Skip to content

Add cached find support#894

Open
premtsd-code wants to merge 20 commits into
mainfrom
feat/cached-find
Open

Add cached find support#894
premtsd-code wants to merge 20 commits into
mainfrom
feat/cached-find

Conversation

@premtsd-code

@premtsd-code premtsd-code commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Summary

  • add Database::findCached() for trusted internal reads that can tolerate bounded staleness
  • store cached find results using an Appwrite-list-cache-like Redis key with a :find suffix so it does not collide with Appwrite's native list cache
  • add getFindCacheKey() and getFindCacheField() so callers can build and purge the same find-cache key/field shape explicitly
  • keep cached reads and writes limited to authorization-skipped/internal reads; authorized calls delegate to find()
  • validate query types and collection-specific query semantics before cache lookup so cached paths raise QueryException consistently with find()
  • clamp cached TTLs to Database::TTL (24h), with ttl: 0 disabling caching for a call
  • include database, schema, queries, relationship mode, active filter signatures, and roles in find-cache field generation
  • serialize cursor query documents with their full payload so pagination boundary values do not collide
  • bypass cached reads and writes for Query::orderRandom()
  • refresh cached entries when cached payloads contain TTL-expired documents or invalid payload entries

Usage

findCached() executes the normal find() query on cache miss and stores the result under a find-cache key derived from the Appwrite list-cache shape.

$documents = $db->getAuthorization()->skip(
    fn () => $db->findCached(
        collection: 'projects',
        queries: [Query::orderAsc('name')],
        ttl: Database::TTL,
        cacheCollection: 'wafrules',
        namespace: '_39',
        roles: ['waf'],
        field: 'documents',
        payloadKey: 'rules',
    )
);

Callers that need invalidation can purge the whole find-cache key they own:

$db->getCache()->purge(
    $db->getFindCacheKey('wafrules', '_39')
);

Cache Key Generation

Cached entries are stored under:

{cacheName}-cache:{hostname}:{namespace}:{tenant}:collection:{cacheCollection}:find

Example:

default-cache:mysql-console:_39::collection:wafrules:find

The :find suffix keeps this separate from Appwrite's native list-cache key for the same collection.

The hash field keeps the Appwrite list-cache field shape:

{schemaHash}:{rolesHash}:{queryStateHash}:{field}:{payloadKey}

The query-state hash includes serialized queries, database, relationship mode, and active filter signatures. Cursor queries include the cursor document payload so pagination boundaries do not collide when ordered values change.

The payloadKey is included in the field identity and cached values are stored under that configured key, for example rules for WAF.

Tests

  • php vendor/bin/phpunit tests/unit/ListCacheTest.php tests/unit/CacheKeyTest.php
  • composer lint
  • composer format
  • php vendor/bin/phpstan analyse --level 7 src/Database/Database.php tests/unit/ListCacheTest.php tests/unit/CacheKeyTest.php --memory-limit 2G

Summary by CodeRabbit

  • New Features
    • Query result caching with configurable expiration is now available. Frequently requested query results are cached to minimize database calls. The system automatically detects when cached results expire and refreshes them as needed. Caching respects authorization and query settings to ensure data accuracy and security while improving performance.

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Warning

Review limit reached

@premtsd-code, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 41 minutes and 57 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7773177a-5e35-4e88-b4dc-0bc5f2e6d7f7

📥 Commits

Reviewing files that changed from the base of the PR and between 2eb8271 and 20f0a56.

📒 Files selected for processing (3)
  • src/Database/Database.php
  • tests/unit/CacheKeyTest.php
  • tests/unit/ListCacheTest.php
📝 Walkthrough

Walkthrough

Adds findCached() to Database for TTL-backed caching of find() results, with cache bypass conditions (zero TTL, active authorization, random-order queries) and automatic eviction/refresh when cached documents are TTL-expired. Introduces getFindCacheKey(), getFindCacheField(), and deterministic internal helpers for stable cache-key derivation. Updates getCacheKeys() to use a centralized getActiveFilterSignatures(). New unit tests are added in CacheKeyTest and a new ListCacheTest suite with in-memory cache/adapter helpers.

Changes

findCached Caching Feature

Layer / File(s) Summary
Cache-key infrastructure: stable find cache key/field helpers
src/Database/Database.php
Updates getCacheKeys() to include relationship-resolution state and centralized getActiveFilterSignatures(). Adds public getFindCacheKey() and getFindCacheField() with deterministic key/field derivation from database identity, schema hash, roles, filter signatures, and serialized query payload. Adds internal serializeFindCacheQuery(), normalizeFindCacheQueryValue(), getFindCacheSchemaHash(), and getActiveFilterSignatures() helpers.
findCached() implementation and cache payload decoding
src/Database/Database.php
Adds findCached() with TTL clamping, cache bypass conditions, cache load/decode via decodeFindCachePayload(), TTL-expiration detection with eviction/reload, EVENT_DOCUMENT_FIND emission on clean hits, and best-effort save on misses/refreshes. Adds private decodeFindCachePayload() returning [documents, hasExpiredDocuments].
CacheKeyTest: find cache key and field coverage
tests/unit/CacheKeyTest.php
Extends createDatabase() helper with optional database name; adds tests for getFindCacheKey() format and namespace/tenant override, and getFindCacheField() schema derivation, role/query variation, cursor payload inclusion, ambient-state effects, and QueryException for invalid types.
ListCacheTest: findCached() behavior and in-memory test adapters
tests/unit/ListCacheTest.php
Adds ListCacheTest suite for stale-until-purge, list-key/field correctness, expired document refetch, and query-type validation. Implements HashMemoryCache (TTL-aware in-memory cache adapter with setCachedPayloadDocumentAttribute) and TtlMemoryAdapter (TTL index support override).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • utopia-php/database#828: Both PRs modify cache-key/signature generation driven by filter metadata — #828 adds a signature to getInstanceFilters(), while this PR builds find-cache keys/fields from those filter-signature inputs, with overlapping cache-key-focused test coverage.

Suggested reviewers

  • abnegate

Poem

🐰 A bunny once searched through a haystack of docs,
Now findCached keeps results tucked in a box.
Schema hash, filter sigs, TTL in tow,
Stale entries evicted, fresh data aglow!
Cache keys computed with deterministic care,
No query goes missing — results always there. 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.94% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The pull request title 'Add cached find support' clearly and concisely summarizes the main change: introducing a new findCached() method for caching database find query results.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/cached-find

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps

greptile-apps Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces findCached(), a new auth-skip-only query path that stores result sets in Redis under a :find-suffixed key derived from the existing Appwrite list-cache shape, with configurable TTL, cursor-safe field hashing, and automatic refresh when cached documents have expired according to a collection TTL index.

  • findCached() correctly short-circuits to find() when authorization is enabled, TTL is zero, or a Query::orderRandom() is detected, and properly clamps TTL to Database::TTL (24h).
  • getFindCacheField() serializes cursor Document values via getArrayCopy() so that two cursors with the same id but different sort-field values produce distinct fields; $payloadKey is included as the fifth segment of the field, preventing two callers with different payload keys from evicting each other's entries.
  • decodeFindCachePayload() now sets $shouldRefreshCache = true for both TTL-expired documents and non-array entries, triggering a full database refetch rather than silently returning a partial or empty result.
  • Cache-hit documents are reconstructed via createDocumentInstance() only, without the castingAfter/casting/decode pipeline applied by find(), so callers can receive different PHP types on a cache hit than on a cold read; separately, nested relationship attributes are returned as plain arrays instead of Document objects on cache hits.

Confidence Score: 4/5

Safe to merge for callers that treat findCached() results as opaque blobs, but callers that read typed attributes or nested relationship objects from cache hits can receive different PHP types or plain arrays where fresh find() would return Document instances.

The cache-hit reconstruction path in decodeFindCachePayload() skips the castingAfter/casting/decode pipeline that find() applies to every result, so float attributes serialized as integers in JSON and nested relationship documents serialized as arrays are returned as-is to callers on every cache hit. This is a real behavioral difference between the first (cold) call and subsequent (warm) calls for the same query. The rest of the new logic — cursor payload serialization, random-order bypass, payloadKey inclusion in the field hash, non-array entry refresh — is correctly implemented.

src/Database/Database.php — specifically the decodeFindCachePayload() method and the cache-hit branch of findCached(), which return documents without the type-casting and decoding steps that find() applies

Important Files Changed

Filename Overview
src/Database/Database.php Adds findCached(), decodeFindCachePayload(), getFindCacheKey(), getFindCacheField(), and related helpers; correctly handles cursor serialization, TTL clamping, random-order bypass, and non-array cache entries, but cache-hit documents bypass castingAfter/casting/decode applied by find(), and relationships are returned as arrays instead of nested Documents
tests/unit/CacheKeyTest.php Adds tests for getFindCacheKey() and getFindCacheField() covering namespace/tenant overrides, cursor payload differentiation, ambient filter/relationship state, and query-type validation; all assertions look correct
tests/unit/ListCacheTest.php New integration tests using in-process HashMemoryCache and TtlMemoryAdapter stubs; covers stale-until-purge, cache field shape, TTL-expired document refetch, invalid payload refetch, and query semantic/type validation; test helpers are well-structured

Reviews (17): Last reviewed commit: "Harden find cache validation" | Re-trigger Greptile

Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Database/Database.php`:
- Around line 8625-8642: The cache hit path in the conditional block starting
with if ($cached !== null && $cached !== false && \is_array($cached)) only wraps
cached arrays into Document instances using createDocumentInstance, but does not
hydrate relationship fields that may have been serialized as arrays. This
results in different document shapes being returned compared to the cache miss
path which uses find() that returns fully processed documents. Either ensure the
cached hit path recursively hydrates relationship fields to match the shape
returned by find(), or add logic to bypass caching for finds that request
relationship-resolved data to avoid returning inconsistent types between cache
hits and misses.
- Around line 9563-9564: The cache key generation logic in the findCached method
is incorrectly dropping the serialized queries when a $key is provided. Change
the ternary expression that sets the queries field to always call
serializeFindCacheQueries($queries) instead of returning null when $key is set.
The caller key should add an additional cache dimension alongside the query
fingerprint, not replace it, so both the serialized queries and the key must be
included in the cache identifier to prevent different queries with the same key
from returning incorrect cached results.
- Around line 9572-9584: The getFindCacheSchemaHash method currently only
includes collection attributes and indexes in the hash calculation, but the
find() authorization check also validates collection permissions and
documentSecurity settings. When collection authorization policies change, cached
results remain valid incorrectly because the schema hash doesn't reflect these
changes. Modify the getFindCacheSchemaHash method to include the collection's
permissions and documentSecurity flag in the hash payload alongside the existing
attributes and indexes. Additionally, replace the MD5 hash algorithm with a
stronger digest function such as SHA256 to address the static analysis flag
regarding weak hashing.

In `@tests/unit/FindCacheTest.php`:
- Around line 140-144: The `HashMemoryCache::save` method incorrectly prevents
caching of empty arrays because the condition `empty($data)` returns true for
empty arrays and rejects them. Modify the validation logic to only reject empty
strings while allowing empty arrays to be cached as valid results. Change the
condition to specifically check if `$data` is an empty string or if `$key` is
empty, rather than using `empty()` which treats empty arrays the same as empty
strings.
- Around line 83-99: The test testFindCachedTriggersFindEventOnCacheHit does not
actually verify a cache hit because both cache hits and cache misses emit
EVENT_DOCUMENT_FIND. To fix this, add data mutation between the two findCached
calls to force a true cache hit scenario. After the first findCached call on
line 92, modify the seeded project data in the database, then verify that the
second findCached call on line 93 returns the stale cached result from the first
call rather than the modified data, which would definitively prove the second
call was a cache hit.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d0eed82b-9eca-489f-a0fd-19748bc2694c

📥 Commits

Reviewing files that changed from the base of the PR and between cfba533 and 1d615f5.

📒 Files selected for processing (4)
  • src/Database/Database.php
  • tests/e2e/Adapter/Scopes/DocumentTests.php
  • tests/unit/CacheKeyTest.php
  • tests/unit/FindCacheTest.php

Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread tests/unit/FindCacheTest.php Outdated
Comment thread tests/unit/FindCacheTest.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/Database/Database.php (2)

8611-8617: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Validate queries before serializing the cache key.

findCached() reaches serializeQueriesForFindCache() before find() runs checkQueryTypes(), so malformed query arrays can fail outside the normal QueryException path on cached calls.

Proposed fix
         if ($collectionDocument->isEmpty()) {
             throw new NotFoundException('Collection not found');
         }
 
+        $this->checkQueryTypes($queries);
+
         [$findKey, $findField] = $this->getFindCacheKeys($collectionDocument->getId(), $queries, $key, $forPermission, $collectionDocument);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Database/Database.php` around lines 8611 - 8617, In the findCached()
method, validate the queries before they are serialized for the cache key. Call
checkQueryTypes() on the queries parameter before invoking getFindCacheKeys(),
which calls serializeQueriesForFindCache(). This ensures malformed query arrays
are caught early with proper QueryException handling rather than failing during
serialization.

8617-8639: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid serving document-security caches after permission revocation.

The cache hit path returns serialized documents without re-reading current document permissions. If a cached document later loses $permissions, the same role can still receive it until the find TTL expires. Until permission-changing writes purge affected find variants, bypass caching for documentSecurity collections.

Conservative fix
         if ($collectionDocument->isEmpty()) {
             throw new NotFoundException('Collection not found');
         }
 
+        if ($collectionDocument->getAttribute('documentSecurity', false)) {
+            return $this->find($collectionDocument->getId(), $queries, $forPermission);
+        }
+
         [$findKey, $findField] = $this->getFindCacheKeys($collectionDocument->getId(), $queries, $key, $forPermission, $collectionDocument);
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Database/Database.php` around lines 8617 - 8639, The current caching
logic in the find method returns cached documents without re-validating
permissions, which creates a security vulnerability for documentSecurity
collections where permissions may have been revoked. Add a check using the
$collectionDocument to determine if the collection is a documentSecurity
collection, and if so, skip the entire cache loading and cache return logic (the
try-catch block for $this->cache->load() and the conditional check for $cached
!== null) to force a fresh permission check on every query. This ensures that
permission changes are respected immediately rather than serving stale cached
results until the TTL expires.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/Database/Database.php`:
- Around line 8611-8617: In the findCached() method, validate the queries before
they are serialized for the cache key. Call checkQueryTypes() on the queries
parameter before invoking getFindCacheKeys(), which calls
serializeQueriesForFindCache(). This ensures malformed query arrays are caught
early with proper QueryException handling rather than failing during
serialization.
- Around line 8617-8639: The current caching logic in the find method returns
cached documents without re-validating permissions, which creates a security
vulnerability for documentSecurity collections where permissions may have been
revoked. Add a check using the $collectionDocument to determine if the
collection is a documentSecurity collection, and if so, skip the entire cache
loading and cache return logic (the try-catch block for $this->cache->load() and
the conditional check for $cached !== null) to force a fresh permission check on
every query. This ensures that permission changes are respected immediately
rather than serving stale cached results until the TTL expires.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 29c2ae6a-250e-4626-8ae6-59effc86c769

📥 Commits

Reviewing files that changed from the base of the PR and between 6fad393 and 5275b33.

📒 Files selected for processing (5)
  • src/Database/Database.php
  • tests/e2e/Adapter/Base.php
  • tests/e2e/Adapter/RedisTest.php
  • tests/e2e/Adapter/Scopes/DocumentTests.php
  • tests/unit/FindCacheTest.php
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/e2e/Adapter/Base.php
  • tests/e2e/Adapter/RedisTest.php
  • tests/e2e/Adapter/Scopes/DocumentTests.php

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Database/Database.php`:
- Line 8659: The condition on line 8659 checks `if ($touchOnHit &&
!$hasExpiredDocuments)`, but the `!$hasExpiredDocuments` part is redundant
because the code already guarantees this value through an early return statement
in the preceding lines (8646-8657). Remove the `&& !$hasExpiredDocuments` clause
from the condition, leaving only `if ($touchOnHit)`, to eliminate the
always-true condition that is causing the PHPStan error.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 647e586e-bf40-4233-b042-568a6b0b5436

📥 Commits

Reviewing files that changed from the base of the PR and between 5275b33 and 7cd9d19.

📒 Files selected for processing (4)
  • src/Database/Database.php
  • tests/e2e/Adapter/Scopes/DocumentTests.php
  • tests/unit/CacheKeyTest.php
  • tests/unit/FindCacheTest.php
💤 Files with no reviewable changes (1)
  • tests/unit/CacheKeyTest.php
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/e2e/Adapter/Scopes/DocumentTests.php
  • tests/unit/FindCacheTest.php

Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php
Comment thread src/Database/Database.php
Comment thread src/Database/Database.php
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated
Comment thread src/Database/Database.php Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/Database/Database.php (1)

8831-8838: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Hydrate cached hits through the same post-processing as find().

decodeCachedFindPayload() only wraps top-level arrays, while the miss path returns fully cast/decoded documents and may include populated relationship Document instances. JSON-backed cache adapters can therefore make cache hits return a different shape than misses; recursively hydrate relationship fields or bypass relationship-resolved cached finds until hit-path hydration matches find().

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Database/Database.php` around lines 8831 - 8838, The
decodeCachedFindPayload() method only wraps top-level arrays but does not
perform the same post-processing and hydration of relationship fields that the
find() method applies to documents from the miss path. To fix this, ensure that
documents retrieved from the cache in this code section (where isTtlExpired() is
checked) are passed through the same post-processing/hydration logic that is
applied to non-cached results, so that relationship Document instances and field
casting match the behavior of the find() method. This will ensure cache hits
return the same shape as cache misses and prevent JSON-backed cache adapters
from returning inconsistently structured data.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Database/Database.php`:
- Around line 9788-9793: Replace the MD5 hashing algorithm with SHA-256 for
collision resistance in the cache key generation. In the sprintf call that
builds the cache key, replace both instances of \md5(\json_encode($roles) ?: '')
and \md5(\json_encode($queryPayload) ?: '') with hash('sha256', ...) calls
instead. Also apply the same SHA-256 replacement at the location around line
9822 as indicated in the review comment.

---

Duplicate comments:
In `@src/Database/Database.php`:
- Around line 8831-8838: The decodeCachedFindPayload() method only wraps
top-level arrays but does not perform the same post-processing and hydration of
relationship fields that the find() method applies to documents from the miss
path. To fix this, ensure that documents retrieved from the cache in this code
section (where isTtlExpired() is checked) are passed through the same
post-processing/hydration logic that is applied to non-cached results, so that
relationship Document instances and field casting match the behavior of the
find() method. This will ensure cache hits return the same shape as cache misses
and prevent JSON-backed cache adapters from returning inconsistently structured
data.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 25343a0e-0838-4824-bd18-b9cd2fd72e4b

📥 Commits

Reviewing files that changed from the base of the PR and between 7cd9d19 and e4d8f12.

📒 Files selected for processing (3)
  • src/Database/Database.php
  • tests/unit/CacheKeyTest.php
  • tests/unit/FindCacheTest.php

Comment thread src/Database/Database.php
Comment on lines +9788 to +9793
return \sprintf(
'%s:%s:%s:%s',
$this->getCachedFindSchemaHash($collection),
\md5(\json_encode($roles) ?: ''),
\md5(\json_encode($queryPayload) ?: ''),
$field,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use a collision-resistant digest for cache-field hashes.

These hashes decide which cached result is returned. Replace MD5 with SHA-256 for the role/query/find payload fingerprints to avoid aliasing distinct cache variants.

Proposed fix
         return \sprintf(
             '%s:%s:%s:%s',
             $this->getCachedFindSchemaHash($collection),
-            \md5(\json_encode($roles) ?: ''),
-            \md5(\json_encode($queryPayload) ?: ''),
+            \hash('sha256', \json_encode($roles) ?: ''),
+            \hash('sha256', \json_encode($queryPayload) ?: ''),
             $field,
         );
@@
-        return [$findKey, \md5(\json_encode($payload) ?: '')];
+        return [$findKey, \hash('sha256', \json_encode($payload) ?: '')];

Also applies to: 9822-9822

🧰 Tools
🪛 ast-grep (0.43.0)

[error] 9787-9793: Do not use a weak hash algorithm
Context: \sprintf(
'%s:%s:%s:%s',
$this->getCachedFindSchemaHash($collection),
\md5(\json_encode($roles) ?: ''),
\md5(\json_encode($queryPayload) ?: ''),
$field,
)
Note: [CWE-328] Use of Weak Hash.

(weak-hash-algorithm)


[error] 9790-9790: Do not use a weak hash algorithm
Context: \md5(\json_encode($roles) ?: '')
Note: [CWE-328] Use of Weak Hash.

(weak-hash-algorithm)


[error] 9791-9791: Do not use a weak hash algorithm
Context: \md5(\json_encode($queryPayload) ?: '')
Note: [CWE-328] Use of Weak Hash.

(weak-hash-algorithm)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/Database/Database.php` around lines 9788 - 9793, Replace the MD5 hashing
algorithm with SHA-256 for collision resistance in the cache key generation. In
the sprintf call that builds the cache key, replace both instances of
\md5(\json_encode($roles) ?: '') and \md5(\json_encode($queryPayload) ?: '')
with hash('sha256', ...) calls instead. Also apply the same SHA-256 replacement
at the location around line 9822 as indicated in the review comment.

Source: Linters/SAST tools

Comment thread src/Database/Database.php

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/Database/Database.php`:
- Around line 8616-8627: The code attempts to load from cache before performing
full query validation against the collection structure. Add the complete
semantic validation logic from the full DocumentsValidator (not just
checkQueryTypes() which only validates that values are Query objects) before the
cache->load() call to ensure that cached hits cannot bypass standard
QueryException behavior for invalid query semantics. This validation block
should validate the queries parameter against the collectionDocument before
attempting to retrieve from the cache identified by the cacheKey and cacheField
variables.

In `@tests/unit/CacheKeyTest.php`:
- Around line 284-293: The PHPDoc annotation on the $queries variable in the
testFindCacheFieldValidatesQueryTypes method incorrectly declares the type as
array<Query>, but the actual value assigned is ['invalid'], which is an array of
strings. Update the PHPDoc comment to accurately reflect the actual data type by
changing the annotation to array<string> to match what is actually being
assigned to the variable.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: eeaf2dfe-7928-4818-8c18-4d69cbbec6f8

📥 Commits

Reviewing files that changed from the base of the PR and between 7cd9d19 and 2eb8271.

📒 Files selected for processing (3)
  • src/Database/Database.php
  • tests/unit/CacheKeyTest.php
  • tests/unit/ListCacheTest.php

Comment thread src/Database/Database.php
Comment thread tests/unit/CacheKeyTest.php
Comment thread src/Database/Database.php
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant