Part of #357 (Epic 2 — Task Lifecycle Fixes).
Context
TaskHistoryStore uses a promise-chain mutex (withLock) to serialize all cache mutations. Every write path — upsert, delete, deleteMany, reconcile, atomicReadAndUpdate — acquires the lock before touching this.cache. However, two public methods bypass the lock entirely:
invalidateAll() — synchronous cache.clear(), no lock acquired
invalidate(taskId) — async file read + cache.set, no lock acquired
There is one production caller today (webviewMessageHandler.ts:954, Roo history import):
provider.taskHistoryStore.invalidateAll() // sync, no lock
await provider.taskHistoryStore.reconcile() // async, acquires lock
The Race
_upsertUnlocked (called by atomicReadAndUpdate and upsert) contains an await at line 175 (await this.writeTaskFile(merged)). At that yield point the event loop is free to run other synchronous work — including invalidateAll(). The resulting interleaving:
atomicReadAndUpdate holds the lock, enters _upsertUnlocked
await writeTaskFile(merged) — lock is held but execution yields
invalidateAll() fires: cache.clear() — entire cache wiped while lock is logically held
_upsertUnlocked resumes: cache.set(merged.id, merged) — only the one item is back
getAll() returns a single-item list
onWrite([singleItem]) → scheduleGlobalStateWriteThrough() runs with partial history
- Lock releases;
reconcile() queues and eventually rebuilds the full cache
Impact: Between steps 6 and 7 globalState contains only one history item. If VS Code crashes in this window, the downgrade-compatibility globalState history list is corrupted. invalidate() has no production callers today, but carries an equivalent stale-write risk if ever called concurrently.
Developer Notes
Why the lock doesn't protect you here
The promise-chain mutex in withLock serializes other withLock callers — it does not prevent synchronous code outside the lock from observing or mutating cache mid-await. As the matklad "On Async Mutexes" post explains: "mutual exclusion is a property of the logic itself, not of the runtime… if a certain snippet of code must be executed atomically with respect to everything else that is concurrent, then it must be annotated as such in the source code." Any public method that touches this.cache without withLock is an escape hatch that breaks the invariant the rest of the class relies on. The Trabe article on mutex-based cache sync makes the same point: the lock must own the full critical section including any cache reads that feed into the write.
Recommended fix — invalidateAndReconcile()
Since the one production caller always pairs invalidateAll() + reconcile(), the cleanest fix is a combined method that does both under a single lock acquisition, eliminating the empty-cache window:
async invalidateAndReconcile(): Promise<void> {
return this.withLock(async () => {
this.cache.clear()
// inline reconcile body (or extract _reconcileUnlocked like _upsertUnlocked)
...
})
}
Then webviewMessageHandler.ts:954 becomes:
await provider.taskHistoryStore.invalidateAndReconcile()
await provider.taskHistoryStore.flushIndex()
await provider.postStateToWebview()
CodeRabbit suggested wrapping each method in its own withLock call. That also fixes the corruption, but leaves a brief window between two separate lock acquisitions where reads see an empty cache. A single combined method is strictly better.
invalidate(taskId) should also be wrapped in withLock (or use an _invalidateUnlocked helper if ever needed inside an existing lock holder).
What to validate
- The
onWrite callback receives a complete history list after any concurrent invalidateAndReconcile + atomicReadAndUpdate
- A test that fires
invalidateAndReconcile() concurrently with atomicReadAndUpdate() should show the full item set in cache after both settle (analogous to delegation-concurrent.spec.ts)
- The one call site in
webviewMessageHandler.ts is updated to await invalidateAndReconcile()
Scope note
This is a pre-existing defect that became visible when atomicReadAndUpdate (Story 2.1, PR #691) made the locking contract explicit. It is independent of Stories 2.2 and 2.3.
Part of #357 (Epic 2 — Task Lifecycle Fixes).
Context
TaskHistoryStoreuses a promise-chain mutex (withLock) to serialize all cache mutations. Every write path —upsert,delete,deleteMany,reconcile,atomicReadAndUpdate— acquires the lock before touchingthis.cache. However, two public methods bypass the lock entirely:invalidateAll()— synchronouscache.clear(), no lock acquiredinvalidate(taskId)— async file read +cache.set, no lock acquiredThere is one production caller today (
webviewMessageHandler.ts:954, Roo history import):The Race
_upsertUnlocked(called byatomicReadAndUpdateandupsert) contains anawaitat line 175 (await this.writeTaskFile(merged)). At that yield point the event loop is free to run other synchronous work — includinginvalidateAll(). The resulting interleaving:atomicReadAndUpdateholds the lock, enters_upsertUnlockedawait writeTaskFile(merged)— lock is held but execution yieldsinvalidateAll()fires:cache.clear()— entire cache wiped while lock is logically held_upsertUnlockedresumes:cache.set(merged.id, merged)— only the one item is backgetAll()returns a single-item listonWrite([singleItem])→scheduleGlobalStateWriteThrough()runs with partial historyreconcile()queues and eventually rebuilds the full cacheImpact: Between steps 6 and 7
globalStatecontains only one history item. If VS Code crashes in this window, the downgrade-compatibilityglobalStatehistory list is corrupted.invalidate()has no production callers today, but carries an equivalent stale-write risk if ever called concurrently.Developer Notes
Why the lock doesn't protect you here
The promise-chain mutex in
withLockserializes otherwithLockcallers — it does not prevent synchronous code outside the lock from observing or mutatingcachemid-await. As the matklad "On Async Mutexes" post explains: "mutual exclusion is a property of the logic itself, not of the runtime… if a certain snippet of code must be executed atomically with respect to everything else that is concurrent, then it must be annotated as such in the source code." Any public method that touchesthis.cachewithoutwithLockis an escape hatch that breaks the invariant the rest of the class relies on. The Trabe article on mutex-based cache sync makes the same point: the lock must own the full critical section including any cache reads that feed into the write.Recommended fix —
invalidateAndReconcile()Since the one production caller always pairs
invalidateAll()+reconcile(), the cleanest fix is a combined method that does both under a single lock acquisition, eliminating the empty-cache window:Then
webviewMessageHandler.ts:954becomes:CodeRabbit suggested wrapping each method in its own
withLockcall. That also fixes the corruption, but leaves a brief window between two separate lock acquisitions where reads see an empty cache. A single combined method is strictly better.invalidate(taskId)should also be wrapped inwithLock(or use an_invalidateUnlockedhelper if ever needed inside an existing lock holder).What to validate
onWritecallback receives a complete history list after any concurrentinvalidateAndReconcile+atomicReadAndUpdateinvalidateAndReconcile()concurrently withatomicReadAndUpdate()should show the full item set in cache after both settle (analogous todelegation-concurrent.spec.ts)webviewMessageHandler.tsis updated toawait invalidateAndReconcile()Scope note
This is a pre-existing defect that became visible when
atomicReadAndUpdate(Story 2.1, PR #691) made the locking contract explicit. It is independent of Stories 2.2 and 2.3.