fix(ntx): split internal nodes correctly in a deep index (off-by-one)#92
fix(ntx): split internal nodes correctly in a deep index (off-by-one)#92Admnwk wants to merge 2 commits into
Conversation
An NTX index built over a key stream containing a clustered run of duplicate keys came out structurally incomplete: a top-to-bottom Skip walk reached only a fraction of the records and AdsSeek found almost none of the survivors, even though AdsCreateIndex reported success. Two independent defects in the B-tree insert path were responsible. 1. seek_key_for_write_(), shared by insert() and erase(), stopped the descent at an internal node whenever the key matched a separator exactly. NTX keeps keys in internal nodes (a B-tree, not a B+tree), so for erase / read positioning that is correct, but an insert must always reach a leaf. Stopping at a branch made insert() write the new entry into that internal node with a 0 left child, orphaning the whole subtree. Exact matches only happen for duplicate keys, so unique streams never tripped it. Fix: a descend_to_leaf flag (set by insert) ignores the exact-match branch shortcut and keeps walking down to a leaf. 2. The "room in parent" split path reused the parent's sentinel slot (its rightmost child) for the promoted separator without preserving the original rightmost child when the separator landed left of the end (pos < key_count). Runs of duplicates sort into the leftmost subtree, so their separators are inserted left of the end, dropping the right subtree and leaving an unreachable branch with a 0 right child. Ascending unique keys always split at the rightmost position, so they never tripped it. Fix: capture the rightmost child before the offset shift and reinstate it as the new sentinel when pos < key_count. Adds abi_ntx_multipage_dup_test: builds a multi-page NTX over a duplicate-laden key stream and asserts every record is reachable both by a full Skip walk (in order) and an individual Seek. RED before the fix (walk reached 30/300, Seek found 24/240), GREEN after. Full unit suite 786/786, 0 regressions; builds clean under -Werror. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Once an NTX index grows a third level — an internal (branch) node fills up and must split, which happens around 1500+ keys for a 15-byte key — insert() built the parent's entry list, inserted the promoted separator at `pos`, then wired the new right sibling in with `ents[pos + 1].lchild = prop_right`. For an ascending key stream the descent always lands at the rightmost slot (pos == key_count), so the separator is appended at the end and `ents[pos + 1]` reads one past the vector. Worse, prop_right was never recorded as the node's rightmost child: the stale original sentinel was reused instead, which for a rightmost leaf split aliases prop_left — so prop_left ended up referenced twice and prop_right was orphaned. Effect: every internal-node split injected one phantom / out-of-order entry. A full Skip walk returned slightly more rows than exist (e.g. 2003 vs 2000), keys came back out of order, and a few Seeks missed. Fix: when the separator is appended at the end, prop_right becomes the node's new rightmost child (mirroring the non-splitting "room in parent" path); otherwise the original sentinel is preserved. This also removes the one-past-the-end vector access. Adds abi_ntx_internal_split_test: builds a 2000-key unique index (no duplicates, no PACK) and asserts a full Skip walk visits exactly N records in order and every key is reachable by Seek. RED before the fix (walk 2003/2000, out of order, Seek 1943/2000), GREEN after. Full unit suite passes; builds clean under -Werror. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request fixes critical bugs in the NTX B-tree index implementation related to duplicate key insertion and internal node splitting, which previously caused orphaned subtrees and out-of-order/missing keys. It introduces a descend_to_leaf flag to ensure insertions always reach a leaf node, and corrects the handling of the rightmost child sentinel during internal node splits. Two regression tests are added to cover these scenarios. The review feedback highlights potential buffer overflow vulnerabilities in the newly added test files where fixed-size buffers are populated using std::memcpy without bounds checking, and suggests safer alternatives using std::snprintf or assertions.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| UNSIGNED8 fld[32]; | ||
| std::memcpy(fld, field, std::strlen(field) + 1); |
There was a problem hiding this comment.
Using std::memcpy with std::strlen to copy into a fixed-size buffer fld of size 32 can lead to a buffer overflow if field exceeds 31 characters. It is safer to use std::snprintf to ensure null-termination and prevent out-of-bounds writes.
| UNSIGNED8 fld[32]; | |
| std::memcpy(fld, field, std::strlen(field) + 1); | |
| UNSIGNED8 fld[32]; | |
| std::snprintf(reinterpret_cast<char*>(fld), sizeof(fld), "%s", field); |
| UNSIGNED8 srv[256]; | ||
| std::memcpy(srv, dir.string().c_str(), dir.string().size() + 1); |
There was a problem hiding this comment.
Copying dir.string() into a fixed-size buffer srv of size 256 using std::memcpy can cause a buffer overflow if the temporary directory path is extremely long (which can happen in some deep CI or Windows environments). Adding a REQUIRE check ensures that the path fits within the buffer limits.
| UNSIGNED8 srv[256]; | |
| std::memcpy(srv, dir.string().c_str(), dir.string().size() + 1); | |
| UNSIGNED8 srv[256]; | |
| REQUIRE(dir.string().size() < sizeof(srv)); | |
| std::memcpy(srv, dir.string().c_str(), dir.string().size() + 1); |
| UNSIGNED8 fld[32]; | ||
| std::memcpy(fld, field, std::strlen(field) + 1); |
There was a problem hiding this comment.
Using std::memcpy with std::strlen to copy into a fixed-size buffer fld of size 32 can lead to a buffer overflow if field exceeds 31 characters. It is safer to use std::snprintf to ensure null-termination and prevent out-of-bounds writes.
| UNSIGNED8 fld[32]; | |
| std::memcpy(fld, field, std::strlen(field) + 1); | |
| UNSIGNED8 fld[32]; | |
| std::snprintf(reinterpret_cast<char*>(fld), sizeof(fld), "%s", field); |
| UNSIGNED8 srv[256]; | ||
| std::memcpy(srv, dir.string().c_str(), dir.string().size() + 1); |
There was a problem hiding this comment.
Copying dir.string() into a fixed-size buffer srv of size 256 using std::memcpy can cause a buffer overflow if the temporary directory path is extremely long (which can happen in some deep CI or Windows environments). Adding a REQUIRE check ensures that the path fits within the buffer limits.
| UNSIGNED8 srv[256]; | |
| std::memcpy(srv, dir.string().c_str(), dir.string().size() + 1); | |
| UNSIGNED8 srv[256]; | |
| REQUIRE(dir.string().size() < sizeof(srv)); | |
| std::memcpy(srv, dir.string().c_str(), dir.string().size() + 1); |
|
Merged directly to |
Problem
Once an NTX index grows a third level — an internal (branch) node fills up and has to split, which happens around 1500+ keys for a 15-byte key — the build produced a slightly corrupt tree: a full
Skipwalk returned more rows than exist (e.g. 2003 vs 2000), keys came back out of order, and a fewSeeks missed. Unlike the duplicate-key case (#86), this hits plain ascending unique keys, so any large index is affected.Root cause
In
insert(), the internal-node split builds the parent's entry list, inserts the promoted separator atpos, then wired the new right sibling in with:ents[pos + 1].lchild = prop_right;For an ascending key stream the descent always lands at the rightmost slot (
pos == key_count), so the separator is appended at the end andents[pos + 1]reads one past the vector. Worse,prop_rightwas never recorded as the node's rightmost child — the stale original sentinel was reused instead, which for a rightmost leaf split aliasesprop_left. Soprop_leftended up referenced twice andprop_rightwas orphaned, injecting one phantom / out-of-order entry per internal split.Fix
When the separator is appended at the end,
prop_rightbecomes the node's new rightmost child (mirroring the non-splitting "room in parent" path); otherwise the original sentinel is preserved. This also removes the one-past-the-end vector access.Validation
tests/unit/abi_ntx_internal_split_test.cpp: builds a 2000-key unique index (no duplicates, no PACK) and asserts a fullSkipwalk visits exactly N records in order and every key is reachable bySeek.-DOPENADS_WARNINGS_AS_ERRORS=ON(MSVC x64).Notes
src/drivers/ntx/ntx_index.{cpp,h}in different regions and are logically independent.abi_adi_multilevel_build_testis order-dependent — it fails when run in isolation (-tc=...) and passes inside the full suite, on plain v1.4.0 before any of these changes. It's anAdsCreateIndex61 → 5000on an 8000-row ADI build that depends on warm heap state; not introduced or affected by this change (this fix in fact removes a one-past-the-end write). Flagging it as a separate item worth a look.