fix(ntx): build a complete multi-page index over runs of duplicate keys#86
fix(ntx): build a complete multi-page index over runs of duplicate keys#86Admnwk wants to merge 1 commit into
Conversation
An NTX index built over a key stream containing a clustered run of duplicate keys came out structurally incomplete: a top-to-bottom Skip walk reached only a fraction of the records and AdsSeek found almost none of the survivors, even though AdsCreateIndex reported success. Two independent defects in the B-tree insert path were responsible. 1. seek_key_for_write_(), shared by insert() and erase(), stopped the descent at an internal node whenever the key matched a separator exactly. NTX keeps keys in internal nodes (a B-tree, not a B+tree), so for erase / read positioning that is correct, but an insert must always reach a leaf. Stopping at a branch made insert() write the new entry into that internal node with a 0 left child, orphaning the whole subtree. Exact matches only happen for duplicate keys, so unique streams never tripped it. Fix: a descend_to_leaf flag (set by insert) ignores the exact-match branch shortcut and keeps walking down to a leaf. 2. The "room in parent" split path reused the parent's sentinel slot (its rightmost child) for the promoted separator without preserving the original rightmost child when the separator landed left of the end (pos < key_count). Runs of duplicates sort into the leftmost subtree, so their separators are inserted left of the end, dropping the right subtree and leaving an unreachable branch with a 0 right child. Ascending unique keys always split at the rightmost position, so they never tripped it. Fix: capture the rightmost child before the offset shift and reinstate it as the new sentinel when pos < key_count. Adds abi_ntx_multipage_dup_test: builds a multi-page NTX over a duplicate-laden key stream and asserts every record is reachable both by a full Skip walk (in order) and an individual Seek. RED before the fix (walk reached 30/300, Seek found 24/240), GREEN after. Full unit suite 786/786, 0 regressions; builds clean under -Werror. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request fixes two critical defects in the NTX multi-page index implementation that occur when building indexes with duplicate keys: ensuring insertion always descends to a leaf node, and preserving the parent's original rightmost child during splits. A comprehensive regression test is also added. The review feedback highlights a critical out-of-bounds read/write vulnerability on 1024-byte NTX pages when the key width is 245, as well as two potential buffer overflows in the newly introduced test code due to unsafe std::memcpy operations on fixed-size buffers.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| // common case for runs of duplicate keys, which sort to the | ||
| // left subtree) drops the original right subtree, leaving an | ||
| // unreachable branch with a 0 right child. | ||
| std::uint32_t old_rightmost = get_left_child(parent, pkc); |
There was a problem hiding this comment.
Critical Security/Correctness Issue: Potential Out-of-Bounds Read/Write on 1024-byte NTX Page
An NTX page has a fixed size of 1024 bytes.
- The page header/metadata starts with 2 bytes for the key count.
- The offset table has
max_keys + 1entries (including the sentinel), taking2 * (max_keys + 1)bytes. - Therefore,
blocks_start = 2 + 2 * (max_keys + 1) = 4 + 2 * max_keys. - The sentinel offset (at index
max_keys) is located atblocks_start + max_keys * item_size. - Since the sentinel slot stores a 4-byte child pointer, we must ensure that:
blocks_start + max_keys * item_size + 4 <= 1024
8 + max_keys * (item_size + 2) <= 1024
When width = 245:
item_size = width + 8 = 253.max_keysis calculated as(1022 / 255) - 1 = 3, which is then clamped to the minimum of4.- Plugging these into the formula:
8 + 4 * (253 + 2) = 8 + 4 * 255 = 1020 + 8 = 1028 > 1024
This means the sentinel offset is calculated as 1024. Reading or writing the 4-byte child pointer at offset 1024 (e.g., via get_left_child(parent, pkc) here or put_left_child(parent, pkc + 1, old_rightmost) below) will access bytes 1024 to 1027, resulting in an out-of-bounds read/write of 4 bytes.
Recommendation:
The maximum safe width for numeric keys must be reduced from 245 to 244 in set_numeric_format (line 856). For width = 244, item_size = 252, and the sentinel offset is 12 + 4 * 252 = 1020, which allows writing the 4-byte child pointer safely within the 1024-byte page boundary.
| UNSIGNED8 srv[256]; | ||
| std::memcpy(srv, dir.string().c_str(), dir.string().size() + 1); |
There was a problem hiding this comment.
Potential Buffer Overflow in Test Code
The dir.string() path is copied into a fixed-size buffer srv[256] using std::memcpy without verifying that the path length fits within the buffer. If the temporary directory path is extremely long, this will cause a buffer overflow.
Please add a safety check or assertion to ensure the path fits within srv before copying.
| UNSIGNED8 srv[256]; | |
| std::memcpy(srv, dir.string().c_str(), dir.string().size() + 1); | |
| UNSIGNED8 srv[256]; | |
| REQUIRE(dir.string().size() < sizeof(srv)); | |
| std::memcpy(srv, dir.string().c_str(), dir.string().size() + 1); |
| UNSIGNED8 fld[32]; | ||
| std::memcpy(fld, field, std::strlen(field) + 1); |
There was a problem hiding this comment.
Potential Buffer Overflow in walk_index
The field parameter is copied into a fixed-size buffer fld[32] using std::memcpy without verifying that the length of field is less than 32. If a longer field name is passed in the future, this will cause a buffer overflow.
Please add a safety check to ensure the field name fits within fld before copying.
| UNSIGNED8 fld[32]; | |
| std::memcpy(fld, field, std::strlen(field) + 1); | |
| UNSIGNED8 fld[32]; | |
| REQUIRE(std::strlen(field) < sizeof(fld)); | |
| std::memcpy(fld, field, std::strlen(field) + 1); |
|
Merged directly to |
Problem
An NTX index built over a key stream that contains a clustered run of duplicate keys comes out structurally incomplete. A top-to-bottom
Skipwalk reaches only a fraction of the records andAdsSeekfinds almost none of them — even thoughAdsCreateIndexreturns success. Unique-key streams are unaffected, which is why this stayed hidden: it only surfaces once duplicate keys span more than one page (e.g. aCfield with many blank/empty values, as is common in legacy Clipper data).Root cause — two independent defects in the B-tree insert path
NTX is a B-tree that stores keys in internal nodes too (unlike CDX's B+tree). Both defects only trigger with duplicates and a tree at least two levels deep:
seek_key_for_write_()stopped the descent at an internal node on an exact key match. This helper is shared byinsert()anderase(). For erase / read positioning that is correct (the key may live in a branch), but an insert must always reach a leaf. Stopping at a branch madeinsert()write the new entry into that internal node with a0left child, orphaning the whole subtree. Exact matches during insert only happen for duplicate keys, so unique streams never hit it.The "room in parent" split path dropped the parent's rightmost child. When a promoted separator is inserted into a non-full parent at a position left of the end (
pos < key_count), the code reused the sentinel slot (which holds the rightmost child pointer) for the new key without preserving the original rightmost child. Runs of duplicates sort into the leftmost subtree, so their separators are inserted left of the end → the right subtree is dropped and an unreachable branch with a0right child is left behind. Ascending unique keys always split at the rightmost position (pos == key_count), so they never hit it.Fix
descend_to_leafflag toseek_key_for_write_(set only byinsert()); it ignores the exact-match branch shortcut and keeps walking down to a leaf.erase()keeps its previous behaviour.pos < key_count.Both changes are confined to
src/drivers/ntx/ntx_index.{cpp,h}.Validation
tests/unit/abi_ntx_multipage_dup_test.cpp: builds a multi-page NTX over a duplicate-laden key stream (300 records, every 5th an empty/blank key) and asserts every record is reachable by both a fullSkipwalk (in order) and an individualSeek.Seekfound 24/240 survivors.-DOPENADS_WARNINGS_AS_ERRORS=ON(MSVC x64).Scope
This addresses the duplicate-key incompleteness only. A separate, pre-existing off-by-one in the internal-node split (
fill_internal) that surfaces with very large unique-key sets (≳1500 keys, where the tree grows a third level — a few phantom/duplicated traversal entries) is not included here and will be a follow-up. This PR is independent of the NTX PACK fix in #85 (different functions; this test never deletes/packs) and is based directly on v1.4.0.