Skip to content

fix(cdx): rewrite stored key expression when re-creating a tag (re-filter on another column)#93

Merged
Admnwk merged 2 commits into
FiveTechSoft:mainfrom
Admnwk:fix/cdx-recreate-tag-stale-expr
Jun 26, 2026
Merged

fix(cdx): rewrite stored key expression when re-creating a tag (re-filter on another column)#93
Admnwk merged 2 commits into
FiveTechSoft:mainfrom
Admnwk:fix/cdx-recreate-tag-stale-expr

Conversation

@Admnwk

@Admnwk Admnwk commented Jun 26, 2026

Copy link
Copy Markdown
Collaborator

Problem

Re-running INDEX ON <other column> TAG ORDERX over an existing .cdx without first deleting the bag — the common Clipper/Harbour "re-filter on another column" idiom — left the tag's stored key expression pointing at the old column.

The tag-already-exists path (AdsCreateIndex61, is_cdx && exists) cleared the B+tree (clear_data) and rewrote the unique/descend/keysize options (set_options), but never rewrote the key expression in the sub-tag header. The freshly built tree carried the new column's keys, so a read straight after the re-create looked fine, but:

  • AdsGetIndexExpr reported the old column; and
  • the next dbAppend/dbReplace (sync_all_indexes_ evaluates idx->expression()) keyed the new record off the old column and inserted it out of order — silent disorder that only surfaces after a later write or reopen.

Reference behavior

Native DBFCDX has no such gap: hb_cdxOrderCreate (src/rdd/dbfcdx/dbfcdx1.c) deletes the existing tag wholesale (hb_cdxIndexDelTag) and adds a fresh one, so expression, FOR clause, options and tree all follow the new column. rddads simply forwards OrdCreate to AdsCreateIndex61, so matching DBFCDX here also removes the need for the app-side OrdSetFocus(0) + OrdBagClear + FERASE workaround callers used to defend against the stale tree.

Fix

On the CDX re-create path, rewrite the whole key-/FOR-expression pool via a new CdxIndex::set_expression(key_expr, for_expr). The pool layout (offsets 504..1024, NUL-terminated, FOR just past the key NUL) is factored out of build_subtag_header into write_expr_pool_ and reused, so the fresh-create and re-create paths can't drift. set_expression subsumes the former set_condition (which only rewrote FOR and now has no callers), so that dead method is removed.

Test

abi_cdx_recreate_tag_diff_expr_test: builds tag ORDERX on K1, re-creates it on K2 (reverse order, no FERASE), and asserts the stored expression is K2, the walk follows K2, and a record appended after the re-create lands in K2 order. RED before the fix (expression == "K1", appended record misplaced); GREEN after.

Full unit suite 785/785, builds clean, no new warnings. (A few ADI/ADT/serverd tests on fixed temp paths are intermittently red regardless of this change — they pass on isolated and repeat runs; this CDX change is green deterministically.)

Scope

CDX only. The ADT/.adi re-create path has the same shape and is left for a follow-up.

Re-running INDEX ON <other column> TAG ORDERX over an existing .cdx
without first deleting the bag — the "re-filter on another column"
idiom — left the tag's stored key expression pointing at the OLD
column. The tag-already-exists path cleared the B+tree (clear_data)
and rewrote the unique/descend/keysize options (set_options) but never
touched the key expression in the sub-tag header. The freshly built
tree carried the new column's keys, so a read straight after the
re-create looked correct, but:

  - AdsGetIndexExpr reported the old column; and
  - the next dbAppend/dbReplace (sync_all_indexes_ evaluates
    idx->expression()) keyed the new record off the OLD column and
    inserted it out of order — the "al cambiar de columna la lista se
    desordenaba" symptom, surfacing only after a later write/reopen.

Native DBFCDX has no such gap: hb_cdxOrderCreate deletes the existing
tag wholesale (hb_cdxIndexDelTag) and adds a fresh one, so expression,
FOR clause, options and tree all follow the new column. rddads simply
forwards OrdCreate to AdsCreateIndex61, so matching DBFCDX here also
removes the need for the app-side OrdSetFocus(0)+OrdBagClear+FERASE
workaround callers were using to defend against the stale tree.

Fix: on the CDX re-create path, rewrite the whole key-/FOR-expression
pool via a new CdxIndex::set_expression(key_expr, for_expr). The pool
layout (offsets 504..1024, NUL-terminated, FOR just past the key NUL)
is factored out of build_subtag_header into write_expr_pool_ and reused
so the fresh-create and re-create paths can't drift. set_expression
subsumes the former set_condition (which only rewrote FOR and now has
no callers), so that dead method is removed.

Adds abi_cdx_recreate_tag_diff_expr_test: builds tag ORDERX on K1,
re-creates it on K2 (reverse order, no FERASE), and asserts the stored
expression is K2, the walk follows K2, and a record appended AFTER the
re-create lands in K2 order. RED before the fix (expression == "K1",
append misplaced); GREEN after. Full unit suite 785/785 (the handful
of intermittently-red tests are pre-existing infra-flaky ADI/ADT/
serverd cases on fixed temp paths, unrelated to this CDX change and
green on isolated/repeat runs). Builds clean, no new warnings.

Scope: CDX only. The ADT/.adi re-create path has the same shape and is
left for a follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the index re-creation logic in AdsCreateIndex61 to correctly overwrite the stored key expression and FOR clause when a tag is re-created on a different column, matching native DBFCDX behavior. It introduces a helper write_expr_pool_ to handle rewriting the expression pool and adds a comprehensive unit test. The review feedback suggests restoring validation checks in set_expression to prevent silent truncation of expressions exceeding the pool limit, and replacing a fixed-size buffer in the unit test with a dynamically sized vector to avoid potential buffer overflows.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines 1522 to 1525
if (sub_header_offset_ == 0) {
return util::Error{6106, 0,
"CDX sub-tag header offset uninitialised", ""};
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The previous implementation of set_condition explicitly prevented silent truncation of the FOR expression if it exceeded the available pool space. By replacing it with set_expression and relying solely on write_expr_pool_ (which silently truncates using std::min), we lose this safety check. This could lead to silent index corruption or incorrect query filtering if a user provides expressions that exceed the 510-byte pool limit. We should restore this validation check in set_expression.

Suggested change
if (sub_header_offset_ == 0) {
return util::Error{6106, 0,
"CDX sub-tag header offset uninitialised", ""};
}
if (sub_header_offset_ == 0) {
return util::Error{6106, 0,
"CDX sub-tag header offset uninitialised", ""};
}
if (key_expr.size() > 510) {
return util::Error{6106, 0,
"Key expression too long for CDX header pool", ""};
}
const std::size_t exp_len_nul = key_expr.size() + 1;
const std::size_t for_room = (exp_len_nul <= 510) ? (510 - exp_len_nul) : 0;
if (for_expr.size() > for_room) {
return util::Error{6106, 0,
"FOR expression too long for CDX header pool", ""};
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restored the fail-loud guard in set_expression: it now returns 6106 if the key (>510) or FOR (> remaining pool) would not fit, instead of letting write_expr_pool_ clamp with std::min. Covers both halves of the pool, not just the FOR as the old set_condition did. Added a unit case asserting the rejection. (769bc15)

Comment on lines +75 to +76
UNSIGNED8 srv[256];
std::memcpy(srv, dir.string().c_str(), dir.string().size() + 1);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a fixed-size buffer of 256 bytes for the server path (srv) can lead to a buffer overflow if the temporary directory path is very long (which is common in nested CI/CD environments). Using a dynamically sized std::vector<UNSIGNED8> avoids this risk entirely.

Suggested change
UNSIGNED8 srv[256];
std::memcpy(srv, dir.string().c_str(), dir.string().size() + 1);
std::vector<UNSIGNED8> srv(dir.string().begin(), dir.string().end());
srv.push_back(0);

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched srv to a std::vector sized to the path. Note: building it directly from dir.string().begin()/.end() reads iterators into two distinct temporaries (length_error at runtime), so the string is materialised once first. (769bc15)

… test

Addresses the Gemini Code Assist review on the PR:

- set_expression now rejects a key or FOR expression that does not fit the
  510-byte sub-header pool (error 6106) instead of letting write_expr_pool_
  silently clamp it with std::min — a clipped key expression is a corrupt
  index and a clipped FOR changes which rows are indexed. Restores the
  fail-loud guard the removed set_condition used to enforce, now covering
  both halves of the pool. Adds a unit case asserting the rejection.

- The test's server-path buffer is a std::vector sized to the path instead
  of a fixed 256-byte array, so a long temp dir (nested CI) can't overflow.
  (The materialised string avoids building the vector from iterators into
  two distinct std::string temporaries.)

Full unit suite 788/788.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Admnwk Admnwk merged commit 5dc77f8 into FiveTechSoft:main Jun 26, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant