Skip to content

fix: stop Oracle decode crashes and mislabeled MongoDB TLS errors (#483, #1418)#1723

Merged
datlechin merged 2 commits into
mainfrom
fix/483-1418-oracle-decoder-mongo-tls
Jun 19, 2026
Merged

fix: stop Oracle decode crashes and mislabeled MongoDB TLS errors (#483, #1418)#1723
datlechin merged 2 commits into
mainfrom
fix/483-1418-oracle-decoder-mongo-tls

Conversation

@datlechin

Copy link
Copy Markdown
Member

Fixes two crash/UX reports from #483 (Oracle) and #1418 (MongoDB).

#483 — Oracle connects, then crashes the app during fetch

Root cause: the oracle-nio decode path trapped (force-unwraps, preconditionFailure, an array out-of-bounds) on backend bytes it didn't expect. A trap on the NIO background thread takes the whole app down. Earlier fixes only hardened the handshake path; the remaining crashes were in the row-decode path on Oracle 12.1/19c/23ai.

oracle-nio fork (TableProApp/oracle-nio, branch fix/483-backend-decode-traps, commit 54b73e30, which this PR re-pins to):

  • OracleBackendMessage+RowData.swift — the real culprit: bitVector[Int(byteNumber)] is now bounds-checked and throws; the rowData/processBindRow preconditionFailures and two readSlice(...)! now throw.
  • +BitVector.swift, +Parameter.swift, BackendError.swift, and the decoder's two ! sites + catch-all preconditionFailure now throw catchable errors.
  • New RowDataTrapHardeningTests (3 tests, passing) assert these throw instead of trapping.

In-app:

  • Channel-fatal decode/connection errors now invalidate the connection state (stops querying a dead channel; health monitor reconnects) and surface a clear message via a new .protocolError category with a retry/reconnect diagnostic, instead of a raw redacted dump.
  • New OracleConnectionErrorTests covers the classification.

#1418 — MongoDB Atlas TLS handshake mislabeled

The original NaN crash was already fixed in #1423. The reporter's post-update error was an Atlas handshake failure (internal error -9838, the old Secure Transport libmongoc.a) that classifySSLError mislabeled as a cipher/protocol mismatch.

  • MongoDBConnection.classifySSLError checks specific signatures first and reserves .cipherMismatch for genuine negotiation strings; a generic handshake failure (Atlas -9838) now maps to .unknown with no false cipher claim. Test mirror and cases updated with the real Atlas string.
  • The Atlas connection itself is fixed by the OpenSSL libmongoc.a already in main (TLS connect mongodb #1599); users get that via a MongoDB registry re-release. This PR is the error-mapping half.

Notes for review

  • Re-pins oracle-nio by exact revision in project.pbxproj + Package.resolved. The fork branch is pushed; it may want its own PR/merge (the app builds by revision regardless).
  • swiftlint clean on changed lines; fork tests pass via swift test.

Refs #483, #1418

@datlechin datlechin merged commit 07961dc into main Jun 19, 2026
3 of 4 checks passed
@datlechin datlechin deleted the fix/483-1418-oracle-decoder-mongo-tls branch June 19, 2026 06:09

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 75400208e7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 384 to +385
await queryGate.release()
throw OracleError(message: detail)
throw mapQueryError(sqlError)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prevent queued Oracle calls from using the reset connection

When a channel-fatal OracleSQLError occurs while another executeQuery/streamQuery call is waiting on queryGate, this releases the gate before (and independently of) invalidating the wrapper, and waiting calls have already copied connection before acquiring the gate. The resumed waiter can therefore execute on the same poisoned OracleNIO connection instead of seeing notConnected/reconnect, so the new fatal-error reset doesn't actually stop queued work from querying a dead channel.

Useful? React with 👍 / 👎.

Comment on lines +833 to +840
let signatures = [
"no shared cipher",
"sslv3 alert handshake failure",
"wrong version number",
"unsupported protocol",
"no protocols available",
"alert protocol version",
"protocol version",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve bad-cipher handshakes as cipher mismatches

With this narrowed signature list, a genuine libmongoc message such as TLS handshake failed: bad cipher (the case the previous test covered) no longer matches and then falls through to the generic handshake branch as .unknown. Users in that cipher-negotiation failure case lose the cipher/protocol recovery guidance, so include the bad-cipher wording in the mismatch signatures rather than treating it as unknown.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant