Skip to content

fix: correct broken ISIN checksum (validator accepted any check digit)#464

Open
DucMinhNe wants to merge 1 commit into
python-validators:masterfrom
DucMinhNe:fix/isin-checksum
Open

fix: correct broken ISIN checksum (validator accepted any check digit)#464
DucMinhNe wants to merge 1 commit into
python-validators:masterfrom
DucMinhNe:fix/isin-checksum

Conversation

@DucMinhNe

Copy link
Copy Markdown

Problem

_isin_checksum never actually validates the check digit. It computes a per-character val inside the loop but never adds it to check:

check, val = 0, None
for idx in range(12):
    ...
    val = ...        # computed
    if idx & 1:
        val += val   # still never added to check
return (check % 10) == 0   # check is always 0 -> always True

So (check % 10) == 0 is always True and isin() accepts any 12-character string with valid characters, regardless of the check digit:

isin('US0378331006')  # wrong check digit -> True (should be invalid)
isin('US0004026251')  # wrong check digit -> True
isin('AAAAAAAAAAAA')  # -> True

The existing "invalid" test cases (010378331005, XCVF, 00^^^1234, A000009) are all rejected earlier by the length / character / idx > 1 guards, so the checksum branch was effectively untested and the bug stayed hidden.

Fix

Implement the actual ISO 6166 algorithm: the check digit must be numeric, expand each letter to its two-digit value (A=10 .. Z=35), then run a right-to-left Luhn sum over the expanded digit string.

I validated the new implementation against python-stdnum's authoritative isin.is_valid over 100k+ country-code-valid random inputs plus random fuzzing: zero mismatches.

Tests

  • Added wrong-check-digit cases (US0378331006, US0004026251, GB0002634947) to the invalid set. These fail on master and pass with the fix.
  • Replaced JP000K0VF054 in the valid set: it is not a valid ISIN (its check digit is wrong), it only passed before because the checksum did nothing. Substituted real valid ISINs (GB0002634946, DE000BAY0017).

pytest tests/test_finance.py is green (25 passed); ruff check and ruff format --check clean.

_isin_checksum computed a per-character val in the loop but never added it
into check, which stayed 0. So (check % 10) == 0 was always True and isin()
accepted any 12-character string with valid characters regardless of its
check digit. The existing invalid-ISIN tests all tripped the earlier length
or character guards, so the checksum branch was never actually exercised.

Reimplement the ISO 6166 algorithm: require a numeric check digit, expand
each letter to its two-digit value (A=10..Z=35), then run a right-to-left
Luhn sum over the expanded digits. Validated against python-stdnum's
isin.is_valid over 100k+ country-code-valid inputs with zero mismatches.
Also dropped JP000K0VF054 from the valid samples (it is not a valid ISIN;
it only passed because the checksum did nothing) and added wrong-check-digit
cases to the invalid set.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant