Skip to content

PDF stage 3.5: Type1 (/FontFile)#552

Draft
andiwand wants to merge 6 commits into
pdf-stage-3.4-cfffrom
pdf-stage-3.5-type1
Draft

PDF stage 3.5: Type1 (/FontFile)#552
andiwand wants to merge 6 commits into
pdf-stage-3.4-cfffrom
pdf-stage-3.5-type1

Conversation

@andiwand

@andiwand andiwand commented Jun 23, 2026

Copy link
Copy Markdown
Member

Fifth piece of stage 3 — embedded Type1 fonts (/FontFile). Stacked on the 3.4 CFF PR. Now functionally complete end-to-end (still draft pending review).

Design: docs/design/pdf/stage-3.5-type1.md.

Landed (the full Type1 pipeline)

  1. font::type1 decryption (type1_crypt) — eexec (key 55665, binary + ASCII-hex) and charstring (key 4330, /lenIV).
  2. Type1Program (type1_font) — split sections, strip PFB framing, parse header (/FontName//FontMatrix//FontBBox//Encoding), decrypt eexec, extract glyph charstrings + /Subrs.
  3. Type1 → Type2 translation (type1_charstring) — stack machine: flatten callsubr, fold div, lift the hsbw side bearing, drop Type1-only hints, translate flex + seac.
  4. CFF builder (cff_builder) — serialize a CFF from Type2 charstrings (INDEX/DICT/charset/Private, single-pass layout).
  5. type1::to_cff assembles it all, .notdef first; /FontFile wiring reads it back as a CffFont, reusing the entire 3.4 CFF path (PUA re-encode, @font-face wrap, reverse map).

Tests

Each layer assertion-based (non-circular cipher round-trips; exact Type2 output; CFF round-trip through the reader; Type1→CFF→OTTO end to end). Full font + PDF + HTML corpus green (460 tests).

Known follow-up

Simple-font glyph selection by PostScript name (PDF /Encoding → name → glyph) is the shared CFF/Type1 item tied to the AGL / name-mapping decision; composite fonts and the wrap/display path work today.

🤖 Generated with Claude Code

@andiwand andiwand force-pushed the pdf-stage-3.5-type1 branch from 289f85a to dccb1d9 Compare June 23, 2026 19:23
@andiwand andiwand force-pushed the pdf-stage-3.4-cff branch from f479767 to 83c13af Compare June 23, 2026 19:32
@andiwand andiwand force-pushed the pdf-stage-3.5-type1 branch 2 times, most recently from 424f31f to 75d240f Compare June 23, 2026 19:46
andiwand and others added 6 commits June 23, 2026 23:05
Seed the stage-3.5 branch. Read a Type1 program (eexec + charstring
decryption), translate Type1 -> Type2 charstrings, build a CFF and reuse
3.4's CFF -> OTF path; reverse map via glyph names -> AGL. Stacked on 3.4.
Implementation follows.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
First self-contained piece of 3.5: the Type1 running-key cipher
(font::type1::decrypt) and its two entry points — decrypt_eexec (key 55665,
4-byte skip, binary or ASCII-hex/PFA auto-detected) and decrypt_charstring
(key 4330, /lenIV-aware). These don't depend on the CFF translation work, so
they land ahead of the full Type1Font reader (eexec parse + Type1->Type2
charstring translation -> reuse 3.4's CFF->OTF path).

Tests: round-trips against an independent forward-cipher reference (so they're
not circular), the lenIV override, and the hex eexec form.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
Parse an Adobe Type1 font program into its decrypted parts: split the
clear-text header / eexec section / trailer, read /FontName, /FontMatrix,
/FontBBox and /Encoding (StandardEncoding or a custom dup-code-name-put
array) from the header, decrypt the eexec section (type1_crypt) and extract
every glyph's decrypted charstring plus /Subrs (RD/-| binary entries,
/lenIV-aware). PFB segment framing is stripped if present. Charstrings are
not yet interpreted — that's the Type1->Type2 translation that follows,
feeding 3.4's CFF->OTF path.

Tests: a hand-built encrypted Type1 program (independent forward cipher) —
magic, header/encoding parse, and the decrypted charstrings/subrs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
cff::build_cff serializes a name-keyed CFF from a list of (name, Type2
charstring) glyphs + default/nominalWidthX + bbox: Header, Name INDEX, Top
DICT (FontBBox + charset/CharStrings/Private offsets, fixed-width so the
layout resolves in one pass), String INDEX (every glyph name as a custom SID,
so no standard-strings table is needed), empty Global Subr INDEX, CharStrings
INDEX, format-0 charset, Private DICT. This is the assembly target for the
Type1 -> CFF path: the translated Type2 charstrings land here, the result
feeds CffFont + wrap_to_otf (3.4).

Test: build a 2-glyph CFF, read it back through CffFont (name, glyph name,
bbox, charstring width vs. default) and confirm it wraps to a loadable OTTO.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
type1::to_type2 translates a decrypted Type1 charstring to Type2 (CFF): a
stack machine that flattens callsubr (inlining the font's /Subrs, depth
guarded), folds div, lifts the hsbw side bearing into the first moveto and
returns the advance width separately, drops Type1-only hints (dotsection,
*stem3, hint-replacement OtherSubr 3), and translates the flex OtherSubrs
(1/2/0 -> two rrcurvetos) and seac (-> Type2 endchar form). Path operators
(r/h/v lineto, rr/vh/hv curveto, stems, moves, endchar) share opcodes with
Type2 and pass through. Best-effort / display-oriented: hints affect
rendering quality, not glyph shape.

Tests: exact Type2 output for hsbw width + side-bearing folding into the
first move, callsubr inlining, and div folding.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
type1::to_cff translates every glyph (to_type2, flattening /Subrs), places
.notdef at glyph 0 (synthesizing one when absent) and assembles a CFF via the
builder. load_embedded_font now reads /FontFile: parse the Type1 program,
convert to CFF, and hold it as a CffFont — so embedded Type1 reuses the entire
3.4 CFF path (PUA re-encode, @font-face wrap, reverse map) with no new
abstract::Font subclass.

Simple-font glyph selection by PostScript name (PDF /Encoding -> name -> glyph)
is the shared CFF/Type1 follow-up tied to the AGL/name-mapping decision;
composite and the wrap/display path work today.

Tests: a Type1 program converts to a CFF that reads back through CffFont
(glyph count incl. synthesized .notdef, names) and wraps to a loadable OTTO.
Full font + PDF + HTML corpus green (460 tests).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
@andiwand andiwand force-pushed the pdf-stage-3.5-type1 branch from f9351ef to 29cdc2f Compare June 23, 2026 21:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant