PDF stage 3.4: bare CFF (/FontFile3 / Type1C)#551
Draft
andiwand wants to merge 4 commits into
Draft
Conversation
Seed the stage-3.4 branch with the detailed design that precedes implementation (roadmap in src/odr/internal/pdf/AGENTS.md). Read a bare CFF into abstract::Font, wrap to OTF reusing the 3.1 PUA pipeline, wire into PDF @font-face reusing 3.3. Implementation follows. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
First implementation piece of 3.4: a bare-CFF abstract::Font reader. Parses the CFF structure (INDEX/DICT primitives, Name/Top-DICT/String INDEXes, Top DICT, charset formats 0/1/2, CharStrings INDEX, Private DICT) for the abstract::Font facts — glyph count, units-per-em (FontMatrix), bbox, advance widths (Type2 charstring leading-width extraction + default/nominalWidthX), glyph names (custom String-INDEX SIDs), CID-keyed facts — while the raw CFF bytes pass through for later verbatim embedding as a `CFF ` table. Reverse map (code_point_for_glyph) currently covers the algorithmic uniXXXX/uXXXXXX names; the 391-entry standard-strings table and full AGL hookup are follow-ups (see TODO + design doc). OTF wrap + PDF /FontFile3 wiring land next. Tests: a hand-built minimal CFF (assertion-based, no fixtures) covering the facts, custom-name resolution, charstring width vs. default, and magic. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
f479767 to
83c13af
Compare
Wire embedded CFF fonts end to end. cff::wrap_to_otf synthesizes the SFNT skeleton (head/hhea/maxp v0.5/hmtx/name/post/OS/2) from the abstract::Font facts and embeds the CFF verbatim as a `CFF ` table, with a uniform PUA cmap (pua_code_point(glyph) -> glyph) baked in — reusing the 3.1 serializers (build_sfnt, serialize_cmap/post/os2). load_embedded_font now reads /FontFile3 (bare CFF -> CffFont, or a full SFNT -> SfntFont via magic), and the HTML font_family path wraps a CffFont the same way it re-encodes an SfntFont, so embedded CFF renders real glyphs via @font-face with the transparent Unicode selection layer. Tests: wrap_to_otf round-trips through SfntFont (OTTO, glyph count, PUA cmap, synthesized hmtx widths). Full font + PDF HTML-output corpus green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
Close the CFF reverse-map gap. Generate the 391-entry CFF Standard Strings
table (Adobe TN #5176 Appendix A) as committed C++
(tools/font/generate_cff_standard_strings.py -> cff_standard_strings.{hpp,cpp},
mirroring the pdf encoding-data generator), so charset SIDs < 391 resolve to
their glyph names. CffFont::code_point_for_glyph now maps glyph -> name ->
Unicode through the real Adobe Glyph List (pdf::glyph_name_to_unicode) instead
of the algorithmic uni-names only — so a name-keyed CFF's reverse map and
glyph_for_code_point work for standard glyph names (e.g. "A" -> U+0041), which
sharpens simple-font glyph selection for embedded CFF (and, downstream, Type1).
This intentionally makes the font module depend on the pdf module for the AGL
(decision 2026-06-23): the AGL is a font-domain table that lives in pdf today,
and the static lib has no link cycle.
Test: a standard-SID ("A") glyph resolves its name and round-trips through the
AGL reverse/forward maps. Full font + pdf + html corpus green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fourth piece of stage 3 — embedded CFF fonts (
/FontFile3Type1C / CIDFontType0C, OpenType-CFF). Stacked on #550 (3.3). Draft.Design:
docs/design/pdf/stage-3.4-cff.md.Landed
cff::CffFont : abstract::Font— parses the CFF structure (INDEX/DICT, Name/Top-DICT/String INDEXes, charset 0/1/2, CharStrings INDEX, Private DICT) for theabstract::Fontfacts. Raw bytes pass through for theCFFembed.cff::wrap_to_otf— synthesizes the SFNT skeleton (head/hhea/maxp/hmtx/name/post/OS/2), embedsCFFverbatim, bakes the uniform PUA cmap (reuses the 3.1 serializers)./FontFile3loading (bare CFF vs. full SFNT via magic) + HTML@font-facewiring.tools/font/generate_cff_standard_strings.py→cff_standard_strings.{hpp,cpp}), andcode_point_for_glyphnow maps glyph → name → Unicode through the real Adobe Glyph List. Decision (2026-06-23): the font module depends on the pdf module for the AGL (a font-domain table that lives in pdf today; no link cycle).Tests
CFF facts, custom + standard glyph-name resolution, the AGL reverse/forward maps, charstring widths, magic, and
wrap_to_otfround-tripping throughSfntFont. Full font + pdf + html corpus green.🤖 Generated with Claude Code