From 6397b2e7aee8ff8dc23598bbbae67f2fb0361f7f Mon Sep 17 00:00:00 2001 From: Andreas Stefl Date: Tue, 23 Jun 2026 18:20:05 +0200 Subject: [PATCH 1/6] =?UTF-8?q?PDF=20stage=203.5:=20design=20=E2=80=94=20T?= =?UTF-8?q?ype1=20(/FontFile)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Seed the stage-3.5 branch. Read a Type1 program (eexec + charstring decryption), translate Type1 -> Type2 charstrings, build a CFF and reuse 3.4's CFF -> OTF path; reverse map via glyph names -> AGL. Stacked on 3.4. Implementation follows. Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz --- docs/design/pdf/stage-3.5-type1.md | 79 ++++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) create mode 100644 docs/design/pdf/stage-3.5-type1.md diff --git a/docs/design/pdf/stage-3.5-type1.md b/docs/design/pdf/stage-3.5-type1.md new file mode 100644 index 00000000..8028869f --- /dev/null +++ b/docs/design/pdf/stage-3.5-type1.md @@ -0,0 +1,79 @@ +# PDF stage 3.5 — Type1 (`/FontFile`) + +Design for the Type1 font sub-stage. Status: **design draft** (no implementation +yet — this PR seeds the branch). Roadmap entry lives in +[`src/odr/internal/pdf/AGENTS.md`](../../../src/odr/internal/pdf/AGENTS.md). + +Stacked on **3.4** — the whole point is to reuse 3.4's CFF → OTF path, so Type1 +support is "translate Type1 to CFF, then everything downstream is 3.4". + +## Goal + +Read a PDF `/FontFile` (a Type1 / PostScript font program) and render it through +the same `@font-face` + dual-layer pipeline as TrueType (3.3) and CFF (3.4). The +hardest single font piece, but precisely specified (Adobe *Type 1 Font Format* +T1 spec; pdf.js as a reference implementation). + +## What gets read (`internal/font/type1_font.{hpp,cpp}`) + +`/FontFile` has three parts sized by the descriptor's `/Length1` (clear ASCII), +`/Length2` (binary eexec), `/Length3` (trailer of zeros + `cleartomark`): + +1. **Clear text** — `/Encoding` (code → glyph name, or `StandardEncoding`), + `/FontMatrix`, `/FontBBox`, `/FontName`. +2. **eexec section** — decrypt with R = 55665 (skip the 4 random bytes), then + parse: + - **`/Subrs`** — index → (decrypted) charstring. + - **`/CharStrings`** — glyph name → charstring; each charstring decrypted + with R = 4330, `lenIV` (default 4) leading bytes dropped. +3. **Trailer** — ignored. + +PFB segment framing (`0x80` markers) is handled if present; PDF embeds the raw +three-segment form. + +## Type1 → Type2 charstring translation (the core) + +Translate each decrypted **Type1** charstring into a **Type2 (CFF)** charstring, +then build a CFF and hand it to 3.4's wrap. The non-trivial cases: + +- `hsbw` → seed the left side bearing + advance width, emit as the Type2 width + + initial `rmoveto`. +- `seac` (accented composite) → decompose into base + accent (StandardEncoding + lookup), or emit `endchar` with the seac operands (Type2 deprecated-seac form). +- `div`, `callsubr` / `return` (Subrs), and the `callothersubr` family — + **flex** (OtherSubrs 0–2) and **hint replacement** (OtherSubr 3) must be + interpreted/flattened, not passed through; this is the part that needs care. +- hint operators (`hstem`/`vstem`/`dotsection`) → Type2 equivalents (or drop; + display tolerates missing hints). + +Output: a `cff::CffFont` (3.4) built from the translated charstrings, a charset +from the glyph names, and a private dict carrying the widths. Everything after +that — OTF wrap, PUA re-encode, OTS gate — is 3.4 unchanged. + +## Reverse map + +Charstring **glyph names** → AGL → Unicode (reuse `pdf_encoding`), same shape as +CFF. A symbolic Type1 with a built-in encoding becomes selectable via this map. + +## PDF wiring (reuse 3.3 / 3.4) + +- `pdf_document_parser`: `/FontFile` → `Type1Font` → (translate) → `CffFont` → + `Font::embedded_font`. +- `Font::glyph_for_code` simple-font branch resolves code → glyph name via the + PDF `/Encoding` (Differences over base) or the font's built-in `/Encoding`, + then name → glyph id through the CFF charset. +- `to_unicode` reverse-map fallback and HTML dual-layer emission unchanged. + +## Scope / non-goals + +- CID-keyed Type1 (Type1 in a Type0, rare) — defer unless a corpus file needs it. +- Multiple Master Type1 — out of scope. +- Hinting fidelity is best-effort (display only). + +## Tests + +Font-only, assertion-based: a minimal hand-built (or frozen-literal) Type1 — +eexec + charstring decryption, an `hsbw` + a `flex`/hint-replacement charstring +translated and round-tripped through 3.4's CFF wrap and OTS, the glyph-name +reverse map. Plus a `pdf_document_parser` case: `/FontFile` → `embedded_font` +with Unicode recovery. From f9215ad13b747a8cb106873cba71cf133871b333 Mon Sep 17 00:00:00 2001 From: Andreas Stefl Date: Tue, 23 Jun 2026 21:48:29 +0200 Subject: [PATCH 2/6] PDF stage 3.5: Type1 eexec/charstring decryption primitives MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit First self-contained piece of 3.5: the Type1 running-key cipher (font::type1::decrypt) and its two entry points — decrypt_eexec (key 55665, 4-byte skip, binary or ASCII-hex/PFA auto-detected) and decrypt_charstring (key 4330, /lenIV-aware). These don't depend on the CFF translation work, so they land ahead of the full Type1Font reader (eexec parse + Type1->Type2 charstring translation -> reuse 3.4's CFF->OTF path). Tests: round-trips against an independent forward-cipher reference (so they're not circular), the lenIV override, and the hex eexec form. Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz --- CMakeLists.txt | 1 + src/odr/internal/font/type1_crypt.cpp | 94 ++++++++++++++++++++++++++ src/odr/internal/font/type1_crypt.hpp | 34 ++++++++++ test/CMakeLists.txt | 1 + test/src/internal/font/type1_crypt.cpp | 65 ++++++++++++++++++ 5 files changed, 195 insertions(+) create mode 100644 src/odr/internal/font/type1_crypt.cpp create mode 100644 src/odr/internal/font/type1_crypt.hpp create mode 100644 test/src/internal/font/type1_crypt.cpp diff --git a/CMakeLists.txt b/CMakeLists.txt index ccbd8eab..f74f4945 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -201,6 +201,7 @@ set(ODR_SOURCE_FILES "src/odr/internal/font/cff_font.cpp" "src/odr/internal/font/cff_standard_strings.cpp" "src/odr/internal/font/cff_transform.cpp" + "src/odr/internal/font/type1_crypt.cpp" "src/odr/internal/font/sfnt_font.cpp" "src/odr/internal/font/sfnt_parser.cpp" "src/odr/internal/font/sfnt_transform.cpp" diff --git a/src/odr/internal/font/type1_crypt.cpp b/src/odr/internal/font/type1_crypt.cpp new file mode 100644 index 00000000..3805c631 --- /dev/null +++ b/src/odr/internal/font/type1_crypt.cpp @@ -0,0 +1,94 @@ +#include + +#include +#include +#include + +namespace odr::internal::font::type1 { + +namespace { + +constexpr std::uint16_t c1 = 52845; +constexpr std::uint16_t c2 = 22719; + +[[nodiscard]] bool is_hex_digit(const unsigned char c) { + return std::isxdigit(c) != 0; +} + +/// Hex-decode @p in, skipping whitespace; stops at the first non-hex, non-space +/// byte (the binary `eexec` form never reaches here). +[[nodiscard]] std::string hex_decode(std::string_view in) { + std::string out; + int high = -1; + for (const char ch : in) { + const auto c = static_cast(ch); + if (std::isspace(c) != 0) { + continue; + } + if (!is_hex_digit(c)) { + break; + } + const int value = (c <= '9') ? c - '0' + : (c <= 'F') ? c - 'A' + 10 + : c - 'a' + 10; + if (high < 0) { + high = value; + } else { + out += static_cast((high << 4) | value); + high = -1; + } + } + return out; +} + +/// Whether @p eexec is the ASCII-hex form: the first four non-space bytes are +/// all hex digits (Type1 spec 7.2 — the binary form is detected as not-this). +[[nodiscard]] bool looks_like_hex(std::string_view eexec) { + int seen = 0; + for (const char ch : eexec) { + const auto c = static_cast(ch); + if (std::isspace(c) != 0) { + continue; + } + if (!is_hex_digit(c)) { + return false; + } + if (++seen == 4) { + return true; + } + } + return false; +} + +} // namespace + +std::string decrypt(const std::string_view cipher, const std::uint16_t key, + const std::size_t skip) { + std::uint16_t r = key; + std::string out; + out.reserve(cipher.size()); + for (const char ch : cipher) { + const auto c = static_cast(ch); + out += static_cast(c ^ (r >> 8)); + r = static_cast((c + r) * c1 + c2); + } + if (skip >= out.size()) { + return {}; + } + return out.substr(skip); +} + +std::string decrypt_eexec(const std::string_view eexec) { + if (looks_like_hex(eexec)) { + const std::string binary = hex_decode(eexec); + return decrypt(binary, 55665, 4); + } + return decrypt(eexec, 55665, 4); +} + +std::string decrypt_charstring(const std::string_view charstring, + const std::size_t len_iv) { + return decrypt(charstring, 4330, len_iv); +} + +} // namespace odr::internal::font::type1 diff --git a/src/odr/internal/font/type1_crypt.hpp b/src/odr/internal/font/type1_crypt.hpp new file mode 100644 index 00000000..388d2f5e --- /dev/null +++ b/src/odr/internal/font/type1_crypt.hpp @@ -0,0 +1,34 @@ +#pragma once + +#include +#include +#include +#include + +namespace odr::internal::font::type1 { + +/// Type1 (Adobe Type 1 Font Format) `eexec` / charstring decryption. +/// +/// Both the `eexec`-encrypted portion of the font program and each individual +/// charstring use the same stream cipher with different keys: a 16-bit running +/// key `R`, constants c1 = 52845 / c2 = 22719, where each plaintext byte is +/// `cipher ^ (R >> 8)` and `R = (cipher + R) * c1 + c2` (mod 2^16). The leading +/// @p skip bytes of plaintext are random padding and discarded. + +/// Decrypt @p cipher with the running-key cipher seeded at @p key, discarding +/// the first @p skip plaintext bytes. +[[nodiscard]] std::string decrypt(std::string_view cipher, std::uint16_t key, + std::size_t skip); + +/// Decrypt the `eexec` section (key 55665, 4 random bytes). Accepts either the +/// binary form (PDF `/FontFile`, the `/Length2` portion) or the ASCII-hex form +/// (PFA fonts): when the section's leading bytes are all hex digits/whitespace +/// it is hex-decoded first. +[[nodiscard]] std::string decrypt_eexec(std::string_view eexec); + +/// Decrypt one charstring (key 4330), discarding @p len_iv leading bytes +/// (the font's `/lenIV`, default 4). +[[nodiscard]] std::string decrypt_charstring(std::string_view charstring, + std::size_t len_iv = 4); + +} // namespace odr::internal::font::type1 diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt index 95fe4cf8..9bcd4254 100644 --- a/test/CMakeLists.txt +++ b/test/CMakeLists.txt @@ -55,6 +55,7 @@ add_executable(odr_test "src/internal/pdf/pdf_test_file_builder.cpp" "src/internal/font/cff_font.cpp" + "src/internal/font/type1_crypt.cpp" "src/internal/font/sfnt_font.cpp" "src/internal/font/sfnt_transform.cpp" "src/internal/font/font_file.cpp" diff --git a/test/src/internal/font/type1_crypt.cpp b/test/src/internal/font/type1_crypt.cpp new file mode 100644 index 00000000..7bfbd4d8 --- /dev/null +++ b/test/src/internal/font/type1_crypt.cpp @@ -0,0 +1,65 @@ +#include + +#include + +#include +#include + +using namespace odr::internal::font::type1; + +namespace { + +/// Independent reference implementation of the Type1 *encryption* (the inverse +/// of `decrypt`), so the round-trip tests are not circular: this codes the +/// cipher forwards (plaintext -> ciphertext), `decrypt` codes it backwards. +std::string encrypt(const std::string &plain, std::uint16_t r, + const std::string &random_prefix) { + constexpr std::uint16_t c1 = 52845; + constexpr std::uint16_t c2 = 22719; + std::string out; + const std::string full = random_prefix + plain; + for (const char ch : full) { + const auto p = static_cast(ch); + const auto cipher = static_cast(p ^ (r >> 8)); + out += static_cast(cipher); + r = static_cast((cipher + r) * c1 + c2); + } + return out; +} + +} // namespace + +TEST(Type1CryptTest, EexecRoundTrip) { + const std::string plain = "/Private 10 dict dup begin"; + const std::string cipher = encrypt(plain, 55665, "ABCD"); + EXPECT_EQ(decrypt_eexec(cipher), plain); +} + +TEST(Type1CryptTest, CharstringRoundTrip) { + const std::string plain("\x0d\x0e\xff\x00\x01", 5); // hsbw-ish bytes + const std::string cipher = encrypt(plain, 4330, "wxyz"); + EXPECT_EQ(decrypt_charstring(cipher), plain); // default lenIV = 4 +} + +TEST(Type1CryptTest, CharstringHonoursLenIv) { + const std::string plain = "hello"; + const std::string cipher = encrypt(plain, 4330, ""); + EXPECT_EQ(decrypt_charstring(cipher, 0), plain); +} + +TEST(Type1CryptTest, EexecAcceptsHexForm) { + const std::string plain = "dup /CharStrings"; + const std::string binary = encrypt(plain, 55665, "0000"); + // Hex-encode the binary eexec (PFA form), with whitespace the decoder skips. + std::string hex; + const char *digits = "0123456789abcdef"; + for (std::size_t i = 0; i < binary.size(); ++i) { + const auto b = static_cast(binary[i]); + hex += digits[b >> 4]; + hex += digits[b & 0x0f]; + if (i % 8 == 7) { + hex += '\n'; + } + } + EXPECT_EQ(decrypt_eexec(hex), plain); +} From cf306e2d7f9ed19d1566bfae6ee58522f7459837 Mon Sep 17 00:00:00 2001 From: Andreas Stefl Date: Tue, 23 Jun 2026 22:00:13 +0200 Subject: [PATCH 3/6] PDF stage 3.5: Type1 program parser (Type1Program) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Parse an Adobe Type1 font program into its decrypted parts: split the clear-text header / eexec section / trailer, read /FontName, /FontMatrix, /FontBBox and /Encoding (StandardEncoding or a custom dup-code-name-put array) from the header, decrypt the eexec section (type1_crypt) and extract every glyph's decrypted charstring plus /Subrs (RD/-| binary entries, /lenIV-aware). PFB segment framing is stripped if present. Charstrings are not yet interpreted — that's the Type1->Type2 translation that follows, feeding 3.4's CFF->OTF path. Tests: a hand-built encrypted Type1 program (independent forward cipher) — magic, header/encoding parse, and the decrypted charstrings/subrs. Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz --- CMakeLists.txt | 1 + src/odr/internal/font/type1_font.cpp | 295 ++++++++++++++++++++++++++ src/odr/internal/font/type1_font.hpp | 81 +++++++ test/CMakeLists.txt | 1 + test/src/internal/font/type1_font.cpp | 111 ++++++++++ 5 files changed, 489 insertions(+) create mode 100644 src/odr/internal/font/type1_font.cpp create mode 100644 src/odr/internal/font/type1_font.hpp create mode 100644 test/src/internal/font/type1_font.cpp diff --git a/CMakeLists.txt b/CMakeLists.txt index f74f4945..d6eb4f6a 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -202,6 +202,7 @@ set(ODR_SOURCE_FILES "src/odr/internal/font/cff_standard_strings.cpp" "src/odr/internal/font/cff_transform.cpp" "src/odr/internal/font/type1_crypt.cpp" + "src/odr/internal/font/type1_font.cpp" "src/odr/internal/font/sfnt_font.cpp" "src/odr/internal/font/sfnt_parser.cpp" "src/odr/internal/font/sfnt_transform.cpp" diff --git a/src/odr/internal/font/type1_font.cpp b/src/odr/internal/font/type1_font.cpp new file mode 100644 index 00000000..b149c6ed --- /dev/null +++ b/src/odr/internal/font/type1_font.cpp @@ -0,0 +1,295 @@ +#include + +#include + +#include +#include +#include +#include +#include +#include +#include + +namespace odr::internal::font::type1 { + +namespace { + +[[nodiscard]] bool is_ps_space(const char c) { + return c == ' ' || c == '\t' || c == '\r' || c == '\n' || c == '\f' || + c == '\0'; +} + +/// Skip PostScript whitespace starting at @p p. +[[nodiscard]] std::size_t skip_space(std::string_view s, std::size_t p) { + while (p < s.size() && is_ps_space(s[p])) { + ++p; + } + return p; +} + +/// Read a whitespace-delimited token starting at @p p; advances @p p past it. +[[nodiscard]] std::string_view read_token(std::string_view s, std::size_t &p) { + p = skip_space(s, p); + const std::size_t begin = p; + while (p < s.size() && !is_ps_space(s[p])) { + ++p; + } + return s.substr(begin, p - begin); +} + +[[nodiscard]] bool parse_int(std::string_view token, int &out) { + const char *begin = token.data(); + const char *end = begin + token.size(); + const auto [ptr, ec] = std::from_chars(begin, end, out); + return ec == std::errc() && ptr == end; +} + +[[nodiscard]] double parse_double(std::string_view token) { + // std::from_chars for double is not universally available; std::stod needs a + // null-terminated copy. + try { + return std::stod(std::string(token)); + } catch (const std::exception &) { + return 0.0; + } +} + +/// Parse the numbers inside the next `[...]` or `{...}` after @p key in @p s. +[[nodiscard]] std::vector parse_number_array(std::string_view s, + std::string_view key) { + std::vector out; + const std::size_t k = s.find(key); + if (k == std::string_view::npos) { + return out; + } + std::size_t open = s.find_first_of("[{", k); + if (open == std::string_view::npos) { + return out; + } + const std::size_t close = s.find_first_of("]}", open); + if (close == std::string_view::npos) { + return out; + } + std::size_t p = open + 1; + while (p < close) { + const std::string_view token = read_token(s.substr(0, close), p); + if (token.empty()) { + break; + } + out.push_back(parse_double(token)); + } + return out; +} + +/// Read the `length` binary bytes of an `RD`/`-|` value: at @p p sits the +/// length integer, then the RD operator, then exactly one space, then the +/// bytes. On success returns the bytes and advances @p p past them; on failure +/// returns nullopt. +[[nodiscard]] std::optional read_rd_binary(std::string_view s, + std::size_t &p) { + std::size_t q = p; + const std::string_view length_token = read_token(s, q); + int length = 0; + if (!parse_int(length_token, length) || length < 0) { + return std::nullopt; + } + const std::string_view rd = read_token(s, q); // "RD" or "-|" + if (rd != "RD" && rd != "-|") { + return std::nullopt; + } + // Exactly one space separates the RD operator from the binary data. + if (q >= s.size()) { + return std::nullopt; + } + ++q; // the single delimiter space + if (q + static_cast(length) > s.size()) { + return std::nullopt; + } + const std::string_view bytes = s.substr(q, static_cast(length)); + p = q + static_cast(length); + return bytes; +} + +} // namespace + +bool Type1Program::is_type1(const std::string_view data) { + if (data.size() >= 2 && static_cast(data[0]) == 0x80) { + return true; // PFB segment marker + } + return data.substr(0, 256).find("%!PS-AdobeFont") != std::string_view::npos || + data.substr(0, 256).find("%!FontType1") != std::string_view::npos; +} + +Type1Program::Type1Program(std::string_view program) { + // Strip PFB segment framing if present: each segment is `0x80 type len32le` + // followed by `len` bytes (type 1 = ASCII, 2 = binary, 3 = EOF). + std::string unframed; + if (!program.empty() && static_cast(program[0]) == 0x80) { + std::size_t p = 0; + while (p + 6 <= program.size() && + static_cast(program[p]) == 0x80) { + const std::uint8_t type = static_cast(program[p + 1]); + if (type == 3) { + break; + } + const std::uint32_t len = + static_cast(program[p + 2]) | + (static_cast(program[p + 3]) << 8) | + (static_cast(program[p + 4]) << 16) | + (static_cast(static_cast(program[p + 5])) + << 24); + p += 6; + if (p + len > program.size()) { + break; + } + unframed.append(program.substr(p, len)); + p += len; + } + program = unframed; + } + + const std::size_t eexec = program.find("eexec"); + if (eexec == std::string_view::npos) { + throw std::runtime_error("type1: no eexec section"); + } + + parse_clear(program.substr(0, eexec)); + + // The encrypted blob begins after `eexec` and its trailing whitespace. + const std::size_t blob = skip_space(program, eexec + 5); + const std::string decrypted = decrypt_eexec(program.substr(blob)); + parse_private(decrypted); + + if (m_glyphs.empty()) { + throw std::runtime_error("type1: no /CharStrings"); + } +} + +void Type1Program::parse_clear(const std::string_view clear) { + if (const std::size_t k = clear.find("/FontName"); + k != std::string_view::npos) { + std::size_t p = clear.find('/', k + 1); + if (p != std::string_view::npos) { + ++p; + const std::size_t begin = p; + while (p < clear.size() && !is_ps_space(clear[p])) { + ++p; + } + m_name = std::string(clear.substr(begin, p - begin)); + } + } + + if (const std::vector matrix = + parse_number_array(clear, "/FontMatrix"); + matrix.size() == 6) { + m_font_matrix = matrix; + } + if (const std::vector bbox = parse_number_array(clear, "/FontBBox"); + bbox.size() == 4) { + m_font_bbox = { + static_cast(bbox[0]), static_cast(bbox[1]), + static_cast(bbox[2]), static_cast(bbox[3])}; + } + + // /Encoding: `StandardEncoding def`, or a custom array built with + // `dup / put` lines. + const std::size_t enc = clear.find("/Encoding"); + if (enc != std::string_view::npos) { + const std::string_view after = clear.substr(enc); + if (after.substr(0, 64).find("StandardEncoding") != + std::string_view::npos) { + m_standard_encoding = true; + } else { + m_standard_encoding = false; + std::size_t p = 0; + while ((p = after.find("dup ", p)) != std::string_view::npos) { + std::size_t q = p + 4; + int code = 0; + const std::string_view code_token = read_token(after, q); + const std::size_t slash = after.find('/', q); + if (parse_int(code_token, code) && slash != std::string_view::npos) { + std::size_t r = slash + 1; + const std::size_t begin = r; + while (r < after.size() && !is_ps_space(after[r])) { + ++r; + } + m_encoding[code] = std::string(after.substr(begin, r - begin)); + } + p = q; + } + } + } +} + +void Type1Program::parse_private(const std::string_view decrypted) { + int len_iv = 4; + if (const std::size_t k = decrypted.find("/lenIV"); + k != std::string_view::npos) { + std::size_t p = k + 6; + int value = 0; + if (parse_int(read_token(decrypted, p), value)) { + len_iv = value; + } + } + m_len_iv = len_iv; + + // /Subrs: entries `dup RD NP`. + if (const std::size_t k = decrypted.find("/Subrs"); + k != std::string_view::npos) { + std::size_t p = k; + while ((p = decrypted.find("dup ", p)) != std::string_view::npos) { + // Stop when /CharStrings starts (Subrs precede it). + const std::size_t cs = decrypted.find("/CharStrings"); + if (cs != std::string_view::npos && p > cs) { + break; + } + std::size_t q = p + 4; + int index = 0; + if (!parse_int(read_token(decrypted, q), index) || index < 0) { + p += 4; + continue; + } + const std::optional bytes = + read_rd_binary(decrypted, q); + if (!bytes.has_value()) { + p += 4; + continue; + } + if (static_cast(m_subrs.size()) <= index) { + m_subrs.resize(index + 1); + } + m_subrs[index] = decrypt_charstring(*bytes, len_iv); + p = q; + } + } + + // /CharStrings: entries `/ RD ND`. + const std::size_t cs = decrypted.find("/CharStrings"); + if (cs == std::string_view::npos) { + return; + } + const std::size_t begin = decrypted.find("begin", cs); + std::size_t p = (begin == std::string_view::npos) ? cs : begin + 5; + while (p < decrypted.size()) { + const std::size_t slash = decrypted.find('/', p); + if (slash == std::string_view::npos) { + break; + } + std::size_t q = slash + 1; + const std::size_t name_begin = q; + while (q < decrypted.size() && !is_ps_space(decrypted[q])) { + ++q; + } + std::string name(decrypted.substr(name_begin, q - name_begin)); + const std::optional bytes = read_rd_binary(decrypted, q); + if (!bytes.has_value()) { + // Not a charstring entry (e.g. `end`); advance past this slash. + p = slash + 1; + continue; + } + m_glyphs.push_back({std::move(name), decrypt_charstring(*bytes, len_iv)}); + p = q; + } +} + +} // namespace odr::internal::font::type1 diff --git a/src/odr/internal/font/type1_font.hpp b/src/odr/internal/font/type1_font.hpp new file mode 100644 index 00000000..6dc56734 --- /dev/null +++ b/src/odr/internal/font/type1_font.hpp @@ -0,0 +1,81 @@ +#pragma once + +#include + +#include +#include +#include +#include + +namespace odr::internal::font::type1 { + +/// One glyph of a Type1 font: its PostScript name and its **decrypted** Type1 +/// charstring (charstring encryption removed, `/lenIV` leading bytes dropped). +struct Glyph { + std::string name; + std::string charstring; +}; + +/// @brief Parses an Adobe Type1 font program into its decrypted parts. +/// +/// A Type1 program has three sections: a clear-text header (font dictionary up +/// to `eexec`), an `eexec`-encrypted private portion (`/Subrs`, +/// `/CharStrings`), and a zero-padded trailer. This reads the header for +/// `/FontMatrix`, +/// `/FontBBox`, `/Encoding` and `/FontName`, decrypts the `eexec` section +/// (`type1_crypt`) and extracts every glyph's decrypted charstring plus the +/// `/Subrs`. It does **not** yet interpret the charstrings — that is the +/// Type1 -> Type2 translation that follows, feeding 3.4's CFF -> OTF path. +/// +/// Throws `std::runtime_error` when the program has no `eexec` section or no +/// `/CharStrings`. +class Type1Program { +public: + /// Cheap magic test: the PostScript font sentinel (`%!PS-AdobeFont`, + /// `%!FontType1`) or a PFB segment marker (`0x80`). + [[nodiscard]] static bool is_type1(std::string_view data); + + /// Parse @p program (the raw `/FontFile` bytes, PFB markers stripped if + /// present). + explicit Type1Program(std::string_view program); + + [[nodiscard]] std::string_view name() const noexcept { return m_name; } + /// The 6-element `/FontMatrix` (defaults to `[0.001 0 0 0.001 0 0]`). + [[nodiscard]] const std::vector &font_matrix() const noexcept { + return m_font_matrix; + } + [[nodiscard]] FontBBox font_bbox() const noexcept { return m_font_bbox; } + + /// `/Encoding` as code -> glyph name. Empty when the font uses + /// `StandardEncoding` (see `standard_encoding`). + [[nodiscard]] const std::map &encoding() const noexcept { + return m_encoding; + } + [[nodiscard]] bool standard_encoding() const noexcept { + return m_standard_encoding; + } + + /// Decrypted glyphs in declaration order. + [[nodiscard]] const std::vector &glyphs() const noexcept { + return m_glyphs; + } + /// Decrypted `/Subrs`, indexed by subr number. + [[nodiscard]] const std::vector &subrs() const noexcept { + return m_subrs; + } + +private: + void parse_clear(std::string_view clear); + void parse_private(std::string_view decrypted); + + std::string m_name; + std::vector m_font_matrix{0.001, 0.0, 0.0, 0.001, 0.0, 0.0}; + FontBBox m_font_bbox{}; + std::map m_encoding; + bool m_standard_encoding{true}; + std::vector m_glyphs; + std::vector m_subrs; + int m_len_iv{4}; +}; + +} // namespace odr::internal::font::type1 diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt index 9bcd4254..249ebc51 100644 --- a/test/CMakeLists.txt +++ b/test/CMakeLists.txt @@ -56,6 +56,7 @@ add_executable(odr_test "src/internal/font/cff_font.cpp" "src/internal/font/type1_crypt.cpp" + "src/internal/font/type1_font.cpp" "src/internal/font/sfnt_font.cpp" "src/internal/font/sfnt_transform.cpp" "src/internal/font/font_file.cpp" diff --git a/test/src/internal/font/type1_font.cpp b/test/src/internal/font/type1_font.cpp new file mode 100644 index 00000000..a372b0b2 --- /dev/null +++ b/test/src/internal/font/type1_font.cpp @@ -0,0 +1,111 @@ +#include + +#include + +#include +#include + +using namespace odr::internal::font::type1; + +namespace { + +/// Forward Type1 cipher (the inverse of `decrypt`), so the test builds a real +/// encrypted program rather than trusting the decryptor. +std::string encrypt(const std::string &plain, std::uint16_t r, + const std::string &random_prefix) { + constexpr std::uint16_t c1 = 52845; + constexpr std::uint16_t c2 = 22719; + std::string out; + for (const char ch : random_prefix + plain) { + const auto p = static_cast(ch); + const auto cipher = static_cast(p ^ (r >> 8)); + out += static_cast(cipher); + r = static_cast((cipher + r) * c1 + c2); + } + return out; +} + +/// A `/name len RD ND` charstring entry, the charstring encrypted with +/// the charstring key (4330) and a 4-byte lenIV prefix. +std::string charstring_entry(const std::string &name, + const std::string &plain_charstring) { + // The 4-byte lenIV prefix must be a real 4 NUL bytes — a "\x00\x00\x00\x00" + // string literal would be empty (the first NUL terminates it). + const std::string enc = encrypt(plain_charstring, 4330, std::string(4, '\0')); + return "/" + name + " " + std::to_string(enc.size()) + " RD " + enc + " ND\n"; +} + +/// Assemble a minimal but well-formed Type1 program: a clear header (with a +/// custom /Encoding and /FontMatrix) and an eexec-encrypted private section +/// holding two glyphs and one subr. +std::string build_type1() { + std::string clear = "%!PS-AdobeFont-1.0: TestType1 001.000\n" + "/FontName /TestType1 def\n" + "/FontMatrix [0.001 0 0 0.001 0 0] readonly def\n" + "/FontBBox {0 -200 700 800} readonly def\n" + "/Encoding 256 array\n" + "0 1 255 {1 index exch /.notdef put} for\n" + "dup 65 /A put\n" + "dup 66 /B put\n" + "readonly def\n" + "currentdict end\n" + "currentfile eexec\n"; + + std::string private_section = "dup /Private 16 dict dup begin\n" + "/lenIV 4 def\n" + "/Subrs 1 array\n"; + private_section += "dup 0 "; + { + const std::string subr = + encrypt(std::string("\x0b", 1), 4330, std::string(4, '\0')); // return + private_section += std::to_string(subr.size()) + " RD " + subr + " NP\n"; + } + private_section += "ND\n" + "2 index /CharStrings 2 dict dup begin\n"; + // .notdef-ish + two named glyphs. Charstring bytes are arbitrary here: the + // parser does not interpret them, it only extracts them. + private_section += charstring_entry("A", std::string("\x8b\x8b\x0d\x0e", 4)); + private_section += charstring_entry("B", std::string("\xf0\x0d\x0e", 3)); + private_section += "end\nend\n"; + + std::string program = clear; + program += encrypt(private_section, 55665, "wxyz"); + // Trailer (would be 512 zeros + cleartomark in a real font); the parser + // tolerates trailing data, so a short stub is enough. + program += std::string(8, '\0'); + return program; +} + +} // namespace + +TEST(Type1FontTest, IsType1Magic) { + EXPECT_TRUE(Type1Program::is_type1(build_type1())); + EXPECT_FALSE(Type1Program::is_type1("not a font program at all")); +} + +TEST(Type1FontTest, ParsesHeaderAndEncoding) { + const Type1Program font{build_type1()}; + + EXPECT_EQ(font.name(), "TestType1"); + EXPECT_FALSE(font.standard_encoding()); + ASSERT_EQ(font.font_matrix().size(), 6u); + EXPECT_DOUBLE_EQ(font.font_matrix()[0], 0.001); + EXPECT_EQ(font.font_bbox().y_min, -200); + EXPECT_EQ(font.font_bbox().x_max, 700); + + EXPECT_EQ(font.encoding().at(65), "A"); + EXPECT_EQ(font.encoding().at(66), "B"); +} + +TEST(Type1FontTest, DecryptsCharstringsAndSubrs) { + const Type1Program font{build_type1()}; + + ASSERT_EQ(font.glyphs().size(), 2u); + EXPECT_EQ(font.glyphs()[0].name, "A"); + EXPECT_EQ(font.glyphs()[0].charstring, std::string("\x8b\x8b\x0d\x0e", 4)); + EXPECT_EQ(font.glyphs()[1].name, "B"); + EXPECT_EQ(font.glyphs()[1].charstring, std::string("\xf0\x0d\x0e", 3)); + + ASSERT_EQ(font.subrs().size(), 1u); + EXPECT_EQ(font.subrs()[0], std::string("\x0b", 1)); // return +} From b5cb39fc3adaf1ad0a284309f82945251fc6a826 Mon Sep 17 00:00:00 2001 From: Andreas Stefl Date: Tue, 23 Jun 2026 22:06:13 +0200 Subject: [PATCH 4/6] PDF stage 3.5: CFF builder (assemble a CFF from Type2 charstrings) cff::build_cff serializes a name-keyed CFF from a list of (name, Type2 charstring) glyphs + default/nominalWidthX + bbox: Header, Name INDEX, Top DICT (FontBBox + charset/CharStrings/Private offsets, fixed-width so the layout resolves in one pass), String INDEX (every glyph name as a custom SID, so no standard-strings table is needed), empty Global Subr INDEX, CharStrings INDEX, format-0 charset, Private DICT. This is the assembly target for the Type1 -> CFF path: the translated Type2 charstrings land here, the result feeds CffFont + wrap_to_otf (3.4). Test: build a 2-glyph CFF, read it back through CffFont (name, glyph name, bbox, charstring width vs. default) and confirm it wraps to a loadable OTTO. Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz --- CMakeLists.txt | 1 + src/odr/internal/font/cff_builder.cpp | 184 ++++++++++++++++++++++++++ src/odr/internal/font/cff_builder.hpp | 40 ++++++ test/src/internal/font/cff_font.cpp | 32 +++++ 4 files changed, 257 insertions(+) create mode 100644 src/odr/internal/font/cff_builder.cpp create mode 100644 src/odr/internal/font/cff_builder.hpp diff --git a/CMakeLists.txt b/CMakeLists.txt index d6eb4f6a..a0b4b1e4 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -198,6 +198,7 @@ set(ODR_SOURCE_FILES "src/odr/internal/pdf/pdf_object_parser.cpp" "src/odr/internal/pdf/pdf_page_text.cpp" + "src/odr/internal/font/cff_builder.cpp" "src/odr/internal/font/cff_font.cpp" "src/odr/internal/font/cff_standard_strings.cpp" "src/odr/internal/font/cff_transform.cpp" diff --git a/src/odr/internal/font/cff_builder.cpp b/src/odr/internal/font/cff_builder.cpp new file mode 100644 index 00000000..26174f5e --- /dev/null +++ b/src/odr/internal/font/cff_builder.cpp @@ -0,0 +1,184 @@ +#include + +#include +#include +#include + +namespace odr::internal::font::cff { + +namespace { + +void put16(std::string &s, const std::uint16_t v) { + s += static_cast(v >> 8); + s += static_cast(v & 0xff); +} + +/// A CFF DICT integer in the compact encoding (used for widths / bbox). +void dict_int(std::string &s, const int v) { + if (v >= -107 && v <= 107) { + s += static_cast(v + 139); + } else if (v >= 108 && v <= 1131) { + const int u = v - 108; + s += static_cast((u >> 8) + 247); + s += static_cast(u & 0xff); + } else if (v >= -1131 && v <= -108) { + const int u = -v - 108; + s += static_cast((u >> 8) + 251); + s += static_cast(u & 0xff); + } else if (v >= -32768 && v <= 32767) { + s += static_cast(28); + put16(s, static_cast(v)); + } else { + s += static_cast(29); + s += static_cast((v >> 24) & 0xff); + s += static_cast((v >> 16) & 0xff); + s += static_cast((v >> 8) & 0xff); + s += static_cast(v & 0xff); + } +} + +/// A CFF DICT integer in the fixed 5-byte form (`29 + int32`), so an operand +/// whose value (an offset) is not yet known can be sized before it is filled. +void dict_int_fixed(std::string &s, const std::int32_t v) { + s += static_cast(29); + s += static_cast((v >> 24) & 0xff); + s += static_cast((v >> 16) & 0xff); + s += static_cast((v >> 8) & 0xff); + s += static_cast(v & 0xff); +} + +void dict_operator(std::string &s, const int op) { + if (op >= 1200) { + s += static_cast(12); + s += static_cast(op - 1200); + } else { + s += static_cast(op); + } +} + +/// Serialize a CFF INDEX from its members. +std::string build_index(const std::vector &members) { + std::string out; + put16(out, static_cast(members.size())); + if (members.empty()) { + return out; // count 0: no offSize/offsets + } + std::uint32_t total = 1; + for (const std::string &m : members) { + total += static_cast(m.size()); + } + const std::uint8_t off_size = total <= 0xff ? 1 + : total <= 0xffff ? 2 + : total <= 0xffffff ? 3 + : 4; + out += static_cast(off_size); + const auto put_off = [&](const std::uint32_t off) { + for (int i = off_size - 1; i >= 0; --i) { + out += static_cast((off >> (8 * i)) & 0xff); + } + }; + std::uint32_t offset = 1; + put_off(offset); + for (const std::string &m : members) { + offset += static_cast(m.size()); + put_off(offset); + } + for (const std::string &m : members) { + out += m; + } + return out; +} + +} // namespace + +std::string build_cff(const std::string_view name, + const std::vector &glyphs, + const double default_width, const double nominal_width, + const FontBBox bbox) { + // CharStrings INDEX (one Type2 charstring per glyph). + std::vector charstrings; + charstrings.reserve(glyphs.size()); + for (const BuilderGlyph &glyph : glyphs) { + charstrings.push_back(glyph.charstring); + } + const std::string charstrings_index = build_index(charstrings); + + // String INDEX: every glyph name gets a custom SID (391 + position). Glyph 0 + // is the implicit `.notdef` (SID 0), so its name is not stored; the charset + // lists SIDs for glyphs 1..n-1. + std::vector strings; + for (std::size_t i = 1; i < glyphs.size(); ++i) { + strings.push_back(glyphs[i].name); + } + const std::string string_index = build_index(strings); + + // Format-0 charset: SID per glyph 1..n-1. + std::string charset; + charset += static_cast(0); // format 0 + for (std::size_t i = 1; i < glyphs.size(); ++i) { + put16(charset, static_cast(391 + (i - 1))); + } + + // Private DICT: defaultWidthX (20), nominalWidthX (21). + std::string private_dict; + dict_int(private_dict, static_cast(default_width)); + dict_operator(private_dict, 20); + dict_int(private_dict, static_cast(nominal_width)); + dict_operator(private_dict, 21); + + const std::string name_index = + build_index({std::string(name.empty() ? "ODRType1" : name)}); + const std::string global_subrs = build_index({}); + + // Top DICT, with the offsets to charset / CharStrings / Private filled once + // the layout is known. Fixed-width offset integers keep the size constant. + const auto top_dict = [&](const std::uint32_t charset_off, + const std::uint32_t charstrings_off, + const std::uint32_t private_off) { + std::string d; + dict_int(d, bbox.x_min); + dict_int(d, bbox.y_min); + dict_int(d, bbox.x_max); + dict_int(d, bbox.y_max); + dict_operator(d, 5); // FontBBox + dict_int_fixed(d, static_cast(charset_off)); + dict_operator(d, 15); // charset + dict_int_fixed(d, static_cast(charstrings_off)); + dict_operator(d, 17); // CharStrings + dict_int_fixed(d, static_cast(private_dict.size())); + dict_int_fixed(d, static_cast(private_off)); + dict_operator(d, 18); // Private [size offset] + return d; + }; + + const std::string top_dict_probe = build_index({top_dict(0, 0, 0)}); + constexpr std::uint32_t header_size = 4; + const auto prefix = static_cast( + header_size + name_index.size() + top_dict_probe.size() + + string_index.size() + global_subrs.size()); + // Layout after the prefix: CharStrings, charset, Private. + const std::uint32_t charstrings_off = prefix; + const std::uint32_t charset_off = + charstrings_off + static_cast(charstrings_index.size()); + const std::uint32_t private_off = + charset_off + static_cast(charset.size()); + + const std::string top_dict_index = + build_index({top_dict(charset_off, charstrings_off, private_off)}); + + std::string out; + out += static_cast(1); // major + out += static_cast(0); // minor + out += static_cast(4); // hdrSize + out += static_cast(4); // offSize (absolute offsets; legacy/unused) + out += name_index; + out += top_dict_index; + out += string_index; + out += global_subrs; + out += charstrings_index; + out += charset; + out += private_dict; + return out; +} + +} // namespace odr::internal::font::cff diff --git a/src/odr/internal/font/cff_builder.hpp b/src/odr/internal/font/cff_builder.hpp new file mode 100644 index 00000000..cf8f6ac5 --- /dev/null +++ b/src/odr/internal/font/cff_builder.hpp @@ -0,0 +1,40 @@ +#pragma once + +#include + +#include +#include +#include + +namespace odr::internal::font::cff { + +/// One glyph for the CFF builder: its PostScript name and its **Type2** +/// charstring (already translated from Type1, if applicable). +struct BuilderGlyph { + std::string name; + std::string charstring; +}; + +/// Serialize a name-keyed CFF font from Type2 charstrings. +/// +/// Assembles the minimal CFF a `CffFont` reader (and, after wrapping, a +/// browser) needs: Header, Name INDEX, Top DICT (FontBBox + +/// charset/CharStrings/Private offsets), String INDEX (every glyph name, SID +/// 391+), an empty Global Subr INDEX, the CharStrings INDEX, a format-0 charset +/// and a Private DICT +/// (`defaultWidthX`/`nominalWidthX`). Glyph 0 is the implicit `.notdef`; the +/// caller orders @p glyphs so glyph 0 is `.notdef`. +/// +/// This is the assembly target for the Type1 -> CFF path (stage 3.5): the +/// translated Type2 charstrings go in here, the result feeds `CffFont` + +/// `wrap_to_otf` (3.4). No `FontMatrix` is emitted, so the font is 1000 +/// units/em (the Type1 default); a non-default matrix is a follow-up. +/// +/// Offsets in the Top DICT use the fixed-width 5-byte integer form so the +/// layout resolves in a single pass. +[[nodiscard]] std::string build_cff(std::string_view name, + const std::vector &glyphs, + double default_width, double nominal_width, + FontBBox bbox); + +} // namespace odr::internal::font::cff diff --git a/test/src/internal/font/cff_font.cpp b/test/src/internal/font/cff_font.cpp index 3be0bebc..e5b1a3f8 100644 --- a/test/src/internal/font/cff_font.cpp +++ b/test/src/internal/font/cff_font.cpp @@ -1,6 +1,7 @@ #include #include +#include #include #include #include @@ -193,6 +194,37 @@ TEST(CffFontTest, IsCffMagic) { EXPECT_FALSE(CffFont::is_cff("not a font")); } +TEST(CffFontTest, BuildCffRoundTripsThroughReader) { + using odr::internal::font::cff::build_cff; + using odr::internal::font::cff::BuilderGlyph; + + // Type2 charstrings: .notdef = endchar; "A" = width-operand 50 then endchar + // (50 -> single byte 50 + 139 = 0xBD; endchar = 0x0E). + std::vector glyphs = { + {".notdef", std::string("\x0e", 1)}, + {"A", std::string("\xbd\x0e", 2)}, + }; + const std::string cff_bytes = + build_cff("MyType1", glyphs, /*default_width=*/0, /*nominal_width=*/100, + FontBBox{0, -200, 700, 800}); + + const CffFont font{cff_bytes}; + EXPECT_EQ(font.format(), FontFormat::cff); + EXPECT_EQ(font.name(), "MyType1"); + EXPECT_EQ(font.glyph_count(), 2); + EXPECT_FALSE(font.is_cid_keyed()); + EXPECT_EQ(font.glyph_name(1), "A"); + EXPECT_EQ(font.bounding_box().x_max, 700); + // explicit charstring width: nominalWidthX (100) + 50. + EXPECT_EQ(font.advance_width(1), 150); + // no explicit width: defaultWidthX (0). + EXPECT_EQ(font.advance_width(0), 0); + + // The built CFF wraps into a loadable OTTO (3.4 path) end to end. + const std::string otf = odr::internal::font::cff::wrap_to_otf(font); + EXPECT_TRUE(odr::internal::font::sfnt::SfntFont::is_sfnt(otf)); +} + TEST(CffFontTest, WrapsToLoadableOtf) { using namespace odr::internal::font; const CffFont cff{build_cff()}; From 8b11d73084d8cfbd75ce35b10cf016874bb9d2b7 Mon Sep 17 00:00:00 2001 From: Andreas Stefl Date: Tue, 23 Jun 2026 22:11:24 +0200 Subject: [PATCH 5/6] PDF stage 3.5: Type1 -> Type2 charstring translation type1::to_type2 translates a decrypted Type1 charstring to Type2 (CFF): a stack machine that flattens callsubr (inlining the font's /Subrs, depth guarded), folds div, lifts the hsbw side bearing into the first moveto and returns the advance width separately, drops Type1-only hints (dotsection, *stem3, hint-replacement OtherSubr 3), and translates the flex OtherSubrs (1/2/0 -> two rrcurvetos) and seac (-> Type2 endchar form). Path operators (r/h/v lineto, rr/vh/hv curveto, stems, moves, endchar) share opcodes with Type2 and pass through. Best-effort / display-oriented: hints affect rendering quality, not glyph shape. Tests: exact Type2 output for hsbw width + side-bearing folding into the first move, callsubr inlining, and div folding. Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz --- CMakeLists.txt | 1 + src/odr/internal/font/type1_charstring.cpp | 424 ++++++++++++++++++++ src/odr/internal/font/type1_charstring.hpp | 30 ++ test/CMakeLists.txt | 1 + test/src/internal/font/type1_charstring.cpp | 122 ++++++ 5 files changed, 578 insertions(+) create mode 100644 src/odr/internal/font/type1_charstring.cpp create mode 100644 src/odr/internal/font/type1_charstring.hpp create mode 100644 test/src/internal/font/type1_charstring.cpp diff --git a/CMakeLists.txt b/CMakeLists.txt index a0b4b1e4..be77b4aa 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -202,6 +202,7 @@ set(ODR_SOURCE_FILES "src/odr/internal/font/cff_font.cpp" "src/odr/internal/font/cff_standard_strings.cpp" "src/odr/internal/font/cff_transform.cpp" + "src/odr/internal/font/type1_charstring.cpp" "src/odr/internal/font/type1_crypt.cpp" "src/odr/internal/font/type1_font.cpp" "src/odr/internal/font/sfnt_font.cpp" diff --git a/src/odr/internal/font/type1_charstring.cpp b/src/odr/internal/font/type1_charstring.cpp new file mode 100644 index 00000000..ad7365d6 --- /dev/null +++ b/src/odr/internal/font/type1_charstring.cpp @@ -0,0 +1,424 @@ +#include + +#include +#include +#include +#include +#include + +namespace odr::internal::font::type1 { + +namespace { + +// Type1 charstring operators (single byte; 12 = escape to a two-byte op). +enum T1 : int { + t1_hstem = 1, + t1_vstem = 3, + t1_vmoveto = 4, + t1_rlineto = 5, + t1_hlineto = 6, + t1_vlineto = 7, + t1_rrcurveto = 8, + t1_closepath = 9, + t1_callsubr = 10, + t1_return = 11, + t1_hsbw = 13, + t1_endchar = 14, + t1_rmoveto = 21, + t1_hmoveto = 22, + t1_vhcurveto = 30, + t1_hvcurveto = 31, + t1_dotsection = 1200, + t1_vstem3 = 1201, + t1_hstem3 = 1202, + t1_seac = 1206, + t1_sbw = 1207, + t1_div = 1212, + t1_callothersubr = 1216, + t1_pop = 1217, + t1_setcurrentpoint = 1233, +}; + +/// Encode an integer operand in the Type2 charstring number forms. +void emit_int(std::string &out, const int v) { + if (v >= -107 && v <= 107) { + out += static_cast(v + 139); + } else if (v >= 108 && v <= 1131) { + const int u = v - 108; + out += static_cast((u >> 8) + 247); + out += static_cast(u & 0xff); + } else if (v >= -1131 && v <= -108) { + const int u = -v - 108; + out += static_cast((u >> 8) + 251); + out += static_cast(u & 0xff); + } else { + out += static_cast(28); // shortint + out += static_cast((v >> 8) & 0xff); + out += static_cast(v & 0xff); + } +} + +/// Encode a (possibly fractional) operand: an integer form when whole and in +/// range, else the Type2 16.16 fixed form (`255 + int32`). +void emit_num(std::string &out, const double v) { + if (v == std::floor(v) && v >= -32768 && v <= 32767) { + emit_int(out, static_cast(v)); + return; + } + const auto fixed = static_cast(std::lround(v * 65536.0)); + out += static_cast(255); + out += static_cast((fixed >> 24) & 0xff); + out += static_cast((fixed >> 16) & 0xff); + out += static_cast((fixed >> 8) & 0xff); + out += static_cast(fixed & 0xff); +} + +/// The translation state machine. Walks the Type1 charstring (recursing through +/// `callsubr`), emitting a Type2 charstring. +class Translator { +public: + explicit Translator(const std::vector &subrs) : m_subrs(subrs) {} + + Type2Charstring run(std::string_view charstring) { + execute(charstring, 0); + if (!m_ended) { + m_out += static_cast(t1_endchar); + } + return {std::move(m_out), m_width, m_has_width}; + } + +private: + // Emit the pending width (once) ahead of the first stem/move/endchar's + // operands, as the Type2 width does. nominalWidthX is 0 in the built CFF, so + // the width is the absolute advance. + void emit_width() { + if (m_width_pending) { + emit_int(m_out, m_width); + m_width_pending = false; + } + } + + void flush_stack() { + for (const double v : m_stack) { + emit_num(m_out, v); + } + m_stack.clear(); + } + + // Emit width + operands + a one-byte operator, clearing the stack. + void emit_op(const int op) { + emit_width(); + flush_stack(); + m_out += static_cast(op); + } + + void execute(std::string_view cs, const int depth) { + if (depth > 16 || m_ended) { + return; + } + std::size_t p = 0; + while (p < cs.size() && !m_ended) { + const auto b = static_cast(cs[p]); + if (b >= 32) { + // operand + double value = 0.0; + if (b <= 246) { + value = static_cast(b) - 139; + p += 1; + } else if (b <= 250) { + value = (static_cast(b) - 247) * 256 + + static_cast(cs[p + 1]) + 108; + p += 2; + } else if (b <= 254) { + value = -(static_cast(b) - 251) * 256 - + static_cast(cs[p + 1]) - 108; + p += 2; + } else { // 255: Type1 32-bit integer + value = static_cast( + (static_cast(cs[p + 1]) << 24) | + (static_cast(cs[p + 2]) << 16) | + (static_cast(cs[p + 3]) << 8) | + static_cast(cs[p + 4])); + p += 5; + } + m_stack.push_back(value); + continue; + } + int op = b; + ++p; + if (b == 12) { + op = 1200 + static_cast(cs[p]); + ++p; + } + handle(op, depth); + } + } + + void handle(const int op, const int depth) { + switch (op) { + case t1_hsbw: + if (m_stack.size() >= 2) { + m_sbx = m_stack[0]; + m_width = static_cast(m_stack[1]); + m_has_width = true; + m_width_pending = true; + m_sbx_pending = true; + } + m_stack.clear(); + break; + case t1_sbw: + if (m_stack.size() >= 4) { + m_sbx = m_stack[0]; + m_width = static_cast(m_stack[2]); + m_has_width = true; + m_width_pending = true; + m_sbx_pending = true; + } + m_stack.clear(); + break; + + case t1_rmoveto: + if (m_flex_active) { + collect_flex_point(); + } else { + if (m_sbx_pending && !m_stack.empty()) { + m_stack[0] += m_sbx; + m_sbx_pending = false; + } + emit_op(t1_rmoveto); + } + break; + case t1_hmoveto: + if (m_flex_active) { + collect_flex_point(); + } else { + // hmoveto has no y; the side bearing adds an x, so keep it hmoveto. + if (m_sbx_pending && !m_stack.empty()) { + m_stack[0] += m_sbx; + } + m_sbx_pending = false; + emit_op(t1_hmoveto); + } + break; + case t1_vmoveto: + if (m_flex_active) { + collect_flex_point(); + } else if (m_sbx_pending && m_sbx != 0.0) { + // A side bearing adds an x offset, which vmoveto cannot carry: promote + // to rmoveto(sbx, dy). + const double dy = m_stack.empty() ? 0.0 : m_stack[0]; + m_stack = {m_sbx, dy}; + m_sbx_pending = false; + emit_op(t1_rmoveto); + } else { + m_sbx_pending = false; + emit_op(t1_vmoveto); + } + break; + + case t1_hstem: + case t1_vstem: + case t1_rlineto: + case t1_hlineto: + case t1_vlineto: + case t1_rrcurveto: + case t1_vhcurveto: + case t1_hvcurveto: + emit_op(op); // identical opcodes/semantics in Type2 + break; + + case t1_closepath: + case t1_dotsection: + case t1_vstem3: + case t1_hstem3: + case t1_setcurrentpoint: + m_stack.clear(); // dropped (implicit / hints / no-op) + break; + + case t1_div: + if (m_stack.size() >= 2) { + const double b = m_stack.back(); + m_stack.pop_back(); + const double a = m_stack.back(); + m_stack.pop_back(); + m_stack.push_back(b != 0.0 ? a / b : 0.0); + } + break; + + case t1_callsubr: { + if (m_stack.empty()) { + break; + } + const auto index = static_cast(m_stack.back()); + m_stack.pop_back(); + if (index >= 0 && index < static_cast(m_subrs.size())) { + execute(m_subrs[index], depth + 1); + } + break; + } + case t1_return: + break; // end of the current subr + + case t1_callothersubr: + handle_othersubr(); + break; + case t1_pop: + // Push the value the matching callothersubr left on the PS stack. + if (!m_ps_stack.empty()) { + m_stack.push_back(m_ps_stack.back()); + m_ps_stack.pop_back(); + } else { + m_stack.push_back(0.0); + } + break; + + case t1_seac: + emit_seac(); + break; + + case t1_endchar: + emit_op(t1_endchar); + m_ended = true; + break; + + default: + m_stack.clear(); // unknown: skip + break; + } + } + + // OtherSubr dispatch: flex (1 start / 2 add-point / 0 end) and hint + // replacement (3). The Type1 convention is `arg1..argN N othersubr# + // callothersubr`, so the operand stack top is the othersubr number, below it + // the argument count, below that the arguments. + void handle_othersubr() { + if (m_stack.size() < 2) { + m_stack.clear(); + return; + } + const auto othersubr = static_cast(m_stack.back()); + m_stack.pop_back(); + const auto argc = static_cast(m_stack.back()); + m_stack.pop_back(); + + std::vector args; + for (int i = 0; i < argc && !m_stack.empty(); ++i) { + args.push_back(m_stack.back()); + m_stack.pop_back(); + } + // args is reversed (top first); restore call order. + std::reverse(args.begin(), args.end()); + + switch (othersubr) { + case 1: // flex start + m_flex_active = true; + m_flex_points.clear(); + break; + case 2: // flex add-point marker (the rmoveto already collected the point) + break; + case 0: // flex end: emit two curves from the collected points + emit_flex(); + m_flex_active = false; + // OtherSubr 0 leaves the end x,y on the PS stack for `pop pop + // setcurrentpoint`. + if (args.size() >= 3) { + m_ps_stack.push_back(args[2]); // y (popped second) + m_ps_stack.push_back(args[1]); // x (popped first) + } + break; + case 3: // hint replacement: result is the subr number, used by callsubr + m_ps_stack.push_back(args.empty() ? 3.0 : args[0]); + break; + default: + // Unknown OtherSubr: make the arguments available to subsequent pops. + for (auto it = args.rbegin(); it != args.rend(); ++it) { + m_ps_stack.push_back(*it); + } + break; + } + } + + // During flex the 7 points arrive as `dx dy rmoveto`; collect their deltas. + void collect_flex_point() { + const double dx = m_stack.size() >= 2 ? m_stack[m_stack.size() - 2] : 0.0; + const double dy = m_stack.empty() ? 0.0 : m_stack.back(); + m_flex_points.push_back({dx, dy}); + m_stack.clear(); + } + + // Emit the flex as two rrcurvetos. Point 1 is the reference point; points + // 2..7 are the two beziers. The first curve's leading delta folds in the + // reference delta (point 2 relative to the pre-flex point = d1 + d2). + void emit_flex() { + if (m_flex_points.size() < 7) { + m_flex_points.clear(); + return; + } + const auto &d = m_flex_points; + emit_width(); + emit_num(m_out, d[1].x + d[0].x); + emit_num(m_out, d[1].y + d[0].y); + emit_num(m_out, d[2].x); + emit_num(m_out, d[2].y); + emit_num(m_out, d[3].x); + emit_num(m_out, d[3].y); + m_out += static_cast(t1_rrcurveto); + emit_num(m_out, d[4].x); + emit_num(m_out, d[4].y); + emit_num(m_out, d[5].x); + emit_num(m_out, d[5].y); + emit_num(m_out, d[6].x); + emit_num(m_out, d[6].y); + m_out += static_cast(t1_rrcurveto); + m_flex_points.clear(); + } + + // seac: asb adx ady bchar achar. Emit the Type2 deprecated endchar-seac form + // `adx' ady bchar achar endchar`, adjusting adx for the accent side bearing. + void emit_seac() { + if (m_stack.size() >= 5) { + const double asb = m_stack[0]; + const double adx = m_stack[1]; + const double ady = m_stack[2]; + const double bchar = m_stack[3]; + const double achar = m_stack[4]; + m_stack.clear(); + emit_width(); + emit_num(m_out, adx - asb + m_sbx); + emit_num(m_out, ady); + emit_num(m_out, bchar); + emit_num(m_out, achar); + m_out += static_cast(t1_endchar); + } + m_ended = true; + } + + struct Point { + double x; + double y; + }; + + const std::vector &m_subrs; + std::string m_out; + std::vector m_stack; + std::vector m_ps_stack; + + int m_width{}; + bool m_has_width{}; + bool m_width_pending{}; + double m_sbx{}; + bool m_sbx_pending{}; + bool m_ended{}; + + bool m_flex_active{}; + std::vector m_flex_points; +}; + +} // namespace + +Type2Charstring to_type2(const std::string_view type1, + const std::vector &subrs) { + return Translator(subrs).run(type1); +} + +} // namespace odr::internal::font::type1 diff --git a/src/odr/internal/font/type1_charstring.hpp b/src/odr/internal/font/type1_charstring.hpp new file mode 100644 index 00000000..47bd0d6c --- /dev/null +++ b/src/odr/internal/font/type1_charstring.hpp @@ -0,0 +1,30 @@ +#pragma once + +#include +#include +#include + +namespace odr::internal::font::type1 { + +/// The result of translating a Type1 charstring to Type2 (CFF). +struct Type2Charstring { + std::string charstring; ///< the Type2 charstring (no leading width) + int width{}; ///< advance width from `hsbw`/`sbw`, in glyph units + bool has_width{}; ///< whether an `hsbw`/`sbw` set the width +}; + +/// Translate one **decrypted** Type1 charstring to a Type2 (CFF) charstring. +/// +/// Type1 and Type2 share most path operators; this flattens `callsubr` +/// (inlining @p subrs), folds `div`, lifts the `hsbw` side bearing into the +/// first move, drops Type1-only hint operators (`dotsection`, `*stem3`, hint +/// replacement) and translates the flex and `seac` OtherSubr mechanisms. The +/// advance width (`hsbw`) is returned separately rather than baked into the +/// charstring, so the caller emits it against the CFF `nominalWidthX`. +/// +/// Best-effort and display-oriented: hints are dropped (they affect rendering +/// quality, not glyph shape), and unknown operators are skipped. +[[nodiscard]] Type2Charstring to_type2(std::string_view type1, + const std::vector &subrs); + +} // namespace odr::internal::font::type1 diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt index 249ebc51..975180f4 100644 --- a/test/CMakeLists.txt +++ b/test/CMakeLists.txt @@ -55,6 +55,7 @@ add_executable(odr_test "src/internal/pdf/pdf_test_file_builder.cpp" "src/internal/font/cff_font.cpp" + "src/internal/font/type1_charstring.cpp" "src/internal/font/type1_crypt.cpp" "src/internal/font/type1_font.cpp" "src/internal/font/sfnt_font.cpp" diff --git a/test/src/internal/font/type1_charstring.cpp b/test/src/internal/font/type1_charstring.cpp new file mode 100644 index 00000000..b79e2a22 --- /dev/null +++ b/test/src/internal/font/type1_charstring.cpp @@ -0,0 +1,122 @@ +#include + +#include + +#include +#include + +using namespace odr::internal::font::type1; + +namespace { + +/// Encode an integer in the Type1/Type2 shared number forms (no 28/255 needed +/// for the small values used here). +void num(std::string &s, const int v) { + if (v >= -107 && v <= 107) { + s += static_cast(v + 139); + } else if (v >= 108 && v <= 1131) { + const int u = v - 108; + s += static_cast((u >> 8) + 247); + s += static_cast(u & 0xff); + } else if (v >= -1131 && v <= -108) { + const int u = -v - 108; + s += static_cast((u >> 8) + 251); + s += static_cast(u & 0xff); + } +} + +void op(std::string &s, const int o) { s += static_cast(o); } + +} // namespace + +TEST(Type1CharstringTest, HsbwWidthAndSideBearing) { + // sbx=10 wx=200 hsbw 100 0 rmoveto 50 50 rlineto endchar + std::string t1; + num(t1, 10); + num(t1, 200); + op(t1, 13); // hsbw + num(t1, 100); + num(t1, 0); + op(t1, 21); // rmoveto + num(t1, 50); + num(t1, 50); + op(t1, 5); // rlineto + op(t1, 14); // endchar + + const Type2Charstring out = to_type2(t1, {}); + EXPECT_TRUE(out.has_width); + EXPECT_EQ(out.width, 200); + + // Type2: [width 200][dx 100+sbx 10 = 110][dy 0] rmoveto [50][50] rlineto + // endchar. + std::string expected; + num(expected, 200); // width prepended + num(expected, 110); // 100 + side bearing 10 + num(expected, 0); + op(expected, 21); // rmoveto + num(expected, 50); + num(expected, 50); + op(expected, 5); // rlineto + op(expected, 14); // endchar + EXPECT_EQ(out.charstring, expected); +} + +TEST(Type1CharstringTest, FlattensCallSubr) { + // subr 0: 50 50 rlineto return + std::string subr0; + num(subr0, 50); + num(subr0, 50); + op(subr0, 5); // rlineto + op(subr0, 11); // return + + // 0 0 hsbw 0 0 rmoveto 0 callsubr endchar + std::string t1; + num(t1, 0); + num(t1, 0); + op(t1, 13); // hsbw + num(t1, 0); + num(t1, 0); + op(t1, 21); // rmoveto + num(t1, 0); + op(t1, 10); // callsubr 0 + op(t1, 14); // endchar + + const Type2Charstring out = to_type2(t1, {subr0}); + + // The subr's rlineto is inlined; expect width(0) rmoveto, then rlineto, then + // endchar. + std::string expected; + num(expected, 0); // width + num(expected, 0); + num(expected, 0); + op(expected, 21); // rmoveto + num(expected, 50); + num(expected, 50); + op(expected, 5); // rlineto (from subr) + op(expected, 14); // endchar + EXPECT_EQ(out.charstring, expected); +} + +TEST(Type1CharstringTest, FoldsDiv) { + // 0 0 hsbw 600 2 div 0 rmoveto endchar -> dx = 300 + std::string t1; + num(t1, 0); + num(t1, 0); + op(t1, 13); // hsbw + num(t1, 600); + num(t1, 2); + t1 += static_cast(12); + t1 += static_cast(12); // div + num(t1, 0); + op(t1, 21); // rmoveto + op(t1, 14); // endchar + + const Type2Charstring out = to_type2(t1, {}); + std::string expected; + num(expected, 0); // width + num(expected, 300); // 600 / 2 + num(expected, 0); + op(expected, 21); // rmoveto + op(expected, 14); // endchar + EXPECT_EQ(out.charstring, expected); +} From 29cdc2ff03dcfaa014249922be092c86e5be536e Mon Sep 17 00:00:00 2001 From: Andreas Stefl Date: Tue, 23 Jun 2026 22:19:31 +0200 Subject: [PATCH 6/6] PDF stage 3.5: Type1 -> CFF assembly + /FontFile wiring (end-to-end) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit type1::to_cff translates every glyph (to_type2, flattening /Subrs), places .notdef at glyph 0 (synthesizing one when absent) and assembles a CFF via the builder. load_embedded_font now reads /FontFile: parse the Type1 program, convert to CFF, and hold it as a CffFont — so embedded Type1 reuses the entire 3.4 CFF path (PUA re-encode, @font-face wrap, reverse map) with no new abstract::Font subclass. Simple-font glyph selection by PostScript name (PDF /Encoding -> name -> glyph) is the shared CFF/Type1 follow-up tied to the AGL/name-mapping decision; composite and the wrap/display path work today. Tests: a Type1 program converts to a CFF that reads back through CffFont (glyph count incl. synthesized .notdef, names) and wraps to a loadable OTTO. Full font + PDF + HTML corpus green (460 tests). Co-Authored-By: Claude Opus 4.8 Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz --- src/odr/internal/font/type1_font.cpp | 37 ++++++++++++++++++++ src/odr/internal/font/type1_font.hpp | 7 ++++ src/odr/internal/pdf/pdf_document_parser.cpp | 17 +++++++-- test/src/internal/font/type1_font.cpp | 22 ++++++++++++ 4 files changed, 80 insertions(+), 3 deletions(-) diff --git a/src/odr/internal/font/type1_font.cpp b/src/odr/internal/font/type1_font.cpp index b149c6ed..e53c2974 100644 --- a/src/odr/internal/font/type1_font.cpp +++ b/src/odr/internal/font/type1_font.cpp @@ -1,5 +1,7 @@ #include +#include +#include #include #include @@ -292,4 +294,39 @@ void Type1Program::parse_private(const std::string_view decrypted) { } } +std::string to_cff(const Type1Program &program) { + // Order glyphs with `.notdef` at index 0 (CFF requires it). Translate each + // Type1 charstring to Type2; the width rides in the charstring (the CFF + // builder uses nominalWidthX = 0). + std::vector glyphs; + glyphs.reserve(program.glyphs().size() + 1); + + const auto translate = [&](const Glyph &glyph) { + Type2Charstring t2 = to_type2(glyph.charstring, program.subrs()); + glyphs.push_back({glyph.name, std::move(t2.charstring)}); + }; + + // .notdef first. + std::size_t notdef = program.glyphs().size(); + for (std::size_t i = 0; i < program.glyphs().size(); ++i) { + if (program.glyphs()[i].name == ".notdef") { + notdef = i; + break; + } + } + if (notdef < program.glyphs().size()) { + translate(program.glyphs()[notdef]); + } else { + glyphs.push_back({".notdef", std::string(1, static_cast(14))}); + } + for (std::size_t i = 0; i < program.glyphs().size(); ++i) { + if (i != notdef) { + translate(program.glyphs()[i]); + } + } + + return cff::build_cff(program.name(), glyphs, /*default_width=*/0, + /*nominal_width=*/0, program.font_bbox()); +} + } // namespace odr::internal::font::type1 diff --git a/src/odr/internal/font/type1_font.hpp b/src/odr/internal/font/type1_font.hpp index 6dc56734..037df539 100644 --- a/src/odr/internal/font/type1_font.hpp +++ b/src/odr/internal/font/type1_font.hpp @@ -78,4 +78,11 @@ class Type1Program { int m_len_iv{4}; }; +/// Convert a parsed Type1 program to a **CFF** font: translate every glyph's +/// charstring to Type2 (`to_type2`, flattening the program's `/Subrs`) and +/// assemble via the CFF builder, with `.notdef` placed at glyph 0. The result +/// is a bare CFF that `cff::CffFont` reads and `cff::wrap_to_otf` wraps for the +/// browser — so an embedded Type1 font reuses the entire 3.4 CFF path. +[[nodiscard]] std::string to_cff(const Type1Program &program); + } // namespace odr::internal::font::type1 diff --git a/src/odr/internal/pdf/pdf_document_parser.cpp b/src/odr/internal/pdf/pdf_document_parser.cpp index cae04371..09830556 100644 --- a/src/odr/internal/pdf/pdf_document_parser.cpp +++ b/src/odr/internal/pdf/pdf_document_parser.cpp @@ -5,6 +5,7 @@ #include #include +#include #include #include #include @@ -276,9 +277,10 @@ util::math::Transform2D parse_matrix(DocumentParser &parser, Object object) { /// interface: `/FontFile2` (TrueType / `CIDFontType2`) -> `SfntFont`, and /// `/FontFile3` (CFF / `Type1C` / `CIDFontType0C`, or OpenType-CFF) -> either /// an `SfntFont` (when the program is already a full SFNT, `/Subtype -/// /OpenType`) or a bare `CffFont`. `/FontFile` (Type1) is not yet read and -/// leaves `font.embedded_font` null, so such fonts keep rendering through the -/// fallback path. A malformed font is logged and left null. +/// /OpenType`) or a bare `CffFont`. `/FontFile` (Type1) is translated to a CFF +/// (`type1::to_cff`) and read as a `CffFont`, so it reuses the whole CFF path. +/// A malformed font is logged and leaves `font.embedded_font` null, so such +/// fonts keep rendering through the fallback path. void load_embedded_font(DocumentParser &parser, const Dictionary &descriptor, Font &font) { try { @@ -301,6 +303,15 @@ void load_embedded_font(DocumentParser &parser, const Dictionary &descriptor, font.embedded_font = std::make_shared(std::move(data)); } + } else if (descriptor.has_key("FontFile") && + descriptor["FontFile"].is_reference()) { + // Type1 (`/FontFile`): translate the program to a CFF, then read it as a + // CffFont so the whole CFF path (re-encode / wrap / reverse map) applies. + std::string data = + parser.read_decoded_stream(descriptor["FontFile"].as_reference()); + const font::type1::Type1Program program(data); + font.embedded_font = + std::make_shared(font::type1::to_cff(program)); } } catch (const std::exception &e) { ODR_WARNING(parser.logger(), diff --git a/test/src/internal/font/type1_font.cpp b/test/src/internal/font/type1_font.cpp index a372b0b2..7ef5d974 100644 --- a/test/src/internal/font/type1_font.cpp +++ b/test/src/internal/font/type1_font.cpp @@ -1,5 +1,9 @@ #include +#include +#include +#include + #include #include @@ -109,3 +113,21 @@ TEST(Type1FontTest, DecryptsCharstringsAndSubrs) { ASSERT_EQ(font.subrs().size(), 1u); EXPECT_EQ(font.subrs()[0], std::string("\x0b", 1)); // return } + +TEST(Type1FontTest, ConvertsToLoadableCff) { + namespace cff = odr::internal::font::cff; + namespace sfnt = odr::internal::font::sfnt; + + const Type1Program program{build_type1()}; + const std::string cff_bytes = to_cff(program); + + const cff::CffFont font{cff_bytes}; + EXPECT_EQ(font.format(), odr::FontFormat::cff); + // .notdef (synthesized, since the test font has none) + A + B. + EXPECT_EQ(font.glyph_count(), 3); + EXPECT_EQ(font.glyph_name(1), "A"); + EXPECT_EQ(font.glyph_name(2), "B"); + + // The converted CFF wraps into a browser-loadable OTTO (the 3.4 path). + EXPECT_TRUE(sfnt::SfntFont::is_sfnt(cff::wrap_to_otf(font))); +}