From 6397b2e7aee8ff8dc23598bbbae67f2fb0361f7f Mon Sep 17 00:00:00 2001
From: Andreas Stefl <stefl.andreas@gmail.com>
Date: Tue, 23 Jun 2026 18:20:05 +0200
Subject: [PATCH 1/6] =?UTF-8?q?PDF=20stage=203.5:=20design=20=E2=80=94=20T?=
 =?UTF-8?q?ype1=20(/FontFile)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Seed the stage-3.5 branch. Read a Type1 program (eexec + charstring
decryption), translate Type1 -> Type2 charstrings, build a CFF and reuse
3.4's CFF -> OTF path; reverse map via glyph names -> AGL. Stacked on 3.4.
Implementation follows.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
---
 docs/design/pdf/stage-3.5-type1.md | 79 ++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)
 create mode 100644 docs/design/pdf/stage-3.5-type1.md

diff --git a/docs/design/pdf/stage-3.5-type1.md b/docs/design/pdf/stage-3.5-type1.md
new file mode 100644
index 00000000..8028869f
--- /dev/null
+++ b/docs/design/pdf/stage-3.5-type1.md
@@ -0,0 +1,79 @@
+# PDF stage 3.5 — Type1 (`/FontFile`)
+
+Design for the Type1 font sub-stage. Status: **design draft** (no implementation
+yet — this PR seeds the branch). Roadmap entry lives in
+[`src/odr/internal/pdf/AGENTS.md`](../../../src/odr/internal/pdf/AGENTS.md).
+
+Stacked on **3.4** — the whole point is to reuse 3.4's CFF → OTF path, so Type1
+support is "translate Type1 to CFF, then everything downstream is 3.4".
+
+## Goal
+
+Read a PDF `/FontFile` (a Type1 / PostScript font program) and render it through
+the same `@font-face` + dual-layer pipeline as TrueType (3.3) and CFF (3.4). The
+hardest single font piece, but precisely specified (Adobe *Type 1 Font Format*
+T1 spec; pdf.js as a reference implementation).
+
+## What gets read (`internal/font/type1_font.{hpp,cpp}`)
+
+`/FontFile` has three parts sized by the descriptor's `/Length1` (clear ASCII),
+`/Length2` (binary eexec), `/Length3` (trailer of zeros + `cleartomark`):
+
+1. **Clear text** — `/Encoding` (code → glyph name, or `StandardEncoding`),
+   `/FontMatrix`, `/FontBBox`, `/FontName`.
+2. **eexec section** — decrypt with R = 55665 (skip the 4 random bytes), then
+   parse:
+   - **`/Subrs`** — index → (decrypted) charstring.
+   - **`/CharStrings`** — glyph name → charstring; each charstring decrypted
+     with R = 4330, `lenIV` (default 4) leading bytes dropped.
+3. **Trailer** — ignored.
+
+PFB segment framing (`0x80` markers) is handled if present; PDF embeds the raw
+three-segment form.
+
+## Type1 → Type2 charstring translation (the core)
+
+Translate each decrypted **Type1** charstring into a **Type2 (CFF)** charstring,
+then build a CFF and hand it to 3.4's wrap. The non-trivial cases:
+
+- `hsbw` → seed the left side bearing + advance width, emit as the Type2 width +
+  initial `rmoveto`.
+- `seac` (accented composite) → decompose into base + accent (StandardEncoding
+  lookup), or emit `endchar` with the seac operands (Type2 deprecated-seac form).
+- `div`, `callsubr` / `return` (Subrs), and the `callothersubr` family —
+  **flex** (OtherSubrs 0–2) and **hint replacement** (OtherSubr 3) must be
+  interpreted/flattened, not passed through; this is the part that needs care.
+- hint operators (`hstem`/`vstem`/`dotsection`) → Type2 equivalents (or drop;
+  display tolerates missing hints).
+
+Output: a `cff::CffFont` (3.4) built from the translated charstrings, a charset
+from the glyph names, and a private dict carrying the widths. Everything after
+that — OTF wrap, PUA re-encode, OTS gate — is 3.4 unchanged.
+
+## Reverse map
+
+Charstring **glyph names** → AGL → Unicode (reuse `pdf_encoding`), same shape as
+CFF. A symbolic Type1 with a built-in encoding becomes selectable via this map.
+
+## PDF wiring (reuse 3.3 / 3.4)
+
+- `pdf_document_parser`: `/FontFile` → `Type1Font` → (translate) → `CffFont` →
+  `Font::embedded_font`.
+- `Font::glyph_for_code` simple-font branch resolves code → glyph name via the
+  PDF `/Encoding` (Differences over base) or the font's built-in `/Encoding`,
+  then name → glyph id through the CFF charset.
+- `to_unicode` reverse-map fallback and HTML dual-layer emission unchanged.
+
+## Scope / non-goals
+
+- CID-keyed Type1 (Type1 in a Type0, rare) — defer unless a corpus file needs it.
+- Multiple Master Type1 — out of scope.
+- Hinting fidelity is best-effort (display only).
+
+## Tests
+
+Font-only, assertion-based: a minimal hand-built (or frozen-literal) Type1 —
+eexec + charstring decryption, an `hsbw` + a `flex`/hint-replacement charstring
+translated and round-tripped through 3.4's CFF wrap and OTS, the glyph-name
+reverse map. Plus a `pdf_document_parser` case: `/FontFile` → `embedded_font`
+with Unicode recovery.

From f9215ad13b747a8cb106873cba71cf133871b333 Mon Sep 17 00:00:00 2001
From: Andreas Stefl <stefl.andreas@gmail.com>
Date: Tue, 23 Jun 2026 21:48:29 +0200
Subject: [PATCH 2/6] PDF stage 3.5: Type1 eexec/charstring decryption
 primitives
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

First self-contained piece of 3.5: the Type1 running-key cipher
(font::type1::decrypt) and its two entry points — decrypt_eexec (key 55665,
4-byte skip, binary or ASCII-hex/PFA auto-detected) and decrypt_charstring
(key 4330, /lenIV-aware). These don't depend on the CFF translation work, so
they land ahead of the full Type1Font reader (eexec parse + Type1->Type2
charstring translation -> reuse 3.4's CFF->OTF path).

Tests: round-trips against an independent forward-cipher reference (so they're
not circular), the lenIV override, and the hex eexec form.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
---
 CMakeLists.txt                         |  1 +
 src/odr/internal/font/type1_crypt.cpp  | 94 ++++++++++++++++++++++++++
 src/odr/internal/font/type1_crypt.hpp  | 34 ++++++++++
 test/CMakeLists.txt                    |  1 +
 test/src/internal/font/type1_crypt.cpp | 65 ++++++++++++++++++
 5 files changed, 195 insertions(+)
 create mode 100644 src/odr/internal/font/type1_crypt.cpp
 create mode 100644 src/odr/internal/font/type1_crypt.hpp
 create mode 100644 test/src/internal/font/type1_crypt.cpp

diff --git a/CMakeLists.txt b/CMakeLists.txt
index ccbd8eab..f74f4945 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -201,6 +201,7 @@ set(ODR_SOURCE_FILES
         "src/odr/internal/font/cff_font.cpp"
         "src/odr/internal/font/cff_standard_strings.cpp"
         "src/odr/internal/font/cff_transform.cpp"
+        "src/odr/internal/font/type1_crypt.cpp"
         "src/odr/internal/font/sfnt_font.cpp"
         "src/odr/internal/font/sfnt_parser.cpp"
         "src/odr/internal/font/sfnt_transform.cpp"
diff --git a/src/odr/internal/font/type1_crypt.cpp b/src/odr/internal/font/type1_crypt.cpp
new file mode 100644
index 00000000..3805c631
--- /dev/null
+++ b/src/odr/internal/font/type1_crypt.cpp
@@ -0,0 +1,94 @@
+#include <odr/internal/font/type1_crypt.hpp>
+
+#include <cctype>
+#include <cstdint>
+#include <string>
+
+namespace odr::internal::font::type1 {
+
+namespace {
+
+constexpr std::uint16_t c1 = 52845;
+constexpr std::uint16_t c2 = 22719;
+
+[[nodiscard]] bool is_hex_digit(const unsigned char c) {
+  return std::isxdigit(c) != 0;
+}
+
+/// Hex-decode @p in, skipping whitespace; stops at the first non-hex, non-space
+/// byte (the binary `eexec` form never reaches here).
+[[nodiscard]] std::string hex_decode(std::string_view in) {
+  std::string out;
+  int high = -1;
+  for (const char ch : in) {
+    const auto c = static_cast<unsigned char>(ch);
+    if (std::isspace(c) != 0) {
+      continue;
+    }
+    if (!is_hex_digit(c)) {
+      break;
+    }
+    const int value = (c <= '9')   ? c - '0'
+                      : (c <= 'F') ? c - 'A' + 10
+                                   : c - 'a' + 10;
+    if (high < 0) {
+      high = value;
+    } else {
+      out += static_cast<char>((high << 4) | value);
+      high = -1;
+    }
+  }
+  return out;
+}
+
+/// Whether @p eexec is the ASCII-hex form: the first four non-space bytes are
+/// all hex digits (Type1 spec 7.2 — the binary form is detected as not-this).
+[[nodiscard]] bool looks_like_hex(std::string_view eexec) {
+  int seen = 0;
+  for (const char ch : eexec) {
+    const auto c = static_cast<unsigned char>(ch);
+    if (std::isspace(c) != 0) {
+      continue;
+    }
+    if (!is_hex_digit(c)) {
+      return false;
+    }
+    if (++seen == 4) {
+      return true;
+    }
+  }
+  return false;
+}
+
+} // namespace
+
+std::string decrypt(const std::string_view cipher, const std::uint16_t key,
+                    const std::size_t skip) {
+  std::uint16_t r = key;
+  std::string out;
+  out.reserve(cipher.size());
+  for (const char ch : cipher) {
+    const auto c = static_cast<std::uint8_t>(ch);
+    out += static_cast<char>(c ^ (r >> 8));
+    r = static_cast<std::uint16_t>((c + r) * c1 + c2);
+  }
+  if (skip >= out.size()) {
+    return {};
+  }
+  return out.substr(skip);
+}
+
+std::string decrypt_eexec(const std::string_view eexec) {
+  if (looks_like_hex(eexec)) {
+    const std::string binary = hex_decode(eexec);
+    return decrypt(binary, 55665, 4);
+  }
+  return decrypt(eexec, 55665, 4);
+}
+
+std::string decrypt_charstring(const std::string_view charstring,
+                               const std::size_t len_iv) {
+  return decrypt(charstring, 4330, len_iv);
+}
+
+} // namespace odr::internal::font::type1
diff --git a/src/odr/internal/font/type1_crypt.hpp b/src/odr/internal/font/type1_crypt.hpp
new file mode 100644
index 00000000..388d2f5e
--- /dev/null
+++ b/src/odr/internal/font/type1_crypt.hpp
@@ -0,0 +1,34 @@
+#pragma once
+
+#include <cstddef>
+#include <cstdint>
+#include <string>
+#include <string_view>
+
+namespace odr::internal::font::type1 {
+
+/// Type1 (Adobe Type 1 Font Format) `eexec` / charstring decryption.
+///
+/// Both the `eexec`-encrypted portion of the font program and each individual
+/// charstring use the same stream cipher with different keys: a 16-bit running
+/// key `R`, constants c1 = 52845 / c2 = 22719, where each plaintext byte is
+/// `cipher ^ (R >> 8)` and `R = (cipher + R) * c1 + c2` (mod 2^16). The leading
+/// @p skip bytes of plaintext are random padding and discarded.
+
+/// Decrypt @p cipher with the running-key cipher seeded at @p key, discarding
+/// the first @p skip plaintext bytes.
+[[nodiscard]] std::string decrypt(std::string_view cipher, std::uint16_t key,
+                                  std::size_t skip);
+
+/// Decrypt the `eexec` section (key 55665, 4 random bytes). Accepts either the
+/// binary form (PDF `/FontFile`, the `/Length2` portion) or the ASCII-hex form
+/// (PFA fonts): when the section's leading bytes are all hex digits/whitespace
+/// it is hex-decoded first.
+[[nodiscard]] std::string decrypt_eexec(std::string_view eexec);
+
+/// Decrypt one charstring (key 4330), discarding @p len_iv leading bytes
+/// (the font's `/lenIV`, default 4).
+[[nodiscard]] std::string decrypt_charstring(std::string_view charstring,
+                                             std::size_t len_iv = 4);
+
+} // namespace odr::internal::font::type1
diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt
index 95fe4cf8..9bcd4254 100644
--- a/test/CMakeLists.txt
+++ b/test/CMakeLists.txt
@@ -55,6 +55,7 @@ add_executable(odr_test
         "src/internal/pdf/pdf_test_file_builder.cpp"
 
         "src/internal/font/cff_font.cpp"
+        "src/internal/font/type1_crypt.cpp"
         "src/internal/font/sfnt_font.cpp"
         "src/internal/font/sfnt_transform.cpp"
         "src/internal/font/font_file.cpp"
diff --git a/test/src/internal/font/type1_crypt.cpp b/test/src/internal/font/type1_crypt.cpp
new file mode 100644
index 00000000..7bfbd4d8
--- /dev/null
+++ b/test/src/internal/font/type1_crypt.cpp
@@ -0,0 +1,65 @@
+#include <odr/internal/font/type1_crypt.hpp>
+
+#include <gtest/gtest.h>
+
+#include <cstdint>
+#include <string>
+
+using namespace odr::internal::font::type1;
+
+namespace {
+
+/// Independent reference implementation of the Type1 *encryption* (the inverse
+/// of `decrypt`), so the round-trip tests are not circular: this codes the
+/// cipher forwards (plaintext -> ciphertext), `decrypt` codes it backwards.
+std::string encrypt(const std::string &plain, std::uint16_t r,
+                    const std::string &random_prefix) {
+  constexpr std::uint16_t c1 = 52845;
+  constexpr std::uint16_t c2 = 22719;
+  std::string out;
+  const std::string full = random_prefix + plain;
+  for (const char ch : full) {
+    const auto p = static_cast<std::uint8_t>(ch);
+    const auto cipher = static_cast<std::uint8_t>(p ^ (r >> 8));
+    out += static_cast<char>(cipher);
+    r = static_cast<std::uint16_t>((cipher + r) * c1 + c2);
+  }
+  return out;
+}
+
+} // namespace
+
+TEST(Type1CryptTest, EexecRoundTrip) {
+  const std::string plain = "/Private 10 dict dup begin";
+  const std::string cipher = encrypt(plain, 55665, "ABCD");
+  EXPECT_EQ(decrypt_eexec(cipher), plain);
+}
+
+TEST(Type1CryptTest, CharstringRoundTrip) {
+  const std::string plain("\x0d\x0e\xff\x00\x01", 5); // hsbw-ish bytes
+  const std::string cipher = encrypt(plain, 4330, "wxyz");
+  EXPECT_EQ(decrypt_charstring(cipher), plain); // default lenIV = 4
+}
+
+TEST(Type1CryptTest, CharstringHonoursLenIv) {
+  const std::string plain = "hello";
+  const std::string cipher = encrypt(plain, 4330, "");
+  EXPECT_EQ(decrypt_charstring(cipher, 0), plain);
+}
+
+TEST(Type1CryptTest, EexecAcceptsHexForm) {
+  const std::string plain = "dup /CharStrings";
+  const std::string binary = encrypt(plain, 55665, "0000");
+  // Hex-encode the binary eexec (PFA form), with whitespace the decoder skips.
+  std::string hex;
+  const char *digits = "0123456789abcdef";
+  for (std::size_t i = 0; i < binary.size(); ++i) {
+    const auto b = static_cast<std::uint8_t>(binary[i]);
+    hex += digits[b >> 4];
+    hex += digits[b & 0x0f];
+    if (i % 8 == 7) {
+      hex += '\n';
+    }
+  }
+  EXPECT_EQ(decrypt_eexec(hex), plain);
+}

From cf306e2d7f9ed19d1566bfae6ee58522f7459837 Mon Sep 17 00:00:00 2001
From: Andreas Stefl <stefl.andreas@gmail.com>
Date: Tue, 23 Jun 2026 22:00:13 +0200
Subject: [PATCH 3/6] PDF stage 3.5: Type1 program parser (Type1Program)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Parse an Adobe Type1 font program into its decrypted parts: split the
clear-text header / eexec section / trailer, read /FontName, /FontMatrix,
/FontBBox and /Encoding (StandardEncoding or a custom dup-code-name-put
array) from the header, decrypt the eexec section (type1_crypt) and extract
every glyph's decrypted charstring plus /Subrs (RD/-| binary entries,
/lenIV-aware). PFB segment framing is stripped if present. Charstrings are
not yet interpreted — that's the Type1->Type2 translation that follows,
feeding 3.4's CFF->OTF path.

Tests: a hand-built encrypted Type1 program (independent forward cipher) —
magic, header/encoding parse, and the decrypted charstrings/subrs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
---
 CMakeLists.txt                        |   1 +
 src/odr/internal/font/type1_font.cpp  | 295 ++++++++++++++++++++++++++
 src/odr/internal/font/type1_font.hpp  |  81 +++++++
 test/CMakeLists.txt                   |   1 +
 test/src/internal/font/type1_font.cpp | 111 ++++++++++
 5 files changed, 489 insertions(+)
 create mode 100644 src/odr/internal/font/type1_font.cpp
 create mode 100644 src/odr/internal/font/type1_font.hpp
 create mode 100644 test/src/internal/font/type1_font.cpp

diff --git a/CMakeLists.txt b/CMakeLists.txt
index f74f4945..d6eb4f6a 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -202,6 +202,7 @@ set(ODR_SOURCE_FILES
         "src/odr/internal/font/cff_standard_strings.cpp"
         "src/odr/internal/font/cff_transform.cpp"
         "src/odr/internal/font/type1_crypt.cpp"
+        "src/odr/internal/font/type1_font.cpp"
         "src/odr/internal/font/sfnt_font.cpp"
         "src/odr/internal/font/sfnt_parser.cpp"
         "src/odr/internal/font/sfnt_transform.cpp"
diff --git a/src/odr/internal/font/type1_font.cpp b/src/odr/internal/font/type1_font.cpp
new file mode 100644
index 00000000..b149c6ed
--- /dev/null
+++ b/src/odr/internal/font/type1_font.cpp
@@ -0,0 +1,295 @@
+#include <odr/internal/font/type1_font.hpp>
+
+#include <odr/internal/font/type1_crypt.hpp>
+
+#include <charconv>
+#include <cstdint>
+#include <optional>
+#include <stdexcept>
+#include <string>
+#include <string_view>
+#include <vector>
+
+namespace odr::internal::font::type1 {
+
+namespace {
+
+[[nodiscard]] bool is_ps_space(const char c) {
+  return c == ' ' || c == '\t' || c == '\r' || c == '\n' || c == '\f' ||
+         c == '\0';
+}
+
+/// Skip PostScript whitespace starting at @p p.
+[[nodiscard]] std::size_t skip_space(std::string_view s, std::size_t p) {
+  while (p < s.size() && is_ps_space(s[p])) {
+    ++p;
+  }
+  return p;
+}
+
+/// Read a whitespace-delimited token starting at @p p; advances @p p past it.
+[[nodiscard]] std::string_view read_token(std::string_view s, std::size_t &p) {
+  p = skip_space(s, p);
+  const std::size_t begin = p;
+  while (p < s.size() && !is_ps_space(s[p])) {
+    ++p;
+  }
+  return s.substr(begin, p - begin);
+}
+
+[[nodiscard]] bool parse_int(std::string_view token, int &out) {
+  const char *begin = token.data();
+  const char *end = begin + token.size();
+  const auto [ptr, ec] = std::from_chars(begin, end, out);
+  return ec == std::errc() && ptr == end;
+}
+
+[[nodiscard]] double parse_double(std::string_view token) {
+  // std::from_chars for double is not universally available; std::stod needs a
+  // null-terminated copy.
+  try {
+    return std::stod(std::string(token));
+  } catch (const std::exception &) {
+    return 0.0;
+  }
+}
+
+/// Parse the numbers inside the next `[...]` or `{...}` after @p key in @p s.
+[[nodiscard]] std::vector<double> parse_number_array(std::string_view s,
+                                                     std::string_view key) {
+  std::vector<double> out;
+  const std::size_t k = s.find(key);
+  if (k == std::string_view::npos) {
+    return out;
+  }
+  std::size_t open = s.find_first_of("[{", k);
+  if (open == std::string_view::npos) {
+    return out;
+  }
+  const std::size_t close = s.find_first_of("]}", open);
+  if (close == std::string_view::npos) {
+    return out;
+  }
+  std::size_t p = open + 1;
+  while (p < close) {
+    const std::string_view token = read_token(s.substr(0, close), p);
+    if (token.empty()) {
+      break;
+    }
+    out.push_back(parse_double(token));
+  }
+  return out;
+}
+
+/// Read the `length` binary bytes of an `RD`/`-|` value: at @p p sits the
+/// length integer, then the RD operator, then exactly one space, then the
+/// bytes. On success returns the bytes and advances @p p past them; on failure
+/// returns nullopt.
+[[nodiscard]] std::optional<std::string_view> read_rd_binary(std::string_view s,
+                                                             std::size_t &p) {
+  std::size_t q = p;
+  const std::string_view length_token = read_token(s, q);
+  int length = 0;
+  if (!parse_int(length_token, length) || length < 0) {
+    return std::nullopt;
+  }
+  const std::string_view rd = read_token(s, q); // "RD" or "-|"
+  if (rd != "RD" && rd != "-|") {
+    return std::nullopt;
+  }
+  // Exactly one space separates the RD operator from the binary data.
+  if (q >= s.size()) {
+    return std::nullopt;
+  }
+  ++q; // the single delimiter space
+  if (q + static_cast<std::size_t>(length) > s.size()) {
+    return std::nullopt;
+  }
+  const std::string_view bytes = s.substr(q, static_cast<std::size_t>(length));
+  p = q + static_cast<std::size_t>(length);
+  return bytes;
+}
+
+} // namespace
+
+bool Type1Program::is_type1(const std::string_view data) {
+  if (data.size() >= 2 && static_cast<std::uint8_t>(data[0]) == 0x80) {
+    return true; // PFB segment marker
+  }
+  return data.substr(0, 256).find("%!PS-AdobeFont") != std::string_view::npos ||
+         data.substr(0, 256).find("%!FontType1") != std::string_view::npos;
+}
+
+Type1Program::Type1Program(std::string_view program) {
+  // Strip PFB segment framing if present: each segment is `0x80 type len32le`
+  // followed by `len` bytes (type 1 = ASCII, 2 = binary, 3 = EOF).
+  std::string unframed;
+  if (!program.empty() && static_cast<std::uint8_t>(program[0]) == 0x80) {
+    std::size_t p = 0;
+    while (p + 6 <= program.size() &&
+           static_cast<std::uint8_t>(program[p]) == 0x80) {
+      const std::uint8_t type = static_cast<std::uint8_t>(program[p + 1]);
+      if (type == 3) {
+        break;
+      }
+      const std::uint32_t len =
+          static_cast<std::uint8_t>(program[p + 2]) |
+          (static_cast<std::uint8_t>(program[p + 3]) << 8) |
+          (static_cast<std::uint8_t>(program[p + 4]) << 16) |
+          (static_cast<std::uint32_t>(static_cast<std::uint8_t>(program[p + 5]))
+           << 24);
+      p += 6;
+      if (p + len > program.size()) {
+        break;
+      }
+      unframed.append(program.substr(p, len));
+      p += len;
+    }
+    program = unframed;
+  }
+
+  const std::size_t eexec = program.find("eexec");
+  if (eexec == std::string_view::npos) {
+    throw std::runtime_error("type1: no eexec section");
+  }
+
+  parse_clear(program.substr(0, eexec));
+
+  // The encrypted blob begins after `eexec` and its trailing whitespace.
+  const std::size_t blob = skip_space(program, eexec + 5);
+  const std::string decrypted = decrypt_eexec(program.substr(blob));
+  parse_private(decrypted);
+
+  if (m_glyphs.empty()) {
+    throw std::runtime_error("type1: no /CharStrings");
+  }
+}
+
+void Type1Program::parse_clear(const std::string_view clear) {
+  if (const std::size_t k = clear.find("/FontName");
+      k != std::string_view::npos) {
+    std::size_t p = clear.find('/', k + 1);
+    if (p != std::string_view::npos) {
+      ++p;
+      const std::size_t begin = p;
+      while (p < clear.size() && !is_ps_space(clear[p])) {
+        ++p;
+      }
+      m_name = std::string(clear.substr(begin, p - begin));
+    }
+  }
+
+  if (const std::vector<double> matrix =
+          parse_number_array(clear, "/FontMatrix");
+      matrix.size() == 6) {
+    m_font_matrix = matrix;
+  }
+  if (const std::vector<double> bbox = parse_number_array(clear, "/FontBBox");
+      bbox.size() == 4) {
+    m_font_bbox = {
+        static_cast<std::int16_t>(bbox[0]), static_cast<std::int16_t>(bbox[1]),
+        static_cast<std::int16_t>(bbox[2]), static_cast<std::int16_t>(bbox[3])};
+  }
+
+  // /Encoding: `StandardEncoding def`, or a custom array built with
+  // `dup <code> /<name> put` lines.
+  const std::size_t enc = clear.find("/Encoding");
+  if (enc != std::string_view::npos) {
+    const std::string_view after = clear.substr(enc);
+    if (after.substr(0, 64).find("StandardEncoding") !=
+        std::string_view::npos) {
+      m_standard_encoding = true;
+    } else {
+      m_standard_encoding = false;
+      std::size_t p = 0;
+      while ((p = after.find("dup ", p)) != std::string_view::npos) {
+        std::size_t q = p + 4;
+        int code = 0;
+        const std::string_view code_token = read_token(after, q);
+        const std::size_t slash = after.find('/', q);
+        if (parse_int(code_token, code) && slash != std::string_view::npos) {
+          std::size_t r = slash + 1;
+          const std::size_t begin = r;
+          while (r < after.size() && !is_ps_space(after[r])) {
+            ++r;
+          }
+          m_encoding[code] = std::string(after.substr(begin, r - begin));
+        }
+        p = q;
+      }
+    }
+  }
+}
+
+void Type1Program::parse_private(const std::string_view decrypted) {
+  int len_iv = 4;
+  if (const std::size_t k = decrypted.find("/lenIV");
+      k != std::string_view::npos) {
+    std::size_t p = k + 6;
+    int value = 0;
+    if (parse_int(read_token(decrypted, p), value)) {
+      len_iv = value;
+    }
+  }
+  m_len_iv = len_iv;
+
+  // /Subrs: entries `dup <index> <length> RD <bytes> NP`.
+  if (const std::size_t k = decrypted.find("/Subrs");
+      k != std::string_view::npos) {
+    std::size_t p = k;
+    while ((p = decrypted.find("dup ", p)) != std::string_view::npos) {
+      // Stop when /CharStrings starts (Subrs precede it).
+      const std::size_t cs = decrypted.find("/CharStrings");
+      if (cs != std::string_view::npos && p > cs) {
+        break;
+      }
+      std::size_t q = p + 4;
+      int index = 0;
+      if (!parse_int(read_token(decrypted, q), index) || index < 0) {
+        p += 4;
+        continue;
+      }
+      const std::optional<std::string_view> bytes =
+          read_rd_binary(decrypted, q);
+      if (!bytes.has_value()) {
+        p += 4;
+        continue;
+      }
+      if (static_cast<int>(m_subrs.size()) <= index) {
+        m_subrs.resize(index + 1);
+      }
+      m_subrs[index] = decrypt_charstring(*bytes, len_iv);
+      p = q;
+    }
+  }
+
+  // /CharStrings: entries `/<name> <length> RD <bytes> ND`.
+  const std::size_t cs = decrypted.find("/CharStrings");
+  if (cs == std::string_view::npos) {
+    return;
+  }
+  const std::size_t begin = decrypted.find("begin", cs);
+  std::size_t p = (begin == std::string_view::npos) ? cs : begin + 5;
+  while (p < decrypted.size()) {
+    const std::size_t slash = decrypted.find('/', p);
+    if (slash == std::string_view::npos) {
+      break;
+    }
+    std::size_t q = slash + 1;
+    const std::size_t name_begin = q;
+    while (q < decrypted.size() && !is_ps_space(decrypted[q])) {
+      ++q;
+    }
+    std::string name(decrypted.substr(name_begin, q - name_begin));
+    const std::optional<std::string_view> bytes = read_rd_binary(decrypted, q);
+    if (!bytes.has_value()) {
+      // Not a charstring entry (e.g. `end`); advance past this slash.
+      p = slash + 1;
+      continue;
+    }
+    m_glyphs.push_back({std::move(name), decrypt_charstring(*bytes, len_iv)});
+    p = q;
+  }
+}
+
+} // namespace odr::internal::font::type1
diff --git a/src/odr/internal/font/type1_font.hpp b/src/odr/internal/font/type1_font.hpp
new file mode 100644
index 00000000..6dc56734
--- /dev/null
+++ b/src/odr/internal/font/type1_font.hpp
@@ -0,0 +1,81 @@
+#pragma once
+
+#include <odr/font.hpp>
+
+#include <map>
+#include <string>
+#include <string_view>
+#include <vector>
+
+namespace odr::internal::font::type1 {
+
+/// One glyph of a Type1 font: its PostScript name and its **decrypted** Type1
+/// charstring (charstring encryption removed, `/lenIV` leading bytes dropped).
+struct Glyph {
+  std::string name;
+  std::string charstring;
+};
+
+/// @brief Parses an Adobe Type1 font program into its decrypted parts.
+///
+/// A Type1 program has three sections: a clear-text header (font dictionary up
+/// to `eexec`), an `eexec`-encrypted private portion (`/Subrs`,
+/// `/CharStrings`), and a zero-padded trailer. This reads the header for
+/// `/FontMatrix`,
+/// `/FontBBox`, `/Encoding` and `/FontName`, decrypts the `eexec` section
+/// (`type1_crypt`) and extracts every glyph's decrypted charstring plus the
+/// `/Subrs`. It does **not** yet interpret the charstrings — that is the
+/// Type1 -> Type2 translation that follows, feeding 3.4's CFF -> OTF path.
+///
+/// Throws `std::runtime_error` when the program has no `eexec` section or no
+/// `/CharStrings`.
+class Type1Program {
+public:
+  /// Cheap magic test: the PostScript font sentinel (`%!PS-AdobeFont`,
+  /// `%!FontType1`) or a PFB segment marker (`0x80`).
+  [[nodiscard]] static bool is_type1(std::string_view data);
+
+  /// Parse @p program (the raw `/FontFile` bytes, PFB markers stripped if
+  /// present).
+  explicit Type1Program(std::string_view program);
+
+  [[nodiscard]] std::string_view name() const noexcept { return m_name; }
+  /// The 6-element `/FontMatrix` (defaults to `[0.001 0 0 0.001 0 0]`).
+  [[nodiscard]] const std::vector<double> &font_matrix() const noexcept {
+    return m_font_matrix;
+  }
+  [[nodiscard]] FontBBox font_bbox() const noexcept { return m_font_bbox; }
+
+  /// `/Encoding` as code -> glyph name. Empty when the font uses
+  /// `StandardEncoding` (see `standard_encoding`).
+  [[nodiscard]] const std::map<int, std::string> &encoding() const noexcept {
+    return m_encoding;
+  }
+  [[nodiscard]] bool standard_encoding() const noexcept {
+    return m_standard_encoding;
+  }
+
+  /// Decrypted glyphs in declaration order.
+  [[nodiscard]] const std::vector<Glyph> &glyphs() const noexcept {
+    return m_glyphs;
+  }
+  /// Decrypted `/Subrs`, indexed by subr number.
+  [[nodiscard]] const std::vector<std::string> &subrs() const noexcept {
+    return m_subrs;
+  }
+
+private:
+  void parse_clear(std::string_view clear);
+  void parse_private(std::string_view decrypted);
+
+  std::string m_name;
+  std::vector<double> m_font_matrix{0.001, 0.0, 0.0, 0.001, 0.0, 0.0};
+  FontBBox m_font_bbox{};
+  std::map<int, std::string> m_encoding;
+  bool m_standard_encoding{true};
+  std::vector<Glyph> m_glyphs;
+  std::vector<std::string> m_subrs;
+  int m_len_iv{4};
+};
+
+} // namespace odr::internal::font::type1
diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt
index 9bcd4254..249ebc51 100644
--- a/test/CMakeLists.txt
+++ b/test/CMakeLists.txt
@@ -56,6 +56,7 @@ add_executable(odr_test
 
         "src/internal/font/cff_font.cpp"
         "src/internal/font/type1_crypt.cpp"
+        "src/internal/font/type1_font.cpp"
         "src/internal/font/sfnt_font.cpp"
         "src/internal/font/sfnt_transform.cpp"
         "src/internal/font/font_file.cpp"
diff --git a/test/src/internal/font/type1_font.cpp b/test/src/internal/font/type1_font.cpp
new file mode 100644
index 00000000..a372b0b2
--- /dev/null
+++ b/test/src/internal/font/type1_font.cpp
@@ -0,0 +1,111 @@
+#include <odr/internal/font/type1_font.hpp>
+
+#include <gtest/gtest.h>
+
+#include <cstdint>
+#include <string>
+
+using namespace odr::internal::font::type1;
+
+namespace {
+
+/// Forward Type1 cipher (the inverse of `decrypt`), so the test builds a real
+/// encrypted program rather than trusting the decryptor.
+std::string encrypt(const std::string &plain, std::uint16_t r,
+                    const std::string &random_prefix) {
+  constexpr std::uint16_t c1 = 52845;
+  constexpr std::uint16_t c2 = 22719;
+  std::string out;
+  for (const char ch : random_prefix + plain) {
+    const auto p = static_cast<std::uint8_t>(ch);
+    const auto cipher = static_cast<std::uint8_t>(p ^ (r >> 8));
+    out += static_cast<char>(cipher);
+    r = static_cast<std::uint16_t>((cipher + r) * c1 + c2);
+  }
+  return out;
+}
+
+/// A `/name len RD <bytes> ND` charstring entry, the charstring encrypted with
+/// the charstring key (4330) and a 4-byte lenIV prefix.
+std::string charstring_entry(const std::string &name,
+                             const std::string &plain_charstring) {
+  // The 4-byte lenIV prefix must be a real 4 NUL bytes — a "\x00\x00\x00\x00"
+  // string literal would be empty (the first NUL terminates it).
+  const std::string enc = encrypt(plain_charstring, 4330, std::string(4, '\0'));
+  return "/" + name + " " + std::to_string(enc.size()) + " RD " + enc + " ND\n";
+}
+
+/// Assemble a minimal but well-formed Type1 program: a clear header (with a
+/// custom /Encoding and /FontMatrix) and an eexec-encrypted private section
+/// holding two glyphs and one subr.
+std::string build_type1() {
+  std::string clear = "%!PS-AdobeFont-1.0: TestType1 001.000\n"
+                      "/FontName /TestType1 def\n"
+                      "/FontMatrix [0.001 0 0 0.001 0 0] readonly def\n"
+                      "/FontBBox {0 -200 700 800} readonly def\n"
+                      "/Encoding 256 array\n"
+                      "0 1 255 {1 index exch /.notdef put} for\n"
+                      "dup 65 /A put\n"
+                      "dup 66 /B put\n"
+                      "readonly def\n"
+                      "currentdict end\n"
+                      "currentfile eexec\n";
+
+  std::string private_section = "dup /Private 16 dict dup begin\n"
+                                "/lenIV 4 def\n"
+                                "/Subrs 1 array\n";
+  private_section += "dup 0 ";
+  {
+    const std::string subr =
+        encrypt(std::string("\x0b", 1), 4330, std::string(4, '\0')); // return
+    private_section += std::to_string(subr.size()) + " RD " + subr + " NP\n";
+  }
+  private_section += "ND\n"
+                     "2 index /CharStrings 2 dict dup begin\n";
+  // .notdef-ish + two named glyphs. Charstring bytes are arbitrary here: the
+  // parser does not interpret them, it only extracts them.
+  private_section += charstring_entry("A", std::string("\x8b\x8b\x0d\x0e", 4));
+  private_section += charstring_entry("B", std::string("\xf0\x0d\x0e", 3));
+  private_section += "end\nend\n";
+
+  std::string program = clear;
+  program += encrypt(private_section, 55665, "wxyz");
+  // Trailer (would be 512 zeros + cleartomark in a real font); the parser
+  // tolerates trailing data, so a short stub is enough.
+  program += std::string(8, '\0');
+  return program;
+}
+
+} // namespace
+
+TEST(Type1FontTest, IsType1Magic) {
+  EXPECT_TRUE(Type1Program::is_type1(build_type1()));
+  EXPECT_FALSE(Type1Program::is_type1("not a font program at all"));
+}
+
+TEST(Type1FontTest, ParsesHeaderAndEncoding) {
+  const Type1Program font{build_type1()};
+
+  EXPECT_EQ(font.name(), "TestType1");
+  EXPECT_FALSE(font.standard_encoding());
+  ASSERT_EQ(font.font_matrix().size(), 6u);
+  EXPECT_DOUBLE_EQ(font.font_matrix()[0], 0.001);
+  EXPECT_EQ(font.font_bbox().y_min, -200);
+  EXPECT_EQ(font.font_bbox().x_max, 700);
+
+  EXPECT_EQ(font.encoding().at(65), "A");
+  EXPECT_EQ(font.encoding().at(66), "B");
+}
+
+TEST(Type1FontTest, DecryptsCharstringsAndSubrs) {
+  const Type1Program font{build_type1()};
+
+  ASSERT_EQ(font.glyphs().size(), 2u);
+  EXPECT_EQ(font.glyphs()[0].name, "A");
+  EXPECT_EQ(font.glyphs()[0].charstring, std::string("\x8b\x8b\x0d\x0e", 4));
+  EXPECT_EQ(font.glyphs()[1].name, "B");
+  EXPECT_EQ(font.glyphs()[1].charstring, std::string("\xf0\x0d\x0e", 3));
+
+  ASSERT_EQ(font.subrs().size(), 1u);
+  EXPECT_EQ(font.subrs()[0], std::string("\x0b", 1)); // return
+}

From b5cb39fc3adaf1ad0a284309f82945251fc6a826 Mon Sep 17 00:00:00 2001
From: Andreas Stefl <stefl.andreas@gmail.com>
Date: Tue, 23 Jun 2026 22:06:13 +0200
Subject: [PATCH 4/6] PDF stage 3.5: CFF builder (assemble a CFF from Type2
 charstrings)

cff::build_cff serializes a name-keyed CFF from a list of (name, Type2
charstring) glyphs + default/nominalWidthX + bbox: Header, Name INDEX, Top
DICT (FontBBox + charset/CharStrings/Private offsets, fixed-width so the
layout resolves in one pass), String INDEX (every glyph name as a custom SID,
so no standard-strings table is needed), empty Global Subr INDEX, CharStrings
INDEX, format-0 charset, Private DICT. This is the assembly target for the
Type1 -> CFF path: the translated Type2 charstrings land here, the result
feeds CffFont + wrap_to_otf (3.4).

Test: build a 2-glyph CFF, read it back through CffFont (name, glyph name,
bbox, charstring width vs. default) and confirm it wraps to a loadable OTTO.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
---
 CMakeLists.txt                        |   1 +
 src/odr/internal/font/cff_builder.cpp | 184 ++++++++++++++++++++++++++
 src/odr/internal/font/cff_builder.hpp |  40 ++++++
 test/src/internal/font/cff_font.cpp   |  32 +++++
 4 files changed, 257 insertions(+)
 create mode 100644 src/odr/internal/font/cff_builder.cpp
 create mode 100644 src/odr/internal/font/cff_builder.hpp

diff --git a/CMakeLists.txt b/CMakeLists.txt
index d6eb4f6a..a0b4b1e4 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -198,6 +198,7 @@ set(ODR_SOURCE_FILES
         "src/odr/internal/pdf/pdf_object_parser.cpp"
         "src/odr/internal/pdf/pdf_page_text.cpp"
 
+        "src/odr/internal/font/cff_builder.cpp"
         "src/odr/internal/font/cff_font.cpp"
         "src/odr/internal/font/cff_standard_strings.cpp"
         "src/odr/internal/font/cff_transform.cpp"
diff --git a/src/odr/internal/font/cff_builder.cpp b/src/odr/internal/font/cff_builder.cpp
new file mode 100644
index 00000000..26174f5e
--- /dev/null
+++ b/src/odr/internal/font/cff_builder.cpp
@@ -0,0 +1,184 @@
+#include <odr/internal/font/cff_builder.hpp>
+
+#include <cstdint>
+#include <string>
+#include <vector>
+
+namespace odr::internal::font::cff {
+
+namespace {
+
+void put16(std::string &s, const std::uint16_t v) {
+  s += static_cast<char>(v >> 8);
+  s += static_cast<char>(v & 0xff);
+}
+
+/// A CFF DICT integer in the compact encoding (used for widths / bbox).
+void dict_int(std::string &s, const int v) {
+  if (v >= -107 && v <= 107) {
+    s += static_cast<char>(v + 139);
+  } else if (v >= 108 && v <= 1131) {
+    const int u = v - 108;
+    s += static_cast<char>((u >> 8) + 247);
+    s += static_cast<char>(u & 0xff);
+  } else if (v >= -1131 && v <= -108) {
+    const int u = -v - 108;
+    s += static_cast<char>((u >> 8) + 251);
+    s += static_cast<char>(u & 0xff);
+  } else if (v >= -32768 && v <= 32767) {
+    s += static_cast<char>(28);
+    put16(s, static_cast<std::uint16_t>(v));
+  } else {
+    s += static_cast<char>(29);
+    s += static_cast<char>((v >> 24) & 0xff);
+    s += static_cast<char>((v >> 16) & 0xff);
+    s += static_cast<char>((v >> 8) & 0xff);
+    s += static_cast<char>(v & 0xff);
+  }
+}
+
+/// A CFF DICT integer in the fixed 5-byte form (`29 + int32`), so an operand
+/// whose value (an offset) is not yet known can be sized before it is filled.
+void dict_int_fixed(std::string &s, const std::int32_t v) {
+  s += static_cast<char>(29);
+  s += static_cast<char>((v >> 24) & 0xff);
+  s += static_cast<char>((v >> 16) & 0xff);
+  s += static_cast<char>((v >> 8) & 0xff);
+  s += static_cast<char>(v & 0xff);
+}
+
+void dict_operator(std::string &s, const int op) {
+  if (op >= 1200) {
+    s += static_cast<char>(12);
+    s += static_cast<char>(op - 1200);
+  } else {
+    s += static_cast<char>(op);
+  }
+}
+
+/// Serialize a CFF INDEX from its members.
+std::string build_index(const std::vector<std::string> &members) {
+  std::string out;
+  put16(out, static_cast<std::uint16_t>(members.size()));
+  if (members.empty()) {
+    return out; // count 0: no offSize/offsets
+  }
+  std::uint32_t total = 1;
+  for (const std::string &m : members) {
+    total += static_cast<std::uint32_t>(m.size());
+  }
+  const std::uint8_t off_size = total <= 0xff       ? 1
+                                : total <= 0xffff   ? 2
+                                : total <= 0xffffff ? 3
+                                                    : 4;
+  out += static_cast<char>(off_size);
+  const auto put_off = [&](const std::uint32_t off) {
+    for (int i = off_size - 1; i >= 0; --i) {
+      out += static_cast<char>((off >> (8 * i)) & 0xff);
+    }
+  };
+  std::uint32_t offset = 1;
+  put_off(offset);
+  for (const std::string &m : members) {
+    offset += static_cast<std::uint32_t>(m.size());
+    put_off(offset);
+  }
+  for (const std::string &m : members) {
+    out += m;
+  }
+  return out;
+}
+
+} // namespace
+
+std::string build_cff(const std::string_view name,
+                      const std::vector<BuilderGlyph> &glyphs,
+                      const double default_width, const double nominal_width,
+                      const FontBBox bbox) {
+  // CharStrings INDEX (one Type2 charstring per glyph).
+  std::vector<std::string> charstrings;
+  charstrings.reserve(glyphs.size());
+  for (const BuilderGlyph &glyph : glyphs) {
+    charstrings.push_back(glyph.charstring);
+  }
+  const std::string charstrings_index = build_index(charstrings);
+
+  // String INDEX: every glyph name gets a custom SID (391 + position). Glyph 0
+  // is the implicit `.notdef` (SID 0), so its name is not stored; the charset
+  // lists SIDs for glyphs 1..n-1.
+  std::vector<std::string> strings;
+  for (std::size_t i = 1; i < glyphs.size(); ++i) {
+    strings.push_back(glyphs[i].name);
+  }
+  const std::string string_index = build_index(strings);
+
+  // Format-0 charset: SID per glyph 1..n-1.
+  std::string charset;
+  charset += static_cast<char>(0); // format 0
+  for (std::size_t i = 1; i < glyphs.size(); ++i) {
+    put16(charset, static_cast<std::uint16_t>(391 + (i - 1)));
+  }
+
+  // Private DICT: defaultWidthX (20), nominalWidthX (21).
+  std::string private_dict;
+  dict_int(private_dict, static_cast<int>(default_width));
+  dict_operator(private_dict, 20);
+  dict_int(private_dict, static_cast<int>(nominal_width));
+  dict_operator(private_dict, 21);
+
+  const std::string name_index =
+      build_index({std::string(name.empty() ? "ODRType1" : name)});
+  const std::string global_subrs = build_index({});
+
+  // Top DICT, with the offsets to charset / CharStrings / Private filled once
+  // the layout is known. Fixed-width offset integers keep the size constant.
+  const auto top_dict = [&](const std::uint32_t charset_off,
+                            const std::uint32_t charstrings_off,
+                            const std::uint32_t private_off) {
+    std::string d;
+    dict_int(d, bbox.x_min);
+    dict_int(d, bbox.y_min);
+    dict_int(d, bbox.x_max);
+    dict_int(d, bbox.y_max);
+    dict_operator(d, 5); // FontBBox
+    dict_int_fixed(d, static_cast<std::int32_t>(charset_off));
+    dict_operator(d, 15); // charset
+    dict_int_fixed(d, static_cast<std::int32_t>(charstrings_off));
+    dict_operator(d, 17); // CharStrings
+    dict_int_fixed(d, static_cast<std::int32_t>(private_dict.size()));
+    dict_int_fixed(d, static_cast<std::int32_t>(private_off));
+    dict_operator(d, 18); // Private [size offset]
+    return d;
+  };
+
+  const std::string top_dict_probe = build_index({top_dict(0, 0, 0)});
+  constexpr std::uint32_t header_size = 4;
+  const auto prefix = static_cast<std::uint32_t>(
+      header_size + name_index.size() + top_dict_probe.size() +
+      string_index.size() + global_subrs.size());
+  // Layout after the prefix: CharStrings, charset, Private.
+  const std::uint32_t charstrings_off = prefix;
+  const std::uint32_t charset_off =
+      charstrings_off + static_cast<std::uint32_t>(charstrings_index.size());
+  const std::uint32_t private_off =
+      charset_off + static_cast<std::uint32_t>(charset.size());
+
+  const std::string top_dict_index =
+      build_index({top_dict(charset_off, charstrings_off, private_off)});
+
+  std::string out;
+  out += static_cast<char>(1); // major
+  out += static_cast<char>(0); // minor
+  out += static_cast<char>(4); // hdrSize
+  out += static_cast<char>(4); // offSize (absolute offsets; legacy/unused)
+  out += name_index;
+  out += top_dict_index;
+  out += string_index;
+  out += global_subrs;
+  out += charstrings_index;
+  out += charset;
+  out += private_dict;
+  return out;
+}
+
+} // namespace odr::internal::font::cff
diff --git a/src/odr/internal/font/cff_builder.hpp b/src/odr/internal/font/cff_builder.hpp
new file mode 100644
index 00000000..cf8f6ac5
--- /dev/null
+++ b/src/odr/internal/font/cff_builder.hpp
@@ -0,0 +1,40 @@
+#pragma once
+
+#include <odr/font.hpp>
+
+#include <string>
+#include <string_view>
+#include <vector>
+
+namespace odr::internal::font::cff {
+
+/// One glyph for the CFF builder: its PostScript name and its **Type2**
+/// charstring (already translated from Type1, if applicable).
+struct BuilderGlyph {
+  std::string name;
+  std::string charstring;
+};
+
+/// Serialize a name-keyed CFF font from Type2 charstrings.
+///
+/// Assembles the minimal CFF a `CffFont` reader (and, after wrapping, a
+/// browser) needs: Header, Name INDEX, Top DICT (FontBBox +
+/// charset/CharStrings/Private offsets), String INDEX (every glyph name, SID
+/// 391+), an empty Global Subr INDEX, the CharStrings INDEX, a format-0 charset
+/// and a Private DICT
+/// (`defaultWidthX`/`nominalWidthX`). Glyph 0 is the implicit `.notdef`; the
+/// caller orders @p glyphs so glyph 0 is `.notdef`.
+///
+/// This is the assembly target for the Type1 -> CFF path (stage 3.5): the
+/// translated Type2 charstrings go in here, the result feeds `CffFont` +
+/// `wrap_to_otf` (3.4). No `FontMatrix` is emitted, so the font is 1000
+/// units/em (the Type1 default); a non-default matrix is a follow-up.
+///
+/// Offsets in the Top DICT use the fixed-width 5-byte integer form so the
+/// layout resolves in a single pass.
+[[nodiscard]] std::string build_cff(std::string_view name,
+                                    const std::vector<BuilderGlyph> &glyphs,
+                                    double default_width, double nominal_width,
+                                    FontBBox bbox);
+
+} // namespace odr::internal::font::cff
diff --git a/test/src/internal/font/cff_font.cpp b/test/src/internal/font/cff_font.cpp
index 3be0bebc..e5b1a3f8 100644
--- a/test/src/internal/font/cff_font.cpp
+++ b/test/src/internal/font/cff_font.cpp
@@ -1,6 +1,7 @@
 #include <odr/internal/font/cff_font.hpp>
 
 #include <odr/font.hpp>
+#include <odr/internal/font/cff_builder.hpp>
 #include <odr/internal/font/cff_transform.hpp>
 #include <odr/internal/font/sfnt_font.hpp>
 #include <odr/internal/font/sfnt_transform.hpp>
@@ -193,6 +194,37 @@ TEST(CffFontTest, IsCffMagic) {
   EXPECT_FALSE(CffFont::is_cff("not a font"));
 }
 
+TEST(CffFontTest, BuildCffRoundTripsThroughReader) {
+  using odr::internal::font::cff::build_cff;
+  using odr::internal::font::cff::BuilderGlyph;
+
+  // Type2 charstrings: .notdef = endchar; "A" = width-operand 50 then endchar
+  // (50 -> single byte 50 + 139 = 0xBD; endchar = 0x0E).
+  std::vector<BuilderGlyph> glyphs = {
+      {".notdef", std::string("\x0e", 1)},
+      {"A", std::string("\xbd\x0e", 2)},
+  };
+  const std::string cff_bytes =
+      build_cff("MyType1", glyphs, /*default_width=*/0, /*nominal_width=*/100,
+                FontBBox{0, -200, 700, 800});
+
+  const CffFont font{cff_bytes};
+  EXPECT_EQ(font.format(), FontFormat::cff);
+  EXPECT_EQ(font.name(), "MyType1");
+  EXPECT_EQ(font.glyph_count(), 2);
+  EXPECT_FALSE(font.is_cid_keyed());
+  EXPECT_EQ(font.glyph_name(1), "A");
+  EXPECT_EQ(font.bounding_box().x_max, 700);
+  // explicit charstring width: nominalWidthX (100) + 50.
+  EXPECT_EQ(font.advance_width(1), 150);
+  // no explicit width: defaultWidthX (0).
+  EXPECT_EQ(font.advance_width(0), 0);
+
+  // The built CFF wraps into a loadable OTTO (3.4 path) end to end.
+  const std::string otf = odr::internal::font::cff::wrap_to_otf(font);
+  EXPECT_TRUE(odr::internal::font::sfnt::SfntFont::is_sfnt(otf));
+}
+
 TEST(CffFontTest, WrapsToLoadableOtf) {
   using namespace odr::internal::font;
   const CffFont cff{build_cff()};

From 8b11d73084d8cfbd75ce35b10cf016874bb9d2b7 Mon Sep 17 00:00:00 2001
From: Andreas Stefl <stefl.andreas@gmail.com>
Date: Tue, 23 Jun 2026 22:11:24 +0200
Subject: [PATCH 5/6] PDF stage 3.5: Type1 -> Type2 charstring translation

type1::to_type2 translates a decrypted Type1 charstring to Type2 (CFF): a
stack machine that flattens callsubr (inlining the font's /Subrs, depth
guarded), folds div, lifts the hsbw side bearing into the first moveto and
returns the advance width separately, drops Type1-only hints (dotsection,
*stem3, hint-replacement OtherSubr 3), and translates the flex OtherSubrs
(1/2/0 -> two rrcurvetos) and seac (-> Type2 endchar form). Path operators
(r/h/v lineto, rr/vh/hv curveto, stems, moves, endchar) share opcodes with
Type2 and pass through. Best-effort / display-oriented: hints affect
rendering quality, not glyph shape.

Tests: exact Type2 output for hsbw width + side-bearing folding into the
first move, callsubr inlining, and div folding.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
---
 CMakeLists.txt                              |   1 +
 src/odr/internal/font/type1_charstring.cpp  | 424 ++++++++++++++++++++
 src/odr/internal/font/type1_charstring.hpp  |  30 ++
 test/CMakeLists.txt                         |   1 +
 test/src/internal/font/type1_charstring.cpp | 122 ++++++
 5 files changed, 578 insertions(+)
 create mode 100644 src/odr/internal/font/type1_charstring.cpp
 create mode 100644 src/odr/internal/font/type1_charstring.hpp
 create mode 100644 test/src/internal/font/type1_charstring.cpp

diff --git a/CMakeLists.txt b/CMakeLists.txt
index a0b4b1e4..be77b4aa 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -202,6 +202,7 @@ set(ODR_SOURCE_FILES
         "src/odr/internal/font/cff_font.cpp"
         "src/odr/internal/font/cff_standard_strings.cpp"
         "src/odr/internal/font/cff_transform.cpp"
+        "src/odr/internal/font/type1_charstring.cpp"
         "src/odr/internal/font/type1_crypt.cpp"
         "src/odr/internal/font/type1_font.cpp"
         "src/odr/internal/font/sfnt_font.cpp"
diff --git a/src/odr/internal/font/type1_charstring.cpp b/src/odr/internal/font/type1_charstring.cpp
new file mode 100644
index 00000000..ad7365d6
--- /dev/null
+++ b/src/odr/internal/font/type1_charstring.cpp
@@ -0,0 +1,424 @@
+#include <odr/internal/font/type1_charstring.hpp>
+
+#include <algorithm>
+#include <cmath>
+#include <cstdint>
+#include <string>
+#include <vector>
+
+namespace odr::internal::font::type1 {
+
+namespace {
+
+// Type1 charstring operators (single byte; 12 = escape to a two-byte op).
+enum T1 : int {
+  t1_hstem = 1,
+  t1_vstem = 3,
+  t1_vmoveto = 4,
+  t1_rlineto = 5,
+  t1_hlineto = 6,
+  t1_vlineto = 7,
+  t1_rrcurveto = 8,
+  t1_closepath = 9,
+  t1_callsubr = 10,
+  t1_return = 11,
+  t1_hsbw = 13,
+  t1_endchar = 14,
+  t1_rmoveto = 21,
+  t1_hmoveto = 22,
+  t1_vhcurveto = 30,
+  t1_hvcurveto = 31,
+  t1_dotsection = 1200,
+  t1_vstem3 = 1201,
+  t1_hstem3 = 1202,
+  t1_seac = 1206,
+  t1_sbw = 1207,
+  t1_div = 1212,
+  t1_callothersubr = 1216,
+  t1_pop = 1217,
+  t1_setcurrentpoint = 1233,
+};
+
+/// Encode an integer operand in the Type2 charstring number forms.
+void emit_int(std::string &out, const int v) {
+  if (v >= -107 && v <= 107) {
+    out += static_cast<char>(v + 139);
+  } else if (v >= 108 && v <= 1131) {
+    const int u = v - 108;
+    out += static_cast<char>((u >> 8) + 247);
+    out += static_cast<char>(u & 0xff);
+  } else if (v >= -1131 && v <= -108) {
+    const int u = -v - 108;
+    out += static_cast<char>((u >> 8) + 251);
+    out += static_cast<char>(u & 0xff);
+  } else {
+    out += static_cast<char>(28); // shortint
+    out += static_cast<char>((v >> 8) & 0xff);
+    out += static_cast<char>(v & 0xff);
+  }
+}
+
+/// Encode a (possibly fractional) operand: an integer form when whole and in
+/// range, else the Type2 16.16 fixed form (`255 + int32`).
+void emit_num(std::string &out, const double v) {
+  if (v == std::floor(v) && v >= -32768 && v <= 32767) {
+    emit_int(out, static_cast<int>(v));
+    return;
+  }
+  const auto fixed = static_cast<std::int32_t>(std::lround(v * 65536.0));
+  out += static_cast<char>(255);
+  out += static_cast<char>((fixed >> 24) & 0xff);
+  out += static_cast<char>((fixed >> 16) & 0xff);
+  out += static_cast<char>((fixed >> 8) & 0xff);
+  out += static_cast<char>(fixed & 0xff);
+}
+
+/// The translation state machine. Walks the Type1 charstring (recursing through
+/// `callsubr`), emitting a Type2 charstring.
+class Translator {
+public:
+  explicit Translator(const std::vector<std::string> &subrs) : m_subrs(subrs) {}
+
+  Type2Charstring run(std::string_view charstring) {
+    execute(charstring, 0);
+    if (!m_ended) {
+      m_out += static_cast<char>(t1_endchar);
+    }
+    return {std::move(m_out), m_width, m_has_width};
+  }
+
+private:
+  // Emit the pending width (once) ahead of the first stem/move/endchar's
+  // operands, as the Type2 width does. nominalWidthX is 0 in the built CFF, so
+  // the width is the absolute advance.
+  void emit_width() {
+    if (m_width_pending) {
+      emit_int(m_out, m_width);
+      m_width_pending = false;
+    }
+  }
+
+  void flush_stack() {
+    for (const double v : m_stack) {
+      emit_num(m_out, v);
+    }
+    m_stack.clear();
+  }
+
+  // Emit width + operands + a one-byte operator, clearing the stack.
+  void emit_op(const int op) {
+    emit_width();
+    flush_stack();
+    m_out += static_cast<char>(op);
+  }
+
+  void execute(std::string_view cs, const int depth) {
+    if (depth > 16 || m_ended) {
+      return;
+    }
+    std::size_t p = 0;
+    while (p < cs.size() && !m_ended) {
+      const auto b = static_cast<std::uint8_t>(cs[p]);
+      if (b >= 32) {
+        // operand
+        double value = 0.0;
+        if (b <= 246) {
+          value = static_cast<int>(b) - 139;
+          p += 1;
+        } else if (b <= 250) {
+          value = (static_cast<int>(b) - 247) * 256 +
+                  static_cast<std::uint8_t>(cs[p + 1]) + 108;
+          p += 2;
+        } else if (b <= 254) {
+          value = -(static_cast<int>(b) - 251) * 256 -
+                  static_cast<std::uint8_t>(cs[p + 1]) - 108;
+          p += 2;
+        } else { // 255: Type1 32-bit integer
+          value = static_cast<std::int32_t>(
+              (static_cast<std::uint8_t>(cs[p + 1]) << 24) |
+              (static_cast<std::uint8_t>(cs[p + 2]) << 16) |
+              (static_cast<std::uint8_t>(cs[p + 3]) << 8) |
+              static_cast<std::uint8_t>(cs[p + 4]));
+          p += 5;
+        }
+        m_stack.push_back(value);
+        continue;
+      }
+      int op = b;
+      ++p;
+      if (b == 12) {
+        op = 1200 + static_cast<std::uint8_t>(cs[p]);
+        ++p;
+      }
+      handle(op, depth);
+    }
+  }
+
+  void handle(const int op, const int depth) {
+    switch (op) {
+    case t1_hsbw:
+      if (m_stack.size() >= 2) {
+        m_sbx = m_stack[0];
+        m_width = static_cast<int>(m_stack[1]);
+        m_has_width = true;
+        m_width_pending = true;
+        m_sbx_pending = true;
+      }
+      m_stack.clear();
+      break;
+    case t1_sbw:
+      if (m_stack.size() >= 4) {
+        m_sbx = m_stack[0];
+        m_width = static_cast<int>(m_stack[2]);
+        m_has_width = true;
+        m_width_pending = true;
+        m_sbx_pending = true;
+      }
+      m_stack.clear();
+      break;
+
+    case t1_rmoveto:
+      if (m_flex_active) {
+        collect_flex_point();
+      } else {
+        if (m_sbx_pending && !m_stack.empty()) {
+          m_stack[0] += m_sbx;
+          m_sbx_pending = false;
+        }
+        emit_op(t1_rmoveto);
+      }
+      break;
+    case t1_hmoveto:
+      if (m_flex_active) {
+        collect_flex_point();
+      } else {
+        // hmoveto has no y; the side bearing adds an x, so keep it hmoveto.
+        if (m_sbx_pending && !m_stack.empty()) {
+          m_stack[0] += m_sbx;
+        }
+        m_sbx_pending = false;
+        emit_op(t1_hmoveto);
+      }
+      break;
+    case t1_vmoveto:
+      if (m_flex_active) {
+        collect_flex_point();
+      } else if (m_sbx_pending && m_sbx != 0.0) {
+        // A side bearing adds an x offset, which vmoveto cannot carry: promote
+        // to rmoveto(sbx, dy).
+        const double dy = m_stack.empty() ? 0.0 : m_stack[0];
+        m_stack = {m_sbx, dy};
+        m_sbx_pending = false;
+        emit_op(t1_rmoveto);
+      } else {
+        m_sbx_pending = false;
+        emit_op(t1_vmoveto);
+      }
+      break;
+
+    case t1_hstem:
+    case t1_vstem:
+    case t1_rlineto:
+    case t1_hlineto:
+    case t1_vlineto:
+    case t1_rrcurveto:
+    case t1_vhcurveto:
+    case t1_hvcurveto:
+      emit_op(op); // identical opcodes/semantics in Type2
+      break;
+
+    case t1_closepath:
+    case t1_dotsection:
+    case t1_vstem3:
+    case t1_hstem3:
+    case t1_setcurrentpoint:
+      m_stack.clear(); // dropped (implicit / hints / no-op)
+      break;
+
+    case t1_div:
+      if (m_stack.size() >= 2) {
+        const double b = m_stack.back();
+        m_stack.pop_back();
+        const double a = m_stack.back();
+        m_stack.pop_back();
+        m_stack.push_back(b != 0.0 ? a / b : 0.0);
+      }
+      break;
+
+    case t1_callsubr: {
+      if (m_stack.empty()) {
+        break;
+      }
+      const auto index = static_cast<int>(m_stack.back());
+      m_stack.pop_back();
+      if (index >= 0 && index < static_cast<int>(m_subrs.size())) {
+        execute(m_subrs[index], depth + 1);
+      }
+      break;
+    }
+    case t1_return:
+      break; // end of the current subr
+
+    case t1_callothersubr:
+      handle_othersubr();
+      break;
+    case t1_pop:
+      // Push the value the matching callothersubr left on the PS stack.
+      if (!m_ps_stack.empty()) {
+        m_stack.push_back(m_ps_stack.back());
+        m_ps_stack.pop_back();
+      } else {
+        m_stack.push_back(0.0);
+      }
+      break;
+
+    case t1_seac:
+      emit_seac();
+      break;
+
+    case t1_endchar:
+      emit_op(t1_endchar);
+      m_ended = true;
+      break;
+
+    default:
+      m_stack.clear(); // unknown: skip
+      break;
+    }
+  }
+
+  // OtherSubr dispatch: flex (1 start / 2 add-point / 0 end) and hint
+  // replacement (3). The Type1 convention is `arg1..argN N othersubr#
+  // callothersubr`, so the operand stack top is the othersubr number, below it
+  // the argument count, below that the arguments.
+  void handle_othersubr() {
+    if (m_stack.size() < 2) {
+      m_stack.clear();
+      return;
+    }
+    const auto othersubr = static_cast<int>(m_stack.back());
+    m_stack.pop_back();
+    const auto argc = static_cast<int>(m_stack.back());
+    m_stack.pop_back();
+
+    std::vector<double> args;
+    for (int i = 0; i < argc && !m_stack.empty(); ++i) {
+      args.push_back(m_stack.back());
+      m_stack.pop_back();
+    }
+    // args is reversed (top first); restore call order.
+    std::reverse(args.begin(), args.end());
+
+    switch (othersubr) {
+    case 1: // flex start
+      m_flex_active = true;
+      m_flex_points.clear();
+      break;
+    case 2: // flex add-point marker (the rmoveto already collected the point)
+      break;
+    case 0: // flex end: emit two curves from the collected points
+      emit_flex();
+      m_flex_active = false;
+      // OtherSubr 0 leaves the end x,y on the PS stack for `pop pop
+      // setcurrentpoint`.
+      if (args.size() >= 3) {
+        m_ps_stack.push_back(args[2]); // y (popped second)
+        m_ps_stack.push_back(args[1]); // x (popped first)
+      }
+      break;
+    case 3: // hint replacement: result is the subr number, used by callsubr
+      m_ps_stack.push_back(args.empty() ? 3.0 : args[0]);
+      break;
+    default:
+      // Unknown OtherSubr: make the arguments available to subsequent pops.
+      for (auto it = args.rbegin(); it != args.rend(); ++it) {
+        m_ps_stack.push_back(*it);
+      }
+      break;
+    }
+  }
+
+  // During flex the 7 points arrive as `dx dy rmoveto`; collect their deltas.
+  void collect_flex_point() {
+    const double dx = m_stack.size() >= 2 ? m_stack[m_stack.size() - 2] : 0.0;
+    const double dy = m_stack.empty() ? 0.0 : m_stack.back();
+    m_flex_points.push_back({dx, dy});
+    m_stack.clear();
+  }
+
+  // Emit the flex as two rrcurvetos. Point 1 is the reference point; points
+  // 2..7 are the two beziers. The first curve's leading delta folds in the
+  // reference delta (point 2 relative to the pre-flex point = d1 + d2).
+  void emit_flex() {
+    if (m_flex_points.size() < 7) {
+      m_flex_points.clear();
+      return;
+    }
+    const auto &d = m_flex_points;
+    emit_width();
+    emit_num(m_out, d[1].x + d[0].x);
+    emit_num(m_out, d[1].y + d[0].y);
+    emit_num(m_out, d[2].x);
+    emit_num(m_out, d[2].y);
+    emit_num(m_out, d[3].x);
+    emit_num(m_out, d[3].y);
+    m_out += static_cast<char>(t1_rrcurveto);
+    emit_num(m_out, d[4].x);
+    emit_num(m_out, d[4].y);
+    emit_num(m_out, d[5].x);
+    emit_num(m_out, d[5].y);
+    emit_num(m_out, d[6].x);
+    emit_num(m_out, d[6].y);
+    m_out += static_cast<char>(t1_rrcurveto);
+    m_flex_points.clear();
+  }
+
+  // seac: asb adx ady bchar achar. Emit the Type2 deprecated endchar-seac form
+  // `adx' ady bchar achar endchar`, adjusting adx for the accent side bearing.
+  void emit_seac() {
+    if (m_stack.size() >= 5) {
+      const double asb = m_stack[0];
+      const double adx = m_stack[1];
+      const double ady = m_stack[2];
+      const double bchar = m_stack[3];
+      const double achar = m_stack[4];
+      m_stack.clear();
+      emit_width();
+      emit_num(m_out, adx - asb + m_sbx);
+      emit_num(m_out, ady);
+      emit_num(m_out, bchar);
+      emit_num(m_out, achar);
+      m_out += static_cast<char>(t1_endchar);
+    }
+    m_ended = true;
+  }
+
+  struct Point {
+    double x;
+    double y;
+  };
+
+  const std::vector<std::string> &m_subrs;
+  std::string m_out;
+  std::vector<double> m_stack;
+  std::vector<double> m_ps_stack;
+
+  int m_width{};
+  bool m_has_width{};
+  bool m_width_pending{};
+  double m_sbx{};
+  bool m_sbx_pending{};
+  bool m_ended{};
+
+  bool m_flex_active{};
+  std::vector<Point> m_flex_points;
+};
+
+} // namespace
+
+Type2Charstring to_type2(const std::string_view type1,
+                         const std::vector<std::string> &subrs) {
+  return Translator(subrs).run(type1);
+}
+
+} // namespace odr::internal::font::type1
diff --git a/src/odr/internal/font/type1_charstring.hpp b/src/odr/internal/font/type1_charstring.hpp
new file mode 100644
index 00000000..47bd0d6c
--- /dev/null
+++ b/src/odr/internal/font/type1_charstring.hpp
@@ -0,0 +1,30 @@
+#pragma once
+
+#include <string>
+#include <string_view>
+#include <vector>
+
+namespace odr::internal::font::type1 {
+
+/// The result of translating a Type1 charstring to Type2 (CFF).
+struct Type2Charstring {
+  std::string charstring; ///< the Type2 charstring (no leading width)
+  int width{};            ///< advance width from `hsbw`/`sbw`, in glyph units
+  bool has_width{};       ///< whether an `hsbw`/`sbw` set the width
+};
+
+/// Translate one **decrypted** Type1 charstring to a Type2 (CFF) charstring.
+///
+/// Type1 and Type2 share most path operators; this flattens `callsubr`
+/// (inlining @p subrs), folds `div`, lifts the `hsbw` side bearing into the
+/// first move, drops Type1-only hint operators (`dotsection`, `*stem3`, hint
+/// replacement) and translates the flex and `seac` OtherSubr mechanisms. The
+/// advance width (`hsbw`) is returned separately rather than baked into the
+/// charstring, so the caller emits it against the CFF `nominalWidthX`.
+///
+/// Best-effort and display-oriented: hints are dropped (they affect rendering
+/// quality, not glyph shape), and unknown operators are skipped.
+[[nodiscard]] Type2Charstring to_type2(std::string_view type1,
+                                       const std::vector<std::string> &subrs);
+
+} // namespace odr::internal::font::type1
diff --git a/test/CMakeLists.txt b/test/CMakeLists.txt
index 249ebc51..975180f4 100644
--- a/test/CMakeLists.txt
+++ b/test/CMakeLists.txt
@@ -55,6 +55,7 @@ add_executable(odr_test
         "src/internal/pdf/pdf_test_file_builder.cpp"
 
         "src/internal/font/cff_font.cpp"
+        "src/internal/font/type1_charstring.cpp"
         "src/internal/font/type1_crypt.cpp"
         "src/internal/font/type1_font.cpp"
         "src/internal/font/sfnt_font.cpp"
diff --git a/test/src/internal/font/type1_charstring.cpp b/test/src/internal/font/type1_charstring.cpp
new file mode 100644
index 00000000..b79e2a22
--- /dev/null
+++ b/test/src/internal/font/type1_charstring.cpp
@@ -0,0 +1,122 @@
+#include <odr/internal/font/type1_charstring.hpp>
+
+#include <gtest/gtest.h>
+
+#include <string>
+#include <vector>
+
+using namespace odr::internal::font::type1;
+
+namespace {
+
+/// Encode an integer in the Type1/Type2 shared number forms (no 28/255 needed
+/// for the small values used here).
+void num(std::string &s, const int v) {
+  if (v >= -107 && v <= 107) {
+    s += static_cast<char>(v + 139);
+  } else if (v >= 108 && v <= 1131) {
+    const int u = v - 108;
+    s += static_cast<char>((u >> 8) + 247);
+    s += static_cast<char>(u & 0xff);
+  } else if (v >= -1131 && v <= -108) {
+    const int u = -v - 108;
+    s += static_cast<char>((u >> 8) + 251);
+    s += static_cast<char>(u & 0xff);
+  }
+}
+
+void op(std::string &s, const int o) { s += static_cast<char>(o); }
+
+} // namespace
+
+TEST(Type1CharstringTest, HsbwWidthAndSideBearing) {
+  // sbx=10 wx=200 hsbw  100 0 rmoveto  50 50 rlineto  endchar
+  std::string t1;
+  num(t1, 10);
+  num(t1, 200);
+  op(t1, 13); // hsbw
+  num(t1, 100);
+  num(t1, 0);
+  op(t1, 21); // rmoveto
+  num(t1, 50);
+  num(t1, 50);
+  op(t1, 5);  // rlineto
+  op(t1, 14); // endchar
+
+  const Type2Charstring out = to_type2(t1, {});
+  EXPECT_TRUE(out.has_width);
+  EXPECT_EQ(out.width, 200);
+
+  // Type2: [width 200][dx 100+sbx 10 = 110][dy 0] rmoveto  [50][50] rlineto
+  //        endchar.
+  std::string expected;
+  num(expected, 200); // width prepended
+  num(expected, 110); // 100 + side bearing 10
+  num(expected, 0);
+  op(expected, 21); // rmoveto
+  num(expected, 50);
+  num(expected, 50);
+  op(expected, 5);  // rlineto
+  op(expected, 14); // endchar
+  EXPECT_EQ(out.charstring, expected);
+}
+
+TEST(Type1CharstringTest, FlattensCallSubr) {
+  // subr 0: 50 50 rlineto return
+  std::string subr0;
+  num(subr0, 50);
+  num(subr0, 50);
+  op(subr0, 5);  // rlineto
+  op(subr0, 11); // return
+
+  // 0 0 hsbw  0 0 rmoveto  0 callsubr  endchar
+  std::string t1;
+  num(t1, 0);
+  num(t1, 0);
+  op(t1, 13); // hsbw
+  num(t1, 0);
+  num(t1, 0);
+  op(t1, 21); // rmoveto
+  num(t1, 0);
+  op(t1, 10); // callsubr 0
+  op(t1, 14); // endchar
+
+  const Type2Charstring out = to_type2(t1, {subr0});
+
+  // The subr's rlineto is inlined; expect width(0) rmoveto, then rlineto, then
+  // endchar.
+  std::string expected;
+  num(expected, 0); // width
+  num(expected, 0);
+  num(expected, 0);
+  op(expected, 21); // rmoveto
+  num(expected, 50);
+  num(expected, 50);
+  op(expected, 5);  // rlineto (from subr)
+  op(expected, 14); // endchar
+  EXPECT_EQ(out.charstring, expected);
+}
+
+TEST(Type1CharstringTest, FoldsDiv) {
+  // 0 0 hsbw  600 2 div 0 rmoveto  endchar  -> dx = 300
+  std::string t1;
+  num(t1, 0);
+  num(t1, 0);
+  op(t1, 13); // hsbw
+  num(t1, 600);
+  num(t1, 2);
+  t1 += static_cast<char>(12);
+  t1 += static_cast<char>(12); // div
+  num(t1, 0);
+  op(t1, 21); // rmoveto
+  op(t1, 14); // endchar
+
+  const Type2Charstring out = to_type2(t1, {});
+  std::string expected;
+  num(expected, 0);   // width
+  num(expected, 300); // 600 / 2
+  num(expected, 0);
+  op(expected, 21); // rmoveto
+  op(expected, 14); // endchar
+  EXPECT_EQ(out.charstring, expected);
+}

From 29cdc2ff03dcfaa014249922be092c86e5be536e Mon Sep 17 00:00:00 2001
From: Andreas Stefl <stefl.andreas@gmail.com>
Date: Tue, 23 Jun 2026 22:19:31 +0200
Subject: [PATCH 6/6] PDF stage 3.5: Type1 -> CFF assembly + /FontFile wiring
 (end-to-end)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

type1::to_cff translates every glyph (to_type2, flattening /Subrs), places
.notdef at glyph 0 (synthesizing one when absent) and assembles a CFF via the
builder. load_embedded_font now reads /FontFile: parse the Type1 program,
convert to CFF, and hold it as a CffFont — so embedded Type1 reuses the entire
3.4 CFF path (PUA re-encode, @font-face wrap, reverse map) with no new
abstract::Font subclass.

Simple-font glyph selection by PostScript name (PDF /Encoding -> name -> glyph)
is the shared CFF/Type1 follow-up tied to the AGL/name-mapping decision;
composite and the wrap/display path work today.

Tests: a Type1 program converts to a CFF that reads back through CffFont
(glyph count incl. synthesized .notdef, names) and wraps to a loadable OTTO.
Full font + PDF + HTML corpus green (460 tests).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
---
 src/odr/internal/font/type1_font.cpp         | 37 ++++++++++++++++++++
 src/odr/internal/font/type1_font.hpp         |  7 ++++
 src/odr/internal/pdf/pdf_document_parser.cpp | 17 +++++++--
 test/src/internal/font/type1_font.cpp        | 22 ++++++++++++
 4 files changed, 80 insertions(+), 3 deletions(-)

diff --git a/src/odr/internal/font/type1_font.cpp b/src/odr/internal/font/type1_font.cpp
index b149c6ed..e53c2974 100644
--- a/src/odr/internal/font/type1_font.cpp
+++ b/src/odr/internal/font/type1_font.cpp
@@ -1,5 +1,7 @@
 #include <odr/internal/font/type1_font.hpp>
 
+#include <odr/internal/font/cff_builder.hpp>
+#include <odr/internal/font/type1_charstring.hpp>
 #include <odr/internal/font/type1_crypt.hpp>
 
 #include <charconv>
@@ -292,4 +294,39 @@ void Type1Program::parse_private(const std::string_view decrypted) {
   }
 }
 
+std::string to_cff(const Type1Program &program) {
+  // Order glyphs with `.notdef` at index 0 (CFF requires it). Translate each
+  // Type1 charstring to Type2; the width rides in the charstring (the CFF
+  // builder uses nominalWidthX = 0).
+  std::vector<cff::BuilderGlyph> glyphs;
+  glyphs.reserve(program.glyphs().size() + 1);
+
+  const auto translate = [&](const Glyph &glyph) {
+    Type2Charstring t2 = to_type2(glyph.charstring, program.subrs());
+    glyphs.push_back({glyph.name, std::move(t2.charstring)});
+  };
+
+  // .notdef first.
+  std::size_t notdef = program.glyphs().size();
+  for (std::size_t i = 0; i < program.glyphs().size(); ++i) {
+    if (program.glyphs()[i].name == ".notdef") {
+      notdef = i;
+      break;
+    }
+  }
+  if (notdef < program.glyphs().size()) {
+    translate(program.glyphs()[notdef]);
+  } else {
+    glyphs.push_back({".notdef", std::string(1, static_cast<char>(14))});
+  }
+  for (std::size_t i = 0; i < program.glyphs().size(); ++i) {
+    if (i != notdef) {
+      translate(program.glyphs()[i]);
+    }
+  }
+
+  return cff::build_cff(program.name(), glyphs, /*default_width=*/0,
+                        /*nominal_width=*/0, program.font_bbox());
+}
+
 } // namespace odr::internal::font::type1
diff --git a/src/odr/internal/font/type1_font.hpp b/src/odr/internal/font/type1_font.hpp
index 6dc56734..037df539 100644
--- a/src/odr/internal/font/type1_font.hpp
+++ b/src/odr/internal/font/type1_font.hpp
@@ -78,4 +78,11 @@ class Type1Program {
   int m_len_iv{4};
 };
 
+/// Convert a parsed Type1 program to a **CFF** font: translate every glyph's
+/// charstring to Type2 (`to_type2`, flattening the program's `/Subrs`) and
+/// assemble via the CFF builder, with `.notdef` placed at glyph 0. The result
+/// is a bare CFF that `cff::CffFont` reads and `cff::wrap_to_otf` wraps for the
+/// browser — so an embedded Type1 font reuses the entire 3.4 CFF path.
+[[nodiscard]] std::string to_cff(const Type1Program &program);
+
 } // namespace odr::internal::font::type1
diff --git a/src/odr/internal/pdf/pdf_document_parser.cpp b/src/odr/internal/pdf/pdf_document_parser.cpp
index cae04371..09830556 100644
--- a/src/odr/internal/pdf/pdf_document_parser.cpp
+++ b/src/odr/internal/pdf/pdf_document_parser.cpp
@@ -5,6 +5,7 @@
 
 #include <odr/internal/font/cff_font.hpp>
 #include <odr/internal/font/sfnt_font.hpp>
+#include <odr/internal/font/type1_font.hpp>
 #include <odr/internal/pdf/pdf_cmap_parser.hpp>
 #include <odr/internal/pdf/pdf_document.hpp>
 #include <odr/internal/pdf/pdf_document_element.hpp>
@@ -276,9 +277,10 @@ util::math::Transform2D parse_matrix(DocumentParser &parser, Object object) {
 /// interface: `/FontFile2` (TrueType / `CIDFontType2`) -> `SfntFont`, and
 /// `/FontFile3` (CFF / `Type1C` / `CIDFontType0C`, or OpenType-CFF) -> either
 /// an `SfntFont` (when the program is already a full SFNT, `/Subtype
-/// /OpenType`) or a bare `CffFont`. `/FontFile` (Type1) is not yet read and
-/// leaves `font.embedded_font` null, so such fonts keep rendering through the
-/// fallback path. A malformed font is logged and left null.
+/// /OpenType`) or a bare `CffFont`. `/FontFile` (Type1) is translated to a CFF
+/// (`type1::to_cff`) and read as a `CffFont`, so it reuses the whole CFF path.
+/// A malformed font is logged and leaves `font.embedded_font` null, so such
+/// fonts keep rendering through the fallback path.
 void load_embedded_font(DocumentParser &parser, const Dictionary &descriptor,
                         Font &font) {
   try {
@@ -301,6 +303,15 @@ void load_embedded_font(DocumentParser &parser, const Dictionary &descriptor,
         font.embedded_font =
             std::make_shared<font::cff::CffFont>(std::move(data));
       }
+    } else if (descriptor.has_key("FontFile") &&
+               descriptor["FontFile"].is_reference()) {
+      // Type1 (`/FontFile`): translate the program to a CFF, then read it as a
+      // CffFont so the whole CFF path (re-encode / wrap / reverse map) applies.
+      std::string data =
+          parser.read_decoded_stream(descriptor["FontFile"].as_reference());
+      const font::type1::Type1Program program(data);
+      font.embedded_font =
+          std::make_shared<font::cff::CffFont>(font::type1::to_cff(program));
     }
   } catch (const std::exception &e) {
     ODR_WARNING(parser.logger(),
diff --git a/test/src/internal/font/type1_font.cpp b/test/src/internal/font/type1_font.cpp
index a372b0b2..7ef5d974 100644
--- a/test/src/internal/font/type1_font.cpp
+++ b/test/src/internal/font/type1_font.cpp
@@ -1,5 +1,9 @@
 #include <odr/internal/font/type1_font.hpp>
 
+#include <odr/internal/font/cff_font.hpp>
+#include <odr/internal/font/cff_transform.hpp>
+#include <odr/internal/font/sfnt_font.hpp>
+
 #include <gtest/gtest.h>
 
 #include <cstdint>
@@ -109,3 +113,21 @@ TEST(Type1FontTest, DecryptsCharstringsAndSubrs) {
   ASSERT_EQ(font.subrs().size(), 1u);
   EXPECT_EQ(font.subrs()[0], std::string("\x0b", 1)); // return
 }
+
+TEST(Type1FontTest, ConvertsToLoadableCff) {
+  namespace cff = odr::internal::font::cff;
+  namespace sfnt = odr::internal::font::sfnt;
+
+  const Type1Program program{build_type1()};
+  const std::string cff_bytes = to_cff(program);
+
+  const cff::CffFont font{cff_bytes};
+  EXPECT_EQ(font.format(), odr::FontFormat::cff);
+  // .notdef (synthesized, since the test font has none) + A + B.
+  EXPECT_EQ(font.glyph_count(), 3);
+  EXPECT_EQ(font.glyph_name(1), "A");
+  EXPECT_EQ(font.glyph_name(2), "B");
+
+  // The converted CFF wraps into a browser-loadable OTTO (the 3.4 path).
+  EXPECT_TRUE(sfnt::SfntFont::is_sfnt(cff::wrap_to_otf(font)));
+}