Skip to content

Add draft security threat model (THREAT_MODEL.md + SECURITY.md + AGENTS.md)#3052

Open
potiuk wants to merge 3 commits into
apache:masterfrom
potiuk:asf-security/threat-model-2026-06-18
Open

Add draft security threat model (THREAT_MODEL.md + SECURITY.md + AGENTS.md)#3052
potiuk wants to merge 3 commits into
apache:masterfrom
potiuk:asf-security/threat-model-2026-06-18

Conversation

@potiuk

@potiuk potiuk commented Jun 19, 2026

Copy link
Copy Markdown
Member

This is a proposal for the Drill PMC to review — please correct, reject, or discuss as needed. Every claim is provenance-tagged ((documented) / (inferred)); the (inferred) ones are the team's draft reasoning for you to confirm or strike, collected as "Open questions for the maintainers" (§14, three waves).

This adds a draft THREAT_MODEL.md plus the AGENTS.md -> SECURITY.md -> THREAT_MODEL.md discoverability wiring for Apache Drill, drafted at the PMC's request (Charles Givre, path 3) using the threat-model-producer rubric.

What's needed from the PMC: walk the §14 questions (a one-line confirm / correct / strike per question is plenty). We fold your answers in and the (inferred) tags become (maintainer). Nothing here is a requirement — the scan just runs with less noise when the model is filled in.

Context: this is pre-flight for an automated agentic security scan the ASF Security team is piloting; discoverability (AGENTS.md -> SECURITY.md -> the model) is the one hard gate. Questions / pushback welcome.

Generated-by: Claude Opus 4.8 (1M context)

potiuk added 2 commits June 18, 2026 20:44
…h already covers **/*.md), so .ratignore is unread here and is itself the lone unapproved-license file failing RAT

@cgivre cgivre left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor change regarding the contrib/ folder. As a general comment, we've received a lot of so-called vulnerabilities that basically make the claim that you can enumerate file systems within Drill. Our response usually is "yes... that's the point of Drill". I'm hoping the audit will not repeat that.

Comment thread THREAT_MODEL.md Outdated
- **An operator with cluster/root access, the Drillbit service account, or direct ZooKeeper write access.** Anyone who controls the cluster processes or the coordination store has already won. `OUT-OF-MODEL: adversary-not-in-scope`. *(inferred — §14 Q5)*
- **Embedded / single-user mode**, where the caller owns the JVM, the filesystem, and all input — equivalent to an in-process library call. *(inferred — §14 Q1)*
- **SQL injection in an embedding application** that string-concatenates untrusted input into Drill SQL — that is the embedding app's bug, not Drill's. *(inferred — §11)*
- **Bundled connectors / formats under `contrib/`, tests, and examples** — separately authored; threat-model per connector if promoted. *(inferred)*

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't the components in the contrib/ folder should be out of scope. These ship with Drill and are maintained as part of the Drill project.

…cope; storage-enumeration is by-design

Generated-by: Claude Opus 4.8 (1M context)
@potiuk

potiuk commented Jun 24, 2026

Copy link
Copy Markdown
Member Author

Thanks @cgivre — both points folded into THREAT_MODEL.md (just pushed):

  1. contrib/ plugins are in scope. You're right that the storage/format plugins under contrib/ ship with Drill and are maintained as part of the project — they carry the same connector trust surface as the core storage plugins. I'd over-broadly lumped them into the "tests/examples" out-of-scope bullet. Fixed in three places: the §2 component-family table now has a dedicated "contrib/ storage + format plugins → In" row; §3 narrows the out-of-scope bullet to tests/examples/sample-data only, with an explicit "the contrib/ plugins are in scope" note; and the §13 unsupported-component disposition row carries the same caveat so a triager can't misroute a contrib/ finding.

  2. Storage/filesystem enumeration is by-design, not a vuln. Added a KNOWN-NON-FINDING bullet to §11a capturing exactly this: "Drill can enumerate / read the files, schemas, or storage systems it is pointed at" is the function of a federated query engine, not a vulnerability — in-model only when the access crosses the authenticated identity's authorization (reading data an impersonated user shouldn't, or bypassing a Drill view). It cites your "yes… that's the point of Drill" framing directly, so the audit has a citable line to suppress that whole class. Also tagged the relevant §14 question as partially answered by your review.

The model is still a v0 draft for the PMC to react to — corrections welcome on any of the (inferred) claims.

@cgivre cgivre left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1 (Pending CI)
Thanks @potiuk

@cgivre

cgivre commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

@jnturton @rymarm Any comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants