Performance & Cost¶

All numbers below are measured, not estimated, using tiktoken (cl100k_base) for token counts and tools/discovery_tax.py for the discovery model. Reproduce any of them with the commands shown.

Extraction (real conversions)¶

Measured with pdftotext (PDF) and ebooklib (EPUB):

Book	Format	Pages	Tokens	Chapters auto-detected
Think Python 2	PDF	244	119K	19
Working Backwards	PDF	371	175K	10
Pro Git	PDF	501	229K	— †
Moby-Dick	EPUB	—	301K	133

† Pro Git heads chapters with section titles (no Chapter N), so it does not auto-segment. Moby-Dick's bodies use bare titles, but its Roman-numeral table of contents is detected (133) — see Known limitations in the README.

Extraction method matters for technical books. On a 103-page technical PDF:

Method	Time	Tables	Code blocks
pdftotext	0.1s	0	0
Docling (technical mode)	164s	48	36

pdftotext is instant but flattens structure; Docling is ~1.5s/page but preserves tables and code as markdown. Pick text mode for prose, technical mode for code/tables.

The Discovery Loop Tax¶

Tokens entering context to answer one targeted question. book-to-skill loads a resident core (~4K) plus one compiled chapter (~1K) ≈ 5,000 tokens.

Book (chapter size)	Context-dump	Discovery loop	book-to-skill	vs dump / loop
Think Python 2 (small)	119,264	12,152	~5,000	24× / 2.4×
Working Backwards (medium)	175,253	33,444	~5,000	35× / 6.7×
AI Engineering (large)	256,287	77,866	~5,000	51× / 15.6×

python3 tools/discovery_tax.py --full-text /tmp/book_skill_work/full_text.txt --target-chapter 5

The context-dump advantage (24–51×) is the strongest claim: that cost recurs on every conversation turn.
The discovery-loop advantage (2.4–15.6×) is a one-time cost and a model using the book's real ToC/chapter sizes; it scales with chapter size.

Generation cost¶

One-pass full conversion, estimated from measured tokens (Claude Sonnet 4.5, \$3 / \$15 per MTok input/output):

Book	Input	Output	~Cost
Think Python 2	155K	28K	\$0.88
Working Backwards	228K	19K	\$0.96
Pro Git	298K	23K	\$1.23
Moby-Dick	391K	17K	\$1.42

Roughly \$1 per book for a full skill — paid once. Re-reading the same PDF into context every session costs far more over time (see the Discovery Loop Tax above).

Generated-skill output quality¶

A before/after of the adaptive-depth change (v1.0.0, #20) on one chapter:

Artifact	Old spec	New spec
Chapter file (tokens)	473	1,219
Worked example present	no	yes
Cheatsheet decision rules	0	32
Cheatsheet keyword/definition lines	9	0

The new spec turns the cheatsheet from a glossary into a decision layer and gives study-depth chapters a reproduced worked example.