Eyal Rosenthal · Web scraping at scale

Open Library Books Extractor

Open Library Books Extractor — Bulk Book Metadata via Internet Archive's Open DB

Open Library Books Extractor

Bulk extract book metadata from Open Library (Internet Archive's open book DB) — search.json + per-work enrichment. Maps to "build me a book metadata DB for [author / topic / year range]" briefs.

Built 2026-05-03 as Demo #32.

Run

. ~/freelance/.venv/bin/activate
cd ~/freelance/portfolio_demos/openlibrary_books_extractor
python extract.py --queries "tolkien,asimov,le guin,octavia butler" --limit 5
python extract.py --queries "isaac asimov" --enrich    # adds description + subjects

Result

  • 20 books extracted across 4 author queries (Tolkien, Asimov, Le Guin, Octavia Butler) ✅
  • Per-row: title, authors, first_publish_year, edition_count, languages, subjects_count, has_fulltext, IA archive id, cover URL, OL key ✅
  • Optional --enrich adds per-work description + full subjects ✅
  • Use cases: book recommendation startups, library catalogs, academic data ✅

Hire me to build this for your stack

Same patterns, your target site. Send the brief and I'll quote fixed-price within 24 hours.

info@luba.media