Shopify Bookstore Lead Finder
Shopify Bookstore Lead Finder
Two-stage pipeline that finds verified Shopify-hosted bookstores from a candidate list. Uses Shopify's /products.json endpoint (Shopify-only, returns catalog as JSON) to verify the platform, then applies a book-shape filter (ISBN-pattern SKUs + book/author/poetry/textbook keywords) to confirm the catalog.
Run
pip install requests beautifulsoup4 curl_cffi
python3 extract.py
Output: leads.csv (raw verified leads) and leads_qa.csv (US-filtered + tier-scored).
How it works
Stage 1 — Candidate sourcing. Combines a curated seed list of indie/specialty bookstore URLs with Bing search queries. The verifier handles the false-positive removal, so the candidate list can be noisy.
Stage 2 — Shopify verification. For each candidate, fetch . This endpoint exists only on Shopify storefronts. If it returns a valid JSON payload with a products array, the domain is confirmed Shopify. Otherwise reject.
Stage 3 — Bookstore filter. Apply two heuristics to the catalog:
- ≥30% of sampled products contain book-shaped keywords (book, author, ISBN, paperback, novel, poetry, textbook, memoir, etc.)
- OR ≥3 products with ISBN-pattern SKUs (matching
^(97[89])?\d{9,12}$)
Either signal qualifies. This rejects gift shops with a few books while accepting comic-book stores, RPG/tabletop publishers, used-book sellers, etc.
Stage 4 — Contact enrichment. Hit /, /pages/contact, /pages/about. Extract: company name, email (mailto: links + visible @), phone (tel: + US-pattern), country (footer hints + ZIP-code presence), services blurb (meta description / first paragraph).
Stage 5 — Tier scoring.
- Gold = full contact (email + phone) + ISBN-validated catalog
- Silver = email OR phone
- Bronze = URL only
Why this is better than hand-curated lists
The interesting finding from the demo run: of 50 well-known US indie bookstores I tested as a seed list, only ONE (Tattered Cover) was on Shopify. Most run on IndieCommerce (the American Booksellers Association platform), Squarespace, or custom builds.
Without programmatic verification, ~98% of "Shopify bookstore" lists hand-built from Google results are false positives. The verifier fixes that by going to the platform layer instead of the search-result layer.
Real Upwork brief this maps to
See PROPOSAL.md — written for a $100 fixed-price brief asking for verified US Shopify bookstores with contact enrichment.
Sample output
leads_qa.csv contains the verified Tattered Cover record from the seed run (50 products, 48 ISBN-pattern SKUs in catalog). Production version would source candidates from the Shopify Stores Directory + Bookshop.org affiliate list + niche bookstore directories — typically yields 50-150 verified bookstores.
Hire me to build this for your stack
Same patterns, your target site. Send the brief and I'll quote fixed-price within 24 hours.
info@luba.media