Eyal Rosenthal · Web scraping at scale

Sitemap → JSON-LD Bulk Extractor

Sitemap → JSON-LD Bulk Extractor — Universal Pattern for 'Scrape Every Recipe / Product / Article'

Sitemap-Driven JSON-LD Bulk Extractor

Maps to "scrape every X on this site" Upwork briefs. Two-stage pipeline: pull sitemap.xml (handles sitemap-index nesting), filter URLs by pattern, then extract every JSON-LD