Dump job listings, to a CSV, scraped from searX instances.
mharb
da835d4fb0
Saving my progress on job_scraper_v2. This overhaul will separate cURL requests and HTML parsing into distinct stages. Buyer beware: it is far from complete. The flow is: requesting HTML documents first, closing all network connections, parsing HTML, and finally saving to disk. |
||
---|---|---|
.gitignore | ||
job_scraper.py | ||
LICENSE | ||
README.md |
searX-py-job-scraper (WIP NOT COMPLETE)
Dump job listings, to a CSV, scraped from searX instances. A convenient list of all active public searX hosts.