Crawl 10,000 urls and put html blobs in ElasticSearch
$250-750 USD
Closed
Posted about 9 years ago
$250-750 USD
Paid on delivery
crawl 10,000 urls and put html blobs in elastic search
need to store
name, ID, full url, being domain url
Need to limit to the core domain (or subdomain)
Need to limit to 5,000 pages per site
Would be nice to run this on several AWS spot instances at the same time so we can crawl more quickly
Will run Elastic Search on a single large AWS instance (lots of ram and CPU)