Feature request: Master parsed root domain list

GSADomination · November 2016

I've noticed that after scraping for a while that many of the same root domains are crawled again at some point as I'm finding many duplicate emails/URLs. Every day or so I clear out the project so that the URLs/data don't take too many resources.

I was curious if there was a way to build a master parsed root domain list so that I'm not recrawling the same domains over and over again?

Sven · November 2016

hmm you can try adding that to the filter.

Feature request: Master parsed root domain list

Comments