Skip to content

Feature request: Master parsed root domain list

I've noticed that after scraping for a while that many of the same root domains are crawled again at some point as I'm finding many duplicate emails/URLs. Every day or so I clear out the project so that the URLs/data don't take too many resources.

I was curious if there was a way to build a master parsed root domain list so that I'm not recrawling the same domains over and over again?


Comments

Sign In or Register to comment.