I have an idea - when I give a fresh list of urls to PI and PI found engine(or engines), do not need to check again the same domains in this list(let it passes all these domains)
Ahh, I see what you're saying. Well, the problem with that is we would need to do an extra check on every URL that it sorts. So a URL comes in, it has to check the identified URLS to see if that URL has been sorted before, which also means as the identified list grows, it's having to compare each URL against big .txt files.
It would slow things down, use more resources, etc.
Best thing for you to do would be to setup an automatic dedup project and have it automatically remove dup domains.
Comments
somedomain14.com/somepath/somefile.html
somedomain32.com/somepath/somefile.html
somedomain14.com/somepath/somefile2.html ...
when PI identified somedomain14.com, no need to check somedomain14.com/somepath/somefile2.html