custom time between search queries
squidol
USA
gsa suggest to put 60 in waiting time to prevent proxy ban.
my question is, what would be its waiting time if i leave it unchecked?
i want to optimize gsa scraping because i recently upgraded to 150 rotating proxies.
Comments
This is true of a lot of the little things SER does - they are super handy, and I usually leave them on in a minimal/secondary way, but everything SER does besides post links slows it down dramatically. (It pauses those posting-threads to verify, check emails, etc..)
Scraping is probably the thing that slows SER down the most of all.
The nice thing is SER usually does not scrape for more targets until it has used up everything loaded manually, or from your site lists.
What we do is load our big, bad, massive scraped lists - and have search engines toggled as secondarys. So if we run out of targets while sleeping, or otherwise not looking at the servers, SER keeps on going (Though at a much reduced speed.)
For the individual looking to scrape a few lists, I would say Scrapebox is the way to go. Hrefer is $900 (Includes an Xrumer License), while Scrapebox is just $50.
I think Gscraper is out of the game now as well. So those are your only real options outside of coding something yourself.
If your up to *that* challenge (Its taken us a good 6 years and constant refinements), you should use NUTCH. (Its the framework we built a lot of 1Linklist from).
See: http://nutch.apache.org/
But dont panic. I have a simpler, easier solution :P
If none of that sounds good, you can also ramp SER scraping up quite a bit - add new/custom search engines. There are ALOT of them, and after we added around 20 or 30 of them our scraping speed SER side went up a good 40-50%.
If you go this route, a good starting point:
http://www.thesearchenginelist.com/
PS: Make sure to check SER Does not already have the search-engine your adding before you bother. Sven has been pretty thorough as-is