Skip to content

Scraping is "Finished" in a Matter of Seconds

I loaded up 22 private proxies for scraping, tested them in Bing and all of them work. I am only using the Bing US search engine. 1 thread. Timeout set for 5 sec. And "automatically disable proxies if discovered as beign down" is checked off under "Private Proxies."

I let it rip and it "finishes" in 2 seconds. I test the proxies on Bing again, all seem to be working.

Last night, I let it rip with 10 threads on 22 proxies, all tested, all good. I got a total of about 200 sites and then it aborted. It said my proxies were burned. I kept trying and eventually got up to 500 sites scraped, but half my proxies were burned. When I woke up today, my proxies are all working again on Bing, and I'm not getting aborted anymore, it's just that it "finishes." I got 3 urls last scrape. Finished.

Please give me a clue.

Comments

  • SvenSven www.GSA-Online.de
    why relying on bing us only?
  • edgarallenedgarallen Massachusetts
    I had about 80 search engines checked the first couple times, but I kept getting aborted, so one change I made was to only use Bing US. And I want Bing US because I'm scraping for only a certain kind of US business. I don't want results from all over the world.
  • SvenSven www.GSA-Online.de
    In case your keywords are already containing city names or alike, you should choose international search engines.
    However, this is just a workaround and does not explain your instant stop of scraping. But I can't help you right now cuz Im still on vacation. This would need to wait. I would need to have the project backup as well.
  • edgarallenedgarallen Massachusetts
    Maybe too many threads? Not enought "timeout", which was set to 5? I am using 22 private proxies. My threads were set to 10, then 5, then 1.
Sign In or Register to comment.