Skip to content

Google detect footprints and ban proxies. How do you scrape these days?

I tried first with gsa ser footprints, then tried some custom ones, but scrapebox wont scrape at all, google detects it and block it. I tried elite google passed proxies, then 250 backconnect proxies that change every 10 minutes but still same, what you are doing to avoid this problem, and problem is without doubt footprints.

Comments

  • low thread will do the trick.
  • low threads and custom footprints

    for blogs use -captcha
  • agreed. definitely pull back on the thread count. 
  • penumbrapenumbra Antarctica
    Tried lowering threads but that doesnt help, i will now try dedicated proxies.
  • what version of sb are you using? if you dont have it, download v2.0. huge difference. next, there are two options on the harvester. go to custom harvester and use sb's server proxies, not yours. this will not work on google. but who cares. i am doing my scrape today. 3 hours in and I have 2.1 million urls. Without scraping google.
  • penumbrapenumbra Antarctica
    @viking I use v2.0, so you scrape yahoo? Does footprint operators work same for yahoo?
  • shaunshaun https://www.youtube.com/ShaunMarrs
    I use GSA Proxy Scraper to scrape Bing proxies and then scrape bing/Yahoo rather than google. There are much more proxies available and they dont ban as fast either.

    Thread count is a bit of a problem as I dont have a sent number of proxies that auto update every 5 mins but I try workout my average proxies and have 1 thread per 10 public proxies.
  • no issues being experience on SB with Yahoo. Works great.
Sign In or Register to comment.