Skip to content

Scraping running too slow

edited February 2023 in GSA Website Contact
Hi Guys,

My scrapping has gone too slow since last few hours. Nothing has changed on my computer. I was earlier scraping around 20000-30000 links per hour. I did add a bunch on new keywords to the project and noticed that the scrapping is down to 500 links per hour. I use public proxies for scrapping. Any suggestions @Sven ?

  
Tagged:

Comments

  • any suggestions with this  @Sven .. I just cant make it go faster. I've tried flushing the DNS, using google DNS etc but its just painfully slow  :'(
  • SvenSven www.GSA-Online.de
    well, using private proxies might be a good start as it will enable you to parse google...with public proxies this is usually not working good.
  • s_matysiks_matysik SmartIndexer.net
    For harvesting, I recommend a-parser
  • Sven said:
    well, using private proxies might be a good start as it will enable you to parse google...with public proxies this is usually not working good.
    Thanks @Sven . I have changed my approach and using both private and public proxies for scrapping. Still no joy. To be honest, public proxies were just working fine till now and was scrapping big number of links with them. Not sure what changed all of a sudden. Any thoughts?  
  • SvenSven www.GSA-Online.de
    I would need to see the details on this. Maybe you can offer login to the system for a closer look!?
  • Thank you for offering help to look at it @Sven really appreciate it. Taking your first advice, I moved completely to private proxies. I have around 40 private proxies and I started 350 threads of scrapping on it and things have started to look normal (phew) at the moment in terms of scrapping. I will surely take the offer to help if I'm stuck again.

    on another note, I do need help on the Captcha settings. I'm testing CaptchaAI service using their API but it is solving captchas very slow. Their support recommended to send cookies and proxy with the API call for faster solving. But GSA WC does not have the ability to send cookies or proxy with API for CaptchaAI. Any chance you can add this? I can point you to the documentation if you please. Thanks again mate. 
     
     Sven said:
    I would need to see the details on this. Maybe you can offer login to the system for a closer look!?

  • SvenSven www.GSA-Online.de
    I really don't like to send proxies along with captcha requests as you never know what they do with it. There is a huge risk of the proxies being stolen. And also most proxies providers don't like you sharing them anyway.
  • Understand. Thanks for the clarification @Sven. can we send cookies though?
    Sven said:
    I really don't like to send proxies along with captcha requests as you never know what they do with it. There is a huge risk of the proxies being stolen. And also most proxies providers don't like you sharing them anyway.

  • SvenSven www.GSA-Online.de
    cookies are not needed for captcha solving
  • okay .. thanks .. its just because their support mentioned it and their documentation has it too .. hence I asked.
     
Sign In or Register to comment.