New Google Scraper About To Come!



  • BlazingSEOBlazingSEO
    @Vijayaraj - Haha, those aren't a part of our website, it's part of ;). But thanks so much for the encouraging words, we're very excited to get this rolling out!
  • i would be interested to try this out...
  • Are the results unique or we have to dedupe it?
  • BlazingSEOBlazingSEO
    @fengli - We return (and charge) all results that are scraped. On your results page you then have the option to export the results as:

    1. Original results (all of them)
    2. De-duped by domain
    3. De-duped by URL

  • great, it is amazing..
  • So you are storing all the scraped URLs?
  • BlazingSEOBlazingSEO
    @momba12 - Correct. We won't ever disclose or sell them for any kind of lists, but we'll simply make the scraping process for all users faster by giving you the scraped results if another user has already scraped them. Why waste resources scraping for the same keywords over and over again? :)
  • When is this coming out? Sign me up
  • Yep, excellent concept.  Please count me in, even on the beta testing if possible. 
  • BlazingSEOBlazingSEO
    @jgf213 - We had to put it behind our new text captcha system because that was a much higher priority, but we're hoping we can roll out some beta testers in the next 2-3 weeks.
  • BlazingSEOBlazingSEO
    Quick update everyone - we are very very close! After running some tests today our system came back with some incredible statistics.... an average of 400 links per SECOND! That's right, with the few servers we have right now running the system we will be able to handle over 30 million links a day, and that's not even mentioning with the help of our cache system!

    We have one last thing left to do and that's to find enough proxy sources so that we never run out with the volume we expect to use. With that being said, if you have or know of anyone who sells good Google-passed proxies, please shoot me a PM. If it's not a 'well known' or easily findable source, I will make sure to reward you with a lot of free scraping when we go live :)
  • Looking forward to it, message me once you are live.
  • I never satisfied with the scrapebox did to get its scraping job done at the moment. Willing to try your service asap when it go alive mate. Let me know if you're ready to set ;)
  • I am in for the beta tests. I have several servers runnning with hrefer and gscraper 24/7. I would happily try and break your system! LMAO Dashboard looks nice and user friendly.

    One little feature request is a "sieve fliter" which is the main reason why I love using hrefer over gscraper. It filters a lot of the crap urls as it harvests. You can filter in the same way with gscraper, but it requires a lot of extra steps with the final list. It works a lot better if results are filtered automatically as it scrapes.
  • BlazingSEOBlazingSEO
    Just an update for all the new posters, we are testing the system right now, but as you may have seen in the recent SER update our newest Text Captcha System (like askMeBot) was implemented a couple days ago. We've been hard at work getting that prepped and working properly so we can go live with it instead of just beta testers.

    Make sure to sign up for our email list that will contain information regarding the release and updates of the Google scraper (no additional information will be set - you have my word)

  • BlazingSEOBlazingSEO
    Newest update - we're getting close everybody! Stats are rolling in and data is looking good. Need some additional work on our proxy handler, but once that finishes up we will be in search of many proxy sources that will allow us to scrape FAST.

    If anyone reading this has used GScraper (or Scrapebox), what was the fastest links per minute or links per second that you have received before? Need to know what we have to beat :)


  • mirenmiren Macedonia
    I'm currently scraping with Scrapebox with 160 Avg urls/s
  • BlazingSEOBlazingSEO
    Thanks @miren ! I forgot to note in my previous post that we are only using a handful of crappy public proxy lists during our testing phase (hence the low links per minute / second). May I ask how many proxies you're using and how much you're paying for them that is achieving those results? Maybe even provide me here (or PM if you need) the source you're using to achieve that? Thanks pal!
  • @miren I would be cautious giving out proxy sources here.

    With services like these your proxies will become useless for future use.
  • BlazingSEOBlazingSEO
    @j1387 - thanks for trolling ;)
  • @BanditIM Trolling? Its the truth.
    You been begging for proxy sources to make your service work for a long time.

    Why would anyone in their right mind tell you their proxy source.
    Why would I give you my source that allows me to scrape at 100k per minute?

    So you could kill them with your service LOL. No thanks :D
  • mirenmiren Macedonia
    edited September 2014
    @BanditIM With 20 shared proxies and one connection you will get at least 5 Avg urls/s :)
  • BrandonBrandon Reputation Management Pro
    With GScraper and my proxy guy (when its' running good) I can get 100k+ per minute. When it's running bad about 15k+ per minute.
  • Gscraper and public proxies upto 50K per minute
  • Count me in dear sir for Beta Testing. In this moment I am using Gscraper to scrape and Scrapebox . But defenetly I am interested to use this.
  • Keen to test this out! @banditim
  • BlazingSEOBlazingSEO
    @miren @Brandon @Seljo -- Thanks guys for the input :). We're seeing that once we buy the proxy sources we have on our list we will be able to easily obtain these numbers - so that's awesome!
  • Count me in as well!
  • Currently scraping with a Gscraper so would be very interested to beta test and see the difference. Count me in as well!
  • BlazingSEOBlazingSEO

    First off, anyone is is truly interested in this system should sign up to the mailing list if you want to be a beta tester. I will not send any promotional emails out -- ONLY emails regarding this system will be sent to you. The list can be found here:

    About the update -- we're confident in being ready for beta users within the next few days. The system is currently scraping at 30 links per second with the couple of public proxy sources we currently use. This theoretically means... we increase the proxy sources = increase in links per second. Testing will commence throughout the week and another update will be given shortly.

    Note: The auto-footprint scraper isn't quite working yet, so for the first beta release that will not be functional. The dashboard for it is there, but the backbones of it working is a very complex AI project using natural language processing and heuristics and it's occupying a lot of time right now. It will be done though :)
Sign In or Register to comment.