New Google Scraper About To Come!

April 2014

Would like to test drive it if possible

April 2014

@‌banditim 50$ for 500 million links sounds fare especially if those were deduped links. Suggestion: how about if you integrate a web crawler that crawls the entire web. Each page the crawler will visit would match it with a footprint. You then build a search engine that searches the crawled pages for keywords and filter the searches with engines.

April 2014

@bencrabara - Crawling the 'entire web' is a little overkill, but we definitely do have some extra crawling outside of just user's scrapes to allow us to get even more footprints and websites for specific engines

May 2014

This sounds really fantastic!
Scraping lists is always a very time and power consuming process, so this service can be really impactant!

May 2014

@raperez - Thanks very much! We sure hope it does

May 2014

I'll like to try your service as well. I'm using gscraper with their monthly proxies service, so a better alternative will be great.

May 2014

@BanditIM

Thank you for no response, i know now that I do not would hire me with you

May 2014

@Kaine - That's extremely odd -- I had typed out a very long response to you and was awaiting YOUR response haha. No need to get bitter - I'll retype it and send the PM again (not sure what happened to it).

May 2014

Surely a strange bug, I got nothing

Needless to sent in MP. Posted answered here, we are between man.

May 2014

@Kaine - I'm sorry, I'm a bit confused with your response... do you not want me to reply now?

May 2014

My English is very bad, I await your response.

May 2014

@BanditIM

This deal interested you ?

May 2014

@Kaine - I'll look into it all later today, thanks for all the information... just really busy right now

May 2014

BanditIM this is an interesting concept.

How good will it be at importing massive footprint lists? By massive I mean around the 5M mark (something I am about to start on)? And how quick would it return the restults?

Also, when charging for results, will you be charging for the number after or before the dedupe?

May 2014

Yea I'd pay like 30-70$ for this after testing for myself or hearing reviews and comparisons with the other 2

May 2014

is the beta out yet ?

May 2014

@Flembot - We haven't gotten to the point of testing 5 million, but we allow users to copy/paste or Upload files with their respective footprints and/or keywords, so it shouldn't be an issue. Result return speed will vary depending on how many of your search terms we have in our cache. If most of your search terms are in our cache, you could get 100 million results back in a matter of minutes

. As for the charging of the results, to keep the comparisons even with Scrapebox and GScraper, we will charge before de-dupe. We obviously know there will be a lot of dupes in there, and will keep that in mind, but we want to make the perfect comparison with the other tools to prove we WILL be cheaper and more efficient than the others.

@sumusiko - Sweet, great to hear!

@spammasta - Not yet. We still got a couple weeks to make sure the whole system is ready to go. I wanted to make this thread now to get our beta testers signed up. I will stop accepting new beta testers fairly soon.

May 2014

@BanditIM - interested to see the pricing then, because with gscraper you can use the free version to scrape all day long. With dedupe enabled it is really hard to reach the url limit.

Will there be the ability to filter results? For example: I doubt anyone wants to get google webcache results at all.

May 2014

Really nice idea. Count with me to test it as always.

May 2014

@Ferryman - With the free version you still have to provide your own proxies and server costs though. If you're talking about just using like 1 or 2 threads to scrape with your plain IP, I think that's a far stretch haha. We may offer like 10,000 free scrapes for all users each month or something so it is comparable to that free scraping.

As for filtering results, we'll add in all those extra features as we progress. They are simple and easy to add in, but just take some time because there are so many of them to add. We want to get the core idea down before getting too far ahead of ourselves. I do have to ask though -- I've been scraping for years and haven't encountered my scraped results coming back with webcache results... does Google somehow show this in the searches? Mind giving an example?

@Peisithanatos - Great man, thanks for the interest!

May 2014

I am interested too!

May 2014

The idea is of caching looks great, since google is getting smarter with proxies every single day.
Though it comes to refresh rate of cached queries.
If you are going to call it like 2 weeks to 1 month they will be quite useless.

May 2014

@BanditIM - Yes, ofc you have to provide your own proxies

Still, for the $30 mentioned above you get enough reverse proxies to use all day long. For $100 I wouldn't even bother getting the service unless it is really phenomenal.

About the webcache - weird, I am geting them a lot on gscraper (every third result or so). Doesn't really bother me as I just dedupe according to unique domains.

Would be nice if there was an option to get weekly, monthly, daily etc results so you could just get the fresh ones instead of scraping the same thing over and over again.

May 2014

@derdor - Completely agreed. We plan on monitoring certain popular footprints (i.e. - "powered by wordpress") and see how often the search results update when using a handful of keywords with those footprints. Right now we're seeing around 3-10 days will be a good number to start off with on the refresh rate.

@Ferryman - Regarding reverseproxies, of course you can scrape all day long with those, but $30 gets you like 10 ports... once we do some case studies of certain thread amounts, we will have a good idea of what we will need to charge to be competitive. 10 ports, or 1000 ports, it'll all scale pretty evenly, so the number of links you can scrape in a month using that route will be less than the number of links you can get with us at the same price

. Also note, we see our service as little more premium (but won't charge for it) due to the fact it can auto-scrape 24/7 and auto-FTP, something the other software's cannot do.

May 2014

@BanditIM, looking forward to it.

May 2014

I want to be on the beta list

.

June 2014

Just want to keep everyone in the loop, we haven't forgotten about you guys

. With the upcoming release of our new text captcha system, the scraper has been put down one priority, but it's very very close! Check out the easy-to-use dashboard that'll get you scraping links 24/7 in a matter of a couple of minutes:

http://postimg.org/image/fb9gacdgt/

http://postimg.org/image/j51fm5tkf/

http://postimg.org/image/xzq10c35b/

http://postimg.org/image/gp4o5fhtb/

June 2014

Sounds awesome - I would definately be interested in this...

June 2014

@banditIM after trying your email service and spin service ( too bad you removed it

), I'm waiting to get my hands on this. I always had trouble with scraping and proxies so this will be the better option for me. On a completely unrelated note the ads in the screenshots show south Indian actress

June 2014

@BanditIM

If possible please add me to the Beat-Tester List.

Till now i use ScrapeBox and Gscraper

Thanks Marc

New Google Scraper About To Come!

Comments