Skip to content

Can't search on Google

Hi, In my SER log, it says that my proxy is blocked by google a lot. So I double click on that which leads to this url: https://www.google.com.au/search?q=site:.edu+inurl:"/user/profile.php?id="+moodle&as_qdr=all&filter=0&num=100&complete=0&cr=countryAU It seems that there's no result for the footprint.. But when I click on 'reset search tools', the results appear. How can I make my SER find urls from Google? Please help Thanks
Tagged:

Comments

  • spunko2010spunko2010 Isle of Man
    edited February 2014
    Only options seem to be increase time between searches and use loads of proxies. Even that only works for so long, though. If you have FoxyProxy or something, you could try entering the captchas from Google manually in your browser, I have always wondered if that worked.
  • Thanks very much. I have 10 projects at the moment and the LMP is very low about 0.20 or less. How much LMP should I expect with 10 projects (running 5 at a time), 20 proxies (10 private, 10 semi) with SER scrape for urls? I guess my low LPM is because SER cant search for urls
  • @Spunko2010, I actually just tried entering the recaptcha myself the other day, don't ask me why, and it allowed me to once again scrape google from that proxy. It might actually not be a bad idea if ser in the future could be able to do this automatically whenever a proxy gets banned. Would be very useful for people who don't mind the recaptcha costs or has a monthly paid service.

    @hoanglo, it's impossible to predict your lpm. It could be 0.0001 or even 50-100. As spunko suggested, you need to increase the time between search queries if you can't get more proxies, or simply do both until desired results has been achieved.
  • spunko2010spunko2010 Isle of Man
    edited February 2014
    Hi fakenickahl I + few others have requested this feature many times from Sven but he has said no it's too difficult. However I am considering paying a developer to do it, if there is enough scope and releasing a plugin. I think it will be what separates the wheat from the chaff :D I don't think that Google is using reCaptcha though, looks like custom one?
  • My time between queries is 120 seconds. I don't know how exactly this feature work but I heard somebody recommend 120 seconds.
    Is it enough?
  • How many proxies do you have, and how many projects active? Hard to judge...
  • I have 10 private and 10 semi private. I have about 10 projects and I run the schedule so only 5 projects at a time for 20 mind.
  • @spunko2010, I did not really pay attention to what kind of captcha it was, but if dbc can solve it, I'll try making a quick tool to do it soon. Though before I've started I imagine it'd be a problem that ser does not offer an easy way to tell you what proxies has been banned though I have a solution in mind I'll try. Will be releasing it on the forum if I can get a working solution. It just seems counter productive to test the proxies x minutes outside ser, but I'll give it a try. It seems quite useful though to accomplish though.
  • @fakenickahl what are some useful tools available for SER?
  • @hoanglo, uhm, I can't suggest any to you related to your problem off the top of my head I'm afraid. If I'm successful doing what I explained briefly above I'll surely have an useful tool for you.

    Regarding your problem, how many threads are you using in SER? A wait time of 120 seconds between search queries might just not be enough if you're forcing SER to go scrape links too often. You could try making sure your proxies are google passed once again, try increasing the wait between search queries, and then see if your proxies are still getting banned. It's also a great advantage to use multiple search engines from google.
  • I'm trying 200 threads at the moment. I will try increasing the wait time to 180.
  • I know the problem is. At the end of the url there's a country filter. So I think I will try choosing only the international search engines.
Sign In or Register to comment.