Yahoo search engine bug gsa ser? anyone else experienced this...

Hi, whenever I use scrapebox with yahoo engine selected it finds massive amounts of url's, really good ones too (after loading them up in GSA PI it's very accurate with footprints provided from GSA SER footprint studio) 
But, whenever I de-select all search engines in gsa ser project except for Yahoo. 
i only get "may be blocked by search engine, and displaying proxy" or something like that?

But the thing is, I use these same proxies for Scrapebox, so  I know it's a GSA SER error... but what is the error? user-agent? i never seen options in gsa ser for user-agents, but this is the first thing that came to mind.. anyone knows if you can edit user-agents in GSA SER?


  • SvenSven
    Indeed yahoo is a problem.

    They seem to only allow access to it if you have agreed to there terms on an extra webpage that is shown when a special cookie is not set. Unfortunately, this coolie value is not something I can simply set.

    I don't know how scrapebox is parsing yahoo, but I use the normal webpage that you would use in browser.
  • No problem @sven you already did so much to automate many things, for the people who have this issue all i can say is just let scrapebox do the scraping in a folder and set GSA SER to monitor that folder... problem solved, move along :-) 
  • SvenSven
    Though it would be nice to know how others parse yahoo these days...maybe someone can get details, maybe with http debugger running and sniffing?
