Skip to content

New Search Engine that Doesn't need Proxies

@sven

There is SE www.webcrawler.com that use Yahoo and Google in the same time.
I'm using it in hrefer with no proxies and with 200 threads.
And I'm using it over 6 months.
This is very fast SE and good for scraping but problem is that URLs need to be cleaned before importing them in GSA

Can you find some time to work on this please.
If you find way to use this in GSA it can be only SE used in GSA as it is very fast and doesn't require proxies or time outs.

«1

Comments

  • SvenSven www.GSA-Online.de
    thanks, next version has this
  • Just please test it inside GSA...I have added that engine myself in GSA but it doesn't give good links because some part of URL need to be striped. 
    So, you need to make custom functions just for that engine.
    But once you make that it will scrape like craze without proxies, timeouts, etc...

  • SvenSven www.GSA-Online.de

    add it like this:

    
    [webcrawler.com]
    url=http://www.webcrawler.com/search/web?qsi=%page%&q=%search%
    country=international
    links_on_page=10
    start_page=1
    inc_page=10
    enabled=1
    must_have=*ClickHandler.ashx?du=*
    
  • ronron SERLists.com
    +1 @miki and @sven - That is a great addition!
  • LeeGLeeG Eating your first bourne

    Looks good, just need to test it

    I take it we just do it in the same way with the additional file with it added to

  • SvenSven www.GSA-Online.de
    yes, just add it to se.dat file
  • LeeGLeeG Eating your first bourne

    Done that.

    Just checking in case you have one of those incredibly annoying messages set and going off

    Only incredibly annoying messages so far is the normal incredibly annoying message about mega ocr which don't have an off button option yet :D

  • ronron SERLists.com
    edited January 2014

    If it wasn't for those error messages, @sven would just be relaxing in the Bavarian Alps with no internet connection. Thank you @Sven

    =))
  • LeeGLeeG Eating your first bourne

    Sir, can you please help me :D

    If we push adding this new engine using the se.dat file, Sven will think Im being vindictive because of those damned annoying error messages and no off button yet :D

    This engine is so goooooooooooooooooood once you add the se.dat file :D

  • nice one thank ya will try it
  • somehow its not working for me. it produces no results and all the search queries look like this:

    http://www.webcrawler.com/search/web?amp;q="Powered+by+UCenter+Home"

    if you open it in your browser it says:

    No search term was entered into the WebCrawler search box.

    Please enter your word or phrase to submit your search.

    any idea?
    thx
  • URL need to have "fcoid=417"
    Just visit manually that SE and try to search...you will see how URL looks like


  • But as I told above, Sven need to make some custom parser for this engine!
    It will not work if you just use this what he gave us.
  • @sven, did you find time to play with this SE ?
    It require some custom functions and testing.
    Code you gave us above won't work at all.
  • dont need proxies but have very low results :S
  • Visit manually that SE and type "powered by wordpress" 
    It is taking results from Google and Yahoo
  • @sven, is this implemented in latest update from yesterday?
  • SvenSven www.GSA-Online.de
    yes
  • I'm testing it now, but it looks like it doesn't move from page 1 to page 2 and other pages inside that SE.
    @sven, did you test this? Did you get to to crawl all results or just home page.
  • SvenSven www.GSA-Online.de
    working better on next version
  • edited January 2014
    This engine sounds awesome, I could just use this and stop gettinng banned proxies :)

    @sven Does the engine work correctly now(7.42) or need another update?
  • as he said, it will be ready in next update

  • @sven

    How to test does GSA crawl all pages of this engine?
    I mean does it go to page 2, page 3, etc..
  • SvenSven www.GSA-Online.de
    make a fake project and just select that engine with one engine to see results. Maybe turn debug mode on to see details.
  • After this engine has been added , Anyone  wanna bet for how long does this thing stay online or change their interface .
  • Those who use hrefer are using that engine more then one year.
    And trust me hrefer spammers community is much bigger then gsa

    I think it still doesn't work in GSA
  • My LPM and threads dropped after only selecting this engines.
  • Yes, @sven can you please find a way to crawl more then home page on this engine.
Sign In or Register to comment.