Skip to content

[Suggestion] New Engines: WikkaWiki, DokuWiki, PHPizabi, Socialware, Socialengine

Hello everyone !
this is a list of footprints that i think Sven and Ozz need to work on and add in the near future :
-wikka wiki
-doku wiki
-jomsocial
-PHPizabi
-socialware
-socialengine

Would be awesome if these get added to GSA SER , i also can supply you with the footprints needed to scrape those sites

Comments

  • OzzOzz
    edited October 2012
    That are no footprints. 

    This are footprints:
    WikkaWiki: "Powered by WikkaWiki"|inurl:/wikka/UserSettings|inurl:/wikka.php?wakka=UserSettings
    DokuWiki: "dokuwiki.txt"|inurl:/wiki/dokuwiki|"wiki is licensed under"|"Driven by DokuWiki"


    To increase chances to get this engines added and speed up the whole process you need to post some sample sites as well as good footprints for the engines that are not mentioned above.

    PS: I've moved this thread and edited the title.
  • Maybe there's a way to increase the success rate with the MediaWikis aswell, since a lot of have a new sort of captcha, that CS can't solve (on it's own).


  • Samples please.
  • @mmtj
    captchasniper allows you to save unknow captchas
    and then you can use Captcha Destruction Kit and try to solve them

    in the following thread there is a lot of info on new captcha solvers created
    and an export/import utility in order to share with others/us
    and let us try to improve your solvers
    https://forum.gsa-online.de/discussion/comment/5292
  • OzzOzz
    edited October 2012
    [Script] SocialEngine.ini

    I've written this script blindly as I have no opportunity for testing within the next 48h.

    Notes:
    - "page must have=" as identifier is somewhat vague, but might work as there is no "powered by socialnet" available. the The identifier is made for sites with possible blog-posting
    - profile backlink is not defined but you can post your url within the registration process, so there is no extra-step needed but the profile-url-verification is not defined yet
    - about_yourself is without anchor/html and without line breaks
    - article is with anchor/html but if you post a url without html-code it will be shown as plain text-url
    - register step2 -> if think there are some fields i don't know of yet like 1_1_1 or 1_1_2
    - i've got a "token"-failure when posting an article first, so that could cause some issues. as i tried it a second time everything was OK
  • [Script] DZOIC Handshakes.ini

    Again, no tests and I fear this script cause some major issues

    Notes:
    - many, many Steps
    - registration is complicated and i hope this works ^^. you get a verification code as well, but in the mail is a verification link and if we click this, we don't need to enter the code

    sample url that works: bitchingbullshitclub.com
    sample url where public profile/blog seems not to work: nymomsconnect.com

    Take your time. I know you have a lot to do, so priority for both scripts is low.
  • Hey Ozz,

    Just thought I'd check out your Social Engine as already have some scraped and I get a lot of the following in my error log:

    unknown field "Filedata" was used in form.
    unknown field "captcha[input]" was used in form.

    There are hundreds of these so I'm wondering if fixing them might improve current success drastically :)

  • OzzOzz
    edited October 2012
    As I said, I couldn't test the scripts so far and have written them blindly.

    "Filedata" is the field for image-upload. You should fix that with:

    Filedata=%leave%

    "captcha[input]" should be the field for the captcha input, lol. Some Captcha work differently than the standard ones and need to be determined with some other lines of code I don't know of yet. 
    I can't test them within the next 30h or so, so you have to wait until Sven has time for this as my scripts doesn't work anyway without fixing by Sven ;)

  • Wordpress MultiSite is looking great too and it's not an overspammed platform
    Footprint:
    "Please enter your username and e-mail address. You will receive a new password via e-mail."  "Username:"  "E-mail:"
  • SvenSven www.GSA-Online.de
    just got SocialEngine to work :)
  • Sweet! Thanks Sven - can you paste code?
  • SvenSven www.GSA-Online.de
    DZOIC Handshakes also done :)
  • AlexRAlexR Cape Town
    Sven...you're a machine...I vote we all buy Sven a cup of coffee to go into overdrive! Imagine what he could achieve....maybe even deconstruct the Google algorithm next.. :-0
  • cool story
  • SvenSven www.GSA-Online.de
    I don't drink coffee...image if I would ;P
  • Hey Sven an Ozz , here is some new footprints for some platforms i think that they are not integrated ,

    "Powered by PHPizabi" "Drop your"
    "Powered by PHP-Fusion" inurl:article_id
    Drupal Forums (inurl:forum "create new account" "request new password")
    eZ Publish * ("Log in or create a user account to comment." "Powered by eZ publish")
    Geeklog (inurl:article "Powered By GeekLog")
    IpBoard * ("powered by ip.board" inurl:topic)
    jForum ("Powered by JForum" inurl:posts register)
    Livestreet * ("powered by livestreet" inurl:blog)
    Mambo ("No account yet? Create one" inurl:Itemid "Please login or register to add")
    Mantis ("Mantis 1.1.8" Notes inurl:bug_id "Signup for a new account")
    Redmine ("powered by redmine" -inurl:projects inurl:issues Register Open -Resolved -Fixed)
    Trac * (trac inurl:ticket "Register")
    UBBThreads * ("Powered by UBB.threads" inurl:showflat -inurl:site_id "Register User")
    Fireboard * ("powered by Fireboard" "The administrator has disabled public write access")
    pMachine ("powered by pMachine" "Notify me when someone replies to this post")
    Pivot ("powered by pivot" inurl:entry "Remember personal info?")
  • @globalgoogler -- although I would vote for sven to be able to deconstruct G's algo -- but honestly, I don't even believe the Scientists at Google can deconstruct it -- they literally have a team of people that work on SECTIONS of the algo -- and then run it on a test version of Google to see how the changes they made affected the SERP results.

    @sven -- is the socialengine.ini now in 4.54???  I see there is a SocialEngine.ini in the /engines/ directory of my SER install.
  • SvenSven www.GSA-Online.de
    yes its added (See changelog)
Sign In or Register to comment.