Skip to content

Scripting Question, how to identify code in source of page

If i understand the manual correctly, if I enter page must have1=/wp-content/ then GSA search engine ranker should mark any site that it finds wp-content as a match for this engine?   All I want it to do is stay on search mode and continuously run searching for sites based on keywords I enter that has wp-content in the page source code.    I just took one of the wordpress engine files and changed it to only look for this but when I run the modified engine file it just keeps searching and doesn't find anything.   Am I doing something wrong? 

Comments

  • SvenSven www.GSA-Online.de
    well show the modification...
  • That's all I did, I changed one of the .ini script files and put the only identification required as page must have1=/wp-content/  as almost every wordpress site will have /wp-content/ in the source code.

    What I want to do is identify all wordpress sites, not just ones GSA can submit to like articles and comments.   When I rename it as something like say wordpressfind and only change that and try it, it doesn't find anything.  I put GSA to search only mode as I don't want any submits, just searches. 
  • SvenSven www.GSA-Online.de

    You mean it always gets identified as something else? In that case duplicate the line a few time so that it counts more than others...

    page must have1=/wp-content/

    page must have2=/wp-content/

    ...

  • edited February 2014
    It never identifies it, I go into folder that has engines that are identified and it never generates a text file with a list of sites it found /wp-content/ on.   Would it be easier to just modify the Wp Buddypress or the Article Directory file to make it so that it identifies anything as a match for the engine( page must have1=/wp-content/)  that has /wp-content/  in the page source?
  • SvenSven www.GSA-Online.de
    sorry something else must be wrong than. Please ul it on pastebin for a closer look.
  • I just uploaded it on one of my throw away domains, all I'm doing is taking say for example the wordpress article script and changing it so that the only thing it looks for is page must have1=/wp-content/ and the url must have1=.com

    What I then want to do is use GSA differently, just let it run, search for sites that match these two conditions, and then record them to the list of identified engines .  I run this and it never finds anything. 

    http://www.hbkdemosite1.info/Wordpressfind.ini
  • I use a similar engine to check the search engines. The engine below should identify wordpress sites:

    [SETUP]
    enabled=1
    default checked=0
    engine type=Utilities
    description=SE tester by cherub
    dofollow=2
    anchor text=1
    creates own page=1
    uses pages=0

    page must have1=wp-content
    page must have2=wp-content
    page must have3=wp-content

    search term="powered by wordpress"

    add keyword to search=2
    use blog search=0

    ;-------------------------------------------------------------------------
    ; Variables we have to define for this engine
    ;

    [Anchor_Text]
    type=text
    alternate data=%spinfile-generic_anchor_text.dat%

    [URL]
    type=url

    ;---------------------------

    [STEP1]
    link type=Profile-Contextual
    find link=User CP
    find url=*profile.php
    alternative url=./profile.php

    form url=*profile.php
    form id=newentry
    form name=newentry

    set unknown variable=%leave%
    variable must be used=u_sign,u_site,u_sig

    verify submission=1
    verify by=url
    first verify=0
    verify interval=10
    verify timeout=7200
    verify on unknown status=1
    verify search detail url=0

    avatar=%random_option%
    u_loca=%columnspinfile-address_data.dat-3%, %columnspinfile-address_data.dat-1%
    u_site=%url%
    u_timezone=%random_option%
  • Thanks, this works perfectly!  
Sign In or Register to comment.