Skip to content

Comments

  • You want footprints to harvest urls
  • HARVEST AWAY


    edu inurl:login 
    site:edu “powered by vbulletin”
    inurl:.edu/phpbb2
    inurl:.edu/ (Powered by Invision Power Board)
    site:edu “powered by SMF”
    edu forums sites,gov forums sites
    site:.mil
    site:edu inurl:login (Create an account)
    site:edu "powered by vbulletin"
    inurl:.edu/phpbb2
    inurl:.edu/ (Powered by Invision Power Board)
    site:edu "powered by SMF"
    "keyword" forum site:.edu
    "keyword" forum site:.gov
    "keyword" blog site:.gov
    inurl:.gov +inurl:forum + inurl:register
    inurl:.gov +inurl:forum
    inurl:.edu/phpbb inurl:register
    inurl:edu forum
    inurl:gov forum
    inurl:.edu+inurl:forum

    edu wikis:
    Quote:
    site:.edu wiki
    site:.edu Inurl:MediaWiki_talk
    Wordpress Blogs:
    Quote:
    site:.edu" "Powered By Wordpress"
  • ronron SERLists.com
    edited March 2014
    @qwerty - just start searching for lists like "the top 100,000 most searched terms" and lists of the most common single words and the list of the most common two word phrases. Stuff like that. The more generic and the simpler the words the better. 
  • Thanks for the sharing of foot prints
    Thanked by 1savariarealestate
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    Well...I really dont know what i'm doing wrong???

    I use Scapebox in order to try to attempt to build a decent gov/edu list.

    I use 3 things for the scraping setup:

    1. A huge list with generic keywords (100K)
    2. FootPrints (Listed below)
    3. 40 fast private google approved proxies

    So I use the merge function in scrapebox in order to create multiple search targets, since it will only return max 1000 urls pr. keyword.

    Unfortunately I get really sucking bad results each and every time!!!

    Perhaps someone are facing the same problems, or even better..Someone have a suggestion on how to improve my success rate?

    Would be grately appreciated some idiot bulletproof guide, since my approach seems to fail all the time:(

    Footprints:
    “powered by wordpress” “leave a comment”
    Inurl:.edu “Powered by wordpress”
    Inurl:.gov “Powered by WordPress”
    Inurl:.ac.uk “Powered by wordpress”
    Inurl:.edu “Powered by BlogEngine.NET 1.4.5.0?
    Inurl:.gov “”Powered by BlogEngine.NET 1.4.5.0?
    Inurl:.ac.uk “Powered by BlogEngine.NET 1.4.5.0?
    “Powered by BlogEngine”
    “Powered by Blogsmith”
    “powered by Typepad”
    “powered by scoop”
    “Powered by PHPbb”
    “Powered by vBulletin”
    “Powered by SMF”
    “powered by Simple Machines”
    inurl:/index.php?action=register 
    “powered by punBB”
    “powered by expressionengine” “yourkeyword” site:.uk
    “powered by expressionengine” “yourkeyword” site:.us
    “powered by expressionengine” “yourkeyword” site:.eu
    “Powered by Tagbox”
    “Powered by DRBGuestbook”
    “Mugu Guyman” Guestbook
    * site:.edu inurl:wp-login.php +blog
    * site:.gov inurl:wp-login.php +blog
    * site:.edu “your keyword”
    * site:.gov “your keyword” -”you must be logged in” -”comments are closed”
    * site:.edu “no comments” +blogroll -”posting closed” -”you must be logged in”
    -”comments are closed”
    * site:.gov “no comments” +blogroll -”posting closed” -”you must be logged in”
    -”comments are closed”
    * site:.edu “powered by expressionengine” “ADD YOUR KEYWORD”
    * site:.gov “powered by expressionengine” “ADD YOUR KEYWORD”
    * site:.com “powered by expressionengine” “ADD YOUR KEYWORD”
    * site:.org “powered by expressionengine” “ADD YOUR KEYWORD”
    * site:.edu “Powered by BlogEngine.NET”site:.edu inurl:blog “post a comment”
    -”comments closed” -”you must be logged in” “ADD YOUR KEYWORD”
    * site:.gov “Powered by BlogEngine.NET”site:.edu inurl:blog “post a comment”
    -”comments closed” -”you must be logged in” “ADD YOUR KEYWORD”
    * site:.com “Powered by BlogEngine.NET”site:.edu inurl:blog “post a comment”
    -”comments closed” -”you must be logged in” “ADD YOUR KEYWORD”

    * site:.org “Powered by BlogEngine.NET”site:.edu inurl:blog “post a comment”
    -”comments closed” -”you must be logged in” “ADD YOUR KEYWORD”
    * “powered by wordpress” your keyword
    * “powered by wordpress” “your keyword” (this is looking for an exact keyword match
    with the quotations around keyword)
    * “powered by wordpress” intitle:your keyword This targets pages that have your 
    keyword in the url of the page.
    * “powered by wordpress” inurl:your keyword
    * “If you have a TypeKey or TypePad account” your keyword
    * site:blogspot.com your keyword
    * “powered by wordpress” your keyword -comments are closed
    * intitle:add+url “your keyword”
    * intitle:submit+site “your keyword”
    * intitle:submit+url “your keyword”
    * intitle:add+your+site “your keyword”
    * intitle:add+site “your keyword”
    * intitle:directory “your keyword”
    * intitle:sites “your keyword”
    * intitle:list “your keyword”
    * “Powered by SMF” your keyword
    * Phbb your keyword
    * “powered by IPB” your keyword
    * MyBB your keyword
    * “powered by PunBB” your keyword
    * “Powered by Phbb” your keyword
    * “Powered by vBulletin” your keyword
    * site:squidoo.com “new links plexo” “add to this list”
    * site:squidoo.com “your keyword” “add to this list”
    forums register 
    register iam over 13 years of age forum
    discussion board register 
    bulletin board register 
    message board register 
    phpbb register forum
    punbb register forum
    forum signup
    vbulletin forum signup
    SMF register forum
    register forum Please Enter Your Date of Birth
    forums - Registration Agreement



  • ianclydeianclyde san jose
    edited June 2020

    This will definitely help you guys :)

    *Link Removed by Mod*

    Forum Footprints:

    vBulletin – “Powered by vBulletin”
    phpBB – “Powered by phpBB”
    MyBB – “Powered By MyBB” “Return to Content | Lite (Archive) Mode”
    FluxBB – “Powered by fluxbb”
    XenoBB – “Powered by XennoBB” + “Xenno Group”
    UseBB – “Powered by UseBB 1 Forum Software”
    XMB Forums – “Powered by XMB”
    PHPNuke – “PHP-Nuke Copyright” + “by Francisco Burzi”

    Blogs & Other Platform Footprings:

    WordPress – “mail address will not be published” “Powered by WordPress”
    Joomla – “Powered by Joomla!” “Write comment” “Website:”
    B2Evolution – “Your email address will not be revealed on this site.” “Leave a comment”
    Drupal – Powered by Drupal + “Web page addresses and e-mail addresses turn into links automatically.”
    4Images – “Powered by 4images” “Author:” “Comment”
    BlogEngine – “Powered By BlogEngine” “Add A Comment” “Name*”
    SquareSpace – site:*.squarespace.com “Post a New Comment”
    SharePoint – “Built using the SharePoint” “Comments”
    Geeklog – “Powered by Geeklog” “The following comments are owned”
    Plogger – “Powered by Plogger” “Post a comment:”
    Movable Type – “Powered by Movable Type” + “Post a comment” -movabletype.org



Sign In or Register to comment.