Skip to content

Keyword Generator - Made in Java - For Scraping

12346»

Comments

  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    edited May 2015
    Scraping Tool-Box 1.4 Final is released.

    You are encouraged as a donor to keep version 1.3 at all times, which is considered as a stable version.

    Donors Instruction & information:

    For those who want to test out some new additional features, please observe the following.

    1. Download the new update.
    2. Delete all existing directories, except the directory 'serial' in 'Scraping Tool-Box 1.3 Final'
    3. Extract the rar-file and move the old directory 'serial' into the new directory 'Scraping Tool-Box 1.4 Final'
    4. Delete old main directory 'Scraping Tool-Box 1.3 Final'

    Article Extractor:

    image


    Automatic GSA Project Backup to Dropbox:

    image

    Experimental features:

    image

    Change Log:

    image

    Keyword Generator:

    image

    Thank you all donors - It has been an interesting journey!

    With the release of Scraping Tool-Box 1.4 - This project is now officially discontinued and no further development will take place.
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    Guide - How to use Article Extractor:

    1. Identify potential targets using Google Search Feature:

    image

    Selecting the amount of search results..

    image

    Notice the directories and select destination etc:

    image

    Identified urls found:

    image

    Getting ready to extract articles from identified targets:

    image

    image

    2. Observe we use our new identified list:

    image

    Article Extraction:

    image

    Article Extraction Complete:

    image

    Reviewing the new article text-files:

    image

    Preview of a new article:

    image
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    edited May 2015
    Donors Special Offer Only - 25 lucky slots - One Off Only List!


    This is a ONE-OFF-LIST ONLY - Meaning there will be no future additional lists. Consider it as a Project Development Booster...

    The list is targeting those who struggle to get some verified links - The so-called pro's can go fuck themselves, this is not for you guys (or is it??):D

    Offer is open until all slots are full or 4 weeks from today!

    image

    THIS IS IT!

    - More than 2000 .edu and .gov links & 700 wikis!
    - Tons of dofollow links
    - Tons of contextuals

    38.000 verified links in total + additional 7000 links that also can be posted on.

    image


    Wait...does that mean there are no nofollow links? Nope, of course not. There are nofollow links too in the list, mainly because of 3 reasons:

    1. Nofollow links are good to draw traffic
    2. To make a more natural mix for the search engines.
    3. Some contextuals, wikis and forums are simply nofollow links

    No modifications was used in making this list and only GSA Captcha Breaker + reCaptcha Solver was used. No hokus pokus here - this is the real deal!

    Terms:

    Once you have purchaced and recieved the list - There is no refund options.
    That means - You got the list and it's up to you to use it correctly as described below (*See Instruction Section)
    Upon purchase - expect the list to be sent within 24 hours (normally faster depending on real life and time constraints.)
    This is not negotiable as this list really works without any bullshit - Exactly as stated!

    -Allready verified links WTF?

    Surely some might experience to have some of these links already in their setup. That is completely normal, just keep things running untill list is completely exhausted.
    You will get tons of verified - that is a promise!

    Purchase:
    If you are interested in obtaining this list, please send me a PM.
    Instructions will be given in PM.

    Become Donor:
    Not a donor yet? - No problem! - You may make a donation of 5$ which entitles you to recieve 'Scraping Tool-Box'
    - If there still are empty slots - you may purchase this list at the specified price.

    Other:
    Don't share the list with friends or in public.
    Each list is marked with a special unique token, which makes it easy to track down anyone considering sharing/reselling/public distribution.

    Instructions:

    The list will be delivered in .sl format.
    You will make a new folder on your harddrive ex. 'Amazing List' and simply use either GSA Ser to import it as 'IDENTIFIED' or use other unzipper tool and unpack it.

    - Check your proxies and make sure they are working
    - Modify Project-->Delete target Url Cache
    - Modify Project-->Delete Unused Accounts
    - Add 10 new fresh emails
    - Add fresh articles into GSA SER
    - Project Options---> 'USE URLS FROM GLOBAL SITE IF ENABLED' [x] and 'Identified' [X]
    - Project-->Set Status->Active..Use Global Sitelist only

    - USE GSA CAPCTHA BREAKER + some additional reCapthca Solver (No need for DBC or Imagetyperz)
    --> Set GSA CB to 2 retries and reCaptcha to 1 retry.

    (*) For absolute best performance: Create a brand new project and blast it with this list, using new fresh emails!

    [HIT START] and observe the magic happens!

    PM me now to get the payment instructions.

    (I normally answer within 24 hours - Patience guys, and remember real life;))

    Enjoy:)


  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    Amazing to see what you can do with the proper footprints and keywords.

    This was done using only 75 threads and of course keywords generated with the tool in this thread:

    image
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    - Optimized Speed in 'Clean Scrapings':

    - Moved all tasks to one task i.e 'load of file' in order to avoid redundancy operations and copy from one structure to another
    - Enhanced garbage filter that removes unneeded file extensions
    - Reduced memory ursage - using streams instead of local memory
    - Ensured bigger files can be loaded (> 1 GB text-files)
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    - KEYWORD GENERATOR:

    - Added support for Arabic
    - Added support for Hebrew

    image

    image
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    Scraping Tool-Box 1.4.2 Final is released.

    You are encouraged as a donor to keep version 1.3 at all times, which is considered as a stable version.

    Donors Instruction & information:

    For those who want to test out some new additional features, please observe the following.

    1. Download the new update.
    2. Delete all existing directories, except the directory 'serial' in 'Scraping Tool-Box 1.3 Final'
    3. Extract the rar-file and move the old directory 'serial' into the new directory 'Scraping Tool-Box 1.4.2 Final'
    4. Delete old main directory 'Scraping Tool-Box 1.3 Final'
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    - Added support for Chinese Keywords

    image
  • does it scrape keywords from google suggest, google planner or what ?
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    @Milan
    Nope, that feature is not included in this program.

    In terms of 'keywords' it covers 2 other areas:

    1. Input any Ebook in the format book.txt (utf-8 format) and it will identify all unique words and output them.
    2. Scrape keywords from meta-tags from a list with urls

  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    edited June 2015
    Well I could not resist it:P

    The offer for Donors only (Verified List above) is awesome in terms of quality - The one below is pretty decent too.

  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    edited June 2015
    - Prepared some code for an awesome feature...

    Well, I will add 'Advanced Feature' into 'Clean Scrapings' once I get some more time...

    What it does is actually pretty smart. It subtracts two lists flying fast!

    Example:

    image

    image

    In the above example, I had a 'Cleaned List' after using the 'Clean Scapings' feature in this toolbox.

    Now I made some addtional code, to sort that list into joomla k2 platform - flying fast without any proxies!

    That resulted in the 'sitelist_sitelist_Article-Joomla K2' you see above

    Now, to avoid running all those urls through GSA Ser one more time - I made a feature to 'SUBTRACT' those two lists from each other - and the result is 'sitelist_sitelist_Intersection'.

    You could take your verified list and subtract your target list too - that would save time;)

    I plan to enhance this feature and add some more platforms - it will not replace GSA PI or the sorting in GSA Ser - 

    However this is meant as a fast 'raw sorting' without actually looking up each url.
    Additionally it can subtract two lists - ex. Verified List - Target List  = Read-To-Blast List (Not present in GSA Verified List)

    Personally, I like this feature as it save me a lot of time:P
    Like in the example above - A huge list with a single platform - Really smooth;)

    Note:
    You will still see 'allready successful'.
    Why?

    - Because we did not look/sort according to domain root
    - Some platforms like 'images' should not be sorted according to unique domains

    However some redudant urls are removed in order to save time.

    So, as promised when I have some more time - I will build it in to Scraping Tool-Box as an 'Advanced Feature'.
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    Another example:

    Here I scraped two lists with custom footprints and unique keywords.
    The footprints were closely related, with very little variation.

    So in order to prevent running the same urls through GSA Ser one more time - I simply subtracted the lists:

    image

    image
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    edited June 2015
    - Eartly Preview of upcoming feature - 'Advanced Sorting'

    This new feuture can be used - after normal 'Clean Scrapings'

    image

    image

    image
    image
    - It will rapidly sort into specific platforms - without making any lookup, hence no need for proxies!
    - Currently I have just added a few platforms (More will be added over time)

    *You would then sort the remaining using either GSA Ser or GSA PI.

    I will also add the ability to 'Subtract two list from each other - Fast' as mentioned previously above.

    Please bare in mind - it's not finished yet - this is just a preview;)

  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    Work in progress for version 1.5

    Bug reported and identified:

    - There are issues with some menus and buttons (Related to higher resolution/screen settings)

    Current Plan for upcoming version 1.5:
    - Fix resolution issues
    - Add 'Quick Pre-Sort'
    - Add 'Subtract List'

    [HELP/Support]
    - Add videos to support each feature of the program
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    edited June 2015
    Scraping Tool-Box 1.4.3 Final is released.

    You are encouraged as a donor to keep version 1.3 at all times, which is considered as a stable version.

    If you didn't experience issues with version 1.42 and below - don't update before version 1.5 is released.

    Donors Instruction & information:

    For those who want to test out some new additional features, please observe the following.

    1. Download the new update.
    2. Delete all existing directories, except the directory 'serial' in 'Scraping Tool-Box 1.3 Final'
    3. Extract the rar-file and move the old directory 'serial' into the new directory 'Scraping Tool-Box 1.4.3 Final'
    4. Delete old main directory 'Scraping Tool-Box 1.3 Final'

    Release Information:
    - Fixed issues related to GUI (Resolution)
    - Added part of new Feature (Advanced Sorting) - Not Completed.

    This release fixes significant bugs related to the GUI.

    image

    image

    image
Sign In or Register to comment.