Skip to content

importing lists from scrapebrokers?

Is it possible to import URL lists from scrapebrokers and similar services?  How do you know if GSA SER has the ability to actually handle or post to those sites?

Comments

  • You can either import them into your global lists through Options > Advanced > Tools > Import URLs
    or import them directly into a project through right click > Import Target URLs

    How to know if SER can post to such sites? Ask the list seller!
  • thanks @cherub, appreciate that. the reason i am even considering that is (a) i have not been successful at harvesting my own lists from scrapebox or gsa ser, adn (b) various people have mentioned that their posting success rates are quite low because the list of sites built into gsa get spammed to death. (?)

  • googlealchemistgooglealchemist Anywhere I want
    'successful' is relative..depending on your harvesting skills, and the platform....the % of good ones can vary DRAMATICALLY. Either way, one of the most important skills you can have in this industry is always learning and honing better and better harvesting skills.

    harvest in sbox or href, not gsa, let gsa just post away
  • Where is the best place for a new user to start learning harvesting skills and best practices?
  • edited October 2013
    have a look at THIS thread
    http://www.blackhatworld.com/blackhat-seo/black-hat-seo-tools/605958-tut-how-easily-build-huge-sites-lists-gsa-ser.html

    read and understand the principle = it is NOT about posting comments, but about finding verified links
    using SB + addon link extractor
    based on the fact that backlinks created to another T requires that T-link to be verified, usually auto-approve
    hence a conversion of may be 80% of all submitted = verified is about normal

    all you need to begin is ANY list of comment links or other links
    then extract external links from those pages
    then

    1.
    import URLs after clean up in SB to SER and let SER do all the filtering
    a 100k URL target list extracted by SB may take many hrs or a day depending on your CPU/RAM power

    OR

    2.
    after cleanup using Linux with a self made shell script to to clean up and filter FOR footprints found IN SER (WITHOUT the SE code part) just the footprint part AFTER inurl > NO quotes, plain original text, any language

    to filter such offline BEFORE importing into SER may convert a extract URL list of 100k to may be 5k to 10k of highly matching URLs, then imported into SER

    100k offline filtering in Linux takes about 1-2 seconds
    then import list to SER
    SER does NOT recogniize another 39 or so %
    that is much more resource efficient than direct import of ALL harvested URLs
    you save with this method about 90% SER time/resources

    Importing URLs may consume approx 20-25% CPU by SER
    hence best done when SER stopped
    I use that system since a few days and my LpM went about 10x up

    and once you have your cleaned URL target list,
    use this list to import into SB to extract target URLs from your self made list

    hence properly done
    you start with ONE original list = may be from your own SER verified
    then each extracted list = dedup URL
    and use as NEXT step extraction ist for SB

    for extraction of NEW EXT links
    NEVER dedup domain !!
    you want as many pages, posts, comments from any domain as possible
    only for final SER submission you may do what ever dedup you prefer

    the when having your cleaned target URLs
    you have 2 options

    1. import into a project or T
    or
    2. import to the global lsit

    at the beginning I preferred option 1)
    I stopped ALL, except ONE
    imported self created target URL list
    and watched the LpM and verified BL

    imported target URLs have priority over any other target URLs as far as I see

    if imported to global site list (what I do NOW after initial repeated testing) you have to:

    1. disable all SE
    1. select global list + only identified (uncheck all others = I.E. uncheck submitted, verified, failed)

    one important side effect that requires modification of configuration in SER:

    with a highly matching target list, your CPU load will drastically INCREASE = YOU MAY ACCORDINGLY substantially reduce your threads else you may overload system or skip submissions

    working from home on small laptop, I have to reduce my LpM to 22-28
    but I got in 2 hrs the number of verified liinks I had in a full 24 hrs day before when letting SER scrape SEs

    the above system is the most efficient I have experienced so far
    NO SE
    SER only for submission
    high CPU load because SER has to check PR and submit almost every URL and check mail often for verification / activation

    hence deactivate options that cause SER to work = for example the excellent new feature to
    "use URLs linking on same verified URL ..." it does the same as your SB but uses additional resources you need for submission of your own list.

    and if you have no Linux machine, may be you find a way to CREATE such filtering using any Msoft OS based scripting language
    you need to:

    filter wrong domains
    remove domains containing wrong words
    clean wrong URL code
    clean up wrong characters and symbols harvested
    dedup UURLs
    extract footprints you want to submit to
    etc

    enjoy
  • Hi...

    If u have a (purchased/found) mix unsorted list,

    1. Before import it, Do u check if those domains still alive? Can/How to do it in SER?
    2. Do u check if those URLs still alive? If it's not alive/failed to stick, do u still try to post to it? or delete it while checking >> can/how to do it in SER?
    3. After that, do u simply choose platforms to posts & press Start? OR u "Import URLs (identify and sort in) first?

    Thanks
  • anyone can advise pls? thanks
Sign In or Register to comment.