Use URLs from Global Site Lists If Enabled

edited October 2012 in Need Help
I have a question regarding this function.

1) If we check this, does that mean it only posts to our global site lists? Or it does both the normal searching and also to global site lists? If both, which one gets priority?

2) If it searches the global site lists, does it mean we'll pretty much be getting links from the same place for all our websites that are using this? Is this a good idea?

3) If it searches the global site lists, that means it ignores the keywords. That's why GlobalGoogler said that it will give unreleated or irrelevant links and thus should be used for say links to 1st or 2nd tier and not to money site because of the irrelevance. Am I right?

4) Another more basic and foundational question that I've always had and I hope can get answered. How does GSA actually work. It has all the platforms. But does it thus use all the platform footprints and your keywords and then go and post? If so, what is the place for the "Search Online for URLs" or "Search Online for Site Lists" in the settings? Do we still need to do this? OR GSA does this automatically when it does its main job?



  • SvenSven

    1) If you have search engines selected, it will use them in priority. It will only use the site list if there are not enough sites found with search engines. Let's say the search engine parsing delivers 50 results to post to but you have your setting to use 100 threads, than at least 50 sites are taken from your site lists to keep the program busy on all threads.

    2) It uses the things you added to the site lists. If you didn't add things manually there the sites found by search engines or imported sites got added by other projects to the site lists and so yes...they will be used while they got used by other projects.

    3) Yes but usually this doesn't matter as many engines don't use the keywords anyway to locate new sites. Like social bookmarks or directories. They have all kinds of different sites and are not related to a special topic. But if you only want to post to sites that are related to a special toopic, you might not want to use the site lists.

    4) You don't have to do anything extra. The site lists are not required. The program builds the site lists on it's own over time as it saves the found sites (if enabled). During a running project, it is not using the "Search Online for Site Lists" at all. It will only query search engines for "footprint of engine (like prowered by xyz" and "your keyword from project (if engine is making use of it).

  • Thanks Sven, so what's the use of the Search Online for URLs in the settings if GSA already does this automatically?
  • SvenSven
    It's for people using there own footprints e.g. to locate targets. If you know e.g. a better footprint to find Pligg (social bookmark) sites, you can use it there. Or if you don't want to actually buils links now but only search for possible targets.
  • I see, but if we don't use this function, will GSA eventually find it all out on its own? I'm just wondering the use of this function. Is it to speed things out and allow GSA not to do so much searching and all but go directly to these sites you've found for them?
  • Heres a question relating to this. Lets say I have 'use global site lists' checked and a PR of over 3 on a specific project. Am I right in thinking that when the global sites lists are used PR levels from the global list are ignored? If this is so, it would really be a helpful option for GSA to check against the PR of the url from the global lists and honor your PR settings before posting from global lists.
  • SvenSven
    the filter checking is done after the URL "imported" (from whatever source). So PR checking is done after it and not ignored.
  • Yes, GSA will store all URLs in global sitelists if you have selected to do so in Options -> Advanced

    I use the "search online for URLs" mostly when new engines added to SER to get couple of URLs for those new engines and import them to the project afterwards. You can do this faster with Scrapebox, but SER has more SEs to choose from and you can do this "on the fly".
  • Thanks both. Yes, this is my point, I have collected a large amount site lists (options - advanced). Whenever I use them I see that many of the verified urls are not honoring my PR filter for that particular project, that tends to tell me that for whatever reason, the PR filter you set is not being applied to global site lists.....
  • Sorry, Sven and Ozz, i don't think my last question is being answered or maybe I'm not getting it. I'm just wondering what's the point of doing the "Search Online for URLs" if GSA would already do it naturally in the course of its search? Is it to speed things up by putting a list in for GSA to go to directly?

  • @jonathanjon - apologies it looks like I am hijacking your thread doesn't it - not my intention. If its any help my understanding of this issue is its an entirely 'optional' step that you can take to find more urls for a particular platform. You would also use this if for example you had a different footprint to what is contained within the pre defined footprints, using this option you could then scrape for more urls than GSA would otherwise get with the default footprints. But, as Sven says, GSA will scrape by itself using the pre defined footprints so unless you want to speed up the scraping process, or you have a different footprint to use, you don't have to do anything with this setting. Hope that helps.
  • OzzOzz
    edited October 2012
    @jonatnjon: My last post was a reply to your last question. In the time I wrote that post Sven and takeachance jumped in between our posts

    See "Search online for URLs" just as a little addon and people might use it instead of Scrapebox if they don't have a license for it.
  • takeachance, no worries :)

    Thanks Ozz!
  • Which global list does it use?  Does it used identified, or submission... or verified lists?
    I'm hoping we can set this somewhere, so use "identified" until we've build a huge verified list, and then we can ignore the global identified and submission lists.
  • SvenSven
    It is using all checked lists.
  • edited December 2012

    1. where it should be unchecked to ignore them? advanced options right?

    2. I am getting a lot of log messages like this 

    "Loaded 0/0 urls from site list", ""Loaded 1/2 urls from site list" ...............

    At certain times, almost like 50% of the log messages are like this. Is it referring to the global site list?

    3. Will it parse through all the platform files in the global list folder even if I choose just a few platforms?

    4. Which do you think will be faster if I have a 5k list across 20 platforms? 
    a. using verified list from global list
    b. importing a file(which is verified) in to the project (disabling global list) 
  • i'm trying to answer that.

    1) yes
    2) yes
    3) yes, in random order
    4) i assume b) will be faster as its importing the urls in order and not randomly
  • Thank you Ozz. 
  • thanks for getting back to me, I didn't see that option.  It looks like you've thought of everything.  Can't wait for the captcha solver!
  • SvenSven
    what @Ozz said...and yes 4b is faster
