Skip to content

Mythbusters: do links need to be indexed to be counted?

There is an ongoing debate wether links have to be indexed to be counted (pass link juice) by Google. Unfortunately this debate is mainly anecdotal and I've not seen real proof from either side. Personally I do think they need to be indexed for several reasons beyond the scope of this topic, but I just realized there is an easy test to bring some scientific evidence in this debate.

  • Why not copy the backlink profile from Webmaster Tools, run them through the Scrapebox Alive Checker and Check wether the alives are indexed. It's really that simple. Since the backlink data comes directly from Google itself, it means that if 100% of the backlink profile is indexed it shows definite proof these links have to be indexed to be counted! 

If the number is below 100%, but let's say between 70-99%, there seem to be some other variabels going, but indexing does seem to be important. However, if the number is below -say 50%- we can argue that indexing might really not be that important. Sure, theoretically only the indexed links might be the ones passing link juice, while the other are just sitting there idle, but at least we tried.

Unfortunately I don't use Webmaster Tools on my SER sites to prevent footprints to my other sites (if I get a manual the domain is doomed anyway), but if someone does use WMT for his SER sites, please run this easy test and share your conclusions. If you don't have Scrapebox and feel comfortable sending me the data, I'm willing to do it for you.

Comments

  • edited March 2015
    Good shout! I'm all for the a not indexed backlink is a useless one, but I appreciate what you're saying.

    But....GWMT only shows a sample of links...top 1,000 but yeah I guess it still gives you an idea..

    My laptop is running at 71C at the moment so loading up Scrapebox would push it over the edge. I'll do this tomorrow morning and report back. Interested to hear other's figures.

    Edit, had a spare SB license on a server. OK first site, 69% alive....rechecked and same number. Second site, has an incredible link profile and was 56%....

    So....regarding the second site, do you think it would be an idea to Disavow those dead links? I manually checked some and they were indeed dead/404/moved/parked.
  • A primary authority site that had about 4000 links showing in GWMT was at 68% (3 Error Retries set), and the domains resolved at 99%. Interesting.
  • edited March 2015
    @JudderMan

    You should be able to extract more links by going to (translating this from Dutch WMT version) more links > and then pressing the "download more links" button to export everything to .CSV.

    Thanks for doing that test. But unfortunately it's not clear wether those 69% and 56% numbers refer to indexed pages or alive pages. Just because they're alive (online) doesn't mean Google will index them. But maybe you checked both and those numbers refer to that?


    Thank you for doing the test as well! One question though, did you do the same as Judderman or did you first did the Alive Check and then the (Google) Index Check?

    I have some thoughts on how to go further from here, but first I need to know wether the numbers refer to actual Google indexed pages AND the dead pages were weeded out by the Alive Checker first.

    Edit:To complicate things even further, I just noticed this tidbit in Google's WMT FAQ:

    "Webmaster Tools does not always show 100% of the links that Google knows about, so just because a particular link doesn't appear in Webmaster Tools doesn't mean that Google doesn't know about that link, or that your site isn't "getting credit" for that link."

    I really wonder wether they say this because there is a substantial lag before (indexed) links get passed to WMT or wether some links will never show up in GMT. I know some links from automaticaly generated pages like domain2008.com or rssing.com never get to WMT, but for normal links this seems very strange because it also means you can't disavow them. 
  • Did the SB Alive Check after downloading the links from GWMT. Did another round with the Alive Check at 5 Error retries and it went up to 72%.
  • edited March 2015
    Ok, so if I'm correct you only did the Alive Check? If so, can you extract the alive ones from it and Google index check them? To do this you press the Save/Transfer button in the Alive Checker >> Transfer Alive back to Scrapebox. Then you press Check Indexed >> Google Indexed.
  • catchallseocatchallseo www.catchallseo.com <- Disposable Emails NO MORE!
    @rogerke

    I appreciate this method to find out the actual reality behind indexing links.

    In any case links getting indexed in Google search should have greater value in terms of quality than those not getting indexed! Yes after a Google crawls a page, it then decide whether to index it or not. At least let Google crawl it!

    @dwwwb

    Please check if the links are indexed using the method specified by @rogerke !

  • @rogerke  Oh sorry. The Google Index via SB was about 80% of the Live links. Which seems to be leaving only 55% of the original set. There has to be other factors involved in these results. Proxy location, time of day, etc?
    Perhaps live indexing isn't as important. Historical index,.Majestic offers this metric for a reason maybe?

    How does this compare with your results?
  • edited March 2015
    @dwwwb

    Sorry for the late response. Been a bit busy the last few days.

    80% is a pretty high percentage actually. Sure it's only 55% of the original set, but those included dead links, and it wouldn't be fair to index check those because Google probably noticed as well and already deleted them from their index (Google's index data and WMT's data are definitely not in sync).

    If these results can be replicated it definitely seems important to index your links. Anyone willing to take up the gauntlet and try to replicate dwwwb's results? I have a very interesting hypothesis what the remaining 20% might mean, but I'd like to see this replicated first.
  • 2Take22Take2 UK
    edited March 2015
    Looks interesting.....

    I just ran a test on a 10+ year old local business (white hat) site that I've got.

    All links are natural, or as natural as they're ever likely to be;

    - WMT gave me 391 links to play with.

    - After running them through SB alive check 351 came back as good 'live' links..

    A total of 304 of those 351 urls were actually indexed in Google.
  • Tim89Tim89 www.expressindexer.solutions
    edited March 2015
    This doesn't really mean anything.

    GWT provides a "snapshot" of your backlinks that may have appeared within their index at one point, actually testing this by index checking a snapshot isn't highly accurate.

    This is why I use tools like back link monitor so you actually know how many links you've built and you can also index check them within the tool itself instead of importing into scrapebox for checking.

    I'm sure you are familiar with how Google indexes and deindexes links by checking your own personal websites on a daily basis, you may have 100 pages on your site in total and at times this may drop to 95 or 90 but eventually all 100 pages will return within the index, just like how Bing drops pages from their index one day and then reindexes those pages again.

    I have a 5 year old domain with over 2000 pages and that even fluctuates in indexed pages.

    Besides all of the above, for the non-indexed links you found, try and index check them in a weeks time, they may have dropped out of the index due to the sites' stability or anything for that matter and may reappear.

    I'm not saying anyone is wrong nor I'm saying anyone is 100% but an index link will almost definitely carry the weight you would need to boost serps.

    A good test would be to buy two new domains, set one up as a moneysite and set the other domain as no-index so it stays out of googles index, place a backlink on this domain that points to your moneysite and see if it ranks, the non-indexed site can still be crawled, just not indexed.
  • edited March 2015
    @2Take2

    Thanks for helping out mate! Interesting results (86% indexed) and it definitely replicates dwwwb's work! 


    I'm not sure I can totally follow your snapshot argument. The reason to do the Alive Check on those links is to get rid of the dead links so they won't contaminate the results. Obviously if a link doesn't exist anymore it gets removed from Google's index. Since WMT's data and Google's live index data are not in sync this is an absolute necessity. This is also supported by dwwwb's data (55% of all WMT links indexed vs. 80% of all alive WMT links  indexed). 

    The reason why it's not a good idea to use your own data (i.e. all of the links you built) is because -as you said - it might take a while beofore they get indexed/WMT picks them up. By going to directly to WMT's data you're at least sure Google has "noticed" those links, so it's pretty much the only test you can do to see wether links have to be indexed to be "noticed/counted".

    To be honest 80-86% is pretty damn high so it definitely seems to important to get your links indexed. The only question remains what those other ~15-20% actually mean. Although I've never done the tests you mentioned the daily variations in indexed links could possibly explain that percentage. I also have an alternative/complementary hypothesis, but I'm not sure I should share that here publically. In theory it would allow for some neat link analysis to determine what Google determines as harmful links, so in theory it could also help to optimize your SER campaigns. 
Sign In or Register to comment.