Skip to content

Additional filter for indexing services

I noticed yesterday that despite noindex tags being stored with verified link information ('flag 0') these urls will still get sent to indexing services. Sending a noindex page for indexing is a waste of resources, so could we get an additional checkbox in indexing options to choose not to send them?

Tagged:

Comments

  • SvenSven www.GSA-Online.de
    hmm but "doFollow" has nothing to do with "NoIndex". That are two different things. Some customers still might want a "noIndex" link to be sent to indexing services.
    However, I get your point and will add an option to not send links who are clearly set to "noIndex".
  • Yeah I neglected to actually put another checkbox in that image for noindex like I thought I had  :D
    An option either way would be great
  • googlealchemistgooglealchemist Anywhere I want
    edited May 2022
    Sven said:
    hmm but "doFollow" has nothing to do with "NoIndex". That are two different things. Some customers still might want a "noIndex" link to be sent to indexing services.
    However, I get your point and will add an option to not send links who are clearly set to "noIndex".
    id also love that option so it doesnt send noindex links to indexer services...similar to the existing function of only sending t2 links to an indexable link
  • SvenSven www.GSA-Online.de
    in next update you have this option added.
  • googlealchemistgooglealchemist Anywhere I want
    Sven said:
    in next update you have this option added.
    awesome thanks...i see it set as default on now which makes sense.

    in the 'days to index' section of the main program global options...is that the same function as the project specific option of 'send to index/ping delayed by'? and if so, will the project specific options always override the global option?

  • SvenSven www.GSA-Online.de
    no, this is an option from the indexing service. Some allow you to define when they should start indexing.
  • googlealchemistgooglealchemist Anywhere I want
    Sven said:
    no, this is an option from the indexing service. Some allow you to define when they should start indexing.
    the drip feed function is that what you mean?
  • SvenSven www.GSA-Online.de
    yes
    Thanked by 1googlealchemist
  • googlealchemistgooglealchemist Anywhere I want
    Sven said:
    hmm but "doFollow" has nothing to do with "NoIndex". That are two different things. Some customers still might want a "noIndex" link to be sent to indexing services.
    However, I get your point and will add an option to not send links who are clearly set to "noIndex".
    Hey I dont remember if it was another thread or in a dm but I had asked if the software could detect a noindex tag before posting to a site to avoid wasting resources...and you said that wasnt practical which I understand when I think about it.

    But could we somehow integrate the noindex detection being used for existing urls before sending them to indexers or adding them to a tiered project.....could we use that to send those particular websites to a blacklist to avoid posting to those sites again in the future?
  • SvenSven www.GSA-Online.de
    The noindex via meta is easily detectable before posting. However, most noindex is happening via robot.txt and you really don't want SER to download that one each time it tries to post. This is wasting so much traffic.
  • edited May 2022
    Browser automation studio could do the robots.txt checking and list filtering if you can’t write your own scripts. BAS is free, and this would be trivial to accomplish.
  • SvenSven www.GSA-Online.de
    @the_other_dude don't get me wrong, parsing the robots.txt file is easy. I can do that as well, but the traffic and time waste that would be generated is huge as you would need to do this for every URL.
  • edited May 2022
    Right, I agree. I was just posting a free solution for the few that will inevitably come across this and believe they need to be checking robots.txt :)
  • googlealchemistgooglealchemist Anywhere I want
    Sven said:
    The noindex via meta is easily detectable before posting. However, most noindex is happening via robot.txt and you really don't want SER to download that one each time it tries to post. This is wasting so much traffic.
    the new option in the indexing settings 'skip noindex urls'....if its detecting that for a particular created profile page or article page...and not sending that link to indexers if it has the noindex tag...cant that also be added at the same time to an internal black list or skip list so the software never builds another link like that again to avoid wasting resources?
  • SvenSven www.GSA-Online.de
    This can only be detected after submission/verification unless you define it in the engine script.
    Thanked by 1the_other_dude
  • googlealchemistgooglealchemist Anywhere I want
    Sven said:
    This can only be detected after submission/verification unless you define it in the engine script.
    yeah, thats what im saying too....when it detects it at that stage...add that domain to an internal blacklist so the software doesnt use it again to signup/post to
  • googlealchemistgooglealchemist Anywhere I want
    Sven said:
    This can only be detected after submission/verification unless you define it in the engine script.
    yeah, thats what im saying too....when it detects it at that stage...add that domain to an internal blacklist so the software doesnt use it again to signup/post to
    did this make sense how it put it and if so, is it possible to implement?
Sign In or Register to comment.