Skip to content

Bug With Crawl Anchors Option

sickseosickseo London,UK
I've imported a few hundred urls into a project and am using the crawl anchors option to generate anchors from the title of the pages. The only option selected is "add <title> as anchor".
Once it's finished crawling and I inspect the anchors, everything looks fine. There are 348 urls.
Click ok to close the project. When I re-open the project there are now 700+ urls and the software has added extra urls. For example:


It seems to be making extra urls from the page title. The format in the page title is

Dog Training | 24-7 Dog Training In Atlanta

Is there a workaround for this?

Thanks

Comments

  • SvenSven www.GSA-Online.de
    Thanks for the report. I will fix that right away as it's indeed that "|" char in title.
    Thanked by 1sickseo
  • sickseosickseo London,UK
    edited September 2021
    Hi Sven,

    Thanks for fixing this so quickly. I've spotted another issue when crawling anchors from a different site. It seems to be creating broken spintax as anchor text. For example, after crawling this url it creates this:

    https://headphonesshop.co.uk/wireless-earbudsxunpuls-bluetooth-50-in-ear-tws-earbuds-auto-pairing/#{IPX5 Waterproof Built-in Mic Headsets for Sports Running#Wireless Earbuds#Xunpuls Bluetooth 5.0 in-Ear TWS Earbuds Auto Pairing Earphones with 2000mAh Charging Case LED Battery Display 95H Playtime}

    The page title: Wireless Earbuds,Xunpuls Bluetooth 5.0 in-Ear TWS Earbuds Auto Pairing Earphones with 2000mAh Charging Case LED Battery Display 95H Playtime, IPX5 Waterproof Built-in Mic Headsets for Sports Running

    It seems to be doing this when it sees a comma and replaces it with # instead of |

    As a result, I've got a few anchors looking like this: 

    {IPX5 Waterproof Built-in Mic Headsets for Sports Running


    If you need to take a closer look, here is a link to the ser project with the created links: 
    https://1drv.ms/u/s!AvaPxZBKuUdah_5a_0pp2Y3HPNTLzA?e=tTGVDE 

    Thanks



  • SvenSven www.GSA-Online.de
    should be fixed by now, or at least with next update.
Sign In or Register to comment.