Bug With Crawl Anchors Option

September 2021

I've imported a few hundred urls into a project and am using the crawl anchors option to generate anchors from the title of the pages. The only option selected is "add <title> as anchor".

Once it's finished crawling and I inspect the anchors, everything looks fine. There are 348 urls.

Click ok to close the project. When I re-open the project there are now 700+ urls and the software has added extra urls. For example:

Actual url: https://www.mydomain.com/dog-training/#Dog Training
Extra url added: http://24-7 Dog Training in Atlanta

It seems to be making extra urls from the page title. The format in the page title is

Dog Training | 24-7 Dog Training In Atlanta

Is there a workaround for this?

Thanks

September 2021

Thanks for the report. I will fix that right away as it's indeed that "|" char in title.

September 2021

Hi Sven,

Thanks for fixing this so quickly. I've spotted another issue when crawling anchors from a different site. It seems to be creating broken spintax as anchor text. For example, after crawling this url it creates this:

https://headphonesshop.co.uk/wireless-earbudsxunpuls-bluetooth-50-in-ear-tws-earbuds-auto-pairing/#{IPX5 Waterproof Built-in Mic Headsets for Sports Running#Wireless Earbuds#Xunpuls Bluetooth 5.0 in-Ear TWS Earbuds Auto Pairing Earphones with 2000mAh Charging Case LED Battery Display 95H Playtime}

The page title: Wireless Earbuds,Xunpuls Bluetooth 5.0 in-Ear TWS Earbuds Auto Pairing Earphones with 2000mAh Charging Case LED Battery Display 95H Playtime, IPX5 Waterproof Built-in Mic Headsets for Sports Running

It seems to be doing this when it sees a comma and replaces it with # instead of |

As a result, I've got a few anchors looking like this:

{IPX5 Waterproof Built-in Mic Headsets for Sports Running

If you need to take a closer look, here is a link to the ser project with the created links:
https://1drv.ms/u/s!AvaPxZBKuUdah_5a_0pp2Y3HPNTLzA?e=tTGVDE

Thanks

September 2021

should be fixed by now, or at least with next update.

Bug With Crawl Anchors Option

Comments