Bug With Crawl Anchors Option
I've imported a few hundred urls into a project and am using the crawl anchors option to generate anchors from the title of the pages. The only option selected is "add <title> as anchor".
Once it's finished crawling and I inspect the anchors, everything looks fine. There are 348 urls.
Click ok to close the project. When I re-open the project there are now 700+ urls and the software has added extra urls. For example:
Actual url: https://www.mydomain.com/dog-training/#Dog Training
Extra url added: http://24-7 Dog Training in Atlanta
Extra url added: http://24-7 Dog Training in Atlanta
It seems to be making extra urls from the page title. The format in the page title is
Dog Training | 24-7 Dog Training In Atlanta
Is there a workaround for this?
Thanks
Comments
Thanks for fixing this so quickly. I've spotted another issue when crawling anchors from a different site. It seems to be creating broken spintax as anchor text. For example, after crawling this url it creates this:
It seems to be doing this when it sees a comma and replaces it with # instead of |
As a result, I've got a few anchors looking like this:
{IPX5 Waterproof Built-in Mic Headsets for Sports Running
If you need to take a closer look, here is a link to the ser project with the created links:
https://1drv.ms/u/s!AvaPxZBKuUdah_5a_0pp2Y3HPNTLzA?e=tTGVDE
Thanks