Large discrepancy between total verified and verified lists
I'm a little confused as to why there's such a huge difference in the total number of URLs verified by my project and the total number of URLs in my verified sitelist directory files.
I scraped and filtered lists of target URLs by engine to push through GSA to build verified lists for another project. My identified lists contained around 500k target URLs and after running the project for around 24 hours GSA is showing the total number of verified URLs at 63k, and this is the only GSA project I currently have and it's setup to create 1 link per url.
I then imported my verified directory into scrapebox just to double check that everything was good with the verified lists, but there's less than 8,000 URLs between all the sitelist files.
Anyone have any idea why the actual verified lists are missing so many URLs?
I scraped and filtered lists of target URLs by engine to push through GSA to build verified lists for another project. My identified lists contained around 500k target URLs and after running the project for around 24 hours GSA is showing the total number of verified URLs at 63k, and this is the only GSA project I currently have and it's setup to create 1 link per url.
I then imported my verified directory into scrapebox just to double check that everything was good with the verified lists, but there's less than 8,000 URLs between all the sitelist files.
Anyone have any idea why the actual verified lists are missing so many URLs?
Comments
What did you use scrapebox to check? If the links were live or if there were duplicates?
I happened to spot the "per URL" issue about 12 hours ago and disabled it, but since then the verified lists still only have a total of 8300 URLs, when an additional 20k URLs have been verified since.
The identified list with 500k targets is seriously big. If you've scraped these then I would not expect anywhere near 100% success rate. Scraping normally yields 1% verified links, depending on footprints. So anything above this is considered good.
Anyone know what determines if an when a url is added to a verified list?
Other engines such as articles shouldn't work that way as you only need the domain added to the verified list once for the software to post on that domain. There is no benefit in adding multiple article urls from the same site to the verified list.
This is why you should remove duplicate domains excluding blog comments and some other engines which are already deselected when you use this option.