Cleaning List: 2M in 2 days and still going..

Iam cleaning my list for the 3rd time i think.. The thing is, its been runniing 24/7 and not yet finished:

Last Cleaning: 1st Week March 2014


Is it normal?


  • SvenSven
    what version is that? In a more recent one I have fixed an issue with the progressbar not being accurate.
  • @sven its 8.15

    i dont get it coz i only have like 40K Verified list on my GSA. 

    I Have verified checked and identified in my global only, Where did GSA getting the 2M links?

  • To clearly share what happened, please see below:

    • I decided to clean my verified list using GSA SER
    • I go to tools and hit "remove duplicate urls" and choose my Verified list
    • Process is done and left me with 40K+ urls in my verified list
    • Next is i go to tools, then clean list
    • Choose the Verified list
    • It ran for 2 days non-stop and shows it is checking to more than 2M links by then
    • After 2 days, I lost my patience.. and aborted the task.. 
    • I checked my verified list and it blown to 2M URLs.., thought i hit a jackpot here.
    • I click Tools and choose "remove duplicate urls" and choose Verified
    • It left me with 40K+ urls in my verified links
    @sven is this a bug or what?
  • @addywordy: links go bad very quickly, and SER builds a lot of dupes. This result is not that surprising. We don't recommend using SER's cleanup function. We believe that it deletes a high percentage of valid URLs. A more reliable way to clean a verified list is as follows:1. Export your verified list as a .sl for backup. 2. Empty your identified folder 3. Import the .sl file to identified 4. Empty your verified folder 5. Dedupe the identified folder for both URLs and domains. 6. Set your projects to run from identified for a day or 2 7. Whatever is left in your verified folder are valid. We recommend doing this every 2 weeks
  • @Satans_Apprentice thanks for the advise. 

    Will definitely do this!
  • @sven what happens if your cleaning a list and you abort it because it seems to never end. Does it just throw the rest of the urls that it did not finish checking into unknown or does it just go back like if you never even started the clean up.
  • SvenSven
    It aborts the current engine it works on and it's links. The old links are there still.
