Skip to content

Already Parsed Question

Hey All,

So I load up 3 brand new campaigns, and import a freshly scraped list that has been sorted in PI.  Then I remove duplicate URLs.

I start running it only to see a ton of "Already Parsed" errors for the URLs. 

Does this make sense?  How can the project already parse a URL if it's a fresh campaign using a deduped freshly imported list?  The campaign should not have seen ANY urls before, and since each URL should be unique since it was deduped, how could they already be parsed?

Would love you input, since I'm probably missing something here...

Comments

  • SvenSven www.GSA-Online.de
    yes it makes sense because you de-duped on urls not on domains. Certain URLs are all removed and handled as "already parsed" if it was previously detected by an engine where it doesn't matter from what sublink you come.
  • I see, so de-duping on domains should stop this.

    The reason I didn't dedupe on domains was because I figured some URLs would be better than others at posting, and didn't want to hurt my chances of getting a link submitted.  Maybe that was a faulty assumption, however.
  • SvenSven www.GSA-Online.de
    Yes thats a good point though. But then you have to accept the "already parsed" message in log for certain URLs.
  • As an update to this, I removed duplicate domains and saw my verified links plummet to essentially zero.  Sticking with removing duplicate URLs and living with "already parsed"...
  • Theoretically you can post on the same domain more than once if you enable scheduled posting (don't check Per URL)

    Lets say you have 3 URLs in the project but only 1 email account 
    it will try to post a comment on a different post from the same domain but linking to 1 of the 3 URLs 

    and according to SVEN's asnwer here https://forum.gsa-online.de/discussion/comment/44430/#Comment_44430 it can not register another account on forums (or other sites that require registration) since the email for that domain will already be used 
  • charlesallicharlesalli Huntsville, AL
    Solution:

    Step1: Right on the logs and uncheck "Enabled log".
    Step 2: Relax.

    :)
Sign In or Register to comment.