Skip to content

Best Practices to use Indetified URLs from GSA PI for SER

Hi 

I am scraping my own Targets with ScrapeBox 24*7
After I scrape 5-10GB of Data, I trim to root, de-duplicate and send the resulted list to GSA PI.

GSA PI feeding Identified list to my GSA SER Instance which verifies the lists and being used for link building.

Now my questions are.

*Question 1*
Should I keep adding my new targets into the existing Identified list OR I should wipe out my identified list occasionally and start over with a fresh GSA PI identified list?

Because over time, All the targets are already tried by GSA SER and there is nothing more left under this identified list to verify.

If yes, How occasionally? OR you can tell after how big your Identified list becomes when you wipe it out.

I am running GSA SER on a dedicated server with 2000 threads and getting 100+ LPM. So you can get an idea when I need to wipe it out.

*Question 2*
I am using SB Link Extractor to scrape the initial targets. I believe I am getting LESS Unique targets. 

How can I increase it?

Thanks in Advance 

Comments

  • Update 

    I had this option Enabled for my GSA Verified list creator campaigns

    *Allow POsting on the Same Site Again*

    Which I don't think makes sense If I am just creating a Verified list,

    So disabling this function will allow GSA to mark a target "Done" quickly so I can wipe out and change my Identified list

    correct?

    Going to disable this function on GSA Verified list creator campaigns.

    Thanks
Sign In or Register to comment.