Skip to content

Scraping for targets - Massive Duplicates

So I have been experimenting with many different ways of scraping for targets using all kinds of footprints. I focus on Google API and Bing, moreso Bing lately.

I've tried limiting results from 1000 to 100, but it doesn't seem to make much difference in # of duplicates you end up scraping. 

One tool I've found that is capable of finding a lot of UNIQUE targets is actually SEO List Builder, the only problem is its really unstable. When it works though, it is pretty awesome.

What techniques have you come up with to reduce the # of duplicate targets you get on your scrapes? Do you just deal with it and accept it as an inevitability, or have you been able to come up with a method of eliminating most of your dupes?

Comments

  • Good question... I've been googling around a bit - seems to me that most best practice tutorials have become outdated *very recently*.

    Google is getting better and better at spotting those evil Scrapeboxers :)

    And switching to other engines inevitable produces a *lot* of dupes - haven't found a solution for that myself yet either. That's basically why I started this thread:
  • HinkysHinkys SEOSpartans.com - Catchalls for SER - 30 Day Free Trial
    Basically using seed keywords that are not long tails. Meaning don't take 100 keywords and run them through keyword scraper, try to find 10,000 keywords that are all as different from each other as possible and then do a run with that.

    Also, some footprints tend to be very similar so I try to use no more than 1 footprint / engine.

    And naturally, don't scrape from more than 1 engine, that's going to get you nothing but dupes most of the time.
Sign In or Register to comment.