Skip to content

NON contextual link list (Spam links)?

Hi guys,
I joined the GSA family a couple of days ago and everything worked just great since then, but now I hit sort of a bump in the road and I'm not quite sure how to approach this problem and was hoping you guys could give me some input to solve it.

The campaign I'm talking about is a german one which already limits the amount of scrapable niche relevant links by a lot. However, I was able to scrape enough links (at least I guess) to start a tiered quality campaign. This means that my tiers will all consist of contextual quality posts and the only difference between the tiers is going to be the quality (OBLs, PR, etc.).

So what I need now is to scrape for URLs for my secondary links (the links coming in 'from the side' of the link pyramid). They will mostly consist of blog comments, guestbooks, etc. so really spammy stuff.
The problem I encountered is that I literally have no idea how to scrape for these links. I'd appreciate it if you guys could help me out on this one.

Regards.

Comments

  • Take the footprints from these engines and combine them with keywords in a scraper like you would with any platform?
  • edited April 2014
    @fakenickahl Dude.. nevermind the question. I felt so stupid after submitting it. I don't even know why I haven't thought of this. I instantly figured it out seconds after having pressed the submit button haha.
    Thanks anyways.

    One question though - if you were to create spam links to your contextual tiered links, would you even bother combining the footprints with your (broad) KW? I'm just wondering, since they don't neccessarily have to be niche related.
  • I don't do tiered links, but if I were to, I would not care about making my lower tiers on niche relevant sites. Therefore I would use as many generic keywords as possible.
  • I just had quite the discussion with someone who claims knowing a lot about automated link building.

    He told me that I shouldn't bother scraping for URLs using SER's built-in scraper. Instead I should either invest in good lists (30-40$) or buy Scrapebox/Hrefer.
    Now, I was under the impression that SER's built in scraper isn't that bad - at least if you're scraping for niche relevant sites.
    However, I am nowhere near 5k+ links and I think this is at least what I need to be able to spam my lower tiers with secondary links.

    Do I absolutely need a scraping tool like Scrapebox? As of right now I could afford it, but I don't feel like wasting money on something that I don't absolutely need right now.

    What's your oppinion?
  • Tixxpff Gscraper has a free version which scrapes everything you need.
  • The scraper within SER is just fine if you have low needs. However I think a good scraper is a must if you want to go through a lot of volume. Gscraper is a beast for me and it scrapes for weeks, but I would not use it without access to thousands of good proxies. Definitely not their proxy service.
  • @ImLazy Wow, didn't know that. Will try this one for sure.

    @fakenickahl These are exactly the two things I've heard of. 1. Their proxy service sucks and 2. You'll need a huge amount of proxies.

    @ImLazy @fakenickahl Let's say I'll try free GScraper and do some basic scraping for low/medium comp projects, will 20-30 proxies suffice? Because that's pretty much how many I have access to.
    Also, comparing free GScraper to Scrapebox (quality wise) any difference? I'm only talking about the scraping here, not the additional features, like KW suggester, etc.

    Thanks guys, you're very helpful.
  • If it's all about scraping for you, then choose Gscraper. It's not necessarily faster but better with sorting and managing stuff. But for other functions they are not interchangeable.
  • I would not even consider scraping outside SER with 30 proxies. Get access to a shitload of proxies instead before using a scraper. Also the free version of gscraper has some very annoying limits.
  • @Jesse Are you comparing to Scrapebox? Because I actually like Scrapebox' bonus features. But as of right now, it's just for scraping, yes.

    @fakenickahl Wow.. didn't expect an answer like that. In all the videos/guides/threads/.. I've found that people  mostly use something between 10-50 proxies. How'd you define 'shitload'?
    Please keep in mind that I don't intend to scrape 24/7 for the next month and create the biggest list in the history of scraping. I'm rather looking for a decent list that can be used for secondary spam links to my tiered link structure.
  • Actually having a great proxy source is exhilarating and it's like 100 times faster. But still you spend your money better on Scrapebox or Gscraper before anything like monthly indexing services or monthly OCR services.
  • The more I read from you guys, the more I feel I need to open another thread regarding scraping itself.
    I feel like there's not enough info about scraping. I've watched pretty much every tutorial video posted on this forum and a couple of others, but I still don't really know how to properly scrape for KWs.

    I read about people scraping lists with 100-500k verified links in it, but when I scrape for related KWs using SER Scraper I end up with 2-4k.
  • edited April 2014
    Scraping shouldn't be at the focus of your business unless that's your business : list-selling. List is hyped. Just establish reasonable amount of your own and then it's all about how you manage them.
  • edited April 2014
    Well, as you can see I'm not even able to create a proper list. I don't know how to approach the scraping process in a structured and organised manner.

    Right now I'm just punchin in a couple of my KWs + variations and LSIs, don't select foot prints, hit 'Scrape' and then I hope for the best. I assume that's not the proper way, is it?
  • edited April 2014
    Scraping is not just about SER but one of the most basic things for SEO. I mean if you spend one month on it to learn to become a pro then the knowledge will go on 5 years. I can't talk details unless it's about SER settings, sorry.
  • goonergooner SERLists.com
    edited April 2014
    @tixxpff - You need to know how successful your scrapes are. You should do something like this:

    - Take a set of 5k or 10k keywords (save these and use for all scrapes).
    - Scrape only 1 footprint with your keywords. Record the total URLs scraped.
    - Set up 1 project in SER for only that 1 scrape. Let it run until it has completely finished. Record how many verified URLs.

    Now you have total URLs scraped and total URLs verified for one footprint. With some simple maths you can get a % verified.

    Repeat for all footprints. It's a long process but if you want to find the best footprints it works.
  • edited April 2014
    @gooner - Thanks for your suggestion. I was going to something very similar to this anyway as it has been suggested in the tutorial videos. But your post is rather about optimizing the quality of the scraped URLs. Right now my problem is that I can't even get enough URLs for a medium sized campaign.

    • Now you mentioned 5k - 10k KWs. Is that the usual amount you guys use for (contextual) scraping?
    • When looking for blogs/guestbooks/etc. (secondary link material) they obviously don't need to be contextual. How do I scrape for those since putting in my niche KWs along with the engine footprints will limit the results to a fraction. But these links are quantity > quality. And if I only put in the foot prints then I don't get any results, at all.

    Thanks.

  • goonergooner SERLists.com
    @tixxpff - No probs :)

    I have a huge list of keywords, 10 million or more. Lots of languages. So i split them into chunks of 10k or whatever and scrape like that. I have footprints organised into batches too.

    Sent you a PM.


  • @gooner Holy moly.. that's a lot. I saw your PM. Will reply soon.

    One general question about foot prints and KWs:
    If I use the SER scraper tool and ONLY use my KW list (no foot prints) will this make the the tool scrape every website possible? And is there a difference between doing it like that and manually adding every footprint from the list + merge them with the KWs from a text file?
    Because right now I don't quite understand if the foot prints can be used as filters to scrape for particular platforms, or if they're neccessary for you to get results.
  • goonergooner SERLists.com
    @tixxpff - If you just put keywords you will get all sites, like if you go to Google and just search a keyword.
    The problem is SER won't be able to post to most of them so you need to refine your scrape with footprints.

  • @gooner That was very helpful. So footprints act as a filter and will not provide you with additional results.
  • goonergooner SERLists.com
    You will probably get less results with footprints, but they more are targeted.

Sign In or Register to comment.