NON contextual link list (Spam links)?

Tixxpff · April 2014

Hi guys,
I joined the GSA family a couple of days ago and everything worked just great since then, but now I hit sort of a bump in the road and I'm not quite sure how to approach this problem and was hoping you guys could give me some input to solve it.

The campaign I'm talking about is a german one which already limits the amount of scrapable niche relevant links by a lot. However, I was able to scrape enough links (at least I guess) to start a tiered quality campaign. This means that my tiers will all consist of contextual quality posts and the only difference between the tiers is going to be the quality (OBLs, PR, etc.).

So what I need now is to scrape for URLs for my secondary links (the links coming in 'from the side' of the link pyramid). They will mostly consist of blog comments, guestbooks, etc. so really spammy stuff.
The problem I encountered is that I literally have no idea how to scrape for these links. I'd appreciate it if you guys could help me out on this one.

Regards.

fakenickahl · April 2014

Take the footprints from these engines and combine them with keywords in a scraper like you would with any platform?

Tixxpff · April 2014

@fakenickahl Dude.. nevermind the question. I felt so stupid after submitting it. I don't even know why I haven't thought of this. I instantly figured it out seconds after having pressed the submit button haha.
Thanks anyways.

One question though - if you were to create spam links to your contextual tiered links, would you even bother combining the footprints with your (broad) KW? I'm just wondering, since they don't neccessarily have to be niche related.

fakenickahl · April 2014

I don't do tiered links, but if I were to, I would not care about making my lower tiers on niche relevant sites. Therefore I would use as many generic keywords as possible.

Tixxpff · April 2014

I just had quite the discussion with someone who claims knowing a lot about automated link building.

He told me that I shouldn't bother scraping for URLs using SER's built-in scraper. Instead I should either invest in good lists (30-40$) or buy Scrapebox/Hrefer.
Now, I was under the impression that SER's built in scraper isn't that bad - at least if you're scraping for niche relevant sites.
However, I am nowhere near 5k+ links and I think this is at least what I need to be able to spam my lower tiers with secondary links.

Do I absolutely need a scraping tool like Scrapebox? As of right now I could afford it, but I don't feel like wasting money on something that I don't absolutely need right now.

What's your oppinion?

ImLazy · April 2014

Tixxpff Gscraper has a free version which scrapes everything you need.

fakenickahl · April 2014

The scraper within SER is just fine if you have low needs. However I think a good scraper is a must if you want to go through a lot of volume. Gscraper is a beast for me and it scrapes for weeks, but I would not use it without access to thousands of good proxies. Definitely not their proxy service.

Tixxpff · April 2014

@ImLazy Wow, didn't know that. Will try this one for sure.

@fakenickahl These are exactly the two things I've heard of. 1. Their proxy service sucks and 2. You'll need a huge amount of proxies.

@ImLazy @fakenickahl Let's say I'll try free GScraper and do some basic scraping for low/medium comp projects, will 20-30 proxies suffice? Because that's pretty much how many I have access to.
Also, comparing free GScraper to Scrapebox (quality wise) any difference? I'm only talking about the scraping here, not the additional features, like KW suggester, etc.

Thanks guys, you're very helpful.

Jesse · April 2014

If it's all about scraping for you, then choose Gscraper. It's not necessarily faster but better with sorting and managing stuff. But for other functions they are not interchangeable.

fakenickahl · April 2014

I would not even consider scraping outside SER with 30 proxies. Get access to a shitload of proxies instead before using a scraper. Also the free version of gscraper has some very annoying limits.

Tixxpff · April 2014

@Jesse Are you comparing to Scrapebox? Because I actually like Scrapebox' bonus features. But as of right now, it's just for scraping, yes.

@fakenickahl Wow.. didn't expect an answer like that. In all the videos/guides/threads/.. I've found that people mostly use something between 10-50 proxies. How'd you define 'shitload'?
Please keep in mind that I don't intend to scrape 24/7 for the next month and create the biggest list in the history of scraping. I'm rather looking for a decent list that can be used for secondary spam links to my tiered link structure.

Jesse · April 2014

Actually having a great proxy source is exhilarating and it's like 100 times faster. But still you spend your money better on Scrapebox or Gscraper before anything like monthly indexing services or monthly OCR services.

Tixxpff · April 2014

The more I read from you guys, the more I feel I need to open another thread regarding scraping itself.
I feel like there's not enough info about scraping. I've watched pretty much every tutorial video posted on this forum and a couple of others, but I still don't really know how to properly scrape for KWs.

I read about people scraping lists with 100-500k verified links in it, but when I scrape for related KWs using SER Scraper I end up with 2-4k.

Jesse · April 2014

Scraping shouldn't be at the focus of your business unless that's your business : list-selling. List is hyped. Just establish reasonable amount of your own and then it's all about how you manage them.

Tixxpff · April 2014

Well, as you can see I'm not even able to create a proper list. I don't know how to approach the scraping process in a structured and organised manner.

Right now I'm just punchin in a couple of my KWs + variations and LSIs, don't select foot prints, hit 'Scrape' and then I hope for the best. I assume that's not the proper way, is it?

Jesse · April 2014

Scraping is not just about SER but one of the most basic things for SEO. I mean if you spend one month on it to learn to become a pro then the knowledge will go on 5 years. I can't talk details unless it's about SER settings, sorry.

gooner · April 2014

@tixxpff - You need to know how successful your scrapes are. You should do something like this:

- Take a set of 5k or 10k keywords (save these and use for all scrapes).
- Scrape only 1 footprint with your keywords. Record the total URLs scraped.
- Set up 1 project in SER for only that 1 scrape. Let it run until it has completely finished. Record how many verified URLs.

Now you have total URLs scraped and total URLs verified for one footprint. With some simple maths you can get a % verified.

Repeat for all footprints. It's a long process but if you want to find the best footprints it works.

Tixxpff · April 2014

@gooner - Thanks for your suggestion. I was going to something very similar to this anyway as it has been suggested in the tutorial videos. But your post is rather about optimizing the quality of the scraped URLs. Right now my problem is that I can't even get enough URLs for a medium sized campaign.

Now you mentioned 5k - 10k KWs. Is that the usual amount you guys use for (contextual) scraping?

When looking for blogs/guestbooks/etc. (secondary link material) they obviously don't need to be contextual. How do I scrape for those since putting in my niche KWs along with the engine footprints will limit the results to a fraction. But these links are quantity > quality. And if I only put in the foot prints then I don't get any results, at all.

Thanks.

gooner · April 2014

@tixxpff - No probs

I have a huge list of keywords, 10 million or more. Lots of languages. So i split them into chunks of 10k or whatever and scrape like that. I have footprints organised into batches too.

Sent you a PM.

Tixxpff · April 2014

@gooner Holy moly.. that's a lot. I saw your PM. Will reply soon.

One general question about foot prints and KWs:
If I use the SER scraper tool and ONLY use my KW list (no foot prints) will this make the the tool scrape every website possible? And is there a difference between doing it like that and manually adding every footprint from the list + merge them with the KWs from a text file?
Because right now I don't quite understand if the foot prints can be used as filters to scrape for particular platforms, or if they're neccessary for you to get results.

gooner · April 2014

@tixxpff - If you just put keywords you will get all sites, like if you go to Google and just search a keyword.
The problem is SER won't be able to post to most of them so you need to refine your scrape with footprints.

Tixxpff · April 2014

@gooner That was very helpful. So footprints act as a filter and will not provide you with additional results.

gooner · April 2014

You will probably get less results with footprints, but they more are targeted.

NON contextual link list (Spam links)?

Comments