Strange results while using built in footprints to scrape sites
Hi,
I am familiarizing with footprints, and I will use Wordpress article footprints as an example to ask my question.
So, below are all the SER built in footprints for Wordpress articles. I used them all with Scrapebox to scrape such article sites but only got a few results (like 400 total and these are even not just Wordpress sites but there's a mix of sites like fb, scribd, quora, reddit, etc). And I even not specified any keyword (if I do so, I get nearly zero result).
I was expecting to get thousands results at least...
I used the footprints as is from SER :
"Additional Articles From"Does it mean these footprints are outdated and don't work anymore, or are there no more wordpress articles sites online (the few sites I found are very old, there design seems so outdated)?
"Do not submit articles filled with spelling errors and bad grammar"
"If you have hired a ghost writer, you agree that you have"
"Powered by WordPress Using Article Directory plugin"
"Publish your article in RSS format for other websites to syndicate"
"registered authors in our article directory"
"RSS Articles" "RSS comments" "Recent Articles"
"RSS Articles" "RSS comments" "Recent Articles" "Authorization" "Username:" "Password:" "Remember Me" "Register" "Lost your password?"
"There are * published articles and * registered authors in our article directory."
"There are * published articles and * registered authors"
"This author has published * articles so far. More info about the author is coming soon."
"Using Article Directory plugin"
"Welcome to article directory *. Here you can find interesting and useful information on most popular themes."
"Powered by WordPress + Article Directory plugin"
inurl:"/wp-login.php?action=register"
"This entry was posted in Uncategorized by" "Bookmark the permalink."
Since these footprints are built in SER and people use the tool as is, I guess no one should be able to find Wordpress Article platform ???
I didn't try with other engines but if it's the same then there wouldnt be many available sites to post on.
What should I do then to find more sites, do you have better footprints?
People who sell lists, how do they find footprints for Wordpress article sites?
Thanks for your help guys
Comments
Test the footprint first in google search manually and see what results are returned. If there are no relevant results, then don't use the footprint. Google is continually blocking footprint queries, so what may have worked yesterday can stop working at anytime as Google update things. Ofcourse, the footprints may still work with other search engines.
Try this footprint:
"Welcome to WordPress. This is your first post. Edit or delete it, then start writing!"
That will get you lots of wordpress sites, but you'll likely find them to be working blog comment sites rather than sites that support article posting. Many of these sites have been spam targets in the past so they will have all sorts of site protection in place as well as disabled registration.
Considering 2/3 of the internet runs on wordpress, there are plenty of them out there. You just need to get better at scraping by using different keyword lists and custom footprints. Even using footprints in foreign languages will get you sites in that language. You would never find those sites if you just used english footprints.
The more varied your footprints and keywords, the more results you'll scrape.
Aside from wordpress, you can also try scraping buddypress, wp foro, gnu board, dwqa, moodle, media wiki. Combined you'll probably be able to find 1000+ sites still working with gsa. All contextual links, but many will be no follow.
Pay very little attention to what is written in the software. Just because the software says they are generally no follow, this does not mean that you won't find any do follow links. That info in the software is for general guidance only. It's up to the site owners if they make their links on these sites do follow or no follow. GSA SER has no control over this. It will solely depend on the sites you scrape.
Unfortunately you'll have to go through the process of scraping, testing and filtering to end up with 100% do follow contextuals. Probably not worth the effort or expense.
I don't discriminate against no follow links anymore. I'll select all engines under article, forums, social network, ser nuke and wiki as T1. They have their value in SEO so they are worth using. For getting other links crawled/indexed, no follow links are still very useful.
From my own experiments they do help with rankings in other search engines, such as bing. Personally I think they also constitute a more natural link profile, as only an SEO would manipulate their link profile to have 100% do follow links. So I treat them in the same way as do follow links. They even get tiers built to them and powered up as a do follow link does.
Even rankerx works this way. When I build pyramids, the site list is mixed no follow and do follow across 3 tiers. Then I further blast the pyramid with mixed no follow and do follow links from GSA SER.
Whilst a no follow link is not supposed to help with passing link juice, it will still help with crawling/indexing.
Ofcourse it's your call which strategy you want to follow.
Forums are absolutely great for targeted traffic. Just ensure you actually read the emails from the profiles and posts you make using automated tools.
Link loss is always going to be an issue, especially when you build links from public sites.
I've seen this cycle twice now with GSA SER. First we had joomla k2 many many years ago and at one point I had over 10k domains for joomla k2, contextuals and 100% do follow. Then as more and more users spammed the platform, changes happenned to the point where now I have maybe 3 working sites that are joomla k2. Sites are still out there but now google block the footprints for finding joomla k2.
Second time I saw this cycle of events was with gnuboard which makes nice contextual links on article posts. Maybe 2 years ago I had several thousand of these sites and over 1000 of them were do follow. Now those numbers have dwindled down to less than 100 do follow in my site list. As sites have died, I've lost rankings on client projects both times over. Lesson learnt.
So whilst I still use GSA SER as T1 and the other tiers, I do expect link loss over the months/years - it's inevitable. My solution is to focus on different link sources such as rankerx, money robot, my own PBN network (500 sites) as well as outsourced links from fiverr. I have my own software that runs scheduled link checks every month, so it's easy to mange now and see where the link loss is happenning,
The new engines from SER Nuke are very good, but I've changed my strategy with them. I'm using them sparingly on T1 only to ensure they don't get spammed too quickly. At one point I've had over 100 SER installs running, so you can imagine what happens to the sites in my list if I start running them on all installs. The new engines get used on homepage projects only for all my clients, just to boost authority and referring domain stats. They only run on 1 install which limits their use.
Register on a forum, post something using automated tools to provoke reaction and take it from there manually.
Look at yourself. You are posting in a forum, begging for answers Google Search can't give you.
Don't make generic posts or comments. Use AI to create relevant content.