Strange results while using built in footprints to scrape sites

remirom · October 2024

Hi,

I am familiarizing with footprints, and I will use Wordpress article footprints as an example to ask my question.

So, below are all the SER built in footprints for Wordpress articles. I used them all with Scrapebox to scrape such article sites but only got a few results (like 400 total and these are even not just Wordpress sites but there's a mix of sites like fb, scribd, quora, reddit, etc). And I even not specified any keyword (if I do so, I get nearly zero result).

I was expecting to get thousands results at least...

I used the footprints as is from SER :

"Additional Articles From"
"Do not submit articles filled with spelling errors and bad grammar"
"If you have hired a ghost writer, you agree that you have"
"Powered by WordPress   Using Article Directory plugin"
"Publish your article in RSS format for other websites to syndicate"
"registered authors in our article directory"
"RSS Articles" "RSS comments" "Recent Articles"
"RSS
 Articles" "RSS comments" "Recent Articles" "Authorization" "Username:" 
"Password:" "Remember Me" "Register" "Lost your password?"
"There are * published articles and * registered authors in our article directory."
"There are * published articles and * registered authors"
"This author has published * articles so far. More info about the author is coming soon."
"Using Article Directory plugin"
"Welcome to article directory *. Here you can find interesting and useful information on most popular themes."
"Powered by WordPress + Article Directory plugin"
inurl:"/wp-login.php?action=register"
"This entry was posted in Uncategorized by" "Bookmark the permalink."

Does it mean these footprints are outdated and don't work anymore, or are there no more wordpress articles sites online (the few sites I found are very old, there design seems so outdated)?

Since these footprints are built in SER and people use the tool as is, I guess no one should be able to find Wordpress Article platform ??? I didn't try with other engines but if it's the same then there wouldnt be many available sites to post on.

What should I do then to find more sites, do you have better footprints? People who sell lists, how do they find footprints for Wordpress article sites? Thanks for your help guys

organiccastle · October 2024

Google will not return all its search results on a simple search query, expect 100-200. You have to add keywords or a-z, 0-9 to get more results.

sickseo · October 2024

Best not to rely on the built in footprints. You will get better results by making your own custom footprints. You most defintiely should combine the footprint with long keyword lists. Personally I've got keyword lists in multiple foreign languages so I get a good range of sites with varying TLD's each time I scrape.

Test the footprint first in google search manually and see what results are returned. If there are no relevant results, then don't use the footprint. Google is continually blocking footprint queries, so what may have worked yesterday can stop working at anytime as Google update things. Ofcourse, the footprints may still work with other search engines.

Try this footprint:

"Welcome to WordPress. This is your first post. Edit or delete it, then start writing!"

That will get you lots of wordpress sites, but you'll likely find them to be working blog comment sites rather than sites that support article posting. Many of these sites have been spam targets in the past so they will have all sorts of site protection in place as well as disabled registration.

Considering 2/3 of the internet runs on wordpress, there are plenty of them out there. You just need to get better at scraping by using different keyword lists and custom footprints. Even using footprints in foreign languages will get you sites in that language. You would never find those sites if you just used english footprints.

The more varied your footprints and keywords, the more results you'll scrape.

remirom · October 2024

organiccastle said:

Google will not return all its search results on a simple search query, expect 100-200. You have to add keywords or a-z, 0-9 to get more results.

The queries I mentioned return very few results. And when I add a kw there is even less results.

Adding a kw to a query works when the query returns a huge number of results by default, but if the query return very few results by default then adding a kw won't do much.

sickseo said:
Try this footprint:

"Welcome to WordPress. This is your first post. Edit or delete it, then start writing!"

That will get you lots of wordpress sites, but you'll likely find them to be working blog comment sites rather than sites that support article posting. Many of these sites have been spam targets in the past so they will have all sorts of site protection in place as well as disabled registration.

Thanks mate, indeed this query return only wordpress blogs (I know how to search for those), but zero article directory.

Articles directories is what we need to post our articles, blog comments are useless (at least for tier 1).

Maybe there are no more wordpress article directories. All the ones alive are very old. I even wonder if google wouldn't penalize a site if you get backlinks from such outdated directories lol.

sickseo · October 2024

A lot has changed since the days of article directories. Those types of sites are pretty much non-existant now. You may want to take a look at the new engines from ser nuke. Lots of sites available with their included footprints. Mixture of contextual and profile links. I've scraped over 10k verified domains so far: https://forum.gsa-online.de/discussion/33465/sernuke-custom-gsa-ser-engines/p2#Comment_194664

Aside from wordpress, you can also try scraping buddypress, wp foro, gnu board, dwqa, moodle, media wiki. Combined you'll probably be able to find 1000+ sites still working with gsa. All contextual links, but many will be no follow.

remirom · October 2024

sickseo said:

A lot has changed since the days of article directories. Those types of sites are pretty much non-existant now. You may want to take a look at the new engines from ser nuke. Lots of sites available with their included footprints. Mixture of contextual and profile links. I've scraped over 10k verified domains so far: https://forum.gsa-online.de/discussion/33465/sernuke-custom-gsa-ser-engines/p2#Comment_194664

Thanks I will take a look to sernuke.

Yes I think those wordpress article directories are dead that's why there are no more results in the serps. This engine should be removed from SER then. I didnt try with other engines to see if some are dead as well, but I will. Should be great if the dev (which is very cool btw) could update all of this, since engines are the core of SER it's a bit weird to keep dead ones in the tool.

sickseo said:
Aside from wordpress, you can also try scraping buddypress, wp foro, gnu board, dwqa, moodle, media wiki. Combined you'll probably be able to find 1000+ sites still working with gsa. All contextual links, but many will be no follow.

I can't see these engines in SER, that means I have to build a new template from scratch for each of these you suggested?

sickseo · October 2024

There are all in the Article module. Already programmed to work with the software.

remirom · October 2024

ah yes sorry I thought they were in alphabetic order in SER.

I found only WP FORO allows contextual links + dofollow + anchor. All the other are useless (for me).

Pretty much disappointed, I thought the tool would allow me to post tons of articles everywhere with dofollow links but it's far from being the case.

sickseo · October 2024

All the engines I mentioned above have mixed do follow and no follow contextual link sources. I know because I have them in my site list.

Pay very little attention to what is written in the software. Just because the software says they are generally no follow, this does not mean that you won't find any do follow links. That info in the software is for general guidance only. It's up to the site owners if they make their links on these sites do follow or no follow. GSA SER has no control over this. It will solely depend on the sites you scrape.

Unfortunately you'll have to go through the process of scraping, testing and filtering to end up with 100% do follow contextuals. Probably not worth the effort or expense.

remirom · October 2024

@sickseo thanks again for your help mate. I was wondering if it was better to select the engines for which "Sometimes" is mentioned regarding Dofollow links ? I see you include them in your campaign, did you check if you do get some dofollow in the mix? I don't want to waste ressources on nofollow links. Thanks

sickseo · October 2024

All those engines I mentioned above, I have a mix of do follow and no follow links for.

I don't discriminate against no follow links anymore. I'll select all engines under article, forums, social network, ser nuke and wiki as T1. They have their value in SEO so they are worth using. For getting other links crawled/indexed, no follow links are still very useful.

From my own experiments they do help with rankings in other search engines, such as bing. Personally I think they also constitute a more natural link profile, as only an SEO would manipulate their link profile to have 100% do follow links. So I treat them in the same way as do follow links. They even get tiers built to them and powered up as a do follow link does.

Even rankerx works this way. When I build pyramids, the site list is mixed no follow and do follow across 3 tiers. Then I further blast the pyramid with mixed no follow and do follow links from GSA SER.

Whilst a no follow link is not supposed to help with passing link juice, it will still help with crawling/indexing.

Ofcourse it's your call which strategy you want to follow.

remirom · October 2024

Thanks. Yep I know nofollow are useful for indexing that's why I thought of reserving them for higher tiers to index my lower tier.

But yes you make a good point about other search engines treating them differently than google, so better have some in the mix.

For Tier 1 I read wiki links are useless they die crazy fast for some reason when compaired to Article and Social networks. If you workout the amount of resources you put into the T2 just for the wiki target links on T1 to die it seems pointless so I dropped them.

The forum platform is a grey area as forum posts are useless and will most likley be deleted but the forum profiles offer the same as what SER classes as contextual profiles in the article and social bookmark category.

organiccastle · October 2024

remirom said:

The forum platform is a grey area as forum posts are useless and will most likley be deleted but the forum profiles offer the same as what SER classes as contextual profiles in the article and social bookmark category.

Do you want traffic or rankings only and believe into your money site to convert?

Forums are absolutely great for targeted traffic. Just ensure you actually read the emails from the profiles and posts you make using automated tools.

sickseo · October 2024

Forums I would not dismiss. Some of them make contextuals in the signature and engines like smf and discuz are mostly do follow, low obls with decent authority on them.

Link loss is always going to be an issue, especially when you build links from public sites.

I've seen this cycle twice now with GSA SER. First we had joomla k2 many many years ago and at one point I had over 10k domains for joomla k2, contextuals and 100% do follow. Then as more and more users spammed the platform, changes happenned to the point where now I have maybe 3 working sites that are joomla k2. Sites are still out there but now google block the footprints for finding joomla k2.

Second time I saw this cycle of events was with gnuboard which makes nice contextual links on article posts. Maybe 2 years ago I had several thousand of these sites and over 1000 of them were do follow. Now those numbers have dwindled down to less than 100 do follow in my site list. As sites have died, I've lost rankings on client projects both times over. Lesson learnt.

So whilst I still use GSA SER as T1 and the other tiers, I do expect link loss over the months/years - it's inevitable. My solution is to focus on different link sources such as rankerx, money robot, my own PBN network (500 sites) as well as outsourced links from fiverr. I have my own software that runs scheduled link checks every month, so it's easy to mange now and see where the link loss is happenning,

The new engines from SER Nuke are very good, but I've changed my strategy with them. I'm using them sparingly on T1 only to ensure they don't get spammed too quickly. At one point I've had over 100 SER installs running, so you can imagine what happens to the sites in my list if I start running them on all installs. The new engines get used on homepage projects only for all my clients, just to boost authority and referring domain stats. They only run on 1 install which limits their use.

remirom · October 2024

organiccastle said:

remirom said:

The forum platform is a grey area as forum posts are useless and will most likley be deleted but the forum profiles offer the same as what SER classes as contextual profiles in the article and social bookmark category.

Do you want traffic or rankings only and believe into your money site to convert?

Forums are absolutely great for targeted traffic. Just ensure you actually read the emails from the profiles and posts you make using automated tools.

You mean the forum posts you make by hand to promote your services, yes they could be useful to get traffic (if the mods don't spot your ad...). But with SER and automation tools we can't place such links we have to write generic comments that are spotted right away unless the board is not moderated (and thus has no interest since it's probably spammed as hell and won't bring you any human traffic).

On the other hand if you are talking about forum profiles I think they can be of interest if you have the possibility to write a long contextual article to surround the link. Then it will improve the backlink profile of your site.

Forum profiles where you just put your link are useless in my opinion.

remirom · October 2024

sickseo said:

Forums I would not dismiss. Some of them make contextuals in the signature and engines like smf and discuz are mostly do follow, low obls with decent authority on them.

Link loss is always going to be an issue, especially when you build links from public sites.

I've seen this cycle twice now with GSA SER. First we had joomla k2 many many years ago and at one point I had over 10k domains for joomla k2, contextuals and 100% do follow. Then as more and more users spammed the platform, changes happenned to the point where now I have maybe 3 working sites that are joomla k2. Sites are still out there but now google block the footprints for finding joomla k2.

Second time I saw this cycle of events was with gnuboard which makes nice contextual links on article posts. Maybe 2 years ago I had several thousand of these sites and over 1000 of them were do follow. Now those numbers have dwindled down to less than 100 do follow in my site list. As sites have died, I've lost rankings on client projects both times over. Lesson learnt.

So whilst I still use GSA SER as T1 and the other tiers, I do expect link loss over the months/years - it's inevitable. My solution is to focus on different link sources such as rankerx, money robot, my own PBN network (500 sites) as well as outsourced links from fiverr. I have my own software that runs scheduled link checks every month, so it's easy to mange now and see where the link loss is happenning,

The new engines from SER Nuke are very good, but I've changed my strategy with them. I'm using them sparingly on T1 only to ensure they don't get spammed too quickly. At one point I've had over 100 SER installs running, so you can imagine what happens to the sites in my list if I start running them on all installs. The new engines get used on homepage projects only for all my clients, just to boost authority and referring domain stats. They only run on 1 install which limits their use.

Your setup is on steroids mate

That's wise to verify links regularly, I don't do this but I should.

Well noted for forums, good to know.

That's weird we can't find Joomla k2 forums anymore, I didnt try yet but I mean there are always footprints in the source code can use, and if google still blocks them we han scrape with bing. I'll try it later, my scrapebox is now busy scraping urls for other engines.

I'm still thinking if I'll include forum profiles in my tier 1 or not, I need to read more info on this.

Thanks again for your detailed posts they are very useful !

remirom · October 2024

What worries me with forum links (signature and short profile ones, no contextual) is they look very spammy and I'm afraid google penalizes my site if I use them. We could make such links by the millions with xRummer back in time and google quickly dismissed them. We could even disavow them. So these links are probably harmful nowadays, don't know for sure.

organiccastle · October 2024

remirom said:

organiccastle said:

remirom said:

The forum platform is a grey area as forum posts are useless and will most likley be deleted but the forum profiles offer the same as what SER classes as contextual profiles in the article and social bookmark category.

Do you want traffic or rankings only and believe into your money site to convert?

Forums are absolutely great for targeted traffic. Just ensure you actually read the emails from the profiles and posts you make using automated tools.

You mean the forum posts you make by hand to promote your services, yes they could be useful to get traffic (if the mods don't spot your ad...). But with SER and automation tools we can't place such links we have to write generic comments that are spotted right away unless the board is not moderated (and thus has no interest since it's probably spammed as hell and won't bring you any human traffic).
On the other hand if you are talking about forum profiles I think they can be of interest if you have the possibility to write a long contextual article to surround the link. Then it will improve the backlink profile of your site.
Forum profiles where you just put your link are useless in my opinion.

No. Not the links, nor generic comments.

Register on a forum, post something using automated tools to provoke reaction and take it from there manually.

Look at yourself. You are posting in a forum, begging for answers Google Search can't give you.

remirom · October 2024

organiccastle said:

remirom said:

organiccastle said:

remirom said:

The forum platform is a grey area as forum posts are useless and will most likley be deleted but the forum profiles offer the same as what SER classes as contextual profiles in the article and social bookmark category.

Do you want traffic or rankings only and believe into your money site to convert?

Forums are absolutely great for targeted traffic. Just ensure you actually read the emails from the profiles and posts you make using automated tools.

You mean the forum posts you make by hand to promote your services, yes they could be useful to get traffic (if the mods don't spot your ad...). But with SER and automation tools we can't place such links we have to write generic comments that are spotted right away unless the board is not moderated (and thus has no interest since it's probably spammed as hell and won't bring you any human traffic).
On the other hand if you are talking about forum profiles I think they can be of interest if you have the possibility to write a long contextual article to surround the link. Then it will improve the backlink profile of your site.
Forum profiles where you just put your link are useless in my opinion.

No. Not the links, nor generic comments.

Register on a forum, post something using automated tools to provoke reaction and take it from there manually.

Look at yourself. You are posting in a forum, begging for answers Google Search can't give you.

ok so I post something like "I don't like your post because blablabla", but then where goes the link? Signature? (it will be spotted as spam).

Edit the message a few days later to include your link? well it's not really automation.

Could you please clarify how to do it? thanks mate

organiccastle · October 2024

Just update the profile with the link. Automated.
Don't make generic posts or comments. Use AI to create relevant content.

remirom · October 2024

organiccastle said:

Just update the profile with the link. Automated.
Don't make generic posts or comments. Use AI to create relevant content.

Thanks, yes I am already using AI to produce content..

Ok I'll tap into those forum profile links then. Do you know how to select the forum engines that only creates profiles and not actual posts on discussions?

Strange results while using built in footprints to scrape sites

Comments