Skip to content

Scrapebox > Platform Identifier > GSA no engines found

Hello!

I am currently scraping using scrapebox and some list of footprints I found online until I can find my own.

Once the list is scraped I go in platform identifier and I process the file, per engine, directly to GSA SER's identified's folder, then I remove the duplicates.

I'm still getting a whole lot of 'no engine matches' in the logs and my lpm dropped significantly.

I also notice that when I go within GSA SER to 'Clean Up' in the tools, there are still a lot of URLs being removed, somehow. It's confusing because I just did that process within platform identifier. So what's different? They don't have the same footprints? What do I need to add in GSA SER?

What are some tips here? How can I improve my workflow?
Tagged:

Comments

  • No need to search for footprints online on 3rd party sites. Use SER instead.
    • Create a new project in SER.
    • Select the engines you want (+ apply filters)
    • Right button click in the field "where to submit" -> "export footprints of all checked engine"
      This way, you also ensure to include your custom footprints.
    • Enter your niche's KWs or use generic A - Z, 0 - 9 in Scrapebox.
    • Merge with the footprints exported.
    • Use good proxies when scraping Google.
    • Opt for the detailed harvester than the custom harvester in Scrapebox.
      It seems to take longer than the custom harvester but when looking at the results, you'll see that it is way more successful.
    This way, you will scrape primarily target URLs that will be recognized by SER's engine. You can then expand this list in Scrapebox using the "link extractor" addon.
  • No need to search for footprints online on 3rd party sites. Use SER instead.
    • Create a new project in SER.
    • Select the engines you want (+ apply filters)
    • Right button click in the field "where to submit" -> "export footprints of all checked engine"
      This way, you also ensure to include your custom footprints.
    • Enter your niche's KWs or use generic A - Z, 0 - 9 in Scrapebox.
    • Merge with the footprints exported.
    • Use good proxies when scraping Google.
    • Opt for the detailed harvester than the custom harvester in Scrapebox.
      It seems to take longer than the custom harvester but when looking at the results, you'll see that it is way more successful.
    This way, you will scrape primarily target URLs that will be recognized by SER's engine. You can then expand this list in Scrapebox using the "link extractor" addon.
    Hello,

    My goal in using 3rd party footprints is to scrape websites that aren't being scraped by other GSA users. I want a different link profile. @backlinkaddict

    It also doesn't explain why GSA platform identifier is identifying urls as certain engines while GSA SER can't recognize them. Logically it's supposed to be cleaning my scraped lists properly. That's why I was wondering if Platform Identifier somehow has more ''footprints'' (or whatever it uses to spot the engine) than GSA SER. I can't figure this out through Footprint Studio either.

    Can anyone advise on that point?

    I will however try what you both suggested and see if my results improve.

    Thank you for your input!
  • rosath said:

    My goal in using 3rd party footprints is to scrape websites that aren't being scraped by other GSA users. I want a different link profile.
    That's perfectly fine but you need to ensure that SER knows how to identify and action on these targets. Thus the suggestion to export footprints from SER.

    Can't comment on PI as I am not using it.

  • Hi, what do you mean by "You can then expand this list in Scrapebox using the "link extractor" addon." ?
    Does it mean you are looking for external links on pages SB already found, hoping it will be same platforms? thanks
  • remirom said:
    Hi, what do you mean by "You can then expand this list in Scrapebox using the "link extractor" addon." ?
    Does it mean you are looking for external links on pages SB already found, hoping it will be same platforms? thanks
    Both internal and external links. Some people like to post again on the same site.
    Thanked by 1remirom
Sign In or Register to comment.