Skip to content

Noob question about pulling from Advanced settings/Identified folders

gsa8mycowsgsa8mycows forum.gsa-online.de/profile/11343/gsa8mycows
edited December 2015 in GSA Search Engine Ranker
This is a quick question. I assume SER pulls all txt files if a project's set to pull URLS to post to either of the 4 options in Advanced settings.

Is that correct? So you don't have to follow the naming convention of Article-Engine-whatever right? Amirite?

That means I can dump a bunch of urls with scrapebox in there and let some projects process them.

Thanks.

Comments

  • SvenSven www.GSA-Online.de
    SER reads only the files which are engines in the project options. Putting there  any file will not work.
  • gsa8mycowsgsa8mycows forum.gsa-online.de/profile/11343/gsa8mycows
    edited December 2015
    What I meant is if I export files from scrapebox to Identified folder like in the picture below and I've got some dummy projects to process those raw scrapes, then SER will read any text file for URLs, whether they are called junk.txt or Article-Wordpress-Articles.txt.

    image
  • gsa8mycowsgsa8mycows forum.gsa-online.de/profile/11343/gsa8mycows
    edited December 2015
    It looks Sven is right or I misunderstood him.

    Anyway, I got my answer. Whatever I put in D:/gsadata/links/identified from scrapebox, SER deletes it.

    At least I can't find the files I put in the identified folder.

    I just now need to figure out how I can automate feeding SER from scrapebox without platform identifier.
  • Tim89Tim89 www.expressindexer.solutions
    edited December 2015
    Just import your scrapes directly into the campaign after you've created it, uncheck all sitelists and ok the project, it will pop up and state that you haven't selected any sitelists, ignore this, then right click on the project and import sitelist from clipboard/file.

    Let the project run and it will save all verified links in the verified sitelists.

    I used to have many dummy projects set up simply to process my raw scrapes.

    I would scrape, import scrape into my dummy SER campaign and let that run constantly which will then easily build up a verified list.

    It's not 100% automated this way but it's just a matter of copy/importing your scraped urls.
  • gsa8mycowsgsa8mycows forum.gsa-online.de/profile/11343/gsa8mycows
    edited December 2015
    Thank you Tim. That's what I used to do earlier this year, but I just wondered whether there was something I'm missing.

    There's no official solution for this I guess other than platform identifier, which I can't pay for this time.

    I also wondered what if you wrote something that would append your clear/deduped/etc raw scrapes into a new dummy engine(inside your identified folder) you'd write for gsa ser(like one of Ozz's old templates) and just let SER pick it up, but it would probably not post to these new links at all due to the dummy engine well being just a work-around, not a real engine. It would probably not pick up these dummy engine links and then sort them out properly.

    SER really goes through the identified folder and deletes all amateurish attempts of mine to put the raw scrapes there.

    > Just import your scrapes directly into the campaign after you've created it, uncheck all sitelists and ok the project, it will pop up and state that you haven't selected any sitelists, ignore this, then right click on the project and import sitelist from clipboard/file.
  • Tim89Tim89 www.expressindexer.solutions
    I suppose you could write a script to have SER automatically import a raw sitelist scrape into a project.

    The easiest way to streamline this task, the way you want it to, is platform identifier.
Sign In or Register to comment.