Noob question about pulling from Advanced settings/Identified folders

December 2015

This is a quick question. I assume SER pulls all txt files if a project's set to pull URLS to post to either of the 4 options in Advanced settings.

Is that correct? So you don't have to follow the naming convention of Article-Engine-whatever right? Amirite?

That means I can dump a bunch of urls with scrapebox in there and let some projects process them.

Thanks.

December 2015

SER reads only the files which are engines in the project options. Putting there any file will not work.

December 2015

What I meant is if I export files from scrapebox to Identified folder like in the picture below and I've got some dummy projects to process those raw scrapes, then SER will read any text file for URLs, whether they are called junk.txt or Article-Wordpress-Articles.txt.

December 2015

It looks Sven is right or I misunderstood him.

Anyway, I got my answer. Whatever I put in

/gsadata/links/identified from scrapebox, SER deletes it.

At least I can't find the files I put in the identified folder.

I just now need to figure out how I can automate feeding SER from scrapebox without platform identifier.

December 2015

Just import your scrapes directly into the campaign after you've created it, uncheck all sitelists and ok the project, it will pop up and state that you haven't selected any sitelists, ignore this, then right click on the project and import sitelist from clipboard/file.

Let the project run and it will save all verified links in the verified sitelists.

I used to have many dummy projects set up simply to process my raw scrapes.

I would scrape, import scrape into my dummy SER campaign and let that run constantly which will then easily build up a verified list.

It's not 100% automated this way but it's just a matter of copy/importing your scraped urls.

December 2015

Thank you Tim. That's what I used to do earlier this year, but I just wondered whether there was something I'm missing.

There's no official solution for this I guess other than platform identifier, which I can't pay for this time.

I also wondered what if you wrote something that would append your clear/deduped/etc raw scrapes into a new dummy engine(inside your identified folder) you'd write for gsa ser(like one of Ozz's old templates) and just let SER pick it up, but it would probably not post to these new links at all due to the dummy engine well being just a work-around, not a real engine. It would probably not pick up these dummy engine links and then sort them out properly.

SER really goes through the identified folder and deletes all amateurish attempts of mine to put the raw scrapes there.

> Just import your scrapes directly into the campaign after you've created it, uncheck all sitelists and ok the project, it will pop up and state that you haven't selected any sitelists, ignore this, then right click on the project and import sitelist from clipboard/file.

December 2015

I suppose you could write a script to have SER automatically import a raw sitelist scrape into a project.

The easiest way to streamline this task, the way you want it to, is platform identifier.

Noob question about pulling from Advanced settings/Identified folders

Comments