Import List Questions
1. I've scraped a very comprehensive list with scrapebox using numerous SER footprints+keywords. Do I need to "trim to root" for these lists before I import them or just leave the full URL and just "remove duplicate urls"? Or does it even matter?
2. I have 3 locations running SER. From time to time I like to take the "submitted url lists" from each location transfer them to my other locations running SER. When importing site lists from other SER installs, should I use Advanced>Tools>Import Site Lists when doing this OR open the submitted sites list folder, merge all lists into a text file, and then import them via right clicking a Project>Import Target URLs>From File.
I'm assuming they'll both do the same thing, but using Import Target URLs will run through all URLs immediately as where Import Site Lists will submit to the URLs over time, for all projects (NOTE: only if Global Lists is checked)
3. What's the difference between the 2 features under Advanced>Tools: Add URLs from Projects VS Import Site Lists?
Comments
If it was me, I would split the lists if they are big, then add them to t3 or t4 projects
Filter them in that way and they only get added once, then the working links added to the site lists
1) doesn't matter
2) Use Import Site Lists
3) Import/Export are from other SER instances
@Sven - I need a bit more clarification please.
1. Ok, so SER will find the registration pages on it's own regardless if the url is at it's root or a random url scraped? That doesn't seem to make much sense.
2. Is my assumption correct in the 2nd paragraph of #2?
3. This explanation didn't make sense.
1. Just have a look in e.g. any forum engine. It first does "find link=Register" or something alike. And the Register/Login link is usually visible on sub pages as well.
2) yes correct
3) That Import/Export of site lists was made for things you want. So use it to import site lists that you build on a different machine.
1. Ok then, I'll trim to root as an extended URL (www.website.com/string.php?t=something&=something) I would think would be harder on SER to find register links for, as it would have to trim to root before it started the registration process. That's even if it does that at all.
2. Ok thanks! Followup questions: a) If I import my URL lists, using "Import URL Lists" feature, am I importing them into the "Successful" AKA - Submitted folder? b) Once I have imported my URL lists into the submitted folder, will SER ensure that ALL PROJECTS run through the URLs imported?
3. Cool....thanks!
UPDATE: I ended up importing my HUGE URL list into each project by choosing "Import Target URLs" and now I want to delete all of them.
I am selecting Show URLs>Show left Target URLs>Selecting All>Deleting then clicking OK. When I open the Target URL list again to ensure the URLs are gone, they are still there! I believe this is a bug, as none of the URLs for any of the projects are being deleted. I have tried this numerous times with no success.
@Sven- thanks for the fix. I will be awaiting this.....
My last 2 followup questions were ignored above. Followup questions: a) If I import my URL lists, using "Import URL Lists" feature, am I importing them into the "Successful" AKA - Submitted folder? b) Once I have imported my URL lists into the submitted folder, will SER ensure that ALL PROJECTS run through the URLs imported?
a) as it is parsing the URLs...it is adding them to "identified list"
b) yes and no. The projects can use the new URLs any time you add a new URL it it. But as they get the new URLs randomly (random position in file). It might take it's time till they get a new one and not some old.
b) Understood and thanks!
Trim to root seems to make the most sense when importing URLs. Then you can keep track of ALL your scraped URLs in one excel file. Have one tab as a master list of all URLs and a second tab for newly scraped URLs.
Each time you do a scrape:
1. Trim to root
2. Copy and paste into Excel, tab 2
3. Do a Vlookup against master list tab and sort out the duplicates. Move the unique ones into SER and also paste them to the master tab.
Eliminates SER having to process duplicates.
a) advanced > tools > "import URLs (identify platforms and sort in)"... do they get sorted to site_list-identified?
b) i accidentally left the "save identified sites to" [unchecked] when i was doing my mass import... does it matter?
c) in order to use the URLs from "import URLs (identify platforms and sort in)", we have to enable this in all projects like this?
in each project, we [check] use URLs from global site lists if enabled
[check] identified
is this correct?
thank you for the help!
a) yes as no verification/submission was done
b) no, should be imported anyway
c) of course, else they are not used