Import Target URLs & Show Remaining Target URLs
Not sure if this is a bug or if i need help..
So i scraped a list of URLs with Scrapebox using the GSA SER Footprints, de duped the URL/Domain and was left with a list of 540,000.
I attempted to import this list via GSA SER ( Right click > Import Target URLs > From File ) on a dummy project.
The pop up appears and says "Imported 540,000 URLs to the target project..."
When i click show urls ( Right Click > Show URLs > Show Remaining Target URLs }
When i click show urls ( Right Click > Show URLs > Show Remaining Target URLs }
It only shows 290 urls?
I then thought i would take a look at my URL list, even though GSA confirmed 540,000 was imported..
on line 290 there was a url like the following.. (replaced the actual domain with 'domain' the rest is exactly the same)
http://www.domain.com/index.php/اقسام-اخرى/متÙرقات/10229-هل-Øان-وقت-خروج-المهدي-Ùˆ-عودة-المسيØ-ØŸ.html
So i removed this URL, cleared the URL cache so the list was empty, tried again and checked the show URL option and it now shows 65,000 URLs..
Went back to the URL list and at 65,000 was another URL like the one above, removed this, repeated the previous steps and it now shows 350,000 URLs..
Anyone know why this happens?
Even if it only shows 290 URLs will it still post to my 500k ?
if not..
Would i need to manually check files to remove dodgey URLs like the one above in order to import the full list?
Thanks
Comments
Thanks
Even if i right click the project and import list while it is active it does the same thing after a couple of minutes.. Tried several lists i havent previously tried.
In notepaad ++ change ur coding to utf-8 and save the file
This was a list from a provider. the other months worked fine till this list.
I did try saving in utf 8 with a different program. but the notepad++ conversion worked.
the list was downloaded and was in UCS-2 Little Endian encoding format.
Funny supposed to be over 300k urls, but when open in notepad++ shows half of that. 157k
SER purring like a kitten again, Thanks for the help!!!