Regularly save scraping progress

Tixxpff · May 2014

Ok, I'm quite pissed so instead of creating a rage thread, I'll channel this rage into something useful.

The last couple of days I was scraping contextual URLs for my latest project. At ~95% scraping progress my computer crashed. After turning it back on I realised that SER scraper isn't saving the URLs in real time to your custom .txt file. It only saves once the progress is finished or aborted.

So how about you make it save every ~1-5% so instead of losing the whole list you only lose the last 5% or so.

Sven · May 2014

Well it saves it every 5min or 100+URLs. How did you feed it with URLs?

Tixxpff · May 2014

@Sven It didn't for me.. my list's completely empty.
Did you mean 'feed'? If so, I was scraping using GSA foot prints + KW list.

edit: Where does SER save the URLs to, if no custom .txt file is selected?

Sven · May 2014

for no custom file selected, it saves them in site lists -> identified.

Tixxpff · May 2014

I have Identified unticked. I only save to custom files and read from previously created lists. Well, I guess it was just bad luck/bug.
I'm scraping again now and it's updating my custom file every ~100 URLs, just as you said it would.

Thanks for the help Sven.

Regularly save scraping progress

Comments