Feature suggestion
sysco32
Skopje
As we would like to automate everything.I miss the following things.
When i finish harvesting URLs from scrapebox.
First i would like to remove the dups from the file.If i use monitor folder for dupes i would like to set a folder where PI spits out the deduped files.This output folder would be important,because we don't want PI to start identifying the raw harvested urls,only the deduped ones./which is a very small percentage of the raw file/.
So in that case i would set an identify project which monitors the deduped output folder.Than the identified goes to it's set folder and the unrecognized would go to it's set folder.
It would be good if SER could monitor also the unrecognized url folder,and start trying to post to those URLs so i don't have to manually import the URLs into a project.The other option is to set a size limit for the unidentified file,than start to write a new one and i will not loose track which was the last url what i imported to a project.Otherwise it will import all the urls from the beginning.
Thank you
When i finish harvesting URLs from scrapebox.
First i would like to remove the dups from the file.If i use monitor folder for dupes i would like to set a folder where PI spits out the deduped files.This output folder would be important,because we don't want PI to start identifying the raw harvested urls,only the deduped ones./which is a very small percentage of the raw file/.
So in that case i would set an identify project which monitors the deduped output folder.Than the identified goes to it's set folder and the unrecognized would go to it's set folder.
It would be good if SER could monitor also the unrecognized url folder,and start trying to post to those URLs so i don't have to manually import the URLs into a project.The other option is to set a size limit for the unidentified file,than start to write a new one and i will not loose track which was the last url what i imported to a project.Otherwise it will import all the urls from the beginning.
Thank you
Comments
So
1 it will save to another file,
2 i don't think that is a good idea to make it delete the file by any chance we want to use the file again,or the saved file got corrupted,we have a backup at least for few days.
3 so it will append to an existing save is good also if PI knows which one was the last URL
DO you have any solution for unidentified file for importing automatically?
Thank you
I tried it already,but the file name is different than in a site list.It didn't pull nothing.
Hi, i see the dedupe option with different save folder implemented! You guys rock!
Thank you very much!!!!
I have another suggestion.More like question.My jobs are running for a while now.I save the unidentified urls as well,as a lot of good links end up there.
So as long as SER can't monitor the folder like a global site list/i tried also to change it to name only/ i need to import the file manually.Which is not a problem.
The problem starts here: The file atm is over 1,5 GB.My VPS is strong,but was struggling to import this amount of URLs into a project and i ended up restarting the VPS.
So is there any option to apply a size/url/line limit for the unidentified file and when reached it would start a new one?
Let's say 200MB or if we could set it for ourselves it would be even better.
So we would have
unidentified.txt - 200MB
and the new one start like
unidentified01.txt
It would be also beneficial with space management,as we finished with the files...we can delete the ones that we don't need.
Thank you