New identify & sort in feature - Show # of detected posting engines
When we scrape URLs from external scrapers for only selected engines, for example, like Drupal or such, lots of other URLs gets detected too (most common and annoying is "General Blogs" which has lowest verification rate).
So what I want to request is and I think it maybe super useful to lots of other guys here too that at the end of "identify & sort in" from custom scraped URL list, at the end of it, show stats like how much # of URLs are detected for a particular engine like:
732 URLs for Drupal
1000 URLs for General Blogs
322 for Moodle
You get the point I think. And also give a "checkbox" to us on the left of those detected engines whether we want to save URLs from those engines to our custom file or not. For example, when I am scraping only for Drupal engine and only need Drupal URLs, other detected engines are sort of waste of resources to run in the projects because I am not going to post to them in the first place - so why waste resources? Instead if we can shed it out at the time of identify and sort in, only the URLs from the engines that we need would be ran into the project.
So overall efficiency is increased as well.
And most importantly, we can see how well we are able to scrape and how many engine specific targeted URLs are we able to obtain - because in that general list, no numbers for particular engines as of now are mentioned. So this can be certainly a great addition I feel!
Sorry if this is confusing but if you have any questions @Sven, please feel free to ask.
Thank you.
Comments
It would be nice if there was a standalone version of identify and sort to be able to run it on different computers.
You can sell it as an add on.