CUSTOM GLOBAL LISTS
Yeah I know some of you guys been waiting for me to come out with it here. Now that CB is here and relatively stable, I want to shift the conversation to a feature I think will solve a LOT of big issues we're having with submission quality.
SER is essentially a plain-text database of lists segmented into platforms. Something I've noticed is that the larger my database gets (now 2.5 million... yiikes!), the more threads I see being used to filter out the low PR targets from my global list each run. Sometimes a half hour goes by until I get anything higher than a PR2.
So here's a no-brainer... What we need is a way to sort and send selections of links to our own separate, named lists. Lists that are totally seperate from the global list because otherwise it just churns through hundreds of thousands of low PR links in the global list looking for high enough PRs. In my case, it's sometimes an hour later until it stumbles on a few decent targets. At least that's what it's like with a big fatty-mcfatterson list.
So is it just me noticing this or does the PR filtering process seem grossly redundant? It wastes all of those threads rechecking PRs from the global list which it should already have recorded from when it first identified the platform (or at least on first submission attempt).
I've been wondering for awhile now... why can't I specifically target PR links instead of having to filter through the whole global list over again each time? Why not use a database system like every other submission software instead of just churning through a plain text grab-bag of platform links?
I love the platform segmentation, but now how about quality segmentation?
If I want to submit to my highest quality links efficiently, the only way I know to do this is to filter every one of my global list segments with scrapebox and setup an entirely separate instance of SER to churn it.
So is it just me noticing this or does the PR filtering process seem grossly redundant? It wastes all of those threads rechecking PRs from the global list which it should already have recorded from when it first identified the platform (or at least on first submission attempt).
I've been wondering for awhile now... why can't I specifically target PR links instead of having to filter through the whole global list over again each time? Why not use a database system like every other submission software instead of just churning through a plain text grab-bag of platform links?
I love the platform segmentation, but now how about quality segmentation?
If I want to submit to my highest quality links efficiently, the only way I know to do this is to filter every one of my global list segments with scrapebox and setup an entirely separate instance of SER to churn it.
Comments
The simple solution is that rather than having the global site list simply a list of URLs, it should be a CSV that contains all this information (or better still, a database that can be exported to a CSV if required). Now from a user perspective, you set your project up as normal, and it instantly knows which sites meet your criteria (even with a couple million sites, if it's over a second it won't be by much). It shouldn't be too hard from a programming perspective either (simple select statement).
Switching to a Embeddable Database alone brings their own issues such as concurrency. When you write to any of these databases they get locked which means you can't have multiple threads writing to the database at once or they will fail. GSA does writes so fast that Sven would probably have to build sort of a queue system to keep up and again that adds way more work than simple database selects and inserts/updates/deletes.
SENuke, MS and UD all use database systems. Now I challenge anyone to try to add "2.5 MILLION ANYTHING" to them and see if they even load. SENuke starts to freeze and lag when you add 30K+ Sites.
There is a reason Xrumer, ScrapeBox and GSA are faster than ALOT of these other programs out here and that mainly because they use flat .TXT files the way they do. IF Xrumer used a Embeddable Database for everything it would not be anywhere near as fast.
Let the coding part be done by me! Using a database is not required in most cases. It however is for massive amount of data where you have to search in a lot. Thats not the case for a SEO program as SER. Programs where this is required are things like our GENOM2005 program where DNA analysis produced a lot data. Belief me it is not needed and I will not add it.
Everything can be speed up, but it is not much that I can get out of that as everything is very optimized here already.
1. sort lines
2. go from top to buttom and delete things if two lines are the same (same url) or the domain is the same.
I don't see and optimization here.
Slight side note question while we are on the topic... if a site is verified, but then later fails, is it removed from the verified list? (Or should I leave this for another thread?)
>Slight side note question while we are on the topic... if a site is verified, but then later fails, is it removed from the verified list? (Or should I leave this for another thread?)
No it stays there.
The problem with saving all kind of information to a URL is a proper way to keep them updated. If you save PR1 to it and it actually is now PR0 you have another problem.
maybe I am not thinking complex enough here but let me give an example:
Folder PR 1: Sitelist Engine 1, Sitelist Engine 2 ...
Folder PR 2: Sitelist Engine 1, Sitelist Engine 2 ...
...
If the PR of a specific site is changing, you could just delete the entry from the list and add it to the corresponding site list in other PR folder.