@sonic81 - The more 'no limits' projects you have, the higher your LPM will be. Everyone thinks a person with 10 'no limits' projects will have the same LPM as a person with 30 'no limits' projects. It isn't so.
personally I only have 6 engines selected; I have been doing this for a while and I have huge site lists.
I am looking at the figures posted above by @doubleup and @LeeG.
I can honestly say that getting these numbers isn't really that difficult. good proxies-good rig- scale down engines/platforms that don't have high success rates. and you can get these numbers as well.
Another thing @doubleup is running at 800 threads. He is pushing the hell out of that machine. What is amazing to me is that low ram usage. I have had the software all the way up to 2000 threads and the 2GB limitation just eats me alive.
These days though I have found that it isn't about the threads so much as it is about the proxy load. I have several machines that run over 150-250 lpm. with thread counts between 200-300. the machine runs more stable here and I don't have to worry about it locking up when I am away.
I am thinking about other things though. verified links per minute would be a more valuable stat. just my opinion.
I've been reading this thread with great interest since my verified per day fell drastically after I imported a lot of scraped sites, and I have an idea why, I just hope you guys can help me so my LpM can get back to normal - or even better.
I tried to compare stats to wheed out the platforms that didn't perform. Stats were the same for identified, submitted and verified. Then I noticed that GSA (at least my install of GSA) saves these urls in the same GSA root folder. Shouldn't they be in different folders? As it is right now choosing a project to pick only from the verified site list doesn't really work as they are mixed in all together - am I right about this?
just open that folder and you see there are all kind of different files labeled as 'identified', 'submitted' and so on.
to increase your submission rate just use the 'submitted' or 'verified' list which depends on how big your lists are. to do that uncheck 'save identified sites to' in the options of your screenshot or select what you want to use in your project options.
@Ozz that's the problem I don't have these files labelled that. I only have files starting with: sitelist_Article-BuddyPress sitelist_Guestbook-TPK Guestbook etc.
Nothing with identified, submitted, verified etc.
That is probably also why when I choose to view the stats for identified, submitted, verified, the numbers are all the same because all urls are being save to the same file? sitelist_*
So - is my GSA installation bugged, or do I just need to point it to new subfolders?
I hope that I can help you here. Don't worry about the file names. Look at the folder in which the files lie. They should be verified, submitted, identified, failed. You will have the same file names in each folder. I hope that helps.
Have a look at my screenshot. Identified, submitted, verified urls are ALL being saved in the same root dir of GSA, and there are only one sitelist per platform. This looks like to me that ALL 3 types of URL's are being submitted to the same sitelist files, which renders the whole idea of separating them useless.
So - is my GSA FUBAR? Do I need to reinstall or am I missing something here? As I wrote before, when I view stats in the advanced section, the numbers for identified, submitted and verified are all the same - which of course they shouldn't.
@LeeG Exactly - when I bought CB it was the first time in many years I bought piece of software or hardware that I never researched first for reviews. I just knew that Sven and his team would produce quality. No need to check up on it.
I think I bought it around the end of january 2012.
One last question, which I asked in another thread actually, but never got an answer to.
5 projects using the exact same keywords. Would it not make sense to have one project scraping and the last 4 projects just using the global site list without scraping?
I run a similar idea to that, but on a much bigger scale
Everything scrapes from the engines and also uses the global sites lists
I have managed to use every google listed in the process. Thats four random googles plus the .international used, over all my projects
Sometimes you can be on a time limit block on a search engine
So you can scrape once and get zero results.
Scrape again and get some results
And if google does the dirty and roll out a new algo, you might be running a few thousand keywords behind on one of the other projects and pick up the new sites that have jumped places.
I'm trying to maximize the effect of 40 threads running on very quick proxies, so I'd like to keep the scraping as low as possible. Does my approach make sense then (disable scraping on 4/5)?
LeeG is very open minded when it comes to sharing his techniques, but I don't think he will give you a blueprint. I can't speak for him though, but for me it isn't motivating at all if everything has to be repeated again and again. Just read, learn and understand his techniques.
Furthermore a screenshot of your status bar helps noone as we don't know your settings. So do your homework, implement what you learn from this board, modify and test your setup first.
Comments
@sonic81 - The more 'no limits' projects you have, the higher your LPM will be. Everyone thinks a person with 10 'no limits' projects will have the same LPM as a person with 30 'no limits' projects. It isn't so.
Been using paralles + win xp on a mac mini 2012 --- which was a bit slow sometimes working on too..
% rate submit is still pooer when using junk / spam links...
@sonic, try experimenting with the amount of se your using
I personally prefer to use a lot less than you have selected
Over 150 less than your using
I am looking at the figures posted above by @doubleup and @LeeG.
I can honestly say that getting these numbers isn't really that difficult. good proxies-good rig- scale down engines/platforms that don't have high success rates. and you can get these numbers as well.
Another thing @doubleup is running at 800 threads. He is pushing the hell out of that machine. What is amazing to me is that low ram usage. I have had the software all the way up to 2000 threads and the 2GB limitation just eats me alive.
These days though I have found that it isn't about the threads so much as it is about the proxy load. I have several machines that run over 150-250 lpm. with thread counts between 200-300. the machine runs more stable here and I don't have to worry about it locking up when I am away.
I am thinking about other things though. verified links per minute would be a more valuable stat. just my opinion.
4 samples from different machines.
I only run 5 search engines on any tier.
Or to be more exact, I only use one engine which is google and then choose four random countries plus the .com version
Reduces getting your proxies blocked for too many search queries
20hrs of running, Im still under the 2gb threshold
Im now banging out 1/4 million submissions daily
Its just that you need to use the magic formula of
time + effort = high lpm and verified
time analysing your results to see which engines you get high results submitting to
effort making the changes so you only submit to those engines.
There is no magic button in ser to do that
Depending n the search engines and words your trying to rank for, using any google wont help.
Some words get used in most languages (football, sex, viagra etc)
I tried to compare stats to wheed out the platforms that didn't perform. Stats were the same for identified, submitted and verified. Then I noticed that GSA (at least my install of GSA) saves these urls in the same GSA root folder. Shouldn't they be in different folders? As it is right now choosing a project to pick only from the verified site list doesn't really work as they are mixed in all together - am I right about this?
sitelist_Article-BuddyPress
sitelist_Guestbook-TPK Guestbook
etc.
Nothing with identified, submitted, verified etc.
That is probably also why when I choose to view the stats for identified, submitted, verified, the numbers are all the same because all urls are being save to the same file? sitelist_*
So - is my GSA installation bugged, or do I just need to point it to new subfolders?
Have a look at my screenshot. Identified, submitted, verified urls are ALL being saved in the same root dir of GSA, and there are only one sitelist per platform. This looks like to me that ALL 3 types of URL's are being submitted to the same sitelist files, which renders the whole idea of separating them useless.
So - is my GSA FUBAR? Do I need to reinstall or am I missing something here? As I wrote before, when I view stats in the advanced section, the numbers for identified, submitted and verified are all the same - which of course they shouldn't.
My site-lists are huge - GSA must be struggling hard at them. I just first today took notice when I saw how efficient some of you guys are with GSA
new there was a reason behind remove dups websites / and urls :-( !!!!!!!!!!!!!
@claus10 sounds like your one of the older hands at ser
In the early days, thats how ser sorted the lists
Another quick way to do it which will take about ten minutes, maybe a bit longer depending on lists sizes
Go into the location where they are stored at present.
If the folder names are not there.
Right click and create new folder, then make a name verified and repeat three times, adding the name
Then add those locations to where the files should be
Next the time saving clever bit
Hit the tools button on the advanced tab > add urls from projects > submitted and again for verified
Tools again > remove duplicate urls
Yeah - I've had it for a loooooooooooong time - before the BHW thread reached 5 pages :-)
Thanks for the import tip!
Your just plain showing off with your copy.
One install on your original pc / vps and still running strong, with only using the update button
Testament to Svens programming
Which bhw thread, the sales thread or the original discussion
I think I bought it around the end of january 2012.
5 projects using the exact same keywords. Would it not make sense to have one project scraping and the last 4 projects just using the global site list without scraping?
I run a similar idea to that, but on a much bigger scale
Everything scrapes from the engines and also uses the global sites lists
I have managed to use every google listed in the process. Thats four random googles plus the .international used, over all my projects
Sometimes you can be on a time limit block on a search engine
So you can scrape once and get zero results.
Scrape again and get some results
And if google does the dirty and roll out a new algo, you might be running a few thousand keywords behind on one of the other projects and pick up the new sites that have jumped places.
Thats my theory
I'm trying to maximize the effect of 40 threads running on very quick proxies, so I'd like to keep the scraping as low as possible. Does my approach make sense then (disable scraping on 4/5)?
Try it for 24hrs and see how it goes.
You have an idea of your present submissions and verified daily ratio
Which after you make the changes to the global sites list, should take a boost