Max thread usage of SER?
shaun
https://www.youtube.com/ShaunMarrs
Probably one for @Sven but if anyone else is able to help please chip in .
Is there an upper thread count for SER or 32 bit programs?
I found this online that suggests there maybe a limit of 2048 - https://blogs.technet.microsoft.com/markrussinovich/2009/07/05/pushing-the-limits-of-windows-processes-and-threads/
I have been slowly increasing my thread count bit by bit and it runs find at around 1600 display threads in the tool but in Resource Manager SER is down for using many more than this. If I put it any higher than 1600 threads SER seems to randomly spike its actual thread usage in Resource Manager and then start to lose its active threads going down to around 200. The spike displayed in Resource Manager suggests it went higher than 2048.
Also the decrease shouldent have anything to do with scraping/verifying/email checking as the projects I am running are not doing any of these actions, they simply post, nothing else.
Comments
I think for efficency sake you should keep it below 1000 threads. Sometimes less is more.
Yes as you mentioned though, a big factor is proxies when increasing threadcount and also you may want to adjust the HTML timeout setting, giving it a little longer to load pages, perhaps 120 seconds for efficiency.
It is a mixture of both, SER supports both Contextual and Non Contextual platforms without email requirements.
Also, while it's a really interesting experiment, you're probably better off optimizing your SER pipeline first if your end goal is more LPM. Eliminate engines that don't get verified, limit email verification, etc. I'd guess you're already doing that if you're experimenting with 1600 threads but that 280 LPM with that insane number of threads seems pretty low (assuming you're using verified, submitted or even identified list). Especially since you're not using proxies.
Dont know how that compares to trackbacks, pingbacks and other crap link types in terms of speed but that is just the result of turning off engines with low identified - verified ratio.
Articles obviously aren't as fast but still run at pretty decent speeds once you cherry pick the engines.
Now how did you find CPU usage scale when increasing the threads? Did it start growing exponentially after a certain # of thread? Cause it might be cheaper to split a dedi into VMs and run multiple instances of SER separately with lower number of threads rather than trying to run over 1000 threads on a single machine.
Also, while SER certainly is VERY well optimized (because I can't imagine any other tool in which you could even set threads that high), imo you can't really expect a piece of software like this to operate effectively on 2000 threads.
That all being said, props for the "lets test it" approach, you're running a really interesting experiment.
Like right this moment I'm processing a deduped identified list straight from GSA Platform Identifier of about 100,000 URLs, consisting of about 70% blog comments, 30% articles using 3 projects and 300 threads at 344 LPM (it's been about 20 minutes and the LPM is only going up).
No, that is the result of the latest article sites scrape run which just happens to be mostly blog comments (because every run seems to be mostly blog comments, no matter the footprints).
@shaun
yeah, a single non-relevant url.