Max thread usage of SER?

shaun · March 2016

Probably one for @Sven but if anyone else is able to help please chip in

.

Is there an upper thread count for SER or 32 bit programs?

I found this online that suggests there maybe a limit of 2048 - https://blogs.technet.microsoft.com/markrussinovich/2009/07/05/pushing-the-limits-of-windows-processes-and-threads/

I have been slowly increasing my thread count bit by bit and it runs find at around 1600 display threads in the tool but in Resource Manager SER is down for using many more than this. If I put it any higher than 1600 threads SER seems to randomly spike its actual thread usage in Resource Manager and then start to lose its active threads going down to around 200. The spike displayed in Resource Manager suggests it went higher than 2048.

Also the decrease shouldent have anything to do with scraping/verifying/email checking as the projects I am running are not doing any of these actions, they simply post, nothing else.

BigGulpsHuhWelp · March 2016

I think it depends on your proxy count- 10 threads per proxy is the general estimate- plus then you have to take into consideration limitations of your email provider, internet connections, and watch for download failed errors.

I think for efficency sake you should keep it below 1000 threads. Sometimes less is more.

shaun · March 2016

I have seen that 10 threads per proxy thing stated on here a few times but I have never seen anything to back it up so I decided to test it and it looks like it is just a random number thrown out years ago that stuck, I started at 300 threads and increase by 100 threads noting down LPM/DL/DLF at each stage once stable and always saw a benefit at each increase.

For example last night I was pulling 280 LPM steady on 1600 threads with no proxies at all, I had to stop as my server provider will have reports of attacks as it is using their native IP so can be tracked.

I have been doing other tests with 1000 threads, starting SER off with 1 proxy and increasing its proxies by one once I noted its results. There is definitely an effect on thread count and number of proxies as I guess they can only take so much bandwidth but that 10 threads per proxy is massivly undershooting.

BTW none of these projects are using emails so it has nothing to do with that, the internet connection is 600MB Download/800MB upload so that should be fine and SER barley effects my hardware usage.

Tim89 · March 2016

I've not tried beyond 1200 threads, seemed to run fine.

Yes as you mentioned though, a big factor is proxies when increasing threadcount and also you may want to adjust the HTML timeout setting, giving it a little longer to load pages, perhaps 120 seconds for efficiency.

shaun · March 2016

Yea I usually have it at 120 anyway, been testing the difference it makes too and it seems between 90 and 120 is optimum for my set up.

BigGulpsHuhWelp · March 2016

1000 threads on 1 proxy is insane haha-- wait you are not using emails? So these must be non contextuals right?

shaun · March 2016

Yea I needed it as a benchmark to take results from, not surprisingly its bandwidth was overloaded and it massively underperformed.

It is a mixture of both, SER supports both Contextual and Non Contextual platforms without email requirements.

BigGulpsHuhWelp · March 2016

Huh... interesting-- i find i get way way better performance out of SER with a lot of high quality emails.

shaun · March 2016

You have to take your goal into consideration to measure performance of the tool.

This set up is just an experiment to try maximize the LPM output of the tool for a new idea I have so I am measuring the performance of the tool on nothing but LPM.

For my T1 projects that touch my money site directly my goal is completely different so in that situation I measure the performance of the tool compleatly differently and LPM doesnt even come into it.

Hinkys · March 2016

Interesting, so what did you find was the dropout point in performance for a single proxy? (I'm assuming you're testing with non-shared dedicated) I'm usually doing 200-300 threads on 10 non-shared dedicated proxies on GSA SER at the same time as scraping around 500 urls/s with those same proxies. And it seems to work just fine. Never bothered pushing it beyond that but I'm curious about your experiment.

Also, while it's a really interesting experiment, you're probably better off optimizing your SER pipeline first if your end goal is more LPM. Eliminate engines that don't get verified, limit email verification, etc. I'd guess you're already doing that if you're experimenting with 1600 threads but that 280 LPM with that insane number of threads seems pretty low (assuming you're using verified, submitted or even identified list). Especially since you're not using proxies.

shaun · March 2016

@Hinkys all platforms used in this have been filtered so they don't require email verification already to save time and try boost LPM, projects are set to never verify the links as I dont care if they are actually there or not.

I dont have an optimum thread/proxy count yet as I am still optimising engine usage, turned off the contextual based engines and saw a decent jump in LPM. I then turned off trackbacks as they were taking over and saw a dramatic drop in LPM so i'm still tweaking. I know the contextual platforms require more actions to complete so take up more resources to complete.

I have sent a message to a friend to get him to send him a copy of a premium list he uses as I only have a highly pruned self made list so its not ideal for this test. I want to try it with a lot more platforms as I have spoken to people on Skype who claim to have 700 LPM but they use what I consider to be weaker engines such as RSS, Exploit, Indexer, Pingback and Refferer.

I think there is definitely some kind of limit or bug around the 2000 active thread mark for SER as it just seems to crash and stop using threads.

I am pretty sure I have worked out why SER uses more threads that you set too, the threads you are able to set are literally for the active side of SER that is posting, when I went into the Captcha settings and restricted the active captchas massivly the excess threads displaying when monotoring processed were reduced massivl

Hinkys · March 2016

Just saying, after you optimize your engines, it gets alot faster. Like right now I'm doing steady 130 LPM with 250 threads on an identified list of blog comments. And double that for the verified. That's all without turning off verifying. Hell I've had 720 LPM for a couple of minutes 1 time. If I had something larger than a $30 VPS, steady 700 LPM starts sounding pretty easy with like 1000 threads.

Dont know how that compares to trackbacks, pingbacks and other crap link types in terms of speed but that is just the result of turning off engines with low identified - verified ratio.

Articles obviously aren't as fast but still run at pretty decent speeds once you cherry pick the engines.

Now how did you find CPU usage scale when increasing the threads? Did it start growing exponentially after a certain # of thread? Cause it might be cheaper to split a dedi into VMs and run multiple instances of SER separately with lower number of threads rather than trying to run over 1000 threads on a single machine.

Also, while SER certainly is VERY well optimized (because I can't imagine any other tool in which you could even set threads that high), imo you can't really expect a piece of software like this to operate effectively on 2000 threads.

That all being said, props for the "lets test it" approach, you're running a really interesting experiment.

BigGulpsHuhWelp · March 2016

That's all really valid and well put.

Kaine · March 2016

1200 threads (50 projects):

shaun · March 2016

@Hinkys I have my scrapebox running at 3500 threads and it runs fine until its URL bank gets to big and it crashed after around 36 hours. But that is 64 bit software, that is the purpose of this thread to try get some confirmation from @Sven as to if there is a limit of some kind.

CPU usage had random spikes but I have other things running on timers that may cause this such as Ranker X with scheduled project starts and such.

How many projects are you running and within those projects how many URLs do you have SER posting URLs for your 130 LPM results?

Hinkys · March 2016

That was all using 3 projects which I use for sorting. Used to run with just 2 but I feel 3 is a bit more efficient.

Like right this moment I'm processing a deduped identified list straight from GSA Platform Identifier of about 100,000 URLs, consisting of about 70% blog comments, 30% articles using 3 projects and 300 threads at 344 LPM (it's been about 20 minutes and the LPM is only going up).

BigGulpsHuhWelp · March 2016

70% blog comments? Is that for low level t3?

shaun · March 2016

@Hinkys I'm guessing if its three link sorting projects you have then they are only building links for a single URL?

Hinkys · March 2016

@BigGulpsHuhWelp
No, that is the result of the latest article sites scrape run which just happens to be mostly blog comments (because every run seems to be mostly blog comments, no matter the footprints).

@shaun
yeah, a single non-relevant url.

shaun · March 2016

Probably my fault for not being more specific in my OP but this thread was intended to try get an answer about any limitations of the software so I didn't say exactly what I was doing.

@Hinkys The rig in question is being used to try out a new theory I have with higher comp keywords where SER is used to power links up as well as index the pyramid. To do this I run anywhere from 80-320 projects at the same time. Across these projects they are building at least 20 backlinks per day for anywhere between 8000-32000 URLs at any given time.

This is why im running at such a high thread count, if I was only using three projects building links to one URL I wouldent have taken the time out to start the thread :P.

feukim · March 2016

@shaun s cool, benchmarkin 1000 threads for 1 proxy,..

am stay with 280 threads, 40 proxies, normal priority, 90 LPM,

and keep it running every day without crash,...

shaun · March 2016

Think i'm just going to call this test a bust and down scale.

feukim · March 2016

@shaun : any server provider's officer has called you about this action?

is your server ip safe?

ah ah ah , tell me what machine that you'r running this..

shaun · March 2016

About what action?

I received a warning months ago for running GSA PI but it was my fault as I forgot to import my proxies and the IP was reported as carrying out DDoS.

I was testing it on the E3 24GB.

https://greenserver.io/

I am changing tactics as I am confident there is a limit around the 2048 thread count that will stop my plan working so I more tools are being put on it.

Max thread usage of SER?

Comments