Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

In this Discussion

Increase proxy thread loading speed (proxy test)

KaineKaine thebestindexer.com
edited January 11 in GSA Proxy Scraper

I find myself in a case where I would like to test millions of proxies (test only the anon, no other additional test), the file is 25mb.
Having a powerfull fiber optic (full 1gbs) I would like to use around 700 threads but I find myself facing a problem.
Threads aren't starting fast enough to reach the number of concurrent tests my connection can handle (I could go higher without a problem).
The time that test threads increase over the time that previous tests complete does not allow me to exceed about 250 concurrent connections.
I think the charging thread is not fast enough or should be multiplied to compensate for this. For example 2 or 4 loading threads by sharing the sum of the proxies to be tested.

What do you think ? is this feasible?

I imagine many of us would find ourselves in this kind of situation (only the scratch free test) and the performance would surely be greatly accelerated.

Sorry for my extreme optimization fanaticism. I see a probable gain of x2 x3 minimum.

EDIT: It would be possible to automate the number of load threads directly by the number of actual threads through the number of threads desired. If the actual thread count is less than 10, an additional load thread is added.

This implies at the very beginning of the process not to divide (and therefore allocate) the total sum of proxy by the number of load threads, but rather a fixed amount of proxy. For example 500 proxy per loading thread.

Comments

  • SvenSven www.GSA-Online.de
    I don't see room to optimize this really. If there are free slots to start threads, it's done immediately. Maybe the interval to check for free slots can be optimized but thats it.

    Please explain what you do exactly and maybe send a bugreport from help menu if you see it not acting fast enough.
  • KaineKaine thebestindexer.com
    edited January 11
    I'm not doing anything special, just importing a big url file. Then I click on test, then select only anonymous proxy. The only problem is that it can't get threads to start faster than the ones that quit. With me this limit is around 250.
    I sent you a bug report by mentioning this thread but I don't think that will help.
  • SvenSven www.GSA-Online.de
    please update and report if speed improved (uploading new version now).
    Thanked by 1Kaine
  • KaineKaine thebestindexer.com
    edited January 11


    Sorry for time i no longer receive notification, no approximatively 250 threads (700 in setting).

    If I understood correctly by setting 700 threads, there are something like 700 spots available. By the time the control (free space) and the thread are started, at least one other thread ends when we are around 250 (closing = starting).
    Perhaps it is necessary to deactivate the control as long as one has not reached the number defined in the parameters?
    Now if the set of control + start cannot be faster, I think that the threads will go down gradually afterwards (after reaching the desired setting and then re-enable control).

    This is why I was talking about multiplying the loading queues if you want to leave it as it is (control).

    Just out of curiosity and to think in the right direction, I imagine you increased the speed of the control but by how much? x2 x3 ...?
  • SvenSven www.GSA-Online.de
    OK, I think I found the bottleneck now. Next update will improve speed here.
    Thanked by 1Kaine
  • KaineKaine thebestindexer.com
    Thank you :)

    It goes up to the 700 threads that I set, there are still some small instability like relapses of 1 or 2 seconds towards 400 threads but then it goes up. Most of the time it stays on the selected value. A little something is missing for stability but it's already a lot better (x3). 
  • SvenSven www.GSA-Online.de
    "missing for stability" = ? What do you mean by that? the temporary thread drop is normal as it needs to clear some lists or other stuff that can't be left alone.
    Thanked by 1Kaine
  • KaineKaine thebestindexer.com
    By "missing for stability" I mean that it remains in the + or - 10 threads close to the desired setting (by comparison with drops from 700 to 400 even if this lasts a little time). At this stage it is no longer very serious you have met 90% of expectations and this kind of thing can surely be used again.
Sign In or Register to comment.