Skip to content

Set Individual Proxy Use By Percentage Activating Gradually So Paid Rotating Proxy Gateways Get Used

DeeeeeeeeDeeeeeeee the Americas
edited February 14 in Feature Requests
OK, GSA friends...I was wondering...
If I'm using a pool of rotating proxies for scraping, and there are two IPs gateways ...
AND
I'm using additional found-and-tested public proxies as well to scrape Google and other SEs...
(which I actually have SER set-up to do right now)
The more public proxies I find, the more the rotating proxies I pay for are underutilized.
Because I use my other set of proxies for posting and other functions, this set has to have scraping proxies from my gateways as well as the found scraping-passed proxies.

Unlike with URLs to post to in a project where we can have more than one entry for the same URL, and can create percentages in that way, proxies are dealt with intelligently, and not just as a text string, and each proxy is only listed once even if  you try to enter the same one fifty times. So we can't fix it that way...

At one time I dealt with this by having the 2 or 4 gateway IPs and then set GSA Proxy Scraper to have multiple ports, each used as a proxy server.   If I had 2 gateways and 4 proxy servers, I would be getting use of my purchased rotating proxies at 33%. Then Sven showed me how to use file output and use less system resources.

So..
I guess what I'm asking is could SER keep a counter of proxy requests and user can set certain proxies to be in use for a set percentage of all proxies requests on that list when Public or Private proxies are used?

So, each proxy can be set, or some, or none, for percentage use.
If more than one is set, OFC, can't be over 100%.
But if you just set one proxy to 30%... And another to 10%....That would work...
You could also set 20 individual proxies to 2.5%, each. 
Or. whatever.

The issue is, with this idea, let's say the public-found proxies are all now bad and you're down to the one or two gateways ONLY.
The only logic I see that would work would have to be to keep the percentage at AT LEAST 30 percent, because  as the public proxies go down and all you're left with is the gateways, you don't want to keep those at a fixed use of 30% or whatever. What would SER use for the other requests for a proxy?!
Any ideas would be great. This has been an issue and I want to resolve it and get my use of the gateways, not only when the public proxies are depleted! :)
I'm seeing the problem with this as:
1 Proxy set for XXX [10% ] of requests. (Public or Private - which ever list it's on!)
I guess we'd need a second user-settable variable.
On less than  YYY proxies ignore the above percentages?

Or, better yet, could there be a sliding-scale type way of integrating this?

On less than  YYY proxies totally ignore any percentages set for a proxy?
At  ZZZ proxies begin raising the percentage toward YYY.

This way the user gets to determine how quickly this turns on or off, and it's not just a binary thing, rather more suited to the circumstance and user's requirements.

Example:
//////////////////////////////////////////////////////////////////////////////
///
///Proxy 1. percentage: [10%]
///(Displayed in Proxy list column showing percentage)
///
///Totally Stop Percentage-Based-Use on [50] or fewer proxies
///Begin Raising Percentage (On any proxies set with a percentage-use) At [100] Proxies
///(Displayed at bottom Proxy pane)
//////////////////////////////////////////////////////////////////////////////

At Number of Proxies  [now 100] Where SER Begin Raising Percentage begin at Percentage Set For Each Set Proxy.
So for this test we have:
Proxy 1 at 10% minimum
At 100 proxies left this number begins climbing from the 10%
until we hit 50 public proxies left, when it's effectively 100% (off)

(So, let's say we have 75 proxies left, we'd be at 55% usage, etc...)

Percentages have to be based not on total number of proxies in that list (Public or Private) but total number of non-gateway (percentage-set) proxies in that list. So anything that you set with a percentage can't be counted!
What if you had 20 proxies each set at 2.5%?. These can't be counted as part of the total! These are the ones that will be affected and raised on fewer proxies!

So we need three variables...
1. an array so another column for each proxy to have a percentage
and then
2. percentage proxies using percentages all begin getting raised at
3. percentage proxies using percentages are ignoring percentages completely.

Anyone can think of anything else to solve this issue with under-utilization of paid proxies? This is all I could come up with. This is a serious GSA SER issue for my set-up so any help would be appreciated. Thanks for reading.

The idea is to use proxy percentages to INCREASE usage of certain proxies, not decrease usage of same proxies on few/none other scraped proxies in the same list!  That would be the flip side of not limiting its use!
Sign In or Register to comment.