Skip to content

Public Proxies - Best Practice & Suggestions

OzzOzz
edited October 2012 in Other / Mixed
Hi, as I'm using public proxies on my test machine at the moment I want to show you how to find good sources and tweak the proxy scraper to get good proxies you can use in SER or tools like Scrapebox.

This is a possible setting:
image

You first have to take a look at what proxy list give you good and fresh proxies every day. Some lists get updated more than once a day, some lists gets only updated once a week or even less. You need to get rid of all providers that  don't update their lists daily at least. To do that its best to visit the sites by hand and take a look into their proxy posts. To get the URL of the list click "Add/Edit ProxySites" and copy the URL.

If you got plenty of good lists (~10) activate them and download the proxies. Click "Add Proxy" -> "Find online". Keep an eye on the log and notice which provider gives you new proxies. Many proxies will be duplicates and won't give you any new proxies. Now just use the ProxySites which give you the most "new" proxies and get rid of the rest. The less ProxySites you use, the less threads you waste when searching for new proxies.

Test the proxies now (Test Proxies / don't skip unchecked). In my settings I use 50 threads with Timeout: 2 seconds. You need to figure out what works best for you.
If I reduce the timeout to 1 seconds for instance my router can't take the heat and I lose my connection. Same could happen if you increase the threads.
If you think that you don't get enough working proxies you could increase the timeout a litte bit (~5 seconds) but the more threads you use and the less timout you set, the faster will be your proxy test.
Automatically search for new proxies (could be everything from 30 to 720 minutes). Figure out what works best for you. 
You can also re-test flagged "not working" proxies again. Just mark the fastest proxies (up to 2 seconds) -> right mouseclick -> Test selected.

Let the proxy scraper run with your best settings for some days. Then you sort the list by speed and check what sources gave you best results in terms of working proxies, number of proxies and speed. If you see that some ProxySites give you bad results with no/few added proxies or only really crappy ones than you can get rid of them, too.

I for instance use just 3 proxy list providers, but I have my own ones ;)
But I'm sure you'll find some good ones with the lists of providers that SER has installed. Most proxy lists are duplicating each other anyway.

Comments

  • OzzOzz
    edited October 2012
    Suggestions to Sven:
    - ability to hide "not working" proxies
    - selection of all proxies with CTRL+A
    - seperate private proxies from public which may be used for scrapebox for instance
    - better proxy management:
    - only use the fastest ~100 proxies in random order 
    - and/or using proxies we are knowing that they are working at the moment
    - retest dead but formerly fast proxies (< 2 seconds) automatically up to X times. 
    - auto-delete old proxies and 
    -> option to retest them before banning
    -> option to blacklist the IP so they don't get added anymore
    - "already parsed" for proxy forum threads and blog posts so the scraper don't parse already
    known/old proxy sources 
    - option to test proxy right before it get used for posting
  • AlexRAlexR Cape Town
    Ozz - you impress yet again! Great post. Bookmarked and noted!  :)>-
  • OzzOzz
    edited October 2012
    Hi folks!

    I did some testing with my new setup. 

    Proxy sources: 1nd0-mp3.blogspot, blackhat-blog, blackhatworld, damnaccess, elite-proxies.blogspot, elite-proxy, free-elite-proxy.blogspot, free-proxy-serverlist.blogspot, megaproxy.blogspot, members.multimania, sohbetgo, sslproxylist, waiqq, xpazit, xproxy.blogspot

    Some of those sources I added myself, but most of them are included to SER already.

    The really important part of this study was the proxy test. By default SER test the proxy with
    URL: Bing.com
    String: microsoft.com

    I guess the problem with this test is that it only tests if the proxy is alive but not if it can be used for posting. I saw many proxies that needed a login for example. They were alive, but can't be used.

    Sven inspired me to use another URL and string. You find the URLs to test for on this site: http://web.freerk.com/proxyjudge/azenv.htm. Not all URLs worked good for me, so you need to test if you find a better one than mine.
    String: HTTP_CLIENT_IP

    You will notice that you get way less "green" proxies from now on, but I think that those are the ones that actually work to some point. Instead of 2000 green proxies I only got around 300 in the beginning. I sorted my proxies with "Speed" and tested all of those (green and red).

    My fresh test project was only blog commenting, trackback and guestbooks. No captcha solver used and because of that i almost got not guestbook backlink. I've let it run for ~5 hours.

    Within this time span I had
    - submission successful: 1285
    - verification successful: 114
    - submission uncussesful: 1374 ("url" was not used in form / captcha not solvable)
    - no engine matches: 15631
    - download failed: 4977

    As you might see the ratio of "alive" proxy submission to "dead" proxy submission is: 18290/4977
    I think a ratio of ~75% is pretty good with public proxies. I don't know if that number is holding up in the long run. Propably not.

    image

    Please notice that I only used 3 keywords so many potential target URLs are "already parsed" within a few hours.

    image

    PR9 (Domain) -> lol

    The problem with my test setup is that it can be risky to have no working proxies after some time. I retested formerly working proxies twice within 5 hours to get a stack of 300 green proxies. Many of those "dead" proxies didn't work temporarly but are alive when retested. Until there is a solution found how to reanimate those proxies you need to babysit your proxies, but in the last 2 hours of testing I'd lose only 66 of 332 proxies.
    If you have good proxy sources that add new proxies every couple of hours you should be fine.
  • SvenSven www.GSA-Online.de
    edited October 2012

    great analysis :)

    next version has "Re-test previously working proxies"

  • OzzOzz
    edited October 2012
    Thats great. I think "Newbies" can put SER on another level when tweaking their public proxies. I hope that they get pretty decent submission rates with them, but it is needed to test and tweak everything for a day or two before thay can set and forget it.
  • I tried scraping for public proxies while waiting for my private proxies to arrive but it didn't seem to work. Checked all the proxy sources, got about 160K proxies but none worked. I used Bing.com & MSFT as testing parameters. Could anyone help me improve on this?

    Thanks!
  • sounds to me like your router is crashing or firewall/antivirus is blocking ports.
  • I'm using Bermanhosting VPS with Antivir Avira Antivirus....not sure if the antivir is blocking ports :s
  • SvenSven www.GSA-Online.de
    Ohh that anti virus program with the umbrella? Do yourself a favor and use something different. The free version (and even the registered one we bought some years back here) is full of bugs and false detections.
  • Any suggestion on security software to use with GSA SER? Thanks!
  • SvenSven www.GSA-Online.de

    You don't need anything really as no IE is used in background or javascript is executed.

    But if you use any other SEO tool on the same PC you might want to use AVG or AVAST as many do use IE in background.

  • AlexRAlexR Cape Town
    How are Ozz's suggestions on proxies coming along? AS per Oct 12;

    seperate private proxies from public which may be used for scrapebox for instance
    - better proxy management:
    - only use the fastest ~100 proxies in random order 
    - and/or using proxies we are knowing that they are working at the moment
    - retest dead but formerly fast proxies (< 2 seconds) automatically up to X times. 
    - auto-delete old proxies and 
    -> option to retest them before banning
    -> option to blacklist the IP so they don't get added anymore
    - "already parsed" for proxy forum threads and blog posts so the scraper don't parse already
    known/old proxy sources 
    - option to test proxy right before it get used for posting
  • Can't believe I missed this thread.   Thanks Global for reviving it. :)  Was a good read.
  • ronron SERLists.com
    Same here. Ozz is amazing. Good job @GG !
  • AlexRAlexR Cape Town
    I have some good threads bookmarked...I've read every post/every thread on this forum...spent an hour or two cleaning up my bookmarked threads...

    Thanks to @Ozz for this great thread!

    Check this one out too and add your thoughts:
Sign In or Register to comment.