Skip to content

Proxies, HTML Timeout, Threads - Max Efficiency

AlexRAlexR Cape Town
edited January 2013 in Need Help
Let's get together as a community and look at this aspect a little more.

Can you answer the following please:
1) How many private proxies you using?
2) How many SE's have you selected?
3) How many threads are you running?
4) What's your custom HTML timeout?
5) What's your custom search time between queries?

It seems everyone has wildly different settings here...would be neat to compile some data so we can see what the options are.

Let me go first:
1) How many private proxies you using? 20 Private.
2) How many SE's have you selected? 40
3) How many threads are you running? 300
4) What's your custom HTML timeout? 120
5) What's your custom search time between queries? 4s

Tagged:
«13456710

Comments

  • GiorgosKGiorgosK Greece
    edited January 2013
    The reason every has it different is that so many options in the program can affect performance
    but let me play along but I will put submitted / verified per day in the measure and projects

    0) home server dedicated laptop with Pentium P6200  4gb ram win7 home
    1) public
    2) 50
    3) 95
    4) 90
    5) unchecked
    6) submitted / verified per day (avg.)  4000/400 (10% verified)
    7) projects 10
    8) GSA CB only
  • LeeGLeeG Eating your first bourne

    Time to humor the person that has made false accusations about photoshopping my results

    0) vps running windows 2008 r2, 6 gig of ram, 20 gig hard drive, 100mbt connection
    1) 40 shared proxies from proxy hub

    2) 20 random google engines and search engines files edited not to use blog engines. No other engine types used
    3) 230 altered as when cpu is being maxed too often
    4) 130 altered on a daily basis, depending on how the proxies are running when I check the vps
    5) 5 seconds, but the results pulled are checked on a regular basis
    6) submitted / verified per day (avg.)  80,000/10,000
    7) projects 27 with 3 tiers. Run on the schedular function to reduce memory load and spread links in a random order
    8) captcha service used. CSX and GSA CB. New captchas added daily
  • BrandonBrandon Reputation Management Pro
    1) How many private proxies you using?
    2) How many SE's have you selected?
    3) How many threads are you running?
    4) What's your custom HTML timeout?
    5) What's your custom search time between queries?

    1. 20
    2. 14 (united states)
    3. 600
    4. 180
    5. 20
  • Allright Lee,  You've been right about everything i've tried that you have said before so i'm gonna give your settings a try. :) 

    Thanks for posting it. :)
  • ronron SERLists.com

    I think this is important to say:

    If you have a ton of projects - and most of those have posting limits - and you have just a few "no limit" projects, you will have a lot less submissions than somebody who has a whole bunch of "no limit" projects.

    If you feed GSA a whole bunch of scraped lists, you can push the submission count way above what GSA will do on its own.

    So if you are going to compare apples to apples, it goes far beyond just the settings. It also has a lot to do with feeding lists and the number of "no limit" projects you have going.

    At least, that is my experience.

  • Well all my projects are no limit projects but i'm also not feeding it lists other than what it has built up on it's own. 

    There's a few varations that Lee has that I don't.  That is the Custom SE's  and CB  but just from mine setting his I'm noticing a huge increase so far.  Usually by the end of the day i'm at maybe 22k submissions  it's only 3pm here and i'm already at that.  So it deff increased by a lot. :)

    I however doubt i'll get even close to the 80k.
  • ronron SERLists.com
    edited January 2013

    I have 35 projects, but only four have no limits. And I am  hitting about 30,000 submissions a day with no lists. If I had more no limit projects or fed it lists, I know it could go much higher.

    However, with CB solving at about 70% faster than CSX, that will add a whole bunch more submitted each day. 

  • That's not fair guys. I didn't think CB was that good o_O
    I'm anxious to try on my server :)
  • why everyone is after 'I have xxxx submission'..when only matters is 'verification number'..if you submit 100k and get 10k ...but someone submits 20k and gets 10k ..that I would call winner :-)
  • Cause doesn't it take 5 days to full account for all the submitted?  As the number of submitted increase i'd expect the number of verifieds to increase as well.  I know when I first started with GSA  Max submitted was maybe around  5-6k with maybe 100-200 verified a day.   As i've been getting the submitted more and more each day my Verified grow as well.  I'm up to around 3-4k per day now.
  • LeeGLeeG Eating your first bourne

    What you dont see is when I get ideas wrong and hit a massive fail.

    It happens a lot when I have a new idea or tweak to test

    I only run ser with the search engines and the global site lists.

    No feeding of link lists.

    What I do, is set all lower tiers to a five day link verify check.

    Most links will be killed in the first couple of days if the site you have posted on a site with an owner with half a brain and has half an idea about link building and spam. That way building links on lower tiers, you know they are still live when verified. Forums you will find have moderators with spam killing fingers.

    Even worse if the forum owner uses link building methods and knows how to spot spam members.

    Changing emails because of stop forum spam is another over blown fantasy in my opinion.

    If your going to change your email, do it when you get your next batch of proxies.

    Stop forum spam also logs the ip and a lot of webmasters that use it, also use the ip blocking function.

    How do you know the next batch of ip´s you get have not already been used and are listed there

    I have a good knowledge about stop forum spam. Not through being caught in it too often.

    If your going to check your email for being blacklisted, also waste more time checking each ip.

    If your totally paranoid, check your ip and email after each submission with the blacklists.

    I cant remember the last time I swapped out my emails.

    Top level is 100 submitted per day, pr4 domain.

    The reason, if you have 100 verified per day and some links have a five day check by default, how many links could you build then bam, you hit the 100 for the next two weeks without submitting a new link. You could in theory blast a lot of links that go live on day one and ser only picks up on day five

    I have a big sized keyword list niche related for top tiers and a general keyword list for lower tiers

    That way you find more article sites, social media etc etc

    The rule of thumb on building links used to be 100 per day to your money site.

    Bit of simple maths. 100 links to tier 1, now if you was to try and build 100 links to those, your talking 10,000 and 1000000. Thats one project

    If your doing that on multiple projects, Sven need to make ser post a lot more links in a day :D

    Again checking your verified to submitted stats.

    Some links will be verified five days after posting them. So the comparing submitted and verified counters at the bottom is not a true method for build any kind of % values

     

  • LeeGLeeG Eating your first bourne

    Another fantasy is google international.

    Some countries, google picks up the ip and diverts you to the google for that country

    Click on google.com and you get diverted to goole.es if your in Spain etc

    When your searching using ips, if you have international selected, you might end up on the google that corresponds to your ip.

    You can try it with your proxies. Try accessing google.com with your proxies and see what you end up viewing

  • THanks for the information Lee,  Is does for sure explain a lot.  Explains a lot of different things that you are doing vs what i'm doing.   I used to do the tiers but I actually stopped doing that cause I saw the same results if not better just blasting my money site. 
  • LeeGLeeG Eating your first bourne
    edited January 2013

    I constantly try new things.

    If I do it on a project by project edit, it takes between an hour to two hours to make any changes

    Trust me, doing the midnight vps reboot because of watching the memory usage, then just making a quick edit, can take a couple of hours to do.

    Then you wake and think ooooooooooopppppppsss when you check how its run over night

    And start again

    Something that I need to try, is just set four random googles on each project

    Or three random plus the google international, which should divert to the country the proxy is from

    Your also spreading your searches that way. Rather than each and every project hitting google for the same search engines.

    If you do that, some engines like the blogs are set by default to use blog search engines.

    So add a random selection of blog search engines, or as I have done, edit the engines files so they use any search engine. About an hours work to do.

    If you want a quick blast of easy links. On the lower tiers, make sure you dont have a tick in the "dont post to the same domain" or how ever its worded.

    Most websites wont let you register a second time. If its a forum, your already registered etc

    Social bookmarks, ser will log into the accounts you already have and add more links to those, making the accounts look more natural. Rather than one bookmark one account

    Its any easy way to add extra links. And if your using global sites lists, from time to time, links will be added to those accounts you already have set up.

  • ronron SERLists.com

    @LeeG - +1 on this great tip:

    "I have a big sized keyword list niche related for top tiers and a general keyword list for lower tiers. That way you find more article sites, social media etc etc"

    That is simple, obvious, completely logical...and overlooked.

  • @LeeG edit the engines files so they use any search engine. About an hours work to do. << how did you do this?
  • LeeGLeeG Eating your first bourne
    edited January 2013

    Im at the stage of being scared to try any new ideas now, because the above is what I use to get these kinds of results. No doubt be another false accusation of "photoshop" if I post any more screen shots as proof of what I teach works.

     

    Editing of engines. This is something you need to look at work out.

    In the engines folders, look at the blog engines like blogspot

    Use notepad to view and edit the files.

    You should easily spot what needs doing.

    But if I tell people what to edit and they mess up, support will have an influx of people wanting copies of those files.

    When I do it, I make two copies, one for if things go wrong and one to edit the files in and then copy them over to the default folder.

    After any ser updates, the default folder reverts to the default files, so you always need to add them again after any updates

     

  • OzzOzz
    edited January 2013
    @LeeG: the photoshop comment was a joke of GlobalGoogler that wasn't meant seriously. At least I understood it as a joke.

    Regarding the modification of the engine files I do it a bit differently. I copy and rename the file to "#engineexample.ini". After modifying I uncheck the original engine in SER and activate the modified engine.

    Whenever there is an update to engines I check the engines folder and sort by date. This way you notice which engines were updated and can change your modified engine accordingly.
  • LeeGLeeG Eating your first bourne

    You can also add extra footprints to the engines files to bring in more results.

    A simple google of scrapebox footprints can give you a lot of extras to add or try in the files

    There were also a lot of extra footprints shared on the forums recently

    Again, back up the files

    This is where picking the brain of Ozz can be used to confirm something.

    If we pull in say 1000 results from google on a blog search and there is a mixed bag of forums, blogs etc

    Because the search is for blogs, will it identify the other engine types and post to them?

    Or because the search was aimed at a certain blog type, will it only identify that engine type to post comments to?

  • the search term only helps to find certain types of sites but will not identify only that particular platform you've searched for.

    if you search for "powered by wordpress" for example you get a lot different platforms like blogs, directories and so on. the site will be identified through to code snippets of the html code in most cases.

    because of that i like the idea to have a "general footprint" engine where we can just put in a list of all kind of footprints and SER sorting the results of that. the list can contain site specific footprints as well as general types like "your homepage" + "security code" for example. 
    i suggested this some time ago so its on Svens agenda.
  • LeeGLeeG Eating your first bourne

    Or go down a similar line and have a keyword search only.

    Search a keyword and the post to all site types found on that keyword

    If you have 100,000+ keywords, that a lot of searches and a decent mixed bag of engines in the results

    There is a possible way of doing this at present with an edit to any of the engine fies.

    If you look at the files they are footprint 1|footprint 2|footprint3

    Just bung a | at the end of any of the engines that also use keywords in the searches

  • Sweet,  Thanks for your great input and suggestions as always ozz.  It's always appreciated and for sure why GSA is such a great program. :)
  • 1) How many private proxies you using?
    40

    2) How many SE's have you selected?
    For tier1: only google US, no clones or metasearch engines
    For lower tiers: google us, google china, google korea, yandes, google japan

    3) How many threads are you running?
    320

    4) What's your custom HTML timeout?
    160

    5) What's your custom search time between queries?
    10 s

    Damn, @Leeg @Ozz and @GlobalGoogler have been very nice with me helping every time I was stuck. I'm running 8 projects (4 tiers each) and I can't get more out of GSA. After 11 hours running, this is what I've got:

    image

    I'm going to change the only thing I haven't. My keyword list. I'm using a pretty big keyword (400k words) and I don't know what else to do to improve my numbers. It's clear I'm missing something. Some of you might think that this is OK, but when you have hungry clients, you have to go further.
  • LeeGLeeG Eating your first bourne

    Bin the google us only idea, replace it with four or more random googles and four or more random google blog searches

    Your also using public proxies, notorious for being dead in the time it takes to scrape them and post to them.

    If you are using csx and have retries set, set that to zero retries

    Then see how you get on

  • OzzOzz
    edited January 2013
    public proxies are your weak spot, @hyde. use only your private proxies, reduce your thread count to 150 or so and work your way up.

    with public proxies you can run a lot more threads  because many threads are just in idle mode and do nothing more than to observe the dead body of a proxy.
  • I should not use public proxies
    I should not use public proxies
    I should not use public proxies
    I should not use public proxies
    I should not use public proxies
    I should not use public proxies

    For some reason in my head I had the idea that I was heavily scraping with GSA and I needed a lot of proxies. Thanks again @LeeG and @Ozz, I feel dumb :P

    After deleting public proxies and lowering the thread number I got a nice bump in performance. Look at the bandwidth :)

    image

    I owe you a coffee guys, thanks again.
  • LeeGLeeG Eating your first bourne

    Now go stand in the corner :D

    Also watch your html timeout.

    General options tab at the top next to the start button.

    Set it to max and bring it down to your sweet spot

    This is when you get a lot of verification failed, with a few timeout´s

    With private proxies, you will find you have a much lower timeout setting than if your using public proxies.

    Which in turn is a speed boost

  • ronron SERLists.com

    @LeeG - I'm on pace to hit 40k submitted today. No global site lists, fed lists or imported urls. Just pure GSA searching for targets. I just needed to say that for some reason. Kind of excited. I know it's not a race for the most links. But it does something to me when I go faster :)

    @hyde - It can also depend on how many "no limit" tiers you have going. The more you have, the more targets and links you will hit.

    I'm only using a 1000 keyword list at a time for each project, so I wouldn't guess that is your problem. I think it comes down to proxies.

     

  • edited January 2013
    Hey @ron, @LeeG, @Ozz and whoever else is experimenting with GSA CB beta.  Quick question, how long can CB run before it needs to be restarted?  From what I read so far, performance wise CB solves way more captchas than CSX2 and faster but must you restart CB every day, week, month? 
  • Thanks @LeeG I will look at those engines and see what I can sort out. Cheers NR
Sign In or Register to comment.