Skip to content

LPM and Success dropped dramatically - How to improve it ?

2

Comments

  • Trevor_BanduraTrevor_Bandura 267,647 NEW GSA SER Verified List
    @markhoward Sending you a PM with some links to try.
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    @Trevor_Bandura

    Firstly, thanks for being so honest and open-minded - Some list sellers here could learn a lot from you!
    I totally agree with many of your observations and hence agree many are full of shit:P

    Yes, I use Captcha-Tronix for recaptcha however only the small plan (I hate to pay even more)

    Solve rate in CB is currently 49%

    I use Mexala Proxies in combination with Buyproxies

    In terms of footprints - well I used an ini extractor to get the ones I want from GSA Ser itself.
    Keywords - I manually checked at least 100 to see result rate, and use 15000 to harvest in ShutGun-Style.

    What I do know, is that many of the forums fails - log in fails, wrong username etc...

    I do understand why some of these list are so pricy  - Absolutely not that easy to build and a bunch of additional services like solvers for recaptcha are a must!

    I have managed to build some good list, however it has not been an easy journey - That's for sure:P


  • Trevor_BanduraTrevor_Bandura 267,647 NEW GSA SER Verified List
    edited December 2014
    @magically You must be getting lots of blob recaptchas?

    49% solve rate with Captcha Tronics is pretty low. I'm used to seeing high 80-90%

    image

    I don't know how you have CB set up but change your ReTry to 0. Most have it at 3 I think because that's what they were told works the best.

    For proxies, don't run more than 2-3 threads per proxy especially if you're processing lots of reCaptcha, you'll only start to get blobs and those are way harder to solve.

    For project options:

    image

    Make these few changes and see if it helps.
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    @Trevor_Bandura

    Surely will try that.
    In fact, I didn't enable Captcha-Tronix directly in CB, just via API in GSA Ser itself.
    Perhaps I should try to change that, and let CB handle it?

    Also decrease threads to 2-3 pr proxy and retry to 0 in CB globally and in each project.
  • Trevor_BanduraTrevor_Bandura 267,647 NEW GSA SER Verified List
    Have CaptchaTronics set up in both SER and CB.

    I think the global setting in SER for retry should be the only one you need to change unless you have that set in your projects too.

    Play around with that setting. I find that 0 retries works the best for me anyways. I would let SER run for 24 hours with different retry settings to see what one will give the most verified links.
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    @Trevor_Bandura
    Excellent advice, which I will try out now - Many thanks!

    Here is what I have done.

    1. Adjusted CB retry from 3 to 0 globally and in active projects
    2. Enabled Capthca-Tronix in CB as well as in GSA Ser itself
    3. Lowered threads from 500 to 150

    Let's see the outcome in 24 hours;)
  • Trevor_BanduraTrevor_Bandura 267,647 NEW GSA SER Verified List
    @magically Great.
  • edited December 2014
    Thank you Trevor for taking time out of your day to help people. I do really appreciate your effort.

    I tried the sample list you sent me, and I got 11 contextual links, which is decent enough for me out of 250.

    Maybe it boiled down to my lists. I bought SERLISTS and I got like 300 contextuals out of the 30k list. It just doesn't make sense to me. The last was 1 day old and i couldn't build more than 300. I just didn't know why everything changed almost overnight.

    edit: and no, i'm not using OCR for recaptchas, but I still expect more than I currently get at least
  • SvenSven www.GSA-Online.de
    @magically You should not add any captcha service to CB as it will slow down things. Always add that to SER directly. It's a big increase of speed.
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    @Sven

    Got it - corrected that setting, and it's now adjusted back to the way it was before in GSA Ser only.

    I did reduce speed with 1000%:P

    However, now CaptchaTronix is 'overloaded' - and that is a chance for problems with recapcha?
  • Trevor_BanduraTrevor_Bandura 267,647 NEW GSA SER Verified List
    @markhoward What is your setup like? Something is wrong for sure if you only got 11 from the list of URL's I sent you. Going to PM you.

    @Sven @magically

    I know sven said it's a big increase of speed, but I have to disagree there. If you think about it, SER is sending reCaptcha to CB anyways, with the OCR setup in CB, CB then sends it to the ocr then the answer back to SER. Instead of CB sending a not solvable back to SER then SER sending to the OCR. Either way there won't be an increase or decrease in speed. But the main reason why I like having the ocr in CB is so I can see if it's actually solving, in SER you can't see what's going on.

    I've tried both ways and I have always had great results with the way I have mine set up.
  • SvenSven www.GSA-Online.de

    Let me explain why it is faster to not configure it in CB but SER only:

    SER sends the captcha to CB but has to wait till it gets a reply. In this time all other threads are queued in SER as they are all sending requests in sequence to CB. When now CB sends the captcha to a service, all other queued threads have to idle.Thats not happening if CB returns immediately and SER can send it to the service.

  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    I will let my current settings run for a couple of hours more, to get a more data to analyse.

    Obviously, more threads are needed from Captchatronix in order to handle 150 threads in GSA Ser.
    150/5 = 30 threads = 69.77$ pr. month. 

    That is surely expensive, considering additional costs such as server, text-captcha solvers, proxies etc.


  • Tim89Tim89 www.expressindexer.solutions
    @Sven but if you configure it your way, how is the user supposed to track correctly solved captchas etc, if it is not going through CB which is useful for logging statistics..?

    I don't even know if the captcha service is actually working when I use your method, If I set the OCR service in CB, I can visually see in the log when a recapcha is being solved or sent to the service and then I can see the result in the CB log, running the service directly through SER doesn't give me these stats.

    Also, if I run lets say, spamvilla directly through SER, the CB log displays "unable to solve this captcha" and simply skips it within the log, How do I know if this captcha is being dealt with by spamvilla without going through CB?
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    @Tim89
    That's a good question right there - I would also like to know more about this.

    Which OCR solver are you using? Which Plan? Price?
  • @magically

    I think you should try Spamvilla, the success rate is pretty high, but these days, it went down dramatically, and Kelvin is solving this problem I think. 
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    @blackseocn

    Oh, so Spamvilla offers OCR Solving - well if that's the case I better look into it....
  • @magically

    Yes, you can check it.  
  • Trevor_BanduraTrevor_Bandura 267,647 NEW GSA SER Verified List
    @magically In the CaptchTronics members area, you should see a log with how many captcha threads your using. I've run them on 3 PC's and the most threads I ever seen used was 8. I had the 10 thread package and was fine for me.

    I'm sure you should be able to get away with the 5 thread package and still not use it all.
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    @blackseocn

    Yes, I just found it - quite similar to captchatronix in terms of prices and threads...
    However if it's better, it's worth it for sure.

    What plan are you using versus threads?
  • Tim89Tim89 www.expressindexer.solutions
    edited December 2014
    @magically I have not used an OCR service in a LONG LONG time, I scraped targets myself and depended solely on CB until these problems with low link numbers arose.

    I only just recently purchased a sub with spamvilla just to see if my verified links would increase to something I would normally get, but this was not the case, although it did improve verified numbers it's nothing compared to what I normally would achieve 5 months ago.

    I was happily achieving 200-400,000 verified contextual links per day with only CB early this year and now I'm down to around 20,000 verifieds per day WITH ocr services', now that's depressing, I'm fed up of trying to find out the issue at the moment and I'm trying to find additional link sources.
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    @Trevor_Bandura
    Yes, I'm aware of that...In fact i'm always using 5/5 threads, which I find really really strange.

    Small preview:
    Submitted Captchas
    19th December, 201420 583
    18th December, 201492 322
    17th December, 201452 060
    16th December, 201483 176

     
    Could be something is wrong here somehow, since it always uses 5/5 threads?
  • Trevor_BanduraTrevor_Bandura 267,647 NEW GSA SER Verified List
    10 thread package and this is plenty for me and i'm a pretty heavy user.
  • magicallymagically http://i.imgur.com/Ban0Uo4.png
    @Tim89
    I fully understand you frustration, as I'm in the same situation really. I feel with you and the money spent on testing various solutions out, surely has maxed out.

    Could be, I should try to increase the amount i.e get a bigger package.

    I'm also considering to run one of your lists to see what I can pull out of a verified one.
    Would be interesting for sure. Additionally, it would be interesting to see what you can pull out of my latest scraping - perhaps your setup is different than mine and able to get it done....
  • Trevor_BanduraTrevor_Bandura 267,647 NEW GSA SER Verified List
    @Tim89 The problem is that sites have changed the reg process. I did send Sven a list of websites that SER used to be able to post but could not with newer versions, and those were fixed.

    What you'll need to do is get a list of sites that SER used to be able to post to and send them over to Sven only if those sites still allow registrations and posting of articles and such. Maybe Sven might only need to add more fields to the generic_fields.dat file to get them to work again. I do my own additions to that file all the time and it does improve things for me.

    Make this one simple change to the generic_fields.date file.

    Change all instances of:

    %random_email% to %your e-mail% and restart SER.

    I know @Sven will comment on that. lol

    @magically If you're always using the 5 threads, might be a good idea to upgrade to the 10 thread package.
  • Trevor_BanduraTrevor_Bandura 267,647 NEW GSA SER Verified List
    @magically Sure It would be greatly appreciated for you to purchase my list service. I'm actually running a sale right now for my service that ends later this afternoon.


    Did not want to post about my service with out it being mentioned first.

  • Tim89Tim89 www.expressindexer.solutions
    edited December 2014
    @Trevor_Bandura I know what you're saying, however I simply don't have the time at the moment to debug things and narrow my thousands and thousands of potential targets down to a few and find out similarities for these targets and narrow the list further down to send to Sven.

    I am a user of the tool at the end of the day, I'm not a developer, this must be understood too... don't you think? I'm not having any pops at anybody, I just don't have the time needed to do any debugging. 

    and what will that change? (%your e-mail%)? If svens thinks %random_email% should be the default setting then shouldn't this be the best setting?
  • Tim89Tim89 www.expressindexer.solutions
    edited December 2014
    How about checking all of the current platforms official websites such as (https://www.drupal.org/) and looking for any recent changelogs or updates that have occurred of recent and then include any needed/added fields etc, this will eliminate any new platform-wide updates causing SER to not being able to post to.

    Once you go through the list of platforms SER supports, referring to all the official latest installations at least all of the non-postable targets we hit will be due to custom-designed platforms (platforms that have been tweaked to avoid spam).

    It shouldn't take a lot of time for a dev that knows exactly what to look out for to do this, at the cost of 1 domain and some time to download and install the platform one by one to check the submitting/verification process within SER works with the latest version of platform.
  • Trevor_BanduraTrevor_Bandura 267,647 NEW GSA SER Verified List
    edited December 2014
    @Tim89 The %your e-mail% forces SER to use one of the emails in your project, and not some random one that SER creates. Sometimes SER will create a random email for engines that usually don't need email validation to activate a account to login and post.

    Now what has happened in the past couple months or so, is these "Auto Approve" sites have now changed so you need to validate your email address to activate your account and with SER thinking it's still auto approve SER used a fake email address that it can't log in to to get the validation link so you'll see some messages in the log saying something like.

    "Email validation used but no valid email chosen" or something along those lines.

    This is not a 100% fix but I have seen an improvement with getting more verified links. And I have seen less of those messages in the log.

    I'm not saying @Sven is wrong here by having %random_email% in that file but with all due respect to him, because i'm using SER all the time, small changes I make like this, I can tell within 12 hours or so if any small change I make either improves things or makes it worse.

    Because you're also a pretty heavy user of SER, you should also be able to tell if a small change like this will improve things for you.

    All i'm saying is just try it and give it about 12 hours or so to see if it helps. Reset your SER stats and let it run. 
Sign In or Register to comment.