Firstly, thanks for being so honest and open-minded - Some list sellers here could learn a lot from you!
I totally agree with many of your observations and hence agree many are full of shit:P
Yes, I use Captcha-Tronix for recaptcha however only the small plan (I hate to pay even more)
Solve rate in CB is currently 49%
I use Mexala Proxies in combination with Buyproxies
In terms of footprints - well I used an ini extractor to get the ones I want from GSA Ser itself.
Keywords - I manually checked at least 100 to see result rate, and use 15000 to harvest in ShutGun-Style.
What I do know, is that many of the forums fails - log in fails, wrong username etc...
I do understand why some of these list are so pricy - Absolutely not that easy to build and a bunch of additional services like solvers for recaptcha are a must!
I have managed to build some good list, however it has not been an easy journey - That's for sure:P
@magically You must be getting lots of blob recaptchas?
49% solve rate with Captcha Tronics is pretty low. I'm used to seeing high 80-90%
I don't know how you have CB set up but change your ReTry to 0. Most have it at 3 I think because that's what they were told works the best.
For proxies, don't run more than 2-3 threads per proxy especially if you're processing lots of reCaptcha, you'll only start to get blobs and those are way harder to solve.
I think the global setting in SER for retry should be the only one you need to change unless you have that set in your projects too.
Play around with that setting. I find that 0 retries works the best for me anyways. I would let SER run for 24 hours with different retry settings to see what one will give the most verified links.
Thank you Trevor for taking time out of your day to help people. I do really appreciate your effort.
I tried the sample list you sent me, and I got 11 contextual links, which is decent enough for me out of 250.
Maybe it boiled down to my lists. I bought SERLISTS and I got like 300 contextuals out of the 30k list. It just doesn't make sense to me. The last was 1 day old and i couldn't build more than 300. I just didn't know why everything changed almost overnight.
edit: and no, i'm not using OCR for recaptchas, but I still expect more than I currently get at least
I know sven said it's a big increase of speed, but I have to disagree there. If you think about it, SER is sending reCaptcha to CB anyways, with the OCR setup in CB, CB then sends it to the ocr then the answer back to SER. Instead of CB sending a not solvable back to SER then SER sending to the OCR. Either way there won't be an increase or decrease in speed. But the main reason why I like having the ocr in CB is so I can see if it's actually solving, in SER you can't see what's going on.
I've tried both ways and I have always had great results with the way I have mine set up.
Let me explain why it is faster to not configure it in CB but SER only:
SER sends the captcha to CB but has to wait till it gets a reply. In this time all other threads are queued in SER as they are all sending requests in sequence to CB. When now CB sends the captcha to a service, all other queued threads have to idle.Thats not happening if CB returns immediately and SER can send it to the service.
@Sven but if you configure it your way, how is the user supposed to track correctly solved captchas etc, if it is not going through CB which is useful for logging statistics..?
I don't even know if the captcha service is actually working when I use your method, If I set the OCR service in CB, I can visually see in the log when a recapcha is being solved or sent to the service and then I can see the result in the CB log, running the service directly through SER doesn't give me these stats.
Also, if I run lets say, spamvilla directly through SER, the CB log displays "unable to solve this captcha" and simply skips it within the log, How do I know if this captcha is being dealt with by spamvilla without going through CB?
I think you should try Spamvilla, the success rate is pretty high, but these days, it went down dramatically, and Kelvin is solving this problem I think.
@magically In the CaptchTronics members area, you should see a log with how many captcha threads your using. I've run them on 3 PC's and the most threads I ever seen used was 8. I had the 10 thread package and was fine for me.
I'm sure you should be able to get away with the 5 thread package and still not use it all.
@magically I have not used an OCR service in a LONG LONG time, I scraped targets myself and depended solely on CB until these problems with low link numbers arose.
I only just recently purchased a sub with spamvilla just to see if my verified links would increase to something I would normally get, but this was not the case, although it did improve verified numbers it's nothing compared to what I normally would achieve 5 months ago.
I was happily achieving 200-400,000 verified contextual links per day with only CB early this year and now I'm down to around 20,000 verifieds per day WITH ocr services', now that's depressing, I'm fed up of trying to find out the issue at the moment and I'm trying to find additional link sources.
I fully understand you frustration, as I'm in the same situation really. I feel with you and the money spent on testing various solutions out, surely has maxed out.
Could be, I should try to increase the amount i.e get a bigger package.
I'm also considering to run one of your lists to see what I can pull out of a verified one.
Would be interesting for sure. Additionally, it would be interesting to see what you can pull out of my latest scraping - perhaps your setup is different than mine and able to get it done....
@Tim89 The problem is that sites have changed the reg process. I did send Sven a list of websites that SER used to be able to post but could not with newer versions, and those were fixed.
What you'll need to do is get a list of sites that SER used to be able to post to and send them over to Sven only if those sites still allow registrations and posting of articles and such. Maybe Sven might only need to add more fields to the generic_fields.dat file to get them to work again. I do my own additions to that file all the time and it does improve things for me.
Make this one simple change to the generic_fields.date file.
@magically Sure It would be greatly appreciated for you to purchase my list service. I'm actually running a sale right now for my service that ends later this afternoon.
@Trevor_Bandura I know what you're saying, however I simply don't have the time at the moment to debug things and narrow my thousands and thousands of potential targets down to a few and find out similarities for these targets and narrow the list further down to send to Sven.
I am a user of the tool at the end of the day, I'm not a developer, this must be understood too... don't you think? I'm not having any pops at anybody, I just don't have the time needed to do any debugging.
and what will that change? (%your e-mail%)? If svens thinks %random_email% should be the default setting then shouldn't this be the best setting?
How about checking all of the current platforms official websites such as (https://www.drupal.org/) and looking for any recent changelogs or updates that have occurred of recent and then include any needed/added fields etc, this will eliminate any new platform-wide updates causing SER to not being able to post to.
Once you go through the list of platforms SER supports, referring to all the official latest installations at least all of the non-postable targets we hit will be due to custom-designed platforms (platforms that have been tweaked to avoid spam).
It shouldn't take a lot of time for a dev that knows exactly what to look out for to do this, at the cost of 1 domain and some time to download and install the platform one by one to check the submitting/verification process within SER works with the latest version of platform.
@Tim89 The %your e-mail% forces SER to use one of the emails in your project, and not some random one that SER creates. Sometimes SER will create a random email for engines that usually don't need email validation to activate a account to login and post.
Now what has happened in the past couple months or so, is these "Auto Approve" sites have now changed so you need to validate your email address to activate your account and with SER thinking it's still auto approve SER used a fake email address that it can't log in to to get the validation link so you'll see some messages in the log saying something like.
"Email validation used but no valid email chosen" or something along those lines.
This is not a 100% fix but I have seen an improvement with getting more verified links. And I have seen less of those messages in the log.
I'm not saying @Sven is wrong here by having %random_email% in that file but with all due respect to him, because i'm using SER all the time, small changes I make like this, I can tell within 12 hours or so if any small change I make either improves things or makes it worse.
Because you're also a pretty heavy user of SER, you should also be able to tell if a small change like this will improve things for you.
All i'm saying is just try it and give it about 12 hours or so to see if it helps. Reset your SER stats and let it run.
Comments
I tried the sample list you sent me, and I got 11 contextual links, which is decent enough for me out of 250.
Maybe it boiled down to my lists. I bought SERLISTS and I got like 300 contextuals out of the 30k list. It just doesn't make sense to me. The last was 1 day old and i couldn't build more than 300. I just didn't know why everything changed almost overnight.
edit: and no, i'm not using OCR for recaptchas, but I still expect more than I currently get at least
Let me explain why it is faster to not configure it in CB but SER only:
SER sends the captcha to CB but has to wait till it gets a reply. In this time all other threads are queued in SER as they are all sending requests in sequence to CB. When now CB sends the captcha to a service, all other queued threads have to idle.Thats not happening if CB returns immediately and SER can send it to the service.
I don't even know if the captcha service is actually working when I use your method, If I set the OCR service in CB, I can visually see in the log when a recapcha is being solved or sent to the service and then I can see the result in the CB log, running the service directly through SER doesn't give me these stats.
Also, if I run lets say, spamvilla directly through SER, the CB log displays "unable to solve this captcha" and simply skips it within the log, How do I know if this captcha is being dealt with by spamvilla without going through CB?
I only just recently purchased a sub with spamvilla just to see if my verified links would increase to something I would normally get, but this was not the case, although it did improve verified numbers it's nothing compared to what I normally would achieve 5 months ago.
I was happily achieving 200-400,000 verified contextual links per day with only CB early this year and now I'm down to around 20,000 verifieds per day WITH ocr services', now that's depressing, I'm fed up of trying to find out the issue at the moment and I'm trying to find additional link sources.
I am a user of the tool at the end of the day, I'm not a developer, this must be understood too... don't you think? I'm not having any pops at anybody, I just don't have the time needed to do any debugging.
and what will that change? (%your e-mail%)? If svens thinks %random_email% should be the default setting then shouldn't this be the best setting?
Once you go through the list of platforms SER supports, referring to all the official latest installations at least all of the non-postable targets we hit will be due to custom-designed platforms (platforms that have been tweaked to avoid spam).
It shouldn't take a lot of time for a dev that knows exactly what to look out for to do this, at the cost of 1 domain and some time to download and install the platform one by one to check the submitting/verification process within SER works with the latest version of platform.