@MaX Its because awstats does not update in real time. For instance I did a test with www.max-test.com and nothing showed up on any pages, just like you, while my domains were dominating or at least present, depending on how popular the page was. But I checked back a couple days later and it was there. See http://codrianu.ru/awstats/cgi-bin/awstats.pl?framename=mainright&output=refererpages
look for max-test.com its there and on some other pages, although on popular pages it got pushed into the others category.
@Seljo Im currently scraping for social networks and micro blogs and Ill be working on adding those shortly. I am also working on adding the reCaptcha targets which is going to morph from reCaptcha targets into a "Non Captcha Breaker but still fixed cost OCR category" as there are a couple of providers like spamvill and reverse that offer reCaptcha plus 1000 or so other captchas, so Ill include those as long as they are contextual, to start. Also Ill add in text captchas in that same category, so long as its contextual targets. I think I will filter blog comments/image comments etc.. from that category as its not a quality use of those expensive OCR resources.
That will at least be partially up by the end of the month, aimed at reCaptcha targets with text captcha following in October.
Hi Matt, thanks for the update. On another subject, I am aware that Microsoft stops the support of Windows XP as from 31st of October. "Box" gives notice that I have to upgrade my OS if I wish to continue to use "Box"
What is the exact position here? Does that mean I have to buy Windows 7 (pref not 8 ) or soon to be released Windows 9? could you let me know please.
@MaX Hmm Its been years since I used XP, so Im not up on it. I mean you could use vista, Im sure its still supported, but 7 is def better, although I personally use 8.1 on my personal machine, its pretty good, more 7 style, but with the features of 8.
However I would think box would continue to work on xp, but not get updates to the controller, but still sync. But I will have to check. XP will have security issues at some point, so its in your best interest to upgrade regardless. Ill get back with you.
Hi Matt , thanks, just wanting to make sure that Box will continue to operate after the 31st of October. Let me know, but as you said I will have to change eventually. I will answer your email later today as well.
@MaX Currently the stance is that Box will no longer run on windows XP after October 31, 2014. That could change as we get closer, but at that point it will not work. I will PM you.
Here are the stats as of right now. Ive been taking a general list building approach in order to increase the size of the list, meaning scraping generally, not focusing too heavily on specific platforms. Right now I am currently ramping up the contextual targets now that things are running smoothly.
im trying to run the forum list but every single site is coming up as "no engine matches" any idea what would be causing this? literally not a single one recognized by GSA
could be the "backlink type" as well. If a selected engine is just creating types of backlinks you have unchecked, it will not use them for a project and then the urls are not recognised.
ah ya that is correct, had a wrong setting on my end, thanks
HinkysSEOSpartans.com - Catchalls for SER - 30 Day Free Trial
I've just created the account and while waiting for the box sync to be updated, I've tried to manually download the "latest" list (9-11-14) but I get a 404.
Downloading the one before that now but you might want to look into this. (surely you can create just 1 "most recent list" download link that downloads whatever is in the auto-updating box)
@Hinkys Thanks, Ill fix the link. Unfortunately I can not create a download link that will just download whatever the latest box sync files are. The issue is that due to security protocols I box sync doesn't integrate with the members area, they are 2 separate platforms if you will. If I create a link that works with the latest files, the membership sites security won't secure the download link.
So that means that anyone could download the latest files, even non memebers and it can be shared around the web, and members could download the files even after they unsubscribe. Basically there are 2 different sets of security protocols in place and there is no way to make them work together.
Hmmm however now that I think about it I might be able to work around it, it wouldn't be exactly live, but pretty close, better then whats there now anyway. Ill work on it.
However I did get your box sync account activated a couple hours ago or so. Generally speaking most of the time I have the box sync accounts activated in 12 hours or less, so it generally isn't that much of an issue.
None the less I will fix the link, or rather just update it with a newer list probably. Thanks for brining it to my attention.
HinkysSEOSpartans.com - Catchalls for SER - 30 Day Free Trial
Yeah I didn't mean directly pulling from the box but rather having a backend script that constantly pulls the files to your server, cleans everything up and prepares a download link which can only be accessed by current paying members.
But yeah, that was fast indeed so it really isn't an issue.
@Hinkys Its actually a good idea to have a 2ndary script do it and upload it, I hadn't thought of it because no one really uses the files in the members area, because they all just use box.
But it would save me having to stop and update it every X often, its a good idea actually and I think I have it all worked out, thanks for the thought.
@Hinkys Thanks for the idea, I now have the lists being auto grabbed from the live list and uploaded to the membership site so that you can manually download a fresh list with out relying on me to manually upload it.
Love automation!
Cheers!
HinkysSEOSpartans.com - Catchalls for SER - 30 Day Free Trial
Hey, looks great! Can't beat automation.
Tho the syncing sollution is kinda anti-foolproof if you know what I mean. Too bad it can't be set to read-only.
Now that I think about it, If you ever think about doing a custom "server to PC" syncing sollution (not so general as box or dropbox but rather exclusively for use in SER or other tools), please hit me up as I'd definitely be interested in something like that for my own service.
Anyway, thanks for stepping up the automation game with your service, it's definitely a step in the right direction for a trully automated SER experience.
@Hinkys The syncing solution will be read only soon, but in the mean time I have counter measures on my server to prevent issues from occurring as well as full auditing.
However I do have 2 methods that were laid out for me of how I can do it myself, but I am just not quite ready to pursue a full in house solution yet, I need to add a bunch of features first and once there is more value then I can go back and look at that. Although like I said, read only is in the works, so it may be a mute point.
Anyway glad you like the service and it will only get better from here.
@justice There is about 9,000 contexutal targets in the database at the moment and out of that about 6000 were added last month (in Sept). Im still tweaking the contextual target method, so that rate should go up in October, but Im not sure how much.
@Brandon I saw you signed up, thanks for joining and glad to have you on board! Thanks for the acolades as well.
Also Ive been covered up for the past 3 days, and my pm box got slammed, so I am behind, to all those who have messaged me, please bear with me, and I will get your questions answered.
@moonshine Im backlogged on mails, got a lot of general mail, but Ill try and find your mail.
@RayBan It should work fine with windows server 2008 after Oct why? They are discontinuing support for windows xp but that is all, to my knowledge. why do you ask?
@ghetum Those numbers represent unique domains. GSA has a built in filter for each platform when it comes to removing duplicates and it chooses domains or urls. For contextual platforms its unique domains, but in all cases I go with the default filters.
Yes the duplicate domians/urls are remove daily.
@moonshine Im not avoiding you, but Ive been short on time and when that happens I tend to put more time into building the lists and maintaining the integrity of things vs trying to walk users step by step on how to use GSA to build links.
I will do my best to help you in this area, but I have spent a good deal of time attempting to help you already and I have other customers who need help as well and everyone wants lists. Im sorry I don't have a large amount of hours to devote just to support. Out of close to 100 users you are the only current one with issues in this regard, so its a matter of trying to figure out what you are doing wrong and help you do it right. Again I only have so much time to devote to each person. If need be I can refund you and after you figure out how to get things working for your GSA then you can use my service again.
I'm getting 115 lpm using Loopline's lists with no significant dips which I assume is because of real-time updates. I could get that lpm with static lists but it would only last a couple days.
@1957525979 I just looked at one of my production servers that is using my lists to fill customer orders and its getting 712DPM, although I don't really have it turned up all the way.
As for LPM it depends, its massively subjective to the actual platform types, your captcha solving setup, your server resources, your filters, etc... There is so many things I couldn't list them all. I have seen anywhere from 50-400+ LpM.
I don't really pay a lot of attention to speed, I mean its a nice metric to troubleshoot with and to help optimize, but I just set the quality metrics I want and it goes as fast as it goes. I don't care how fast it goes, I want quality end results.
@mcscappum Thanks for chiming in, and glad that your seeing steady lpm with no big dips, that is the goal of the live sync, along with less hassle etc... Consistency, in a word.
I added 1 new server to the list production setup today and I am going to add another 1 tomorrow, and Im still working on the reCaptcha and text captcha list (an optional free extra). I also spent part of the morning redoing how lists are distributed, which should result in more targets long term. So its only going to get better as we move forward.
Comments
Its because awstats does not update in real time. For instance I did a test with www.max-test.com and nothing showed up on any pages, just like you, while my domains were dominating or at least present, depending on how popular the page was. But I checked back a couple days later and it was there. See
http://codrianu.ru/awstats/cgi-bin/awstats.pl?framename=mainright&output=refererpages
look for max-test.com its there and on some other pages, although on popular pages it got pushed into the others category.
@Seljo
Im currently scraping for social networks and micro blogs and Ill be working on adding those shortly. I am also working on adding the reCaptcha targets which is going to morph from reCaptcha targets into a "Non Captcha Breaker but still fixed cost OCR category" as there are a couple of providers like spamvill and reverse that offer reCaptcha plus 1000 or so other captchas, so Ill include those as long as they are contextual, to start. Also Ill add in text captchas in that same category, so long as its contextual targets. I think I will filter blog comments/image comments etc.. from that category as its not a quality use of those expensive OCR resources.
That will at least be partially up by the end of the month, aimed at reCaptcha targets with text captcha following in October.
Thanks for the feedback!
What is the exact position here? Does that mean I have to buy Windows 7 (pref not 8 ) or soon to be released Windows 9? could you let me know please.
Thanks a lot in advance
Max
Hmm Its been years since I used XP, so Im not up on it. I mean you could use vista, Im sure its still supported, but 7 is def better, although I personally use 8.1 on my personal machine, its pretty good, more 7 style, but with the features of 8.
However I would think box would continue to work on xp, but not get updates to the controller, but still sync. But I will have to check. XP will have security issues at some point, so its in your best interest to upgrade regardless. Ill get back with you.
Thanks
Max
Currently the stance is that Box will no longer run on windows XP after October 31, 2014. That could change as we get closer, but at that point it will not work. I will PM you.
Here are the stats as of right now. Ive been taking a general list building approach in order to increase the size of the list, meaning scraping generally, not focusing too heavily on specific platforms. Right now I am currently ramping up the contextual targets now that things are running smoothly.
ategory - Article............: 3459
Category - Blog Comment.......: 140759
Category - Directory..........: 1096
Category - Document Sharing...: 629
Category - Exploit............: 2220
Category - Forum..............: 4309
Category - Guestbook..........: 8429
Category - Image Comment......: 7054
Category - Indexer............: 22109
Category - Microblog..........: 94
Category - Pingback...........: 649
Category - Referrer...........: 48
Category - RSS................: 50
Category - Social Bookmark....: 278
Category - Social Network.....: 3602
Category - Trackback..........: 20276
Category - Unknown............: 4414
Category - URL Shortener......: 9006
Category - Video..............: 410
Category - Web 2.0............: 61
Category - Wiki...............: 771
-------------------------------
Total.........................: 229723
Glad you go it sorted, thanks Sven.
@Hinkys
Thanks, Ill fix the link. Unfortunately I can not create a download link that will just download whatever the latest box sync files are. The issue is that due to security protocols I box sync doesn't integrate with the members area, they are 2 separate platforms if you will. If I create a link that works with the latest files, the membership sites security won't secure the download link.
So that means that anyone could download the latest files, even non memebers and it can be shared around the web, and members could download the files even after they unsubscribe. Basically there are 2 different sets of security protocols in place and there is no way to make them work together.
Hmmm however now that I think about it I might be able to work around it, it wouldn't be exactly live, but pretty close, better then whats there now anyway. Ill work on it.
However I did get your box sync account activated a couple hours ago or so. Generally speaking most of the time I have the box sync accounts activated in 12 hours or less, so it generally isn't that much of an issue.
None the less I will fix the link, or rather just update it with a newer list probably. Thanks for brining it to my attention.
Its actually a good idea to have a 2ndary script do it and upload it, I hadn't thought of it because no one really uses the files in the members area, because they all just use box.
But it would save me having to stop and update it every X often, its a good idea actually and I think I have it all worked out, thanks for the thought.
Cheers as well!
Thanks for the idea, I now have the lists being auto grabbed from the live list and uploaded to the membership site so that you can manually download a fresh list with out relying on me to manually upload it.
Love automation!
Cheers!
The syncing solution will be read only soon, but in the mean time I have counter measures on my server to prevent issues from occurring as well as full auditing.
However I do have 2 methods that were laid out for me of how I can do it myself, but I am just not quite ready to pursue a full in house solution yet, I need to add a bunch of features first and once there is more value then I can go back and look at that. Although like I said, read only is in the works, so it may be a mute point.
Anyway glad you like the service and it will only get better from here.
There is about 9,000 contexutal targets in the database at the moment and out of that about 6000 were added last month (in Sept). Im still tweaking the contextual target method, so that rate should go up in October, but Im not sure how much.
@Brandon
I saw you signed up, thanks for joining and glad to have you on board! Thanks for the acolades as well.
Also Ive been covered up for the past 3 days, and my pm box got slammed, so I am behind, to all those who have messaged me, please bear with me, and I will get your questions answered.
Im backlogged on mails, got a lot of general mail, but Ill try and find your mail.
@RayBan
It should work fine with windows server 2008 after Oct why? They are discontinuing support for windows xp but that is all, to my knowledge. why do you ask?
Also you should be activated.
Are the duplicate domains removed ?
Those numbers represent unique domains. GSA has a built in filter for each platform when it comes to removing duplicates and it chooses domains or urls. For contextual platforms its unique domains, but in all cases I go with the default filters.
Yes the duplicate domians/urls are remove daily.
@moonshine
Im not avoiding you, but Ive been short on time and when that happens I tend to put more time into building the lists and maintaining the integrity of things vs trying to walk users step by step on how to use GSA to build links.
I will do my best to help you in this area, but I have spent a good deal of time attempting to help you already and I have other customers who need help as well and everyone wants lists. Im sorry I don't have a large amount of hours to devote just to support. Out of close to 100 users you are the only current one with issues in this regard, so its a matter of trying to figure out what you are doing wrong and help you do it right. Again I only have so much time to devote to each person. If need be I can refund you and after you figure out how to get things working for your GSA then you can use my service again.
I just looked at one of my production servers that is using my lists to fill customer orders and its getting 712DPM, although I don't really have it turned up all the way.
As for LPM it depends, its massively subjective to the actual platform types, your captcha solving setup, your server resources, your filters, etc... There is so many things I couldn't list them all. I have seen anywhere from 50-400+ LpM.
I don't really pay a lot of attention to speed, I mean its a nice metric to troubleshoot with and to help optimize, but I just set the quality metrics I want and it goes as fast as it goes. I don't care how fast it goes, I want quality end results.
@mcscappum
Thanks for chiming in, and glad that your seeing steady lpm with no big dips, that is the goal of the live sync, along with less hassle etc... Consistency, in a word.
I added 1 new server to the list production setup today and I am going to add another 1 tomorrow, and Im still working on the reCaptcha and text captcha list (an optional free extra). I also spent part of the morning redoing how lists are distributed, which should result in more targets long term. So its only going to get better as we move forward.