Can we blacklist link farms?
I'd like to be able to blacklist link farms. Another piece of software attempts this by searching its post lists for duplicate IP Addresses.
We are left with two likely options.
1) import xblack.txt or the like (which is an option I don't trust)
or
2) create a blacklist URL and check it every 1440 minutes (which requires some thought about how SER imports).
As it happens, using GSA blacklists is unpopular in the GSA SER community, but my needs are different; I care less about LPM than long term negative flag accumulation.
Is there a way to view SERs blacklist? (This assumes that it stores it where we can read it)
Viewing the SER blacklist is important to me as I might then verify and trust SERs "Import" button for "xblack.txt" and stuff.
Are there parameters to creating my own blacklist site? With stopforumspam.com/contributors as an example, SER might be taking all <a href> or all "http:" references whether in links or not.
Article link farms are rampant.
Comments
You could always create a fake project and then only import the domains/links that you want to use for your main campaign.
Hi Brandon
I edited out my thoughts on link farms and why I distrust software since I thought it too wordy.
No, technically many links are not on link farms. Identifying link farms in simple software is usually only a matter of choosing those who share IP addresses. SE could be much more sophisticated...MUCH MUCH more.
My definition of link farm is what SE might detect in the future.
You will not want to hear this, but I foresee algorithms that can detect your use of the blogs of others. Hence, the investment of paying people to find ameniable webmasters is still frought with peril.
All of this does suggest that I should setup bots, a personal blacklist URL, and run these algorithms to put them into SER in automated fashion according to option 2. Only importing approved domains would defeat THE COOLEST feature of GSA; continual scraping for new places to drop links. That would make me sad.
For those who are puzzled by this little exchange, I have lived near Fresno, have spoken to Brandon, and been influence by his writings in other forums and our conversations.
High risk, high reward!
Clients are different of course, I'm not willing to do my testing on their sites, but my sites I'll do anything I think will help.
I've come to appreciate negative SEO recently. We deduce that the penalty algorithms sift through the index to apply penalties AFTER we see a benefit from our links. Also, their is some threshold of penalties that percolates to the linked site.
SE (especially Google) process penalties in their index with another bot that works slowly.
"What has this got to do with GSA SER?" The best defense against such penalties is to not get them in the first place. UD attempts to do so by finding link farms based on sites sharing IP addresses. AFAIK, SER lacks such a feature--an IP address blacklist and a tool to catalog IP addresses of known targets.
UD is not known for being very fast, but the math of big(O) for this is that it must
1) keep a list of all known IP addresses at 1 32-bit word for each
2) check each new address for already being on that list
Item 1 would be 4 bytes times the number of sites UD knows about... say 100,000 = 400kBytes.
no problem.
Item 2 OUGHT to be a sorted list. Adding a new item should happen in log n time as should checking to see that an item is in it....and that only occurs when updating those lists.
In short, ( log n) is not a time consuming operation AND it only occurs infrequently. It is not a computing problem; UD DOES make it obnoxious to maintain lists.