Skip to content

SER and Filters/Blocklists

gsa8mycowsgsa8mycows forum.gsa-online.de/profile/11343/gsa8mycows
edited January 2016 in GSA Search Engine Ranker
@Sven

What format should I give SER for domain filtering? I've made a bash script for pulling data of malware sites which will get auto-updated over time.

Using cygwin, one of the default SER filters look like this:
<pre>
        20160901        womenzz.com     exploits        www.malwaredomainlist.com/update.php    20150308        20131227        20120122        20101112
        20160901        woojeoung.com   attackpage      www.google.com/safebrowsing     20150308        20131227        20120831        20120131
        20160901        xe6.ru  attackpage      google.com/safebrowsing 20150308        20131228        20110311        20090913
        20160901        yanagi.co.kr    attackpage      safebrowsing.google.com 20150308        20131227        20120522        20110721
        20160901        yescounter.com  attackpage      safebrowsing.clients.google.com 20150308        20131227        20120407        20101211
        20160901        yondental.co.kr gumblar www.malwaredomainlist.com/update.php    20150308        20131226        20110316        20091104
        20160901        yusungtech.co.kr        gumblar blog.unmaskparasites.com        20150308        20131227        20120418        20110715
        20160901        zdesestvareznezahodi.com        iframe  blogs.paretologic.com/malwarediaries    20150308        20131227        20121901        20111003
        20160901        zenitchampion.cn        attackpage      safebrowsing.google.com 20150308        20131123        20121201        20100712
        20160931        bigtoprocks.cn  malware safebrowsing.clients.google.com 20141231        20130215        20090909
        20160931        bqtl.in malware safebrowsing.clients.google.com 20141231        20130215        20090909
        20160931        c6h.at  malware safebrowsing.clients.google.com 20141231        20130302        20090909
        20160931        c8t.at  malware safebrowsing.clients.google.com 20141231        20130302        20090909
        20160931        c9u.at  malware safebrowsing.clients.google.com 20141231        20130302        20090909
        </pre>

Does SER automatically recognize subdomains in a file?

My blocklist on my webserver look like this:

ads.blockedsite1.net
ads.blockedsite2.net
ads.blockedsite3.net
etc.

No IP, only a text file on nginx.

Can I also include bad-words in there?

Do I need to add http:// or https:// in front of sub- and root domains, like www.gsa-online.de and forum.gsa-online.de?

Comments

  • SvenSven www.GSA-Online.de
    listing urls or just domains is ok.
  • gsa8mycowsgsa8mycows forum.gsa-online.de/profile/11343/gsa8mycows
    edited January 2016
    @Sven

    What about bad-words? Will SER stop posting to sites where the URL contains words like sex,cunt,fuck etc. if I include them in my own blacklist?

    I mean can I include them in my blacklist?
  • SvenSven www.GSA-Online.de
    for that you have to add the words into the project filter. But ehre you can also use a macro which loads things from an external file like %file-c:\bad\words.txt%
  • gsa8mycowsgsa8mycows forum.gsa-online.de/profile/11343/gsa8mycows
    Thanks!
Sign In or Register to comment.