General discussion

User picture

Attention mass raters of WOT

Introduction

The malwaredomainlist.com recent updates page contains a multi-column table that isn't suitable for rating with mass-rate. Selecting and copying this table to a text file gives a tab-delimited table that still isn't ideal.

Using a quick usage of sed, you can format a text-file with this unsuitable tab-delimited format to something that works with WOT's mass-rater.

How do I do this?

Paste this into a file (I called it "mdlsed"):

#!/bin/sed -f
s/................\t//
s/^-.*$//
s/\t.*$//
s/\/.*$//
s/www\.//

Select and copy the table from the "recent updates" page into a text file.

Then, after running chmod 755 mdlsed to make the script executable, run this command:

./mdlsed < old-domain-file > new-domain-file

"new-domain-file" will contain a list of domains-only, without any other columns.

How does it work?

The five lines of the sed script (stream editor using regular expressions) do the following:
1. remove the first sixteen characters (date) plus its tab
2. remove any entries starting with a dash (url not applicable)
3. remove from the tab to the end (all other columns)
4. (optional, done anyway) remove pathname from after url
4. (optional, done anyway) remove "www." from url

Enjoy!

New!

Joe Wein's recent updates page - http://joewein.de/sw/bl-log.htm - is also an "ugly" format, unlike his text-domains-only version. To format this correctly, follow the steps above, but for the sed script, enter this:

#!/bin/sed
s/\s.*$/

For this one, as there's only one replacement, it's easy to skip the file creation altogether and just use the command directly:

sed s/\s.*$/ < old-domain-file > new-domain-file

Enjoy!

User picture

Thanks

Wow that is a lot of information... Um, this is what I do;

Put this code in the address bar,
javascript:document.body.contentEditable="true";void0

Now I can copy all of the domain names, then put them in Wordpad.
Choose replace and just type a space in the "find what" category.
Finally click replace all.

Hope that helps. :D

User picture

That's malwaredomains.com

That's malwaredomains.com (or the DNS-BH blog), however, I was talking about a different, unrelated site, malwaredomainlist.com.

Thanks for the help on malwaredomains.com though.

User picture

Ohh!

Ok, thanks very much then!
I have always wanted to rate the sites on that website.
I will try what you said, if I can figure out that is! heh (I have never used sed before.)

User picture

Note:

It appears you are a Windows user from how you mentioned Wordpad. I'm not sure if there's a Windows build for sed, and even so, this technique may not work on Windows.

Use linux. :)

User picture

Question

Why WOT mass raters would dump malwaredomains list in their own ratings, since malwaredomains is already integrated into WOT ?

I would not rate a site I have not visited and evaluated myself.

This what led to the misclassification of sk1project : A single error was replicated by WOT mass-raters.

User picture

Sources that are integrated

Sources that are integrated into WOT only have an influence to a certain point, it is the users of WOT that help endorse this rating and give it more "popularity" (the five people next to the rating category). This may also turn a rating that was borderline green - into strong red.

User picture

re: malwaredomainlist.com

Mass-rating is quick, but not always accurate.
MDL and others like it require submission from human users; therefore can be fallible (though not intended).

Examples:

  • fairplaygames.net
    listed on MDL on: 2009/11/14_11:34
    visit the domain and you get.
    This account has been suspended.
  • dunkerquepromotion.org
    listed on MDL on: 2009/11/13_20:52
    visit the domain and you get.
    can't establish a connection to the server at dunkerquepromotion.org.

The point here is that a domain can be pulled by the web host so there isn't much sense in rating it, unless further investigation shows many like-domains some of which are still active, then IMO all should be rated equally.

The other point is a domain could be hacked/infected and caught and cleansed in which case the domain is safe again and should not be rated poorly for a single incident; - again requiring further investigation.

These are my opinions of course.

-------
WOT Services Ltd. - gives us safety through Web of Trust.
WOT Community - gives us security through unity.
Thank you all
- G7W

User picture

I'm sure all blacklists have

I'm sure all blacklists have errors now and then, and they will have their ratings removed as they come along. For those who use this mass-rating system, it was previously difficult to use some lists that came in HTML format with other columns. This topic attempts to solve them by stripping those columns.