Slashdot: slashdot.org
Slashdot is a popular website for people interested in reading and discussing about technology and its ramifications. The site's interaction consists of short-story news-articles that often carry fresh news and links to sources of information with more details. These posts incite many readers to comment on them. Most users register and comment under their nicknames, although a considerable amount participates anonymously (as "Anonymous Coward")
Although Slashdot allows users to express their opinion freely, moderation and meta-moderation mechanisms are employed to judge comments and enable readers to filter them by quality. More information is available in our forums.
The training data set contains about 140.000 comments to 496 articles dealing with politics.
The zip file contains 496 files (one for each article) with the corresponding comments and has the following format:
<thread id="SD_193_122247">
<subdomain>politics</subdomain> <!-- the slashdot category of the article (mainly politics) -->
<title>a string</title> <!-- the title of the article -->
<admin>a username</admin> <!-- the slashdot admin who published the story -->
<user>another username</user> <!-- the user who proposed the story -->
<date>1125558600</date><!-- unix time stamp format of the date the article was published -->
<topics>
<main>Microsoft</main><!-- primary topic -->
<secondary>Politics</secondary><!-- secondary topic -->
</topics>
<body>The text of the article</body>
<posts> <!-- the list of comments related to the article -->
<post id="SD_193_13453623"> <!-- the id of a direct comment to the article -->
<post_score>2</post_score><!-- the score obtained from the metamoderation system -->
<post_moderation>Informative</post_moderation><!-- the text descriptor obtained from the metamoderation system -->
<user id="SD_23782765">A Username</user>
<date>1125560220</date><!-- unix time stamp format of the date the comment was published -->
<body>The text of this comment</body>
</post>
<post id="SD_193_13454510" ref="SD_193_13454267">
<!-- the id of a nested comment (a reply to a comment) and the reference id to its parent comment -->
<post_score>1</post_score><!-- the score obtained from the metamoderation system -->
<user id="SD_90397867">another username</user>
<date>1125565920</date><!-- unix time stamp format of the date the comment was published -->
<body>The text of the next comment.</body>
</post>
<post> <!-- other comment --></post>
<post> <!-- other comment --></post>
</posts></thread>
| Attachment | Size |
|---|---|
| slashdot.zip | 33.75 MB |