The test set will be sent by mail to the participants after the paper submission. It will be composed of some data that is "similar" to the training set, although some metadata like labels or ratings may have been removed. The train and test sets constitute a partition of the same original data set.
The results have to be sent in xml format following the dtd that is attached at the end of this page. Some illustrative examples are also provided below. The test set contains data from all the sources but some of the sources may not be useful for some tasks, for this reason three different test sets have been prepared for the shared tasks. You will be provided with the test set/s required for the task/s you are participating in.
Create a zip or a tgz file containig all your xml files and add it as attachment to your submission on easychair.
PLEASE READ CAREFULLY THE INFORMATION BELOW. IF YOU HAVE ANY QUESTION, DO NOT HESITATE TO CONTACT US (caw2 at barcelonamedia dot org) OR MAKE USE OF THE FORUM SECTION OF THIS SITE.
Test Set 1: Text Normalization / Opinion Mining & Sentiment Analysis
There is a common test set for the text normalization and opinion mining & sentiment analysis shared-tasks. It is composed of a set of 10.000 post/comments strategically extracted from the five considered sources. No thread information is provided within this data set. The format of each entry in this test set is as follows:
<post id="0001">
<body>how r u guys... i realy enjoyed tha moB yest alot:)</body>
</post>
The text normalization task must generate at least one tokenized transcription of the post/comment, but you are also allowed to provide an alternative transcription and include your comments and/or extra information. An example for the text normalization shared task output would be:
<post id="0001">
<body>how r u gys...? i realy enjoyed tha moB yest alot:)</body>
<results><TextN>
<transcription>How are you guys ? I really enjoyed the movie yesterday a lot :)</transcription>
<alternative>How are you guys ? I really enjoyed that movie yesterday a lot</alternative>
<alternative>How are you guys ? I really enjoyed a lot yesterday 's movie</alternative>
<other>
<annotation type="chatspeak, lowercase" > <annotation function="interrogative">How are you guys ? </annotation> <annotation function="declarative"> I really enjoyed the movie yesterday a lot <annotation emocticon="happy"> :) </annotation > </annotation> </annotation>
</other>
</TextN></results></post>
The "other" content, is not mandatory and the objective of having it is to include extra inforamation that may be generated during the aanlaysis and that can be used for other tasks in order to classify the text. Even that we propose this format, if you can generate a different format, then please attach the corresponding dtd
For the opinion mining and sentiment analysis shared-task, float number values in the range between 0 (zero) and 1 (one) must be provided for each category within one or both tracks. The values are intended to represent the degree (or probability) of pertenence of the given post/comment to each category, with no pertinence being represented by 0 (zero) and pertenece by 1 (one). An example for the opinion mining and sentiment analysis shared-task follows (where you can participate in one or both tracks):
<post id="0001">
<body>how r u gys...? i realy enjoyed tha moB yest alot:)</body>
<results><sentop>
<sentiment><neutral>0.307</neutral><happy>0.93</happy><angry>0.2</angry><sad>0.15</sad></sentiment>
<opinion><factual>0.42</factual><positive>0.97</positive><negative>0.05</negative></opinion>
</sentop></results></post>
Test Set 2: Misbehavior Detection (Harassment)
There is a family of five test sets (one for every considered source) for the harassment track of the misbehavior detection shared-task. Float number values in the range between 0.1 and 1 must be provided for every post with harassment. The values are intended to represent the degree (or probability) of the given post to contain harassment. If you have only used binary labels, do not worry and just use only ones and zeros. Send us all posts for which you found more than a 0.1 probability of containing harassment. If the harassment spans multiple posts just put every one of them and mark the first and last one especially in the comment tag.
Please return an XML-file according to the specification attached.
An example would be:
<thread id="MS_134_145453"> <!-- put only those threads in which you found an example for harassment -->
<posts><!-- list of posts with harassment -->
<post id="MS_134_4546464"><!-- put only the posts with harassment -->
<body>The text of the post</body>
<harassment>1</harassment> <!-- a float number in the range [0.1,1] -->
<comment> We have labeled this as ... </comment> <!-- any text, additional labels, type of harassment, etc that can be useful to explain your decision -->
</post>
<post> <!-- another post with harassment--></post>
<post> <!-- another post with harassment--></post>
</posts></thread>
<thread><!-- another thread with harassment--></thread>
Test Set 3: Misbehavior Detection (Out Off Topic)
In this task the test set will contain only threads from Slashdot. Float number values in the range between 0 and 1 must be provided for every offtopic-post. The values are intended to represent the degree (or probability) of the given post to be offtopic. If you have only used binary labels, do not worry and just use only ones and zeros. Send us all posts for which you found more than a 0.1 probability of being offtopic.
Please return an XML-file according to the specification attached.
An example would be:
<thread id="SD_193_122247"> <!-- put only those threads in which you found an example for an offtopic post -->
<posts><!-- list of offtopic posts -->
<post id="SD_193_122247"><!-- put only the offtopic posts -->
<body>The text of the post</body>
<offtopic>1</offtopic> <!-- a float number in the range [0.1,1] -->
<comment> We have labeled this as ... </comment> <!-- any text, additional labels, etc that can be useful to explain your decision -->
</post>
<post> <!-- another offtopic post --></post>
<post> <!-- another offtopic post --></post>
</posts></thread>
<thread><!-- another thread with offtopic posts --></thread>
| Attachment | Size |
|---|---|
| results_dtd.txt | 2.75 KB |