README.testset 1.78 KB
The test set consists of 634 data points, each of which represents
a molecule that is either active (A) or inactive (I).  The test set
has the same format as the training set, with the exception that the
activity value (A or I) for each data point is missing, that is, has
been replaced by a question mark (?).  Please submit one prediction,
A or I, for each data point.  Your submission should be in the form
of a file that starts with your contact information, followed by a
line with 5 asterisks, followed immediately by your predictions, with
one line per data point.  The predictions should be in the same order
as the test set data points.  So your prediction for the first example
should appear on the first line after the asterisks, your prediction
for the second example should appear on the second line after the
asterisks, etc.  Hence, after your contact information, the prediction
file will consist of 635 lines and have the form:

*****
I
I
A
I
A
I

etc.

You may submit your prediction by email to page@biostat.wisc.edu
or by anonymous ftp to ftp.biostat.wisc.edu, placing the file
into the directory dropboxes/page/.  If using email, please use
the subject line "KDDcup <name> thrombin" where <name> is your
name.  If using ftp, please name the file KDDcup.<name>.thrombin
where <name> is your name.  For example, my submission would be
named KDDcup.DavidPage.thrombin

Only one submission per person per task is permitted.  If you do not
receive email confirmation of your submission within 24 hours, please
email page@biostat.wisc.edu with subject "KDDcup no confirmation".

For group entries, the contact information should include the names
of everyone to be credited as a member of the group should your entry
achieve the highest score.  But no person is to be listed on more than
one entry per task.