University of Sheffield
The CLEF Gold Standard:
The Annotation Guidelines
These documents provide a common understandin g of what defines an annotation. The annotation in the CLEF project is defined not only as semantic units in clinical texts: things such as diseases, drugs, body parts etc, but also the relationships between those entities.
The Gold Standard Corpus
We are currently seeking the approval of the release of the full gold standard corpus. Please come back soon to find out the status of this full release. In the meantime, we are able to release small samples of text for specific, agreed purposes. Please enquire by email to Yikun Guo.
- Annalist: This is the scoring tool we developed for evaluating the quality of the semantic annotation process.
- The CLEF Information Extraction pipeline extracts entities (such as clinical problems, anatomy, drugs) and the relationships between them from text. Most components of the CLEF IE pipeline are available as part of the GATE NLP toolkit distribution. We plan to release the remaining components (e.g. Termino), and all configuration files for the pipeline in the near future. Please come back soon.
Last update: 25/09/08, Yikun Guo