Clinical E-Science Framework

University of Sheffield


Brief Description of the CLEF Project:

  • CLEF aims to develop a high quality, secure and interoperable information repository, derived from operational electronic patient records to enable ethical and user-friendly access to patient information in support of clinical care and biomedical research (Full Project ProposalExecutive Summary).

CLEF Project Partners:
Funding and Duration of the CLEF Project:
  • Funded by Medical Research Council (MRC)
  • 2003 – 2005 (CLEF), 2005 – 2007 (CLEF-Services)

Sheffield NLP's Role the CLEF Project:
  • Well-founded clinical studies require access to extensive, fine-grained data about individual patients. The bulk of this information is held in textual form in clinical reports, e.g. discharge summaries, radiology and pathology reports. Using generic Information Extraction technology specialised for work in bioinformatics applications, the Natural Language Processing Group at the University of Sheffield will develop tools to automatically identify, extract and markup key information in clinical reports. Specifically we shall extract the diagnosis, stage, and treatment intent from the patient summaries.

CLEF Project Members at the University of Sheffield:


  • CLEF Gold Standard Corpus:
    In order to evaluate our information extraction system to extract the clinically significant information from clinical texts, we created the CLEF gold standard corpus. It contains 167 clinical documents, chosen from 565K CLEF corpus. More detailed information about the annotation guildelines, the corpus and the tools used for the annotation exercises can be found in the link above.


More information:

Last update: 25/09/08,  Yikun Guo