Melita is a semi-automatic annotation tool uses IE algorithm to support the users in the process of annotation.

Melita main features are:

Melita actively supports corpus annotation using Amilcare, an adaptive Information Extraction (IE) tool based on the (LP)2 algorithm[Ciravegna2001]. The novelty of Melita is the possibility of tuning the AIE system so as to provide the desired level of pro-activity and intrusiveness provided by the IE engine.

Melita provides non-intrusive, just in time and pro-active support for annotation:

Just in time because training is performed while user annotates the text.

Pro-active because it does not wait for the user to learn and calculate statistics. It takes the initiative to do any pre-processing which will be used in future.

Non-intrusive because user can fully customize the level of support the interface provides (pervasive, very active, active, lazy or very lazy).

While users annotate texts, Amilcare runs in the background learning how to reproduce the inserted annotation. As soon as a user annotates a document, this document is sent to the learning algorithm which lies on a server (either local or remote). The learning algorithm independently is always run as a background process to make sure that no resources are taken from the user.

Induced rules are silently applied to new texts and their results are compared with the user annotation. When its rules reach a (userdefined) level of accuracy, Melita presents new texts with a preliminary annotation derived by the rule application. In this case users have just to correct mistakes and add missing annotations. User corrections are inputted back to the learner for retraining.

This technique focuses the slow and expensive user activity on uncovered cases, avoiding requiring annotating cases where a satisfying effectiveness is already reached. Moreover validating extracted information is a much simpler task than tagging bare texts (and also less error prone), speeding up the process considerably. If the IE based annotation becomes very reliable, the user can decide to let the IE system proceed automatically for further annotation.

The project

Web Intelligence Group logo
Natural Language Processing Group logo

Valid XHTML 1.0!   >Valid XHTML 1.0!

Vitaveska Lanfranchi