Seminars

Current Reading Groups

Temporal and Spatial Information Extraction reading group

Machine Learning in NLP Reading Group

2009 - 2010

17th December, 2009 Adam Funk (University of Sheffield) - TBA

10th December, 2009 Roberto Navigli (Universita di Roma "La Sapienza") - TBA

26th November, 2009 Serge Sharoff (University of Leeds) - Classifying the Web into Domains and Genres

The jungle metaphor is quite common in corpus studies. The subtitle of David Lee's seminal paper on genre classification is 'navigating a path through the BNC jungle'. According to Adam Kilgarriff, the BNC is a jungle only when compared to smaller Brown-type corpora, while it looks more like an English garden when compared to the Web. At the moment we know little about the domains and genres of webpages. In the seminar I'm going to talk about approaches to understand the composition of the Web as a corpus.

19th November, 2009 Luke Zettlemoyer (University of Edinburgh) - Learning to Follow Orders: Reinforcement Learning for Mapping Instructions to Actions

In this talk, I will address the problem of relating linguistic analysis and control --- specifically, mapping natural language instructions to executable actions. I will present a reinforcement learning algorithm for inducing these mappings by interacting with virtual computer environments and observing the outcome of the executed actions. This technique has enabled automation of tasks that until now have required human participation --- for example, automatically configuring software by consulting how-to guides. Our results demonstrate that this method can rival supervised learning techniques while requiring few or no annotated training examples.

29th October, 2009 Allan Ramsay (Univeristy of Manchester) - Using English to Express Commonsense Rules

The talk will discuss some issues arising from an attempt to provide natural language access to a body of simple information about diet and its effect on various common medical conditions. Expressing this knowledge in natural language has a number of advantages. It also raises a number of difficult issues. I will outline the reasons why it seemed like a good idea and the reasons why it is difficult, and sketch our solution to these problems.

15th October, 2009 Diana Maynard (University of Sheffield) - Using Lexico-Syntactic Patterns for Ontology Enrichment: the case of ODd SOFAS

This talk describes the use of information extraction techniques involving lexico-syntactic patterns to generate ontological information from unstructured text and augment an existing ontology with new entities. We refine the patterns using a term extraction tool and some semantic restrictions derived from WordNet and VerbNet, in order to prevent the overgeneration that typically occurs with general patterns. We present two applications developed in GATE and available as plugins for the NeOn Toolkit: one for general use on all kinds of text, and one for specific use in the fisheries domain. Both make use of a new plugin for GATE which generates ontologies on the fly. Furthermore, we integrate support for ontology lifecycle development via a change log mechanism that enables logging of ontology versions and application of changes from one version to another.

1st October, 2009 Trevor Cohn (Univeristy of Sheffield) - Bayesian Non-Parametric Models for Parsing and Translation Slides

Many natural language processing tasks require inference over partially observed input data. Traditionally these models are trained using the expectation maximisation (EM) algorithm. However, for many models EM finds poor or degenerate solutions. Bayesian methods provide a elegant and theoretically principled way to address these problems, by including a prior over the model and integrating over uncertain events. In this talk I'll describe how we developed non-parametric Bayesian models for two related tasks: 1) learning a tree substitution grammar (DOP) for syntactic parsing and 2) learning a grammar-based machine translation model. The models learn compact and simple grammars, uncovering latent linguistic structures and in doing so outperform competitive baselines.

2008 - 2009

14th May, 2009 Sivaji Bandyopadhyay (Jadavpur University, India) - Emotion Analysis in Blog texts

Emotion analysis on blog texts is being carried out for a less privileged language like Bengali. A set of six attitude types, namely, happy, sad, anger, fear, disgust and surprise, have been selected toward this emotion detection task for reliable and semi automatic annotation of the blog texts. An automatic classifier has been applied for recognizing six basic types of attitudes for different words of a sentence. Different scoring strategies have been applied to identify sentence level emotion type based on the acquired word level emotion information. Unsupervised techniques have been applied on the classified test output to improve the accuracy. Same method has been applied on English SemEval 2007 Affect Sensing corpus that has given satisfactory performance.

7th May, 2009 Leon Derczynski (University of Sheffield) - Sequencing of Events and Their Durations Based on Event Descriptions Slides

Temporal Information Extraction is the elicitation of accurate data on events in a discourse. This specifies both tense and aspect of actions, both explicitly given by text and implicit from world knowledge. Events can occur at any point along a timeline, and are often only loosely specified in terms of upper and/or lower bounds relative to other events. Being able to identify and annotate times in discourse enables us to build a richer representation of the knowledge present in text. Given a document - for example, a news article - only a subset of facts within that document ever hold true at any one time. For example, we cannot concurrently assert "The silver and black Scott bike was chained to railings" and "An hour later it was gone". Extracting and temporally linking information is the only way to know which sets of facts hold true at the same time. A brief summary of literature and models surrounding tense and temporal location will be presented, followed by a review of recent work in the field. We will look at the normalisation of temporal data (anchoring vague expressions to a fixed interval on an absolute time scale), how events in text relate to each other and ways of reasoning about them, and different representations of temporal data - logical, textual and visual.

30th April, 2009 Marta Sabou (Open University) - Exploiting Semantic Web Ontologies: An Experimental Report Slides

As a side effect of the Semantic Web research activities, a large collection of ontologies is now available online constituting one of the largest and most heterogeneous knowledge sources in the history of AI. In this talk we report on the characteristics of this novel source and on its successful use for relation discovery. Our experiments show that, in the context of an ontology matching task, relations between the concepts of two ontologies can be discovered with a precision of 70% when using online ontologies. We conclude by exploring the potential of this novel knowledge resource for language technology applications.

16th April, 2009 Kumutha Swampillai (University of Sheffield) - Inter-Sentential and Intra-Sentential Relations in IE Corpora

Some information extraction systems are limited to extracting binary relations from single sentences. This constraint means that relations occurring across sentence boundaries cannot possibly be extracted by such systems. We examine the distribution of inter-sentential and intra-sentential relations in the MUC6 and ACE03 corpora. It was found that inter-sentential relations constitute 31.4% and 9.4% of the total number of relations in MUC6 and ACE03 respectively. These results show a 69.6% and a 90.6% recall upper bound of single sentence approaches to relation extraction. As such, any comprehensive approach to relation extraction will have to treat linguistic units larger than a sentence.

2nd April, 2009 Danica Damljanovic (University of Sheffield) - Natural Language Interfaces to Conceptual Models: Usability and Performance Slides

Accessing structured data in the form of ontologies currently requires the use of formal query languages (e.g., SeRQL or SPARQL) which pose significant difficulties for non-expert users. One way to lower the learning overhead and make ontology queries more straightforward is through a Natural Language Interface (NLI). While there are existing NLIs to structured data with reasonable performance, they tend to require expensive customisation to each new domain or ontology. Additionally, they often require specific adherence to a pre-defined syntax which, in turn, means that users still have to undergo training. Many methods are under development to reduce this training, and increase the usability of NLIs. We have developed Question-based Interface to Ontologies (QuestIO) which translates Natural Language text-based queries to SeRQL/SPARQL queries, which are then executed against the given ontology/knowledge base and the results are shown to the user. Customisation of this system is performed automatically from the ontology vocabulary. QuestIO is quite flexible in terms of complexity and syntax of the supported queries, as both keyword-based searches and full blown questions are supported. However, in the user-centric evaluation of this system we have noticed that the performance was degraded as the users did not have suficient help from the interaction with the system. In this talk, we propose combination of the three methods which are used to assist the user while interacting with the system: feedback, creating personalised vocabulary, and query refinement, and how these can be used in combination to improve the usability of NLIs to conceptual models.

19th March, 2009 Peter Wallis (University of Sheffield) - Social Engagement with Robots and Agents (SERA) Slides

Getting people to engage with robotic and virtual artifacts is easy, but keeping them engaged over time is hard: robots and agents lack some fundamental capabilities which can be summarized as sociability. The research community has realized the problem, but approaches, so far, have been dispersed and disjoint. If robots and agents are to become companions in people's lives, they will have to blend into these lives seamlessly. SERA is innovative in that it addresses sociability holistically, by advancing knowledge about what sociability in robots and agents entails, by developing methodology to analyze and evaluate it, and by making available research resources and platforms. SERA will, to this purpose, undertake real-life extended field studies of users' engagement with robotic devices. Sociablity has also to be built into robot and agent architectures from scratch and the goal here is to implement an architecture that caters for both background (cultural, normative etc.) and situational individual (theory of mind, adaptivity, responsiveness) practices and needs of users, with the guiding principle of pervasive affectivity. Assistive robots and agents that are to become true companions have to be versatile in functionality and identity (style, personality) depending on the service they are required to deliver, such as (reactive) social mediators, as (in turn reactive and proactive) information assistants, or as (proactive) coaches or monitors e.g. with health-related tasks. SERA will develop pilots of such intertwined interactive service applications for a robotic device.

12th March, 2009 Chris Huyck (Middlesex University) - A Pyscholinguistic Model of Natural Language Parsing Implemented in Simulated Neurons Slides

One of the central activities in natural language processing is parsing. There are a wide range of engineering solutions to parsing but none perform at human levels. The understanding of how humans process language is far from complete, but there is little doubt that humans use their neurons for all mental activities including parsing. There are several psychological models of parsing, but this talk will describe the first neuro-psychological model of parsing. That is, the parser is implemented entirely in simulated neurons. It makes use of Hebb's Cell Assembly hypothesis to form the basis of memories including words, clauses and sentences. Neural parsers require variable binding, and this parser binds via short-term potentiation. The parser produces correct semantic output. As neural cycles have an associated time, time can be measured, and the parser parses in times similar to humans. Prepositional phrase attachment ambiguities are resolved based on the semantics of the sentence. Finally, the parser is embedded in a functioning agent.

5th March, 2009 Monica Schraefel (University of Southampton) - The Path to Joyful Interaction or Why doesn't your computer make you happy?

The common computing interaction paradigm is task oriented and task silo'd. We go to a specific application that supports a specific task and do that specific thing. There is some boundary crossing within applications - calendars and address books share data; email is forced into being as flexible as a paper notebook, spreadsheets can be linked into word processing documents. Yet perhaps not too many would say they feel particularly empowered by their computers; that their quality of life is enhanced by interacting with these machines. There are several ways at least in which we might consider why this lack of joy and delight is the more usual experience of computers in our world. One may be this sense of having to do too many things FOR the computer in order for it to do things for us. Another may be that even when it has the information, it does not DO what we want with it. It is functionally obtuse. Another may be that the cost of trying to explain what to do is simply too high for the benefit that might accrue. In the past year or so, a few of us have been looking at some of these problems that appear to be quite light weight issues, and yet have been substantial road blocks towards delightful computing. We have been prototyping some approaches to explore new interactions and new types of services that might be both practically effective in freeing us from serving the computer to get on with our own missions, and may, in so doing, serve to enhance our quality of life along the way. In this talk, I'll go over some of these projects, the motivation behind them and how far we've gotten on the path to joyful computing and the perfect digital assistant.

26th February, 2009 Mark Stevenson (University of Sheffield) - Disambiguation of Biomedical Text Slides

Like text in other domains, biomedical documents contain a range of terms with more than one possible meaning. These ambiguities form a significant obstacle to the automatic processing of these texts. Previous approaches to resolving this problem have made use of a variety of knowledge sources including the context in which the ambiguous term is used and domain-specific resources (such as UMLS). We compare a range of knowledge sources which have been previously used and introduce a novel one: MeSH terms. The best performance is obtained using linguistic features in combination with MeSH terms. Performance exceeds previously reported results on a standard test set. Our approach is supervised and therefore relies on annotated training examples. A novel approach to automatically acquiring additional training data, based on the relevance feedback technique from Information Retrieval, is presented. Applying this method to generate additional training examples is shown to lead to a further increase in performance.

19th February, 2009 Mark A. Greenwood (University of Sheffield) - IR4QA: An Unhappy Marriage Slides

Over a decade of recent question answering (QA) research has relied on using off-the-shelf information retrieval (IR) engines in order to find relevant documents from which exact answers can be extracted. In this talk I will explain why most QA systems follow this approach and summarise the recent research into what has become known as IR4QA. It is becoming increasingly clear, however, that the use of IR within QA systems is nothing more than a marriage of convenience: in general, QA researchers don't want to develop IR engines and IR researchers are not interested in the QA task. I believe that this marriage is doomed and will never lead to the production of high performance QA systems. The second half of the talk will highlight the main problems inherent in modern QA systems which use IR engines and suggest some possible avenues that QA research may take in the future.

12th February, 2009 Ehud Reiter (University of Aberdeen) - BabyTalk: Generating English Summaries of Clinical Data Slides

I will give an overview of the BabyTalk project, whose goal is to generate English summaries of complex clinical data from a neonatal intensive care unit, for doctors, nurses, parents, and other family members. BabyTalk is based on the hypothesis that a textual summary of the most important information in a data set can in some cases be more useful than a visualisation which presents all of the data, or a expert system which explicitly gives advice based on the data. I will primarily focus on NLP challenges in BabyTalk, such as generating good narratives and effectively communicating temporal information. I will also present the results of our first evaluation, which were mixed but overall quite encouraging.

5th February, 2009 Julien Bourdon (Kyoto University) - Language Grid: An Infrastructure for Intercultural Collaboration Slides

The Language Grid is an on-line multilingual service platform which enables easy registration and sharing of language services such as on-line dictionaries, bilingual corpora, and machine translations. Unlike existing machine translation systems, the Language Grid allows users to register and combine user-created dictionaries and bilingual corpora with existing machine translations to realize user-oriented translation programs with greater accuracy. The main goals of this project are to combine the existing standard language services provided by linguistic professionals and to assist users to create new language services for their own purpose by permitting them to add their own language resources to the ones made by professionals. Currently, services such as translators, dictionaries, parallel texts, morphological analysers, concept dictionaries, available in 10 languages are deployed on the Language Grid. The Language Grid is used for applications such as multilingual collaboration in NPOs, intercultural coexistence in Japanese schools or hospitals.

29th January, 2009 Miles Osborne (University of Edinburgh) - POSTPONED

4th December, 2008 Diana McCarthy (University of Sussex) - Evaluating Lexical Inventories and Disambiguation Systems with Lexical Substitution Slides

There has been a surge of interest within Computational Linguistics over the last decade into methods for word sense disambiguation (WSD). A major catalyst has been the series of SENSEVAL evaluation exercises which have provided standard datasets for the field. Whilst researchers believe that WSD will ultimately prove useful for applications which need some degree of semantic interpretation; the jury is still out on this point. One significant problem is that there is no clear choice of inventory for any given task, other than the use of a parallel corpus for a specific language pair for a machine translation application. Many of the evaluation datasets produced, certainly in English, have used WordNet. Whilst WordNet is a useful resource, it would be beneficial if systems using other inventories could enter the WSD arena without the need for mappings between the inventories which may mask results. This is particularly important since there is no consensus that WordNet sense distinctions are the right ones to make for any given application. As well as the work in disambiguation, there is a growing interest in automatic acquisition of inventories of word meaning. It would be useful to investigate the merits of predefined inventories themselves, aside from their use for disambiguation, and compare these with inventories which have been acquired automatically. In this talk I will discuss these issues and some results in the context of the English Lexical Substitution Task, organised by myself and Roberto Navigli (University of Rome, "La Sapienza") last year under the auspices of SEMEVAL.

27th November, 2008 David Guthrie (University of Sheffield) - Unsupervised Detection of Anomalous Text Slides, PhD Thesis

Situations abound that rely on the ability of computers to detect differences from what is normal or expected. Credit card companies identify possible fraud by detecting spending patterns that differ from what is 'normal' for a given cardholder and network analysts detect possible attacks by spotting network traffic that is out of the ordinary. The focus for this talk is the development of unsupervised technologies to similarly detect anomalies in text. We use the term "anomalous" to refer to text that is irregular, or unusual, with respect to the writing style in the majority of a text. In this talk we show that identifying such abnormalities in text can be viewed as a type of outlier detection because these anomalies will deviate significantly from their surrounding context. We consider segments of text which are anomalous with respect to topic (i.e. about a different subject), author (written by a different person), or genre (written for a different audience or from a different source) and experiment with whether it is possible to identify these anomalous segments automatically. Several different innovative approaches to this problem are introduced and we present results over large document collections, created to contain randomly inserted anomalous segments.

18th November, 2008 Seemab Latif (University of Manchester) - Novel Automatic Technique for Linguistic Quality Assessment of Students' Essays Using Automatic Summarizers Slides

In this seminar, I will be talking about the experiments that have addressed the calculation of inter-annotator inconsistency in selecting the content in both manual and automatic summarization of sample TOEFL essays. A new finding is that the linguistic quality of source essay has a very strong positive correlation with the degree of disagreement among human assessors to what should be included in a summary. This leads to a fully automated essay evaluation technique based on degree of disagreement among automated summarizes. ROUGE evaluation is used to measure the degree of inconsistency among the participants (human summarizers and automatic summarizers). This automated essay evaluation technique is potentially an important contribution with wider significance.

6 November, 2008 Niraj Aswani (University of Sheffield) - Tools for Alignment Tasks Slides

For some tasks, such as text alignment and cross-document co-reference resolution, one would need to refer to more than one document at the same time. Hence, a need arises for Processing Resources (PRs) which can accept more than one document as parameters. For example, given two documents, a source and a target, a Sentence Alignment PR would need to refer to both of them to identify which sentence of the source document aligns with which sentence of the target document. Similarly for a cross-document co-reference resolution, the respective PR would need to access both the documents simultaneously. The standard behaviour of the GATE PRs contradicts the above mentioned requirements. GATE PRs process one document at a time. Corpus pipeline which accepts a corpus as input, considers only one document at a time. Having said this it is not impossible to make PRs accepting more than one document but this would require a lot of re-engineering. Recently, we have introduced a few new resources in GATE (e.g. CompoundDocument, CompositeDocument, AlignmentEditor etc.) to address these issues. In this short presentation, I will describe these components and show how to use them.

28 October, 2008 Rob Gaizauskas (University of Sheffield) - Generating Image Captions using Topic Focused Multi-document Summarization Slides

In the near future digital cameras will come standardly equipped with GPS and compass and will automatically add global position and direction information to the metadata of every picture taken. Can we use this information, together with information from geographical information systems and the Web more generally, to caption images automatically? This challenge is being pursued in the TRIPOD project and in this talk I will address one of the subchallenges this topic raises: given a set of toponyms automatically generated from geo-data associated with an image, can we use these toponyms to retrieve documents from the Web and to generate an appropriate caption for the image?

We begin assuming the toponyms name the principal objects or scene contents in the image. Using web resources (e.g. Wikipedia) we attempt to determine the types of these things -- is this a picture of church? a mountain? a city? We have constructed a taxonomy of such image content types using on-line image collections and for each such type we have constructed a several collections of texts describing that type. For example, we have a collection of captions describing churches and a collection of Wiki pages describing churches. The intuition here is that these collections are examples of, e.g. the sorts of things people say in captions of churches. These collections can then be used to derive models of objects or scene types which can be used to bias or focus multi-document summaries of new images of things of the same type.

In the talk I report results of work we have carried out to explore the hypothesis underlying this approach, namely that brief multidocument summaries generated as image captions by using models of object/scene types to bias or focus content selection will be superior to generic multidocument summaries generated for this purpose. I describe how we have constructed an image content taxonomy, how we have derived text collections for object/scene types, how we have derived object/scene type models from these collections and how these have been used in multi-document summarization. I also discuss the issue of how to evaluate the resulting captions and present preliminary results from one sort of evaluation.

21 October, 2008
Leon Derczynski (University of Sheffield) - A Data Driven Approach to Query Expansion in Question Answering Slides

Automated answering of natural language questions is an interesting and useful problem to solve. Question answering (QA) systems often perform information retrieval at an initial stage. Information retrieval (IR) performance, provided by engines such as Lucene, places a bound on overall system performance. For example, no answer bearing documents are retrieved at low ranks for almost 40% of questions. In this paper, answer texts from previous QA evaluations held as part of the Text REtrieval Conferences (TREC) are paired with queries and analysed in an attempt to identify performance-enhancing words. These words are then used to evaluate the performance of a query expansion method. Data driven extension words were found to help in over 70% of difficult questions. These words can be used to improve and evaluate query expansion methods. Simple blind relevance feedback (RF) was correctly predicted as unlikely to help overall performance, and an possible explanation is provided for its low value in IR for QA.

Mark A. Greenwood (University of Sheffield) - Evaluation of Automatically Reformulated Questions in Question Series Slides

Having gold standards allows us to evaluate new methods and approaches against a common benchmark. In this paper we describe a set of gold standard question reformulations and associated reformulation guidelines that we have created to support research into automatic interpretation of questions in TREC question series, where questions may refer anaphorically to the target of the series or to answers to previous questions. We also assess various string comparison metrics for their utility as evaluation measures of the proximity of an automated system's reformulations to the gold standard. Finally we show how we have used this approach to assess the question processing capability of our own QA system and to pinpoint areas for improvement.

14 October, 2008 - Jordi Poveda (UPC Catalunya) - A Combination of Machine Learning Methods for the Recognition of Temporal Expressions Slides

Time expression recognition and representation of the time information they convey in a suitable normalized form is a central part of Information Extraction (IE), for it paves the way for the extraction of events and temporal relations. The most common approach to time expression recognition in the past has been the use of handmade extraction rules (grammars), which also served as the basis for normalization. Our aim is to explore the possibilities afforded by applying machine learning techniques to the recognition of time expressions, in order to see where it stands in relation to grammar-based approaches. We focus on recognizing the appearances of time expressions in text (not normalization) and transform the problem into one of chunking, where the aim is to correctly assign IOB tags to tokens. We explain will the knowledge representation used and compare the results obtained in our experiments with two different supervised methods, one statistical (support vector machines) and one of rule induction (FOIL), where the superiority of SVMs is revealed. Next, we will present a semi-supervised approach (based on bootstrapping) to the extraction of time expression mentions in large unlabelled corpora based on bootstrapping. The only supervision is in the form of seed examples, hence it becomes necessary to resort to heuristics to rank and filter out spurious patterns and candidate time expressions. We will summarize our preliminary result with this bootstrapping architecture, which is currently in a testing and improvement stage . The ultimate benefit of developing an end-to-end machine-learning-based framework for information extraction is that it can be carried to new domains and tasks with little customization.