CLI Speaker Series
-
[Fri Oct 31 2008 2:30-3:45] (ICC462)
Speaker: Dr. Cleo Condoravdi
Title: Computing Textual Inferences
Abstract
A measure of understanding a text is the ability to make inferences based on the information conveyed by it. Given a passage of text and a hypothesis, the task would be to automatically infer whether the hypothesis follows from the text, whether it is contradicted by it, or whether it is compatible with it. At PARC we have been working on a system for computing linguistically-based textual inferences such as the ones below.
Passage: Ed has been living in Athens for 3 years.
Mary visited Athens in the last 2 years.
Hypothesis: Mary visited Athens while Ed lived in Athens.
Answer: YES
Passage: The diplomat does not know that the president failed to
destroy the evidence.
Hypothesis: The president managed to destroy the evidence.
Answer: NO
Passage: No one stayed throughout the concert.
Hypothesis: No one stayed throughout the first part of the concert.
Answer: UNKNOWN
Texts are parsed to produce packed functional-structures and these are rewritten and canonicalized, without unpacking, into abstract knowledge representations (AKR). An AKR representation is a flat set of facts that involves concepts, roles, temporal relations and contexts. In this talk I show how AKRs are derived from parsed text and discuss the system's algorithm for entailment and contradiction detection (ECD). ECD operates on the AKRs of the passage and of the hypothesis in order to detect a potential entailment or contradiction between them, without the need for disambiguation.
-
[Tue Sep 23 2008 11:40-12:50] (ICC450)
Speaker: Katrin Erk (University of Texas, Austin)
Topic: "Polysemy: Some corpus observations and computational models"Determining the meaning of a polysemous word in a given context is a difficult task: It is difficult to do this automatically, and it is even difficult for human annotators to perform this task manually. It is possible that this is due to the underlying model of word meaning: Polysemy is usually framed as a list of dictionary senses, and the task as picking the one sense that is contextually appropriate. But there is evidence indicating that not all words may have clearly disjoint senses.
The first half of this talk describes two ongoing annotation tasks designed to determine the degree to which different words have disjoint senses. The second half of the talk discusses vector space models of word meaning, computational models of word meaning that does not take recourse to dictionary senses. Vector space models have successfully been used in language technology, especially, information retrieval, and cognitive science, and they can be induced automatically from corpora. However, they are mostly used to describe word meaning in isolation, rather than meaning in a specific context, and existing vector space models for meaning in context do not take syntactic structure sufficiently into account. We present a novel "structured vector space model" that addresses these issues by incorporating the selectional preferences for words' argument positions. This makes it possible to integrate syntax into the computation of word meaning in context. In addition, the model performs at and above the state of the art for modeling the contextual adequacy of paraphrases.
Previous Talks
-
[Mon Oct 22 2007 4:00-5:30] (ICC450)
Speaker: Janet Hitzeman (MITRE)
Topic: "SpatialML: A Proposed Standard for GeoSpatial Annotation" -
[Fri Nov 2 2007 3:30-5:00]
Speaker: John Hale (Michigan State University)
Topic: "QUANTIFYING AMBIGUITY RESOLUTION WITH INFORMATION THEORY"
Abstract: Ambiguities about grammatical category and syntactic structure permeate natural language. Explaining human comprehenders' performance in the face of such confusion has been called the central problem in sentence processing (Tabor & Tanenhaus, 2001). How is it that human sentence understanders are able to recognize combinatory relationships, from an infinite range of possibilities, to arrive at a meaningful interpretation of a sentence?
This talk argues that an answer lies in formalizing the idea that comprehenders search the space of grammatical analyses in a way constrained by the words they hear. Comprehenders are constantly engaged in ambiguity resolution, and the more ambiguity is resolved, the longer they take.
To make this intuition fully explicit, ambiguity resolution will be given a precise interpretation in terms of information theory.
The general theory is tested using explicit grammar fragments that are probabilistic versions of Generalized Phrase Structure Grammars (Gazdar, Klein & Sag 1985) and Minimalist Grammars (Stabler 1997). The theory will be shown to derive a range of well-documented processing phenomena including garden-path sentences, center-embedding, and the Accessibility (or Obliqueness) Hierarchy of relativized grammatical functions.
Gerald Gazdar, Ewan Klein, Geoffrey Pullum and Ivan Sag. 1985. Generalized Phrase Structure Grammar. Harvard University Press. Whitney Tabor and Michael K. Tanenhaus. 2001. ``Dynamical Systems for Sentence Processing'' in Connectionist Psycholinguistics, edited by Morten H. Christiansen and Nick Chater. Ablex. Edward P. Stabler, Jr. 1997. ``Derivational Minimalism'' in Logical Aspects of Computational Linguistics, edited by Christian Retore. Springer-Verlag. -
[Mon Nov 12 2007 3:30-5:00]
Speaker: Jerry Hobbs (ISI/USC)
Topic: An Ontology of Time
I will first describe the OWL-Time ontology that was developed in conjunction with the DARPA Agent Markup Language (DAML) program. It covers the topological properties of time, including Allen's interval relations, measures of duration, and the clock and calendar, all axiomatized in first-order predicate calculus. I will then describe our effort to axiomatize temporal aggregates, such as "every third Monday of every other month". In developing all of this, we have attempted to cover a wide range of natural language constructions as well as subsume the coverage of existing calendar systems. Finally, I will describe our work on annotating events in newspaper articles with the range in which their duration is likely to fall. Here we have developed annotation guidelines to disambiguate the most common uncertain cases, we have developed an approach to measuring inter-annotator agreement, and we have performed some modestly successful machine learning experiments on this data. All of this represents joint work with Feng Pan and Rutu Mehta.
Upcoming Events
- Nov 25, All day: Thanksgiving Recess Begins after Last Class
- Nov 27, All day: Fall Dissertation Proposals due to Department
- Nov 30, All day: Classes Resume

