Natural language expressions are supposed to be unambiguous in context. Yet more and more examples of use of expressions that are ambiguous in context, yet felicitous and rhetorically unmarked, are emerging. In my own work, I demo...
ver más
¿Tienes un proyecto y buscas un partner? Gracias a nuestro motor inteligente podemos recomendarte los mejores socios y ponerte en contacto con ellos. Te lo explicamos en este video
Proyectos interesantes
FJC2021-047114-I
Procesamiento de oraciones y ambigüedad sintáctica
65K€
Cerrado
MULTILEX
Multilingual Lexicon Extraction from Comparable Corpora
100K€
Cerrado
BES-2014-070547
MOLDEAMIENTO EN LA DETECCION DE PATRONES POR LAS REPRESENTAC...
88K€
Cerrado
PSI2012-32533
¿LA ADQUISICION DEL LENGUAJE DEPENDE DE LA PERCEPCION HEURIS...
70K€
Cerrado
FFI2010-15006
LA ONTOLOGIA DEL LENGUAJE NATURAL Y LA REPRESENTACION SEMANT...
126K€
Cerrado
PSI2012-37623
PROCESAMIENTO LEXICO Y SINTACTICO EN EL HABLANTE BILINGUE: V...
70K€
Cerrado
Información proyecto DALI
Duración del proyecto: 65 meses
Fecha Inicio: 2016-08-25
Fecha Fin: 2022-01-31
Fecha límite de participación
Sin fecha límite de participación.
Descripción del proyecto
Natural language expressions are supposed to be unambiguous in context. Yet more and more examples of use of expressions that are ambiguous in context, yet felicitous and rhetorically unmarked, are emerging. In my own work, I demonstrated that ambiguity in anaphoric reference is ubiquitous, through the study of disagreements in annotation, that I pioneered in CL. Since then, additional cases of ambiguous anaphoric reference have been found; and similar findings have been made for other aspects of language interpretation, including wordsense disambiguation, and even part-of-speech tagging. Using the Phrase Detectives Game-With-A-Purpose to collect massive amounts of judgments online, we found that up to 30% of anaphoric expressions in our data are ambiguous. These findings raise a serious challenge for computational linguistics (CL), as assumptions about the existence of a single interpretation in context are built in the dominant methodology, that depends on a reliably annotated gold standard.
The goal of the proposed project is to tackle this fundamental issue of disagreements in interpretation by using computational methods for collecting and analysing such disagreements, some of which already exist but have never before been applied in linguistics on a large scale, some we will develop from scratch. Specifically, I propose to develop more advanced games-with-a-purpose to collect massive amounts of data about anaphora from people playing a game. I propose to use Bayesian models of annotation, widely used in epidemiology but not in linguistics, to analyse such data and identify genuine ambiguities; doing this for anaphora will require novel methods. Third, I propose to use these data to revisit current theories about anaphoric expressions that do not seem to cause infelicitousness when ambiguous. Finally, I propose to develop the first supervised approach to anaphora resolution that does not require a gold standard as a blueprint for other areas.