Descripción del proyecto
THE PROJECT PROPOSES NEW UNSUPERVISED COMPUTATIONAL MODELS TO AUTOMATICALLY EXTRACT BACKGROUND KNOWLEDGE AFTER READING LARGE AMOUNTS OF UNSTRUCTURED TEXT, THIS AUTOMATICALLY EXTRACTED KNOWLEDGE IS IN THE FORM OF CLASSES, CATEGORIZED ENTITIES AND PREDICATES WHOSE ARGUMENTS ARE TYPIFIED BY PROBABILITY DISTRIBUTIONS OVER CLASSES, CLASSES THEMSELVES WILL BE AUTOMATICALLY ORGANIZED INTO TAXONOMIES RELATED TO THE PREDICATES IN WHICH THEY PARTICIPATE, IN THIS WAY, NEW METHODS AND MODELS BASED ON EXTENSIONAL DEFINITIONS OF CONCEPTS ARE DEVELOPED FOR THE AUTOMATIC CREATION OF KNOWLEDGE BASES CLOSE RELATED TO TEXTUAL REPRESENTATIONS AS TO ENABLE TEXTUAL INFERENCES, THE EXTRACTED KNOWLEDGE WILL BE ALSO LINKED TO EXTERNAL HUMAN-MADE RESOURCES SUCH AS FREEBASE, DBPEDIA AND WORDNET, AND THE KNOWLEDGE BASES WILL BE INTERFACED TO SEVERAL ENGINES FOR DISAMBIGUATION, RELATION EXTRACTION, RELATEDNESS AND EXPANSION, ALL THESE RESOURCES AND TOOLS WILL BE AVAILABLE FOR THE DEVELOPMENT OF A READING MACHINE AS PART OF THE PROJECT, THE PURPOSE OF OUR READING MACHINE IS TO ANSWER QUERIES ABOUT A GIVEN TEXT, TEXTS ARE NEVER SELF-CONTAINED AND THEIR INTERPRETATION ALWAYS REQUIRES THE RECOVERING OF LARGE AMOUNTS OF BACKGROUND KNOWLEDGE, THUS, THE MACHINE READING TECHNOLOGY UNDER DEVELOPMENT MUST INCORPORATE INTO LANGUAGE PROCESSING THE RECOVERING AND USE OF LARGE AMOUNTS OF BACKGROUND KNOWLEDGE, THIS MACHINE READING TECHNOLOGY WILL BE EVALUATED THROUGH MULTIPLE-CHOICE READING COMPREHENSION TESTS (MRC) DEVELOPED BY HUMANS OVER DOCUMENTS UNSEEN BEFORE BY THE MACHINE, MRC TESTS ENABLE OBJECTIVE AND REPRODUCIBLE EVALUATION EXPERIMENTS, 100% REUSABLE AS BENCHMARKS AVAILABLE FOR THE INTERNATIONAL COMMUNITY, INTERESTINGLY, THE INDUSTRIAL PARTNER IN CHARGE OF THE MACHINE READING SYSTEM DEVELOPMENT WILL APPLY THE REVERSING TECHNOLOGY TO AUTOMATICALLY GENERATE MRC TESTS FOR THE AUTOMATIC ASSESSMENT OF CHILDREN READING ABILITIES, THIS READING MACHINE WILL WORK WITH AT LEAST TWO LANGUAGES, ENGLISH AND FRENCH, THE SUPPORT AND COORDINATION OF AN INTERNATIONAL EVALUATION CAMPAIGN FOR MACHINE READING IN MULTIPLE LANGUAGES (ENGLISH, SPANISH, FRENCH, GERMAN, ITALIAN, ROMANIAN, BULGARIAN AND ARABIC) IS PART OF THE PROPOSAL, THIS EVALUATION CAMPAIGN WILL SERVE TO MEASURE THE PROGRESS IN THE DEVELOPMENT OF THE MACHINE READING TECHNOLOGY IN A COMPARATIVE/COMPETITIVE ENVIRONMENT, EVALUATION EXERCISES IN SPECIFIC DOMAINS SUCH AS THE BIOMEDICAL WILL PROVIDE ALSO A SPACE FOR POTENTIAL TECHNOLOGY TRANSFER AND INDUSTRIES TECHNOLOGY ASSESSMENT, THIS SUBPROJECT ADDRESSES THE KNOWLEDGE LINKING AND INTEGRATION NEEDS OF THE PROJECT (WP3),