Formalizing Subjective Interestingness in Exploratory Data Mining
"The rate at which research labs, enterprises and governments accumulate data is high and fast increasing. Often, these data are collected for no specific purpose, or they turn out to be useful for unanticipated purposes: Companie...
ver más
¿Tienes un proyecto y buscas un partner? Gracias a nuestro motor inteligente podemos recomendarte los mejores socios y ponerte en contacto con ellos. Te lo explicamos en este video
Información proyecto FORSIED
Líder del proyecto
UNIVERSITEIT GENT
No se ha especificado una descripción o un objeto social para esta compañía.
TRL
4-5
Presupuesto del proyecto
2M€
Fecha límite de participación
Sin fecha límite de participación.
Descripción del proyecto
"The rate at which research labs, enterprises and governments accumulate data is high and fast increasing. Often, these data are collected for no specific purpose, or they turn out to be useful for unanticipated purposes: Companies constantly look for new ways to monetize their customer databases; Governments mine various databases to detect tax fraud; Security agencies mine and cross-associate numerous heterogeneous information streams from publicly accessible and classified databases to understand and detect security threats. The objective in such Exploratory Data Mining (EDM) tasks is typically ill-defined, i.e. it is unclear how to formalize how interesting a pattern extracted from the data is. As a result, EDM is often a slow process of trial and error.
During this fellowship we aim to develop the mathematical principles of what makes a pattern interesting in a very subjective sense. Crucial in this endeavour will be research into automatic mechanisms to model and duly consider the prior beliefs and expectations of the user for whom the EDM patterns are intended, thus relieving the users of the complex task to attempt to formalize themselves what makes a pattern interesting to them.
This project will represent a radical change in how EDM research is done. Currently, researchers typically imagine a specific purpose for the patterns, try to formalize interestingness of such patterns given that purpose, and design an algorithm to mine them. However, given the variety of users, this strategy has led to a multitude of algorithms. As a result, users need to be data mining experts to understand which algorithm applies to their situation. To resolve this, we will develop a theoretically solid framework for the design of EDM systems that model the user's beliefs and expectations as much as the data itself, so as to maximize the amount of useful information transmitted to the user. This will ultimately bring the power of EDM within reach of the non-expert."