Innovating Works

GraViLa

Financiado
Graphs without Labels: Multimodal Structure Learning without Human Supervision
Multimodal learning focuses on training models with data in more than one modality, such as videos capturing visual and audio information or documents containing image and text. Current approaches use such data to train large-scal... Multimodal learning focuses on training models with data in more than one modality, such as videos capturing visual and audio information or documents containing image and text. Current approaches use such data to train large-scale deep learning models without human supervision by sampling pair-wise data e.g., an image-text pair from a website and train the network e.g. to identify matching vs. not matching pairs to learn better representations.We argue that multimodal learning can do more: by combining information from different sources, multimodal models capture cross-modal semantic entities, and as most multimodal documents are a collection of connected modalities and topics, multimodal models should allow us to capture the inherent high-level topology of such data. The goal of the following project is to learn semantic structures from multimodal data to capture long-range concepts and relations in multimodal data via multimodal and self-supervision learning without human annotation. We will represent this information in form of a graph, considering latent semantic concepts as nodes and their connectivity as edges. Based on this structure, we will extend current unimodal approaches to capture and process data from different modalities in a single structure. Finally, we will explore the challenges and opportunities of the proposed idea with respect to their impact on two main challenges in machine learning: data-efficient learning and fairness in label-free learning.By bridging the gap between those two parallel trends, multimodal supervision and graph-based representations, we combine their strengths of generating and processing topological data, which will not only allow to build new applications and tools but also opens new ways of processing and understanding large-scale data that are out-of-reach at the moment. ver más
31/03/2029
1M€
Duración del proyecto: 59 meses Fecha Inicio: 2024-04-01
Fecha Fin: 2029-03-31

Línea de financiación: concedida

El organismo HORIZON EUROPE notifico la concesión del proyecto el día 2024-04-01
Línea de financiación objetivo El proyecto se financió a través de la siguiente ayuda:
ERC-2023-STG: ERC STARTING GRANTS
Cerrada hace 2 años
Presupuesto El presupuesto total del proyecto asciende a 1M€
Líder del proyecto
RHEINISCHE FRIEDRICHWILHELMSUNIVERSITAT BONN No se ha especificado una descripción o un objeto social para esta compañía.
Perfil tecnológico TRL 4-5