Deep learning continues to achieve impressive breakthroughs across disciplines and is a major driving force behind a multitude of industry innovations. Most of its successes are achieved by increasingly large neural networks that...
ver más
¿Tienes un proyecto y buscas un partner? Gracias a nuestro motor inteligente podemos recomendarte los mejores socios y ponerte en contacto con ellos. Te lo explicamos en este video
Información proyecto SPARSE-ML
Duración del proyecto: 61 meses
Fecha Inicio: 2023-10-25
Fecha Fin: 2028-11-30
Fecha límite de participación
Sin fecha límite de participación.
Descripción del proyecto
Deep learning continues to achieve impressive breakthroughs across disciplines and is a major driving force behind a multitude of industry innovations. Most of its successes are achieved by increasingly large neural networks that are trained on massive data sets. Their development inflicts costs that are only affordable by a few labs and prevent global participation in the creation of related technologies. The huge model sizes also pose computational challenges for algorithms that aim to address issues with features that are critical in real-world applications like fairness, adversarial robustness, and interpretability. The high demand of neural networks for vast amounts of data further limits their utility for solving highly relevant tasks in biomedicine, economics, or natural sciences.
To democratize deep learning and to broaden its applicability, we have to find ways to learn small-scale models. With this end in view, we will promote sparsity at multiple stages of the machine learning pipeline and identify models that are scaleable, resource- and data-efficient, robust to noise, and provide insights into problems. To achieve this, we need to overcome two challenges: the identification of trainable sparse network structures and the de novo optimization of small-scale models.
The solutions that we propose combine ideas from statistical physics, complex network science, and machine learning. Our fundamental innovations rely on the insight that neural networks are a member of a cascade model class that we made analytically tractable on random graphs. Advancing our derivations will enable us to develop novel parameter initialization, regularization, and reparameterization methods that will compensate for the missing implicit benefits of overparameterization for learning. The significant reduction in model size achieved by our methods will help unlock the full potential of deep learning to serve society as a whole.