Innovating Works

FADAMS

Financiado
Foundations of Factorized Data Management Systems
The objective of this project is to investigate scalability questions arising with a new wave of smart relational data management systems that integrate analytics and query processing. These questions will be addressed by a fundam... The objective of this project is to investigate scalability questions arising with a new wave of smart relational data management systems that integrate analytics and query processing. These questions will be addressed by a fundamental shift from centralized processing on tabular data representation, as supported by traditional systems and analytics software packages, to distributed and approximate processing on factorized data representation. Factorized representations exploit algebraic properties of relational algebra and the structure of queries and analytics to achieve radically better data compression than generic compression schemes, while at the same time allowing processing in the compressed domain. They can effectively boost the performance of relational processing by avoiding redundant computation in the one-server setting, yet they can also be naturally exploited for approximate and distributed processing. Large relations can be approximated by their subsets and supersets, i.e., lower and upper bounds, that factorize much better than the relations themselves. Factorizing relations, which represent intermediate results shuffled between servers in distributed processing, can effectively reduce the communication cost and improve the latency of the system. The key deliverables will be novel algorithms that combine distribution, approximation, and factorization for computing mixed loads of queries and predictive and descriptive analytics on large-scale data. This research will result in fundamental theoretical contributions, such as complexity results for large-scale processing and tractable algorithms, and also in a scalable factorized data management system that will exploit these theoretical insights. We will collaborate with industrial partners, who are committed to assist in providing datasets and realistic workloads, infrastructure for large-scale distributed systems, and support for transferring the products of the research to industrial users. ver más
31/05/2022
UZH
2M€
Duración del proyecto: 75 meses Fecha Inicio: 2016-02-02
Fecha Fin: 2022-05-31

Línea de financiación: concedida

El organismo H2020 notifico la concesión del proyecto el día 2022-05-31
Línea de financiación objetivo El proyecto se financió a través de la siguiente ayuda:
Presupuesto El presupuesto total del proyecto asciende a 2M€
Líder del proyecto
UNIVERSITAT ZURICH No se ha especificado una descripción o un objeto social para esta compañía.
Perfil tecnológico TRL 4-5