Progressive Data Analysis: a new computation paradigm for scalability in exploratory data analysis
Jean-Daniel Fekete
24 January 2020, 14h00 Salle/Bat : 435/PCRI-N
Contact :
Activités de recherche : Gestion de données du Web
Résumé :
Exploring data requires a short feedback loop, with a latency of at most 10 seconds because of human cognitive capabilities and limitations. When data becomes large or analyses become complex, sequential computations can no longer be completed in a few seconds and interactive exploration is severely hampered. This talk will describe a novel computation paradigm called Progressive Data Analysis that brings at the programming language level the low-latency guarantee by performing computations in a
progressive fashion. Moving this progressive computation at the language level relieves the programmer of exploratory data analysis systems from implementing the whole analytics pipeline in a progressive way from scratch, streamlining the implementation of scalable exploratory analytics systems. I will describe the new paradigm, report on novel experiments showing that human can cope effectively with progressive systems, show demos using a prototype implementation called ProgressiVis, explain the requirements it implies through exemplar applications, and present opportunities and challenges ahead, in the domains of visualization, visual analytics, machine-learning, and databases.