Keynote: Computational Reproducibility vs. Transparency: Is It FAIR Enough?

Abstract

The “reproducibility crisis” has resulted in much interest in methods and tools to improve computational reproducibility. FAIR data principles (data should be findable, accessible, interoperable, and reusable) are also being adapted and evolved to apply to other artifacts, notably computational analyses (scientific workflows, Jupyter notebooks, etc.). The current focus on computational reproducibility of scripts and other computational workflows sometimes overshadows a somewhat neglected and arguably more important issue: transparency of data analysis, including data wrangling and cleaning. In this talk I will ask the question: What information is gained by conducting a reproducibility experiment? This leads to a simple model (PRIMAD) that aims to answer this question by sorting out different scenarios. Using data cleaning recipes from OpenRefine as an example, I will present some approaches to improve transparency and reusability of such recipes via workflow analysis. Finally, I will present some features of Whole-Tale, a computational platform for reproducible and transparent computational experiments.

About the speaker

Bertram Ludäscher is a professor at the School of Information Sciences at the University of Illinois, Urbana-Champaign, where he directs the Center for Informatics Research in Science and Scholarship (CIRSS). He is also a faculty affiliate with the National Center for Supercomputing Applications (NCSA) and the Department of Computer Science at Illinois. Until 2014 he was a professor at the Department of Computer Science at the University of California, Davis and a faculty member of the UC Davis Genome Center. His research interests range from practical questions in scientific data and workflow management, to database theory and knowledge representation & reasoning. Prior to his faculty appointments, he was a research scientist at the San Diego Supercomputer Center (SDSC) and an adjunct faculty at the CSE Department at UC San Diego. He received his M.S. (Dipl.-Inform.) in computer science from the University of Karlsruhe (K.I.T.), and his PhD (Dr. rer. nat.) from the University of Freiburg, in Germany.