Reproducible and Trustworthy Workflows for Data Science#

Tiffany A. Timbers, Joel Ostblom, Florencia D’Andrea, Rodolfo Lourenzutti

January, 2023

Course notes for teaching concepts and practices related to reproducible and trustworthy workflows for data science. Specifically we cover workflows for writing reproducible, robust and valid computer scripts, analytic reports and data analysis pipelines, computational environments, as well as testing and deployment of software written for data analysis. Emphasis is placed on how to collaborate on the above tasks effectively with others using version control tools, such as Git and GitHub. Concepts are learned and applied using real data and case studies.


Software licensed under the MIT License, non-software content licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License. See the license file for more information.