Reproducible and Trustworthy Workflows for Data Science
Welcome
Course notes for teaching concepts and practices related to reproducible and trustworthy workflows for data science. Specifically we cover workflows for writing reproducible, robust and valid computer scripts, analytic reports and data analysis pipelines, computational environments, as well as testing and deployment of software written for data analysis. Emphasis is placed on how to collaborate on the above tasks effectively with others using version control tools, such as Git and GitHub. Concepts are learned and applied using real data and case studies.