25  Packaging: R

Learning Objectives

  1. Be able to distinguish between essential software package elements (required to make a minimal software package) and those that are not essential but act to improve the user and developer experiences.
  2. Define what a namespace is.
  3. Explain the role of the code within the following software package files:
  • R
    • DESCRIPTION
    • NAMESPACE
    • man/*.Rd
  1. Compare and contrast how the Cookiecutter project template tool sets up a software package structure to how devtools and usethis set up a software package.
  2. Generate well formatted function and package-level documentation for software packages using available tools (e.g., Roxygen2 and pkgdown in R)

25.1 Essential R package files

Using the project layout we recommend for this course, here is a R package structure with only the most essential files.

pkg
├── DESCRIPTION
├── man
   ├── functionA.Rd
   └── functionB.Rd
├── NAMESPACE
└── R
    └── functions.R

What do each of these do?

  • DESCRIPTION stores all the metadata and dependency installation instructions for the package.

  • The man directory contains the function documentation in .Rd files (one per function). The contents are from these are generated from the ROxygen2 function documentation in the R/*.R files.

  • Unsurprisingly, the NAMESPACE file is important in defining your package’s namespace.

  • R is the directory for the *.R files which contain your exported (i.e., user-facing) functions.

  • functions.R (this one can be named something else!) contains the functions you would like to share with your package users.

25.2 Getting to know a DESCRIPTION file’s contents

Here is an example DESCRIPTION file:

Package: foofactors                                                         ┐
Title: Make Factors Less Aggravating                                        │
Version: 0.0.0.9000                                                         │
Authors@R:                                                                  │
    person(given = "Tiffany",                                               │ Package metadata
           family = "Timbers",                                              │
           role = c("aut", "cre"),                                          │
           email = "tiffany.timbers@gmail.com",                             │
           comment = c(ORCID = "0000-0002-2667-376X"))                      │
Description: Factors have driven people to extreme measures, like ordering  │
    custom conference ribbons and laptop stickers to express how HELLNO we  │
    feel about stringsAsFactors. And yet, sometimes you need them. Can they │
    be made less maddening? Let's find out.                                 │
License: MIT + file LICENSE                                                 │
Encoding: UTF-8                                                             │
LazyData: true                                                              ┘
Roxygen: list(markdown = TRUE)                                              ┐
RoxygenNote: 7.0.2                                                          │ Developer dependencies
Suggests:                                                                   │
    testthat (>= 2.1.0),                                                    │
    covr                                                                    ┘
Imports:                                                                    ┐ User function dependencies
    forcats                                                                 ┘
URL: https://github.com/ttimbers/foofactors                                 ┐ More package metadata
BugReports: https://github.com/ttimbers/foofactors/issues                   ┘

This is equivalent to the pyproject.toml file in Python packages. Again, almost everything in it is customizable based on your package’s specifics.

25.3 man/*Rd files

The man directory contains the function documentation in .Rd files (one per function). These can be created from the function’s roxygen2 documentation using devtools::document. They use a custom syntax that is loosely based on LaTeX, which can be rendered to different formats for sharing with the package users.

For example, this roxygen2 documention in R/fbind.R:

#' Bind two factors
#'
#' Create a new factor from two existing factors, where the new factor's levels
#' are the union of the levels of the input factors.
#'
#' @param a factor
#' @param b factor
#'
#' @return factor
#' @export
#' @examples
#' fbind(iris$Species[c(1, 51, 101)], PlantGrowth$group[c(1, 11, 21)])

gives this syntax (loosely based on LaTeX) in man/bind.Rd:

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/fcount.R
\name{fcount}
\alias{fcount}
\title{Make a sorted frequency table for a factor}
\usage{
fcount(x)
}
\arguments{
\item{x}{factor}
}
\value{
A tibble
}
\description{
Make a sorted frequency table for a factor
}
\examples{
fcount(iris$Species)
}

25.4 NAMESPACE file

In it are commands applied to R objects. Common commands include indicating that a function should be exported from your package, and/or that a another package should be imported to be used internally. The contents of this file are best to be created automatically using devtools to pull this information from the package function’s roxygen2 documentation (exports) and the DESCRIPTION file (imports).

Here’s an example NAMESPACE file:

# Generated by roxygen2: do not edit by hand

export(compare)
export(expect_equal)
import(rlang)
importFrom(brio,readLines)

25.5 Dealing with other package dependencies in your package

25.5.1 Dealing with package dependencies in R

  • When we write code in our package that uses functions from other packages we need to import those functions from the namespace of their packages.

  • In R, we do this via use_package, which adds that package to the “Imports” section of DESCRIPTION

  • We also need to refer to that package in our function code, there are two ways to do this:

    1. refer the function by package::fun_name (e.g., dplyer::filter) whenever you use the function in your code
    2. add the function to your packages namespace so that you can just refer to it as your normally would. To do this add @importFrom <package_name> <function_or_operator> to the Roxygen documentation of the function that you would like to use the function from the other package in and then use document() to update the DESCRIPTION and NAMESPACE file.

It is recommended to use method 1 (pkg::fun_name) because it is more explicit on what external functions your package code depends on (making it easier to read for collaborators, including future you). The trade off is that it’s a little more work to write.

25.6 Package documentation for R

There are several levels of documentation possible for R packages: - code-level documentation (Roxygen-style comments) - vignettes - package websites (via pkgdown)

25.7 Code-level documentation (Roxygen-style comments)

  • We learned the basics of how to write Roxygen-style comments in DSCI 511
  • In the package context, there are Namespace tags you should know about:
    • @export - this should be added to all package functions you want your user to know about
    • @NoRd - this should be added to helper/internal helper functions that you don’t want your user to know about

25.8 Vignettes

It is common for packages to have vignettes (think demos with narratives) showing how to use the package in a more real-world scenario than the documentation examples show. Think of your vignette as a demonstration of how someone would use your function to solve a problem.

  • It should demonstrate how the individual functions in your package work, as well as how they can be integrated together.

  • To create a template for your vignette, run: usethis::use_vignette("package_name-vignette")

  • Add content to that file and knit it when done.

As an example, here’s the dplyr vignette: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html

25.9 Package websites (via pkgdown)

  • Vignettes are very helpful, however they are not that discoverable by others, websites are a much easier way to share your package with others.

  • The pkgdown R package let’s you build a beautiful website for your R package in 4 steps!

    1. Turn on GitHub pages in your package repository, setting main branch / docs folder as the source.

    2. Install pkgdown: `install.packages(“pkgdown”)

    3. Run pkgdown::build_site() from the root of your project, and commit and push the changes made by this.

    4. Push your code, including the docs directory to GitHub.com

In addition to the beautiful website, pkgdown automatically links to your vignette under the articles section of the website!!! 🎉🎉🎉

Note you can also configure a GitHub Actions workflow to automate the rebuilding of the pkgdown site anytime changes are pushed to your package’s GitHub repository. We will discuss this later in the course under the topic of continuous deployment.