Operations Research

Logo

A site for hosting software and data repositories associated with papers appearing in the journal _Operations Research_

View source on GitHub

Dependency Management in R

R’s dependency management has evolved from basic built-in tools to modern solutions that provide reproducibility and isolation.

Core Package Management

Built-in Functions

R includes basic package management functionality:

# Install packages
install.packages("dplyr")
install.packages(c("ggplot2", "tidyr"))

# Load packages
library(dplyr)
require(ggplot2)  # Similar but returns FALSE if not available

# Update packages
update.packages()

# Remove packages
remove.packages("dplyr")

Package Repositories

# From CRAN (default)
install.packages("dplyr")

# From Bioconductor
BiocManager::install("DESeq2")

# From GitHub
devtools::install_github("tidyverse/dplyr")

Package Description Files

DESCRIPTION

For R packages, the DESCRIPTION file specifies dependencies:

Package: mypackage
Version: 0.1.0
Imports:
    dplyr (>= 1.0.0),
    ggplot2
Suggests:
    testthat,
    knitr
Depends:
    R (>= 4.0.0)

Version Specifications

dplyr (>= 1.0.0)           # Minimum version
ggplot2 (>= 3.0, < 4.0)    # Version range
tidyr (== 1.2.0)           # Exact version (rare)

Modern Dependency Management Tools

The modern standard for project-level dependency management:

# Initialize renv for a project
renv::init()

# Install packages (tracked automatically)
install.packages("dplyr")

# Save the current state
renv::snapshot()

# Restore from lockfile
renv::restore()

# Update packages
renv::update()

Project Structure:

myproject/
├── renv.lock          # Lockfile with exact versions
├── renv/              # Project library
│   └── library/       # Installed packages
├── .Rprofile          # Auto-activates renv
└── renv/activate.R    # Activation script

renv.lock example:

{
  "R": {
    "Version": "4.2.0",
    "Repositories": [
      {
        "Name": "CRAN",
        "URL": "https://cran.rstudio.com"
      }
    ]
  },
  "Packages": {
    "dplyr": {
      "Package": "dplyr",
      "Version": "1.0.9",
      "Source": "Repository",
      "Repository": "CRAN"
    }
  }
}

Key Features:

2. packrat (Legacy)

The predecessor to renv, now superseded:

packrat::init()
packrat::snapshot()
packrat::restore()

Note: Use renv instead for new projects.

3. checkpoint

Uses MRAN (Microsoft R Archive Network) time-based snapshots:

library(checkpoint)
checkpoint("2023-01-15")  # Use packages as of this date

Status: MRAN was deprecated in 2022; limited usefulness now.

Dependency Resolution

Basic Approach

R’s built-in install.packages():

With renv

# Check for issues
renv::diagnostics()

# See dependency tree
renv::dependencies()

Environment Management

System Library vs. User Library

R has multiple library paths:

.libPaths()  # View library locations
# [1] "~/R/library"           # User library
# [2] "/usr/lib/R/library"    # System library

Project Isolation

Without renv:

With renv:

Package Development

devtools Workflow

For developing packages:

library(devtools)

# Load your package for development
load_all()

# Install dependencies from DESCRIPTION
install_deps()

# Check package
check()

# Install your package
install()

roxygen2 for Documentation

#' @importFrom dplyr filter mutate
#' @import ggplot2

These tags in your documentation generate the DESCRIPTION file dependencies.

Docker Integration

For ultimate reproducibility:

FROM rocker/r-ver:4.2.0

RUN R -e "install.packages('renv')"

COPY renv.lock renv.lock
RUN R -e "renv::restore()"

The rocker project provides versioned R Docker images.

Best Practices

  1. Use renv for projects: Ensures reproducibility and isolation
  2. Commit renv.lock: Version control your lockfile, not renv/library/
  3. Specify minimum versions: In DESCRIPTION files for packages
  4. Regular snapshots: Run renv::snapshot() after package changes
  5. Use .Rprofile: renv creates this to auto-activate
  6. Document session info: Use sessionInfo() to record your environment
# Save session information
writeLines(capture.output(sessionInfo()), "session_info.txt")

Common Workflows

Starting a New Project

# Create project directory
dir.create("myproject")
setwd("myproject")

# Initialize renv
renv::init()

# Install packages
install.packages("tidyverse")

# Save state
renv::snapshot()

Collaborating

# Team member clones repo
git clone repo_url
cd repo

# Open R in project
# renv activates automatically

# Install exact versions
renv::restore()

Updating Dependencies

# Update a package
install.packages("dplyr")

# Record the update
renv::snapshot()

# If issues arise, rollback
renv::restore()

Key Differences from Python/Julia: