Build Process
build-process.RmdThe projr Build Process
This article provides a comprehensive guide to how projr builds work, covering all three major stages (Pre-Build, Build, and Post-Build) and what happens at each step.
Overview
Build Types
projr supports two types of builds:
Production Builds - Create versioned releases with
full tracking: - projr_build_major() - Increment major
version (X.0.0) - projr_build_minor() - Increment minor
version (0.X.0) - projr_build_patch() - Increment patch
version (0.0.X) - Can run from either release or dev versions
Development Builds - Quick iterations without
version changes: - projr_build_dev() - Build without
incrementing version - Automatically bumps to dev version if not already
on one (e.g., 0.0.1 → 0.0.1-1)
Build Architecture
Each build follows a three-stage architecture:
┌─────────────┐
│ Pre-Build │ ─── Preparation, validation, versioning
└──────┬──────┘
│
▼
┌─────────────┐
│ Build │ ─── Document rendering and script execution
└──────┬──────┘
│
▼
┌─────────────┐
│ Post-Build │ ─── Finalization, distribution, commits
└─────────────┘
Stage 1: Pre-Build Phase
The pre-build phase prepares the project environment and validates that everything is ready for the build.
1.1 Validation and Checks
Purpose: Ensure all prerequisites are met before starting the build.
What happens:
-
Configuration validation -
projr_yml_check()validates_projr.yml - Scripts and hooks validation - Verifies all configured scripts and hooks exist
- Package availability check - Ensures required packages (quarto, rmarkdown, etc.) are installed
- Authentication checks - Verifies GitHub PAT and OSF tokens if needed for remote destinations
- Git repository check - Ensures Git is initialized (creates it if needed and configured)
- GitHub remote check - Verifies GitHub remote exists if push is enabled
- Upstream check - Ensures local repository is not behind remote
Example error handling:
# If packages are missing, build will stop with installation instructions
projr_build_patch()
#> Error: Required packages not available for build:
#> - quarto (install with: install.packages("quarto"))
#> Run projr_build_check_packages() to see all requirements1.2 Remote Destination Preparation
Purpose: Set up GitHub releases or other remote destinations for artifact distribution.
What happens:
- Creates GitHub releases if configured with
archive_githubparameter - Prepares local archive directories if configured with
archive_localparameter
Configuration:
1.3 Documentation and Version Updates
Purpose: Update documentation and version tracking files before the build.
What happens:
- renv snapshot - Captures current package dependencies (if renv is used)
-
Ignore files update - Ensures
.gitignoreand.Rbuildignoreare current - Documentation directory setup - Configures docs directory (safe vs unsafe paths)
- Version enforcement - Ensures project is on a development version before build
Key files affected:
-
renv.lock- Package dependencies snapshot -
.gitignore- Ensures docs and output directories are properly ignored -
VERSIONorDESCRIPTION- Version tracking
1.4 Version Calculation
Purpose: Calculate the version numbers for different build outcomes.
What happens:
projr calculates three version numbers:
- Pre-run version - Current version (e.g., “0.0.1-dev”)
- Run version - Version during build (e.g., “0.0.2” for production, “0.0.1-dev” for dev)
- Failure version - Version to revert to if build fails (e.g., “0.0.1-dev”)
Example version transitions:
Production build (patch) from dev version:
0.0.1-1 → 0.0.2 → 0.0.2-1 (success)
0.0.1-1 → 0.0.2 → 0.0.1-1 (failure)
Development build from release version (auto-bumps to dev):
0.0.1 → 0.0.1-1 → 0.0.1-1 (auto-bump)
Development build from dev version (no change):
0.0.1-1 → 0.0.1-1 → 0.0.1-1 (no version change)
1.5 Pre-Build Hooks
Purpose: Run custom scripts before the main build.
What happens:
- Executes scripts configured in
build.hooks.pre(production) ordev.hooks.pre(development) - Scripts in
build.hooks.bothordev.hooks.bothalso run here - Hooks run in the order specified in
_projr.yml - Each hook is executed via
source()in the project environment
Configuration examples:
# Production hooks
build:
hooks:
pre:
- setup-data.R
- check-dependencies.R
both:
- log-timestamp.R
# Development hooks (exclusive for dev builds)
dev:
hooks:
pre: quick-setup.R
both: dev-logger.RImportant: build.hooks are ignored in
dev builds, and dev.hooks are ignored in production
builds.
1.6 Output Directory Preparation
Purpose: Clear and prepare output directories based on configuration.
What happens:
- Version setting - Sets project to run version (e.g., “0.0.2”)
-
Directory clearing - Based on
clear_outputsetting:-
"pre"(default for production): Clear output directories now -
"post": Clear after build completes -
"never": Don’t clear directories
-
Directory types:
-
Safe directories - Temporary build locations in
cache (e.g.,
_tmp/projr/v0.0.2/output) -
Unsafe directories - Final output locations (e.g.,
_output,docs)
Environment variable:
# Set via environment variable
Sys.setenv(PROJR_CLEAR_OUTPUT = "pre")
# Or via function parameter
projr_build_patch(clear_output = "pre")1.7 Pre-Build Git Commit
Purpose: Commit all changes made during pre-build preparation.
What happens:
- Commits changes to version files, ignore files, and documentation
- Commit message is the fixed text “Snapshot pre-build”
- Only runs if
build.git.commitisTRUEin_projr.yml
Example commit message:
Snapshot pre-build
1.8 Manifest Creation (Pre-Build)
Purpose: Record the state of input files before the build.
What happens:
- Hashes all files in input directories (
raw-data,cache) - Creates temporary manifest file in cache directory
- Stores relative file paths and MD5 hashes
- This manifest will be merged with output hashes after the build
Manifest structure:
| label | fn | version | hash |
|---|---|---|---|
| raw-data | dataset.csv | v0.0.2 | abc123… |
| cache | processed.rds | v0.0.2 | def456… |
Stage 2: Build Phase
The build phase executes the actual document rendering or script execution.
2.1 Script Selection
Purpose: Determine which documents or scripts to build.
Priority order:
-
fileparameter - Explicitly specified files -
dev.scripts(dev builds) - Scripts configured for development -
build.scripts(production builds, or dev builds whendev.scriptsnot set) - Scripts configured for production -
Engine-specific config -
_quarto.ymlor_bookdown.ymlproject -
Auto-detection - Finds
.Rmd,.qmd, or.Rfiles in project root
Configuration examples:
2.2 Document Rendering
Purpose: Render documents using the appropriate engine.
Supported engines:
-
Quarto - For
.qmdfiles or Quarto projects -
Bookdown - For bookdown projects (detected via
_bookdown.yml) -
R Markdown - For
.Rmdfiles
What happens:
- Engine is detected automatically or specified via configuration
- Documents are rendered with arguments from
args_engineparameter - Output goes to the appropriate safe directory (cache-based temporary location)
- Rendering respects the project’s configuration files
(
_quarto.yml,_bookdown.yml, etc.)
Custom engine arguments:
# Pass custom arguments to the rendering engine
projr_build_patch(args_engine = list(
quiet = FALSE,
clean = TRUE
))2.3 Script Execution
Purpose: Execute R scripts that generate outputs.
What happens:
- Plain
.Rfiles are executed viasource() - Scripts run in the project environment
- Outputs should be written to appropriate directories
(
_output,docs)
Best practices:
- Use
projr_path_get_dir()to get correct output paths - Write outputs to version-specific directories when needed
- Keep scripts focused and modular
Stage 3: Post-Build Phase
The post-build phase finalizes artifacts, updates documentation, and distributes outputs.
3.1 Artifact Finalization
Purpose: Move built artifacts from temporary to final locations.
What happens:
-
Output copying - Copies files from safe directories
to unsafe directories
-
_tmp/projr/v0.0.2/output→_output
-
-
Docs copying - Copies documentation from safe to
unsafe directories
- Safe
docsdirectory →docs(project root)
- Safe
-
Directory clearing (if “post”) - Clears directories
if
clear_output = "post"
Clearing behavior:
clear_output = "pre": Clear before, final dirs contain latest + previous
clear_output = "post": Clear after, final dirs contain only latest
clear_output = "never": Never clear, all versions accumulate
3.2 Manifest Creation (Post-Build)
Purpose: Record the state of output files after the build.
What happens:
-
Hash output files - Hashes files in
outputanddocsdirectories - Merge manifests - Combines pre-build and post-build hashes
-
Append to history - Adds current version to
manifest.csvat project root - Preserve history - Maintains all previous versions in the manifest
Query functions:
# Changes between two versions
projr_manifest_changes("0.0.1", "0.0.2")
# Changes across a range
projr_manifest_range("0.0.1", "0.0.5")
# Last changes
projr_manifest_last_change()3.3 Documentation Updates
Purpose: Update project documentation and metadata files.
What happens:
-
roxygen2 documentation - Regenerates
.Rdfiles if roxygen comments found -
Citation files - Updates version in:
CITATION.cffcodemeta.jsoninst/CITATION
-
README rendering - Renders
README.Rmdif it exists - BUILDLOG update - Adds build record with change summary (production builds only)
- CHANGELOG update - Adds version entry if configured
BUILDLOG entry structure:
3.4 Post-Build Git Commit
Purpose: Commit all changes made during the build.
What happens:
- Commits output files, documentation, manifest, and metadata
- Commit message format: “Build v{version}: {message}”
- Only runs if
build.git.commitisTRUE - Does NOT push yet (push happens later)
Example commit message:
Build v0.0.2: Build message here
3.5 Remote Distribution
Purpose: Send artifacts to remote destinations (GitHub, OSF, local archives).
What happens:
For each configured remote destination:
-
Determine send strategy:
-
"sync-diff"- Sync only changed files -
"sync-purge"- Remove all, then upload all -
"upload-all"- Upload all files (may overwrite) -
"upload-missing"- Upload only files not on remote
-
-
Inspect remote (based on
send_inspect):-
"manifest"- Use manifest.csv to track versions -
"file"- Inspect actual files on remote -
"none"- Treat remote as empty
-
-
Apply send cue:
-
"always"- Always create new remote version -
"if-change"- Only if content changed -
"never"- Never send
-
-
Create structure:
-
"latest"- Overwrite files at destination -
"archive"- Create versioned subdirectories
-
Configuration example:
build:
github:
release: "archive"
structure: "archive"
send_cue: "if-change"
send_strategy: "sync-diff"
send_inspect: "manifest"Debug output:
When PROJR_OUTPUT_LEVEL="debug", you’ll see detailed
information:
Remote: github/archive (type: github, structure: archive)
Files to upload: 12
Strategy: sync-diff
Changed files: 3 modified, 2 added, 1 removed
3.6 Post-Build Hooks
Purpose: Run custom scripts after the main build and distribution.
What happens:
- Executes scripts configured in
build.hooks.post(production) ordev.hooks.post(development) - Scripts in
build.hooks.bothordev.hooks.bothalso run here - Hooks run in the order specified
- Useful for notifications, cleanup, or additional processing
Example use cases:
- Send email notifications
- Update external dashboards
- Create summary reports
- Clean up temporary files
- Trigger downstream processes
3.7 Development Version Setup
Purpose: Set project back to a development version after production build.
What happens:
- Only for production builds (not dev builds)
- Appends development suffix to current version (e.g., “0.0.2” → “0.0.2-1”)
- Commits the version change with message “Begin v{version}”
- Ensures project is ready for next development cycle
- Any changes after this commit are segmented from the released version
Version progression:
Build v0.0.2:
Pre-build: 0.0.1-1 → 0.0.2
Post-build: 0.0.2 → 0.0.2-1
Manifest System
The manifest system tracks all file changes across versions.
How It Works
Pre-build: 1. Hash files in raw-data
and cache directories 2. Store in temporary manifest:
_tmp/projr/{version}/manifest_pre.csv
Post-build: 1. Hash files in output and
docs directories 2. Merge with pre-build manifest 3. Append
to project manifest: manifest.csv
Manifest Structure
label,fn,version,hash
raw-data,dataset.csv,v0.0.1,abc123def456
cache,processed.rds,v0.0.1,789012ghi345
output,figure1.png,v0.0.1,jkl678mno901
docs,index.html,v0.0.1,pqr234stu567
Querying Changes
# What changed between versions?
changes <- projr_manifest_changes("0.0.1", "0.0.2")
changes$added # New files
changes$removed # Deleted files
changes$modified # Changed files
# What changed across multiple versions?
range_changes <- projr_manifest_range("0.0.1", "0.0.5")
# What changed in the last build?
last <- projr_manifest_last_change()Build Hooks System
Hooks allow you to run custom scripts at specific points in the build process.
Hook Stages
Pre-build hooks - Run after validation but before document rendering: - Setup tasks - Data preparation - Environment configuration
Post-build hooks - Run after distribution but before final push: - Notifications - Cleanup - Post-processing
Both-stage hooks - Run in both pre and post stages: - Logging - Time tracking - Common setup/teardown
Configuration Patterns
Production builds - Use
build.hooks:
build:
hooks:
pre:
- setup-data.R
- validate-inputs.R
post:
- send-notification.R
- cleanup-temp.R
both:
- log-timestamp.RDevelopment builds - Use dev.hooks
(exclusive):
Hook Execution
- Hooks execute via
source()in the project environment - All hooks must exist or build will fail (validated in pre-build)
- Hooks have access to all project functions and environment
- Execution order: stage-specific hooks first, then “both” hooks
Best Practices
- Keep hooks focused - Each hook should do one thing
-
Handle errors - Use
tryCatch()for non-critical hooks - Log activity - Help debug issues by logging hook execution
- Test independently - Ensure hooks work standalone
- Document purpose - Add comments explaining what each hook does
Logging and Debugging
projr provides comprehensive logging to help understand and debug builds.
Output Levels
Control console verbosity via PROJR_OUTPUT_LEVEL
environment variable:
"none" - No additional messages
(default for dev builds):
# Only errors are shown"std" - Standard progress messages
(default for production builds):
# Shows major steps
✔ Pre-build preparation completed
✔ Document rendering completed
✔ Post-build completed successfully"debug" - Verbose debugging
information:
Log Files
Detailed logs are written to the cache directory:
cache/projr/log/
├── output/ # Production build logs
│ ├── history/builds.md # All build records
│ └── output/2024-Jan-15/
│ └── 14-30-45.qmd # Detailed log
└── dev/ # Development build logs
└── ...
Log File Control
PROJR_LOG_DETAILED - Control detailed
log file creation:
# Create detailed logs (default)
Sys.setenv(PROJR_LOG_DETAILED = "TRUE")
# Skip detailed logs (history still maintained)
Sys.setenv(PROJR_LOG_DETAILED = "FALSE")Debug Workflow
When a build fails:
Check the error message - Often points to the exact issue
-
Enable debug output:
Sys.setenv(PROJR_OUTPUT_LEVEL = "debug") projr_build_patch() Review the log file - Check detailed log in cache
-
Check Git status:
-
Verify packages:
Configuration Options
Environment Variables
Build behavior:
-
PROJR_OUTPUT_LEVEL- Console verbosity:"none","std","debug" -
PROJR_CLEAR_OUTPUT- When to clear:"pre","post","never" -
PROJR_LOG_DETAILED- Detailed logs:"TRUE","FALSE"
Authentication:
-
GITHUB_PAT- GitHub Personal Access Token -
OSF_PAT- Open Science Framework token
Testing:
-
R_PKG_TEST_LITE- Enable LITE test mode -
R_PKG_TEST_CRAN- Enable CRAN test mode
YAML Configuration
Git settings:
build:
git:
commit: true # Auto-commit during build
push: true # Auto-push after build
add-untracked: true # Include untracked filesBuild scripts:
Build hooks:
Remote destinations:
Function Parameters
Build functions:
projr_build_patch(
msg = "Build message", # Git commit message
args_engine = list(), # Arguments for rendering engine
profile = NULL, # projr profile to use
archive_github = FALSE, # Quick GitHub archiving
archive_local = FALSE, # Quick local archiving
always_archive = TRUE, # Archive even if no changes
clear_output = "pre", # When to clear directories
output_level = "std" # Console verbosity
)
projr_build_dev(
file = NULL, # Specific file to build
bump = FALSE, # Whether to bump dev version
old_dev_remove = TRUE, # Remove old dev versions
args_engine = list(),
profile = NULL,
clear_output = "never", # Default for dev builds
output_level = "none" # Default for dev builds
)Common Patterns and Examples
Basic Production Build
library(projr)
# Build with patch version increment
projr_build_patch(msg = "Fix analysis bug")
# Build with minor version increment
projr_build_minor(msg = "Add new analysis section")
# Build with major version increment
projr_build_major(msg = "Complete rewrite")Development Build
# Quick dev build
projr_build_dev()
# Build specific file
projr_build_dev(file = "analysis.qmd")
# Dev build with debug output
Sys.setenv(PROJR_OUTPUT_LEVEL = "debug")
projr_build_dev()Build with Custom Options
# Build with custom engine arguments
projr_build_patch(
msg = "Update analysis",
args_engine = list(
quiet = FALSE,
clean = TRUE
)
)
# Build with different output level
projr_build_patch(
msg = "Release",
output_level = "debug"
)
# Build without clearing output
projr_build_patch(
msg = "Incremental update",
clear_output = "never"
)Build with Archiving
# Archive to GitHub release
projr_build_patch(
msg = "Release v0.0.2",
archive_github = TRUE
)
# Archive specific directories
projr_build_patch(
msg = "Archive outputs only",
archive_github = c("output", "docs")
)
# Archive locally
projr_build_patch(
msg = "Create local backup",
archive_local = TRUE
)Build with Profiles
# Use production profile
projr_build_patch(
msg = "Production release",
profile = "production"
)
# Use testing profile
projr_build_patch(
msg = "Test build",
profile = "testing"
)Troubleshooting
Build Fails with Package Error
Problem: Required packages not available
Solution:
# Check package requirements
pkg_status <- projr_build_check_packages()
# See what's needed
pkg_status$missing_pkgs
# Get installation commands
pkg_status$install_cmdsBuild Fails with Git Error
Problem: Git repository not initialized
Solution:
# Initialize Git
projr_init_git(TRUE)
# Or disable Git handling
projr_yml_git_set(FALSE)Build Fails with Authentication Error
Problem: GitHub or OSF authentication missing
Solution:
# See GitHub auth instructions
projr_instr_auth_github()
# See OSF auth instructions
projr_instr_auth_osf()
# Set tokens
Sys.setenv(GITHUB_PAT = "your_token_here")
Sys.setenv(OSF_PAT = "your_token_here")Build Succeeds but Files Not Where Expected
Problem: Safe vs unsafe directory confusion
Solution:
# Get safe directory (in cache)
projr_path_get_dir("output", safe = TRUE)
# Get unsafe directory (final location)
projr_path_get_dir("output", safe = FALSE)
# Check clear_output setting
Sys.getenv("PROJR_CLEAR_OUTPUT")Want to See What Changed
Problem: Need to understand what files changed in last build
Solution:
# Check last changes
changes <- projr_manifest_last_change()
changes$added
changes$modified
changes$removed
# Compare two specific versions
projr_manifest_changes("0.0.1", "0.0.2")Summary
The projr build process provides a comprehensive, automated workflow for reproducible research:
- Pre-Build validates, prepares, and versions your project
- Build renders documents and executes scripts
- Post-Build finalizes, documents, and distributes outputs
Key features:
- Automated versioning with manifest tracking
- Flexible hooks for customization
- Multiple remotes for distribution (GitHub, OSF, local)
- Comprehensive logging for debugging
- Git integration for version control
This architecture ensures that every build is: - Reproducible - Full record of what changed - Traceable - Complete audit trail via manifests and logs - Reliable - Validation and error handling at each step - Flexible - Configurable for different workflows
For more information:
- How-to guides - Task-focused guides for common workflows
- Concepts - Core concepts behind projr
- Environment - Environment variables reference
- Design - Architecture and design decisions