Skip to contents

Overview

These are the setup instructions for a demonstration of some features of the REDCap Custodian R package.

In these setup steps, we will demonstrate how to:

  1. Create a local credentials database
  2. Fetch and store your API tokens in the credentials database
  3. Create an load some fake data into the REDCap projects we want to test with.

All of this is a prelude to a demonstration of moving data between REDCap projects described in vignette("friday-call-demo").

Prequisites

If you want to do this yourself, we recommend you do these things to setup your development and testing environment:

  1. Install R, Rstudio, and the tidyverse packages
  2. Install redcapcustodian from github
  3. Clone the redcap-docker-compose git repo
  4. Create a local REDCap with redcap-docker-compose
  5. Create an new project in Rstudio. Note this is an R project, not a REDCap project,
  6. Copy local.env.txt to .env in the root of your new project folder.

You also need some REDCap projects to play with. If you have your own, that’s great. for our demo, we need two REDCap projects to exist. Please make those projects with these XML files: main biospecimen.

Name the project from main.xml: “Demo Main” Name the project from biospecimen.xml: “Demo Biospecimen”

To talk to those projects from code, they’ll need API tokens. Add those to the main and biospecimen projects. Do the same for any other project you want to play with.

With those changes in place, you can start developing scripts that use REDCap Custodian.

Fetch and store API Tokens

First, load some R packages:

Then, make a database to hold the credentials:

dir.create(here::here("credentials"))

# creates file if one does not exist
file_conn <- DBI::dbConnect(RSQLite::SQLite(), here::here("credentials/credentials.db"))

# SQLite friendly schema
credentials_sql <- "CREATE TABLE IF NOT EXISTS `credentials` (
  `redcap_uri` TEXT NOT NULL,
  `server_short_name` varchar(128) NOT NULL,
  `username` varchar(191) NOT NULL,
  `project_id` int(10) NOT NULL,
  `project_display_name` TEXT NOT NULL,
  `project_short_name` varchar(128) DEFAULT NULL,
  `token` varchar(64) NOT NULL,
  `comment` varchar(256) DEFAULT NULL
);
"

dbExecute(file_conn, credentials_sql)

Now, fetch all of your API tokens and write them into your local credentials DB.

# fetching all extant API tokens and adding them to storage #################
load_dot_env(here::here("local.env.txt"))

my_username <- "admin"
source_conn <- connect_to_redcap_db()
scraped_credentials <- scrape_user_api_tokens(source_conn, my_username)

# alter credentials to match local schema
source_credentials_upload <- scraped_credentials %>%
  mutate(
    redcap_uri = Sys.getenv("URI"),
    server_short_name = tolower(Sys.getenv("INSTANCE"))
  ) %>%
  # remove duplicates
  anti_join(
    tbl(file_conn, "credentials") %>%
      collect()
  )

dbAppendTable(file_conn, "credentials", source_credentials_upload)

DBI::dbDisconnect(source_conn)

Make test data and write it to your test projects

We will generate some records for your main project for you to port over to the Biospecimen project. First, access the credential database to get your credentials to the REDCap project you want to read from.

source_credentials <- tbl(file_conn, "credentials") %>%
  filter(username == my_username) %>%
  collect() %>%
  filter(str_detect(project_display_name, "Demo Main")) %>%
  unnest()

# Disconnect when your done with the database
dbDisconnect(file_conn)

We use Will Beasley’s REDCapR library to interact with REDCap via its API. REDCapR allows you to specify the forms, fields, event names, and time frames of interest. It even allows REDCap filtering.

It isn’t necessary to understand most of the following block as you typically won’t be populating a project with random data.

record_count_to_create <- 50
collection_events <- 5
tubes_per_collection <- 30

record_columns <- c(
  "record_id",
  "redcap_event_name",
  "sample_collected_date",
  "tmp_event_id",
  paste0("tube_id", 1:tubes_per_collection),
  paste0("tube_specimen_type", 1:tubes_per_collection),
  paste0("tube_volume", 1:tubes_per_collection)
  )

# create empty dataframe and set column names
simulated_data <- data.frame(
  matrix(ncol = length(record_columns), nrow = 0)
) %>%
  mutate(across(everything(), as.character))
colnames(simulated_data) <- record_columns

# create base entries for each event
# NOTE: may be a bit slow as for-loops are not generally used in R
for (record_id in 1:record_count_to_create) {
  for (event_id in 1:collection_events) {
    simulated_data <- simulated_data %>%
      add_row(
        record_id = as.character(record_id),
        redcap_event_name = paste0("event_", event_id, "_arm_1"),
        tmp_event_id = as.character(event_id), # used to generate tube IDs later
        sample_collected_date = sample(
          seq(ymd("2020-03-01"), ymd("2022-06-01"), by = "day"), size = 1, replace = T
        ) %>% as.character()
      )
  }
}

# group to ensure simulated data is consistent with a single collection event
simulated_data <- simulated_data %>%
  group_by(record_id, redcap_event_name)

# simulate individual samples
for (tube in 1:tubes_per_collection) {
  simulated_data <- simulated_data %>%
    mutate(
      "tube_id{tube}" := paste0(
        record_id, "-",
        str_pad(tmp_event_id, width = 2, side = "left", pad = "0"), "-",
        str_pad(tube, width = 2, side = "left", pad = "0")
      ),
      "tube_specimen_type{tube}" := sample(1:4, size = 1),
      "tube_volume{tube}" := sample(2:4, size = 1)
    )
}

# remove temporary column used in simulation
simulated_data <- simulated_data %>%
  ungroup() %>%
  select(-tmp_event_id)

# upload data to REDCap
REDCapR::redcap_write(
  redcap_uri = source_credentials$redcap_uri,
  token = source_credentials$token,
  ds_to_write = simulated_data
)