Overview
This is a demonstration of some features of the REDCap Custodian R package.
We will demonstrate how to:
- Fetch and store you API tokens in a database
- Get some data from a REDCap project
- Transform your data
- Write it to another project
- Review the logs of what the job
Then we’ll talk about how to automate that whole process.
Prequisites
Follow the steps and run the code provided in
vignette("friday-call-demo-setup")
.
Premise
You have a big important project with many rows of data and many more rows coming in every week, a researcher asks you to build them a biorepository project for the biospecimens collected for this protocol. We do not want anyone to transcribe existing data, so we will automatically copy it for them every day. No transcription, no typos, no overtime.
Get some data from a REDCap project
First, load some R packages:
library(redcapcustodian)
library(DBI)
library(tidyverse)
library(dotenv)
library(lubridate)
load_dot_env(here::here("local.env.txt"))
init_etl("friday_call_demo")
Next, connect to the credential database. Then read your credentials to the REDCap project you want to read from.
file_conn <- DBI::dbConnect(RSQLite::SQLite(), here::here("credentials/credentials.db"))
my_username <- "admin"
source_credentials <- tbl(file_conn, "credentials") %>%
filter(username == my_username) %>%
collect() %>%
filter(str_detect(project_display_name, "Demo Main")) %>%
unnest()
At this point, it’s time to read the portions of the project data that interest you.
For our task, we want to read identifiers for collected data so we can write them into the Biospecimen tracking project
record_count_to_create <- 50
collection_events <- 5
tubes_per_collection <- 30
fields_to_read <- c(
"record_id",
"redcap_event_name",
"sample_collected_date",
paste0("tube_id", 1:tubes_per_collection),
paste0("tube_specimen_type", 1:tubes_per_collection),
paste0("tube_volume", 1:tubes_per_collection)
)
source_project_data <- REDCapR::redcap_read(
redcap_uri = source_credentials$redcap_uri,
token = source_credentials$token,
fields = fields_to_read
)
# Validate that data was retrieved and alert regarding issues
if (!source_project_data$success) {
warning("Data was not successfully read from REDCap")
}
Our customer’s requirements for the biorepository require us to do
some transformations before writing. That’s easy with the
dplyr
library
# Append the event number to the subject_id to make the record_id needed in the biorepository
new_target_project_data <- source_project_data$data %>%
rename(subject_id = record_id) %>%
mutate(record_id = paste0(
subject_id, "-",
str_replace(redcap_event_name, "event_", "") %>% str_replace(., "_arm_1", ""))) %>%
select(record_id, everything()) %>%
rename(date_draw = sample_collected_date)
Now write that data to the target project
target_credentials <- tbl(file_conn, "credentials") %>%
filter(username == my_username) %>%
collect() %>%
filter(str_detect(project_display_name, "Demo Biospecimen")) %>%
unnest()
# Want to know exactly what is getting updated in the target project? Fetch that data then anti-join with the new data set
target_fields_to_read <- c(
"record_id",
"redcap_event_name",
"sample_collected_date"
)
old_target_project_data <- REDCapR::redcap_read(
redcap_uri = target_credentials$redcap_uri,
token = target_credentials$token,
fields = target_fields_to_read
)
if (old_target_project_data$success) {
target_project_data <- new_target_project_data %>%
dplyr::anti_join(old_target_project_data$data)
} else {
target_project_data <- new_target_project_data
}
# now write that small dataset
result <- REDCapR::redcap_write(
ds_to_write = slice_head(target_project_data, prop=0.5),
redcap_uri = target_credentials$redcap_uri,
token = target_credentials$token
)
# now log what we did
if (result$success) {
log_job_success(jsonlite::toJSON(target_project_data))
} else {
log_job_failure(jsonlite::toJSON(result))
}
Review the logs of what the job did
When doing automated jobs, it’s important to have a record of what happened. REDCap Custodian writes logs so you can review its actions later.
# Get the connection to the log database
log_con <- get_package_scope_var("log_con")
# Look at the entire log of jobs run
tbl(log_con, "rcc_job_log") %>%
collect() %>%
view()
# Read the summary of the last 'friday_call_demo' job
tbl(log_con, "rcc_job_log") %>%
filter(script_name == "friday_call_demo") %>%
collect() %>%
arrange(desc(log_date)) %>%
head(n=1) %>%
pull(job_summary_data) %>%
jsonlite::fromJSON() %>%
view()