Research Subject Mapper is a tool designed to serve the needs of multi-center studies to prevent exposure of medical record identifiers to the data coordination center while allowing the data coordination center to specify data collection periods for the research subjects. The intended usage of this tool is to combine authoritative data from a research subject data store with identifiable data at a data collection site to generate inputs for queries of an Electronic Health Record (EHR).
Working with research-subject-mapper executable(egg file)
Requirements
To successfully run the Research Subject Mapper tool on the target machine please install the software below before going any further.
Install Python
Execute on the target machine$ sudo apt-get install python2.7
Install Python Packages
$ sudo apt-get install python-setuptools $ sudo apt-get install python-dev $ sudo apt-get install libxml2 $ sudo apt-get install libxslt1-dev
Download the *.egg file
The Research Subject Mapper Executable can be downloaded from the following address ctsit.github.io/research-subject-mapper
Installation
To install research subject mapper run sudo easy_install rsm_X.X.X.egg where X's represent version number.
Configuration Setup
Look for config-example-gsm and config-example-gsm-input directories in the rsm installation directory on the target machine and prepare custom config directories with your implementations on the target machine.
Files to modify
The files that need to be modified with your implementation details in config directory are site-catalog.xml and source_data_schema.xml
site-catalog.xml
This xml file follows below structure:
<sites_list> <site> <site_name>source</site_name> <site_URI>sftp.source_site.edu</site_URI> <site_uname>source_user</site_uname> <site_password>source_password</site_password> <site_remotepath>ftp/smi.xml</site_remotepath> <site_contact_email>contact@source_site.edu</site_contact_email> </site> </sites_list>
source_data_schema.xml
This xml file follows below structure:
<source> <redcap_uri>Your_RedCap_Instance_URI</redcap_uri> <apitoken>API_TOKEN</apitoken> <fields> <field>Field_Name</field> <field>Field_Name</field> <field>Field_Name</field> </fields> </source>
Running the generate_subject_mapper tools
Input Requirements
- Edit the source_data_schema.xml file to specify the "Person Index" fields before running gsm. You will need to specify the "study subject number", a field to verify the study subject (typically year of birth), and the subject's corresponding MRN.
- If this tool is being run on the central site you need to run gsmi tool as shown below and put generated files in the ftp of the site. Client sites typically only need to run the generate_subject_mapper tool.
To generate input at the central site location
Run the following command:
gsmi -c <FULL_PATH_TO_CONFIG_DIRECTORY> -k <YES_OR_NO_TO_KEEP_GENERATED_FILES>Notes:
- If a directory named config is already setup in the parent directory of the generate_subject_map_input.py file, then it is not necessary to provide the path to config directory using the -c option.
- If this tool is being run at the site, one can just provide site ftp details in the site-catalog.xml in order to use the gsm tool.
To run generate_subject_map.py tool
Run the following command:
gsm -c <FULL_PATH_TO_CONFIG_DIRECTORY> -k <YES_OR_NO_TO_KEEP_GENERATED_FILES>Note:
- If a directory named config is already setup in the parent directory of the generate_subject_map_input.py file, then it is not necessary to provide the path to config directory using the -c option.