Skip to contents

Purpose

The get_compile_data function retrieves and cleans study data from the DM (Demographics) domain by applying multiple filtering steps and compiles the remaining data into a cleaned format. First, it removes recovery animals by filtering the DM data using information from the DS (Disposition) domain. Additionally, if the study involves rats or mice,the function further filters out toxicokinetic animals by excluding USUBJIDs present in the Pharmacokinetic (PC) domain. These steps ensure the data set excludes recovery animals and toxicokinetic (TK) animals, focusing only onthe target population relevant to the study’s primary analysis.This function supports data retrieval from both SQLite databases and .xpt files.

Function Parameters (Arguments)

Parameter Type Description Mandatory/Default
studyid Character The study ID number, which uniquely identifies the study within the database. Mandatory
path_db Character The path to the database file. This could be a path to an SQLite database or a directory containing .xpt files. Mandatory
fake_study Boolean Indicates if the study data has been generated using the SENDsanitizer package. Optional (Default: FALSE)
use_xpt_file Boolean Specifies whether to use .xpt file format when dealing with data generated by the SENDsanitizer package. Optional (Default: FALSE)

Output

Returns a cleaned data.frame with the following columns:

  • STUDYID
  • USUBJID
  • Species
  • SEX
  • ARMCD
  • SETCD

The cleaned data is now ready to be used for further analysis.

Implementation Details

The get_compile_data function leverages the following steps to calculate the compile_data data frame for a given study:

Database Connection

-This function connects to a SQLite database or reads .xpt files specified by path_db.

Data Fetching

The function retrieves data from the following SEND domains based on the input parameters:

  • DM (Demographics): Provides animal-level information.
  • DS (Disposition): Identifies recovery animals using the DSDECOD column.
  • PC (Pharmacokinetics): Excludes TK animals for rats and mice based on USUBJID.
  • TX (Treatment): Determines dose levels such as “vehicle” or “HD.”

Filtering Steps

  • Filtering Recovery Animals
    Recovery animals are excluded by filtering the DM data based on DSDECOD values in the DS domain.

  • Filtering Toxicokinetic (TK) Animals
    For studies involving rats or mice, the function removes animals whose USUBJID appears in the PC domain.

  • Dose Selection
    The function identifies and retains animals assigned to either the “vehicle” group or the “high-dose” (HD) group by applying dose-ranking logic from the TX domain, where “Control” groups are reclassified as “vehicle.”

Examples Usage

# Example usage with SQLite database
df <- get_compile_data(
  studyid = "1234123",
  path_db = "path/to/database.db"
)

# Example usage with .xpt files
df <- get_compile_data(
  studyid = "1234123",
  path_db = "path/to/files",
  fake_study = TRUE,
  use_xpt_file = TRUE
)

Required Libraries

This function requires the following R packages:

  • DBI
  • RSQLite
  • data.table
  • dplyr
  • haven
  • tidyr
  • stringr

##Notes

  • The function assumes standard SEND domains and column names.
  • For non-standard data, adjustments may be needed.
  • Check your database or .xpt files to ensure compatibility with the function.