Dataset Versioning - Kidney Health Education and Research Group

Dataset Versioning

Hi all,
Over the recent months, the data team has been pooling large amounts of data into the STATA Mergedset to deliver a few enhancements for our researchers:

  1. Detailed Computerized Adaptive Testing information for each domain for all patients
  2. Enrollment statuses & screening information for all patients to allow direct access to these metrics for manuscript reporting

Though this was a great milestone, unfortunately, some of you may have noticed that the STATA dataset was growing exponentially in size to around 300MB. Our STATA developer Mark has delivered an excellent solution, partitioning the merged dataset, while maintaining the core enhancements. We are happy to relay that this versioning has been completed and the following dataset versions will be generated moving forward:

  1. Mergedset.dta – currently 20MB
  • This will provide only the enrolled patients
    real_enrollment_status == 3 & real_enrollment_status == 9 (cat tagged)
  1. MergedSet_WithCat.dta – currently 112 MB
  • This will provide only the enrolled patients and include their detailed cat information for each domain such as order of questions administered & associated responses, running theta & standard error values, time to completion etc.
    real_enrollment_status == 3 & real_enrollment_status == 9 (cat tagged)
  1. MergedSet_NonEnrolled.dta – currently 1MB
  • This will provide a listing of all screen failures, declines, etc. across all studies.
  • No patient identifying information is stored here, but you will be able to use this dataset to obtain accurate metrics on how many patients have been screened out and the respective reasons for such.
  • Example usage:

use MergedSet_NonEnrolled.dta

//subset to your specific sample & then review the screening variables
codebook screening_younger screening_ckd screening_english screening_education screening_kt screening_consent screening_30days_since_tx screening_demantia screening_hamper_q screening_gfr

As always, please feel free to reach out if there’s any questions.

Thanks,
Nathaniel Edwards

Leave a reply

Your email address will not be published.