STATA Research Datasets - Structure update - Kidney Health Education and Research Group

STATA Research Datasets – Structure update

Hi all,
We’ve implemented an important change to the data dump folder, impacting STATA research datasets & production frequencies moving forward.

The problem
Currently, we produce research datasets on a daily basis via our automated scripts and resolve bugs as they arise with collected patient data or when we are addressing existing data issues. This lag in correcting existing issues, results in several datasets being produced not having the corrections reflected and not the best tracking mechanism in place to allow us to identify which daily datasets (other than the latest version) is the best and has the up to date fixes. This further makes it challenging to know which dataset to point researchers towards for statistical use.

The solution
Since there is minimal value from having 365 research datasets per year, we’ll be moving away from maintaining daily datasets to monthly datasets with a development vs live version. On a regular basis, the data team will continue to ensure bugs are resolved and there is one ‘golden’ monthly STATA dataset which has been reviewed and is the best available at the time. Throughout the month, the data team will use our development structure, integrated with GitHub for efficient version control, to test modifications and reflect our updates to ensure this live version has all corrections reflected. In this structure, we no longer need to track & manage 365 datasets to ensure they are up to date, but rather only 12 per year.

How does this impact you?
In the data dump directory, you’ll see only 2 folders:

And now on a monthly basis, it’s crystal clear which dataset should be used and which has the most up to date patient collected data with audited results. All previous datasets will continue to be available in the archive folder (U:\data\dump\archive), if needed.

If there are any questions or concerns, please reach out.

Thanks,
Nathaniel Edwards

Leave a reply

Your email address will not be published.