Year Two - Paul
Paul's IRB proposal has been approved and the next step is for him to acquire the data. With the aid of an informatician, Paul will obtain the data of interest from the EMR. This can be a lengthy process and has many pitfalls. It is important for Paul to understand the many issues involved in acquiring data from Electronic Medical Records, which include but are not limited to:
- Multiple fields that convey similar information, some of which may disagree.
Paul must decide what to do in this situation. Has one of the fields been known to be more accurate than the others? If there is disagreement is there another source from which the true information can be gleaned?
- Free text fields.
Free text fields can not be put into a statistical analysis so Paul needs to decide if there is any relevant information in these fields, and if so how they can be condensed such that they are analyzable.
- Multiple outcomes for the same response.
EMRs are created by people and people often have different abbreviations. For example, one person may put y for yes, while another will enter Y, and yet another will input yes. It is Paul's responsibility to clean'' the data prior to the statistical analysis such that inputs that mean the same are in fact identical.
Now that Paul has dealt with the above issues, he needs to get his data set into an analyzable form. To do this he brings his data into an excel spreadsheet such that each row corresponds to one subject and each column corresponds to one variable. Along with this spreadsheet, Paul prepares a codebook which describes what the variable in each column stands for, how it's measured, and the possible values that it can take on. He also notes how missing values are notated in the data set. Lastly, Paul removes any patient identifiers before sending the data to the statistician.
Return to the previous page.
-- ErinEsp - 07 May 2012