As reported by the MIT News Office, July 23, 2008, Electrical Engineering and Computer Science Department and Harvard-MIT Division of Health Sciences and Technology, HST, professor Roger G. Mark, and HST principal research scientist Gari D. Clifford, have developed a new software that will enable researchers to access patient medical records for valuable information to aid in future biomedical research and treatments without compromising patient identity.
As required by the U.S. Health Insurance Portability and Accountability Act (HIPAA), patient records that are to be shared within the research community must have any identifying information removed. Manual removal of identifying information is prohibitively expensive, time consuming and prone to error-constraints that have led to considerable research towards automated techniques for 'de-identifying' medical records.
The free and open-source software package successfully deleted more than 94% of the confidential information, while wrongly deleting only 0.2% of the useful content, when tested on a database of over 1800 nursing notes (a total of 296,400 words). This result, according to the authors, "is significantly better than one expert working alone, at least as good as two trained medical professionals checking each other's work and many, many times faster than either."
The MIT team is also freely providing the labeled de-identified data together with the software to allow others to improve their systems, and to allow the software to be adapted to other data types that may exhibit different qualities.
This study was published July 24, 2008 in the open access journal BMC Medical Informatics and Decision Making.