Many health systems are eager to embrace the capability of natural language processing (NLP) to access the vast patient insights recorded as unstructured text in clinical notes and records. Many healthcare data and analytics teams, however, aren’t experienced in or prepared for the unique challenges of working with text and, specifically, don’t have the knowledge to transform unstructured text into a usable format for NLP. Data engineers can follow four need-to-know principles to meet and overcome the challenges of making unstructured text available for advanced NLP analysis:
- Text is bigger and more complex.
- Text comes from different data sources.
- Text is stored in multiple areas.
- Text user documentation patterns matter.