Skip to content

Natural language processing and healthcare: Chasing Mother Nature’s language

“Mother Nature is always speaking. She speaks in a language understood within the peaceful mind of the sincere observer.” Radhanath Swami, American Gaudiya Vaishnava guru and author


The COVID-19 pandemic rages on with an even more transmissible mutated form of SARS-CoV-2. Humans are struggling with execution of the vaccination process, despite the scientific miracle of a vaccine that was well ahead of traditional timelines. For future pandemics, the entire public health and clinical medicine strategy desperately need appropriate AI-inspired automation to expedite the myriad of processes that humans lack sufficient skillset and patience for. Alan Turing remains prescient in his philosophy that machines, not humans, need to neutralize machines (viruses in this pandemic functioning as such).

One very notable advance in AI during this difficult past year has been in the domain of natural language processing (NLP), which is defined as the utilization of computers for processing languages that are human-derived (“natural”) that enables humans to communicate with computers. NLP entails the fields of linguistics, computer science, cognitive science, psychology, philosophy, and logic, and consists of two language tasks: text comprehended and interpreted by the computer called natural language understanding (NLU) and text constructed for communication from the computer termed natural language generation (NLG). The use of NLP in healthcare ranges from extracting information from radiology and genomic reports to diagnosing diseases and matching patients to clinical trials, chatbots and virtual assistants, and even drug discovery or repurposing.

In June 2020, a leading AI research laboratory called OpenAI launched a game changing NLP model called generative pre-trained transformer 3, or GPT-3. This revolutionary NLP model uses deep learning to produce text that is human-like with its myriad of easy-to-use applications via its application programming interfaces (APIs). GPT-3 is a very significant advance over bidirectional encoder representations from transformers, or BERT, and its congener BioBERT (a pre-trained biomedical language model for biomedical text mining).

Several even more advanced and sophisticated multilingual NLP models are on the way (including the upcoming 1.6 trillion-parameter Google NLP model). In addition, more pre-trained NLP models in various NLP model hubs (such as Hugging Face, TensorFlow, PyTorch and others) are available for more ease of use even in healthcare.

Lastly, NLP has been improving with advances in both deep learning and transfer learning. Despite the promising future of NLP, however, accuracy of these NLP models will need to improve as this is a critical requirement for NLP work in the healthcare domain.

The NLP algorithms that are designed for text and sentences can now even be creatively used to interpret sequences and predict mutations in viruses such as the coronavirus in COVID-19. The intriguing concept that was introduced in Science recently is that the genetic fitness of a virus can be compared to a grammatically correct sentence. Perhaps Mother Nature’s genetic code is the most elegant and sophisticated “natural” language after all.

The Healthcare Data Conundrum (Part I)

“It’s the economy, stupid.”  James Carville, Bill Clinton’s strategist for 1992 presidential campaign   The above battle mantra during the presidential campaign of Bill Clinton

Show Buttons
Hide Buttons