Why ethical A.I. is nonsense

Robo Saint and Devil

It is a well-known fact that systems built with A.I. technology are often showing unethical behavior. They can be racist, sexist, or otherwise biased. The problem are not the algorithms. They are just applied mathematics and therefore quite incorruptible. But the models built with these algorithms are trained using data created by humans. And this is where the bias comes from. Humans behave often unethical and is is therefore not surprising for A.I. to show corresponding behavior.

Read More

Is Hardware the Key to Advancing Natural Language Processing?

MIT SpAtten

Researchers at MIT have created an algorithm-based architecture called SpAtten that reduces attention computation and memory access in natural language processing (NLP) systems. If we think it’s hard to learn a new language, imagine the challenges hardware and software engineers face when using CPUs and GPUs to process extensive language data. Natural language processing (NLP) attempts to bridge this gap between language and computing.

Read More

Artificial intelligence in Journalism

AI Journalism

Artificial intelligence in journalism is now a reality. Artificial intelligence has entered almost all aspects of our lives, including journalism. Due to the advent of digital media, we unknowingly consume content based on artificial intelligence everywhere. Whether it’s YouTube’s recommended videos, your Facebook feed, or the kinds of advertisements you see on regular websites, they are all specially catered to you with the use of AI.

Read More

GPT-2 vs GPT-3: The OpenAI Showdown

Woman at a blackboard

GPT-2 vs GPT-3: The OpenAI Showdown. The Generative Pre-Trained Transformer (GPT) is an innovation in the Natural Language Processing (NLP) space developed by OpenAI. These models are known to be the most advanced of its kind and can even be dangerous in the wrong hands. It is an unsupervised generative model which means that it takes an input such as a sentence and tries to generate an appropriate response, and the data used for its training is not labelled.

Read More

The product-Market fit of AI technologies

AI use cases Retail

A strategy is a roadmap for organisations to hone them to their respective objectives. From marketing strategies to product roadmap strategies, success is cemented on how such plans are executed. Along the way, a lot has to be negotiated and metrics traded off against one another that will either provide a tactical remedy or a strategic panacea. While it is the team that works towards such a role, the vision of the leadership ought to encompass and adapt to the various weathers of corporate ups and downs. This whole process…

Read More

Guide To TAPAS (TAble PArSing) – A technique to retrieve information from Tabular Data using NLP

One of the most common forms of data that exists today is tabular data (structured data).In order to extract information from tabular data, you use Python libraries like Pandas or SQL-like languages. Google has recently open-sourced one of their models called ‘TAPAS’ (for TAble PArSing) wherein you can ask questions about your data in natural language.

Read More

Everything you need to know about Google BERT

If you’ve been following developments in deep learning and natural language processing (NLP) over the past few years then you’ve probably heard of something called BERT; and if you haven’t, just know that techniques owing something to BERT will likely play an increasing part in all our digital lives. BERT is a state-of-the-art embedding model published by Google, and it represents a breakthrough in the field of NLP by providing excellent results on many NLP tasks, including question answering, text generation, sentence classification, and more. 

Read More

Getting started with 5 essential Natural Language Processing libraries

This article is an overview of how to get started with 5 popular Python NLP libraries, from those for linguistic data visualization, to data preprocessing, to multi-task functionality, to state of the art language modeling, and beyond.

Read More

Vision Transformers: Natural Language Processing (NLP) increases efficiency and model generality

Vision Transformers: Natural Language Processing (NLP) Increases Efficiency and Model Generality. Why do we hear so little about transformer models applied to computer vision tasks? What about attention in computer vision networks? Transformers Are for Natural Language Processing (NLP), Right?

Read More

Databricks raises $1 billion funding round at $28 billion valuation

Databricks logo stacked

Databricks today announced the close of a $1 billion funding round, bringing the company’s valuation to $28 billion after post-money valuation, a company spokesperson told VentureBeat. News of the funding round — the largest to-date for Databricks — was first reported in late January by Newcomer.

Read More

No. You still cannot have a Real Conversation with a Chatbot

Person and a Chatbot chatting

Sure, we can ask Siri or Alexa to answer a question or perform an action for us.  But Siri and Alexa can only respond to pre-programmed questions and commands.
They do not really understand what you are saying and you cannot have a real conversation with a personal assistant like you can with another person.

Read More

Graphcore IPU gets a public Cloud Boost

Graphcore cloud services

Graphcore, the U.K. AI chip developer, is expanding its roster of cloud partners to include Cirrascale Cloud Partners, a deep learning infrastructure specialist. The result of the collaboration is a scalable AI cloud platform dubbed Graphcloud that provides access to Graphcore’s second-generation intelligent processing unit, or IPU, and accompanying software stack.

Read More

Six times bigger than GPT-3: Inside Google’s TRILLION parameter switch transformer model

Six Times Bigger than GPT-3: Inside Google’s TRILLION Parameter Switch Transformer Model. OpenAI’s GPT-3 is, arguably , the most famous deep learning models created in the last few years. One of the things that impresses the most about GPT-3 is its size. In some context, GPT-3 is nothing but GPT-2 with a lot of more parameters. With 175 billion parameters, GPT-3 was about four times bigger than its largest predecessor.

Read More

How AI transforms Copywriting

What if marketers could leverage artificial intelligence for copywriting to deliver content that resonates with specific audiences? What if, instead of relying on gut instinct alone, creative teams could be mathematically certain about the words and phrases to use in marketing campaigns? It is now possible to apply science to the art of copywriting, and many brands have already started bringing together man and machine to produce compelling copy and achieve better results.

Read More

Google trained a trillion-parameter AI language model

Parameters are the key to machine learning algorithms. They’re the part of the model that’s learned from historical training data. Generally speaking, in the language domain, the correlation between the number of parameters and sophistication has held up remarkably well. For example, OpenAI’s GPT-3 — one of the largest language models ever trained, at 175 billion parameters — can make primitive analogies, generate recipes, and even complete basic code.

Read More

AI models from Microsoft and Google already surpass human performance on the SuperGLUE language benchmark

In late 2019, researchers affiliated with Facebook, New York University (NYU), the University of Washington, and DeepMind proposed SuperGLUE, a new benchmark for AI designed to summarize research progress on a diverse set of language tasks. Building on the GLUE benchmark, which had been introduced one year prior, SuperGLUE includes a set of more difficult language understanding challenges, improved resources, and a publicly available leaderboard.

Read More

OpenAI debuts DALL-E for generating images from text

OpenAI today debuted two multimodal AI systems that combine computer vision and NLP: DALL-E, a system that generates images from text, and CLIP, a network trained on 400 million pairs of images and text. The photo above was generated by DALL-E from the text prompt “an illustration of a baby daikon radish in a tutu walking a dog.” DALL-E uses a 12-billion parameter version of GPT-3, and like GPT-3 is a Transformer language model. The name is meant to evoke the artist Salvador Dali and the robot WALL-E.

Read More

Clinical Natural Language Processing

Transfer Learning and Weak Supervision. Every day across the country, doctors are seeing patients and carefully documenting their conditions, social determinants of health, medical histories and more into electronic health records (EHRs). These documentation-heavy workflows produce rich data stores with the potential to radically improve patient care. The bulk of this data is not in discrete fields, but rather free text clinical notes. Traditional healthcare analytics depends predominantly on discrete data fields and occasionally regular expressions for free text data, missing a wealth of clinical data.

Photo by Hush Naidoo on Unsplash
Every day across the country, doctors are seeing patients and carefully documenting their conditions, social determinants of health, medical histories and more into electronic health records (EHRs). These documentation-heavy workflows produce rich data stores with the potential to radically improve patient care. The bulk of this data is not in discrete fields, but rather free text clinical notes. Traditional healthcare analytics depends predominantly on discrete data fields and occasionally regular expressions for free text data, missing a wealth of clinical data.
Syndromic information about COVID-19 (i.e., fever, cough, shortness of breath) was valuable early in the pandemic to track spread before widespread testing was established. It continues to be valuable to better understand the progression of the disease and identify patients likely to experience worse outcomes. Syndromic data is not captured robustly in discrete data fields. Clinical progress notes, especially in the outpatient setting, provide early evidence of COVID-19 infections, enabling forecasting of upcoming hospital surges. In this article, we’ll examine how NLP enables these insights through transfer learning and weak supervision.

Photo by Martin Sanchez on Unsplash
Natural language processing (NLP) can extract coded data from clinical text, making previously-“dark data” available for analytics and modelling. With the recent algorithm improvements and simplified tooling, NLP is more powerful and accessible than ever before, however, it’s not without some logistical hurdles. Useful NLP engines require a great deal of labelled data to “learn” a data domain well. The specialized nature of clinical text precludes crowd source labelling, it requires expertise and the clinicians with that expertise are in high demand for much more pressing affairs — especially during a pandemic.
So how can health systems make use of their troves of free text data while respecting clinician time? A very practical approach is with transfer learning and weak supervision.
Modern NLP models no longer need to be trained from scratch. Many state-of-the-art language models are already pretrained on clinical text datasets. For COVID-19 Syndromic data, we started with Bio_Discharge_Summary_BERT available in a pytorch framework called huggingface. As described in the ClinicalBERT paper, the model is trained on MIMIC III dataset of discharge summaries. We used the transformer word embeddings from Bio_Discharge_Summary_BERT as a transfer learning base and fine-tuned a sequence tagging layer to classify entities as with our specific symptom labels. For example, we were interested in “Shortness of Breath”, clinically there are a lot of symptoms that can be classified under this umbrella (e.g., “dyspnea”, “winded”, “tachypneic”). Our classification problem was limited to approximately 20 symptom labels, yielding higher performance results than a generalized Clinical NER problem.
To train this sequence tagging layer, however, we came back to the data problem. Both MIMIC III and our internal clinical text datasets were unlabeled. The few publicly available, labelled clinical text datasets (e.g., N2C2 2010) were labeled with a different use case in mind. How could we get enough data labeled for our targeted use case that is sampled responsibly to prevent bias in the model?
Our strategy had 3 steps: selective sampling for annotation, weak supervision, and responsible AI fairness techniques
We used selective sampling to leverage our clinicians’ time more efficiently. For Covid-19 symptoms, that meant only serving up notes to annotators that were likely to have symptom information in them. A prenatal appointment note or a behavioral health note are very unlikely to be discussing fever, cough, runny nose, or shortness of breath. Strategically limiting the note pool we sent to annotators increased the labels per annotation hour spent by our clinicians. For annotation we provided our clinicians with a tool called prodigy. The user interface was easy for them to use and it is flexible for different annotation strategies.

Created by Author
One of the main decision points when setting up an annotation strategy is determining what granularity you want your annotators to label at. Choosing too high of a granularity like “symptom” would not give us the data we need for our use case but getting too specific like “unproductive cough” versus “productive cough” would be a heavy burden for annotators with no additional benefit for us. For any annotation strategy, it is important to balance burden on annotators with reusability of the labelled dataset. The less we have to go back to the well the better, but if it takes a clinician 2 hours to annotate a single clinical note, we have not succeeded either. For our project, the first pass of annotation was for NER only. We did a later pass for sentiment of the NER (ie. Present, Absent, Hypothetical). Prodigy allows for targeted strategies using custom recipe scripts.
After gathering the Prodigy annotations from our clinicians, we created rules-based labelling patterns to use in SpaCy for weak supervision. Prodigy and SpaCy are made by the same development group, making integration straightforward. Weak supervision is another annotation strategy, however, instead of “gold standard” annotation from clinical subject matter experts, it uses an algorithm to annotate a much larger volume of text. Ideally, the decreased accuracy from using an algorithm is offset by the large number of documents that can be processed. Using an algorithm based on the labelling patterns below we were able to generate a very large training dataset.

{“label”:”SOB”,”pattern”:[{“LOWER”:{“IN”:[“short”,”shortness”]}},{“LOWER”:”of”,”OP”:”?”},{“LOWER”:”breath”}]}{“label”:”SOB”,”pattern”:[{“LOWER”:”tachypnea”}]}{“label”:”SOB”,”pattern”:[{“LOWER”:”doe”}]}{“label”:”SOB”,”pattern”:[{“LOWER”:”winded”}]}{“label”:”SOB”,”pattern”:[{“LOWER”:”breathless”}]}{“label”:”SOB”,”pattern”:[{“LOWER”:”desaturations”}]}{“label”:”SOB”,”pattern”:[{“LOWER”:”gasping”}]}{“LOWER”:”enough”},{“LOWER”:”air”}]}{“label”:”SOB”,”pattern”:[{“LOWER”:”cannot”},{“LOWER”:”get”},{“LOWER”:”enough”},{“LOWER”:”air”}]}{“label”:”SOB”,”pattern”:[{“LOWER”:”out”},{“LOWER”:”of”},{“LOWER”:”breath”}]}

Since our selective sampling biased what notes we surfaced to our annotators, we needed to safeguard against bias in our weakly supervised dataset that would ultimately train the model. Machine learning in the clinical domain requires a higher degree of diligence to prevent bias in models. Responsible AI techniques are becoming mandatory in all industries, but as equality and justice are fundamental tenets of biomedical ethics, we took care to develop an unbiased note sampling approach for weak supervision. For each dataset, clinical notes were sampled in equal numbers across race and ethnicity, geographic location, gender, and age. The labelling patterns were then applied to the notes through SpaCy. The result was an annotated dataset in IOB format for 100,000 clinical notes.

def pandas_parse(x): with open(patterns_file) as f:patterns = json.load(f) if patterns_file.lower().endswith(“json”) else [json.loads(s) for s in f]for p in patterns:p[“id”] = json.dumps(p)spacy.util.set_data_path(“/dbfs/FileStore/spacy/data”)nlp = spacy.load(spacy_model, disable=[“ner”])ruler = EntityRuler(nlp, patterns=patterns)nlp.add_pipe(ruler)return x.apply(lambda i: parse_text(i,nlp))parse_pandas_udf = F.pandas_udf(pandas_parse,ArrayType(ArrayType(StringType())), F.PandasUDFType.SCALAR)#IOB outputdef parse_text(text,nlp):doc = nlp(text)text = []iob_tags = []neg = []for sent in doc.sents:if len(sent) < 210 and len(sent.ents) > 0:text = text + [e.text for e in sent] iob_tags = iob_tags + [str(e.ent_iob_) + ‘-‘ + str(e.ent_type_) if e.ent_iob_ else ‘O’ for e in sent]return (pd.DataFrame( {‘text’: text,’iob_tags ‘: iob_tags }).values.tolist())

Created by Author
At this point we were ready to train our sequence tagging layer. We used a framework called Flair to create a corpus from our IOB labeled dataset. The corpus was then split into dev, train, and validation sets and Flair took it from there. The results were very promising.

– F1-score (micro) 0.9964- F1-score (macro) 0.9783By class:ABDOMINAL_PAIN tp: 977 – fp: 6 – fn: 5 – precision: 0.9939 – recall: 0.9949 – f1-score: 0.9944ANXIETY tp: 1194 – fp: 8 – fn: 8 – precision: 0.9933 – recall: 0.9933 – f1-score: 0.9933CHILLS tp: 343 – fp: 1 – fn: 0 – precision: 0.9971 – recall: 1.0000 – f1-score: 0.9985CONGESTION tp: 1915 – fp: 21 – fn: 6 – precision: 0.9892 – recall: 0.9969 – f1-score: 0.9930COUGH tp: 3293 – fp: 6 – fn: 6 – precision: 0.9982 – recall: 0.9982 – f1-score: 0.9982COVID_EXPOSURE tp: 16 – fp: 1 – fn: 1 – precision: 0.9412 – recall: 0.9412 – f1-score: 0.9412DIARRHEA tp: 1493 – fp: 6 – fn: 0 – precision: 0.9960 – recall: 1.0000 – f1-score: 0.9980FATIGUE tp: 762 – fp: 2 – fn: 7 – precision: 0.9974 – recall: 0.9909 – f1-score: 0.9941FEVER tp: 3859 – fp: 7 – fn: 2 – precision: 0.9982 – recall: 0.9995 – f1-score: 0.9988HEADACHE tp: 1230 – fp: 4 – fn: 5 – precision: 0.9968 – recall: 0.9960 – f1-score: 0.9964MYALGIA tp: 478 – fp: 3 – fn: 1 – precision: 0.9938 – recall: 0.9979 – f1-score: 0.9958NAUSEA_VOMIT tp: 1925 – fp: 7 – fn: 12 – precision: 0.9964 – recall: 0.9938 – f1-score: 0.9951SOB tp: 1959 – fp: 10 – fn: 10 – precision: 0.9949 – recall: 0.9949 – f1-score: 0.9949SWEATS tp: 271 – fp: 0 – fn: 1 – precision: 1.0000 – recall: 0.9963 – f1-score: 0.9982TASTE_SMELL tp: 8 – fp: 0 – fn: 6 – precision: 1.0000 – recall: 0.5714 – f1-score: 0.7273THROAT tp: 1030 – fp: 11 – fn: 2 – precision: 0.9894 – recall: 0.9981 – f1-score: 0.9937WHEEZING tp: 3137 – fp: 6 – fn: 0 – precision: 0.9981 – recall: 1.0000 – f1-score: 0.9990

Given that we trained a transformer language model on a weakly-supervised, rules-based dataset, one might reasonably ask, “why not just use the rules-based approach in production?” However, given that transformer language models (like BERT) use sub-word tokens and context-specific vectors, our trained model can identify symptoms not specified in the rules-based patterns file and can also correctly identify misspelled versions of our entities of interest (e.g., it correctly identifies “cuogh” as a [COUGH]).
With the rich data available in clinical free text notes and the logistical challenges of clinical note annotation, a very practical approach to healthcare NLP is with transfer learning and weak supervision.

Read More
1 2 3 5