A new patent granted to Microsoft by the United States Patent & Trademark Office unveils details about a computer-implemented method for detecting toxic electronic content. The patent - Toxic Content Detection with Interpretability Feature - has been granted to Microsoft on October 1 and credits Xiaoran Zhang, Emilia Stoica, and Clayton Holz as inventors.
The technique probabilistically determines a toxic keyword identifier as indicative of toxic content.
Here's the background. In organizations, HR personnel frequently conduct surveys to solicit and obtain comments from employees and assess and improve the health of the company. The review may include identifiers of possible toxicity in the workplace, including comments that are rude, disrespectful, threatening, obscene, insulting, and contain identity-based hate. Such organizations will appreciate solutions that help them quickly identify these toxic comments among potentially tens or hundreds of thousands of comments submitted by employees. If a survey response indicates toxicity, human resources can dig into the response for further action.
Computer-implemented techniques for accurate and interpretable toxic content detection are disclosed. The techniques encompass using a probabilistic toxic keyword identifier to probabilistically determine keywords that are indicative of toxic content. In an implementation, the toxic keywords are determined based on comparing term frequencies of the keywords in a set of example toxic interpersonal electronic communications against term frequencies of the keywords in a set of example non-toxic interpersonal electronic communications. A keyword is determined as indicative of toxic content if its term frequency in the set of toxic examples is more than a threshold number of times more than its term frequency in the set of non-toxic examples. In this way, a set of multiple keywords indicative of toxic content can be determined. Survey comments containing a keyword determined to be toxic are then flagged as potential toxic content in a user interface for human review.
Technology companies file several patents that often don’t see the light of the day in consumer-facing products so any conjecture about the application of this patented technique is well, a conjecture. Microsoft could also broaden the usage of such technique to analyze toxic behavior in chats and video transcriptions in Microsoft Teams.