Are you indulging in a toxic conversation?

Read on Medium

As with every other story nowadays, ours also begins with the COVID-19 quarantine. This project is a result of bored, but motivated AI enthusiasts. This may not be as impressive as the roof garden that Andrew built, or the new customization on Indian scout Bobberby Kate, but its a start. Before we waste any more of your time, let us start with what exactly this project is. And why, if any, you should give a frick about it.

This project is aimed at developing an application that detects toxicity in an online conversation. This project is inspired by the ongoing Kaggle toxicity detection competition, which focuses on using machine learning algorithms to identify toxicity in online conversations.

A toxic comment is defined as a rude, disrespectful, or profane comment that makes other people uncomfortable, leading to leaving the discussion.


Social platforms like YouTube offer an option for comments so that people can have an open discussion about the content. However, these days what we find in the comment section is graded so bad that US-officials have passed an act that includes disabling commentsfrom YouTube videos made for kids. The comment section nowadays consists of trash talks, vulgar comments, and abusive offensive conversations; leading to hate spread, emotional distress, and aggressive behavior especially in the younger generation.

TLDR: it is a project that trains a Deep Learning model to detect whether a person’s comment is toxic or not. Since, none of us are descendants of Socrates (maybe we are, but we don’t care for now) or Social Science major, we are not going to go in the discussion of what is toxic and what is not. We will use the definition as used by many online gaming platforms to ban justly (or unjustly) gamers.

Toxic person is simply causing spread of hate.

If you have any issues with this definition, or if you want a more spirited debate about this, please reach out to one of our project members.


Project APOLLO

By the end of this article, you will be able to

  1. Run a toxicity detection web application.
  2. Learn how to build this application from scratch.

The implementation of this work is made public on our GitHub repo. We advise you to clone it and work on it simultaneously as you read this article.

To start with, we assume that the reader is familiar with the basics of Deep Learning and Python. If not, here are some basics of Deep Learning and Python to start with.


We tackled this problem in a supervised learning setup. The dataset used in this project is from Jigsaw/Conversation AI. The dataset consists of ~2m public comments from the Civic Comments platform. This dataset served as our training and validation set.

The data consists of the following toxicity attributes:

  • severe_toxicity
  • obscene
  • threat
  • insult
  • identity_attack
  • sexual_explicit

In this work, we do not classify the attributes of the toxic comments. Rather we tackle the problem in a binary fashion: Toxic or Non-Toxic.


For our task, we used XLM-ROBERTa (XLM-R), which is a powerful transformer-based unsupervised model. It is a cross-lingual sentence encoder network that obtains state-of-the-art results on many cross-lingual understanding benchmarks.

XLM-R is trained on 2.5T filtered CommonCrawl data in 100 languages. More information can be found here and the official implementation is available here.


We trained our network on a single 12 GB Nvidia GPU. We also used Google Colab, to experiment with the TPUs settings. We trained our network for 2 epochs and achieved a validation accuracy of around 85%.

As the end goal of the project was to create a web application, the trained network weight was quite large. So we compressed the saved model and reduced it’s size from 3GB to 1GB. The final weight is provided here to download.


As we have our trained model ready, we will now move on towards building our final application. In order to test our application, we scraped comments from various social media platforms, but we realized it might be unethical in some sense to release it publically.

So, we limited the scope of this project to YouTube comments only.

We scraped the user comments from a given YouTube video URL and pass it to our trained model. We output the predicted toxic/non-toxic metrics on our application homepage.


As our final goal is to host this project as a web application, we used Flask for hosting the app and HTML with CSS for web styling.

Flask is a web framework that provides tools, interfaces, libraries and technologies to build a web application.


  1. Clone the GitHub repository and download the trained weights fromhere.
  2. Install pre-requisites as mentioned in the GitHub README.
  3. Run and enter a YouTube URL to get the results.

Please note that to maintain the anonymity of the users, the users names and profile pictures has been randomized and do not correspond to the name and the profile picture displayed.

You can download our code and try it yourselves. So use this time to learn something new and good. Our code is available on GitHub with all the required information.

Comment on Medium