machine learning visr (1)

We’re stepping up our fight against bullying

visr email header bullying classifiers - through (2)

We’re getting better at identifying bullying in your child’s social media activity. Today we are releasing a new level of bullying classifiers, this release uses a new upgraded system which will do a better job of finding instances of bullying.

Our new system is not simply looking for direct bullying, we programmed it to find instances of kids defending bullying, supporting bullies, and talking about bullying events that may have happened offline.

How we built this system

In order to properly identify instances of bullying, we put together a team that is trained in finding instances of modern bullying – including the ever-evolving terminology of the internet.

Once we found solid examples of bullying, we send them to our machine learning systems, to teach them what bullying looks like on Facebook, Twitter, Youtube, and other channels. These machine learning systems then identify what they thought were examples of bullying, our data science team then confirms or denies, thereby teaching them what to look for and what to ignore in the future.

How we know it works

This all sounds great, but does it work? Algorithmic decisions can sound good in theory, but are not so great in practice.  This is why we run a huge number of tests.

We use 10-fold cross-validation on everything.  That is, we first take all of our examples of bullying and divide it into two groups, i) a training group and ii) a validation group of examples to determine that our training was successful.  We then set the validation group to the side and forget about it until we are done our tests.

We then take the training group and split it ten different ways into new training and testing groups.  For each of those ten groups, we train on one part and then test on the other part.  The test gives us a number that tells us how well the algorithm performed.  We repeat this with all ten training-testing groups and get 10 numbers telling us how well the algorithm performed.  We take the average of those 10 numbers, and then have a pretty good idea of the success of the algorithm.  Then we compare different algorithms in this same way and find the best one.  Finally, we take the very best algorithm and pull out that validation group of examples that we had set to the side when we began.  We do one final test on that to be extra sure that we have the very best algorithm possible.

This process gives us many new surprises.  We’ve found that when looking at anxious teenagers it helps to know their age but doesn’t help to know their gender.  By contrast, it helps to know the gender of a bully but doesn’t help very much to know the age.  Why is that? We have no clue!  But this is why we perform so many tests.

This article was written by David Van Bruwaene, lead NLP scientist at VISR, which provides a preventive wellness app designed to safeguard children online.

Illustrated image used is by Lillian Chan, a toronto-based illustrator, images from Canadian Family Magazine (March 2013)