This feels like a bad idea. I’ve personally found that most “toxic” comments don’t exist without context and this api does not seem to provide any way to deal with it.
It does seem pretty half-assed. It’s basically an undergrad student course project: take an off-the-shelf ML algorithm, an available corpus, train it on a dataset of “toxic” comments, and write up a report. Fair enough, that’s a reasonable self-contained student project, especially if the write-up has an acknowledgement of the limitations.
But if it’s coming from a company with Google-level resources, that’s not quite as impressive, especially if the write-up, rather than having an acknowledgement of the limitations, oversells it instead. From some testing, it looks like this system operates at roughly 1970s levels of sophistication, equivalent to counting “bad” words (they might use some neural net under the hood, who knows, but the end result is the same). If you tell someone they are a bloody cunt, that’s judged toxic. If you tell them you think they should end up deceased, that’s judged not toxic at all. If you use any non-Latin alphabet, that’s judged toxic (telling someone to have a nice day, but in Greek, is highly toxic to Google). Overall, not impressed.
There are some interesting other examples in the Verge article on this. And also the observations from Ramsey Nasser on Twitter about how all Arabic statements seem to always be rated at least 30% toxic..
Self-censorship is too much work, let’s outsource it to a company for us? Seriously?
So I hit it with a smattering of comments from this forum.
Some I regarded as somewhat toxic, some I regarded as a respondee trying to be reasonable , some just technical.
The “just technical” was rated neutral, the trollish response AND the reasoned reply was rated toxic.
The test yourself api just came up with a number, no hint as to which words were adding toxicity and which were neutralising.