Analytics
Textgain was founded in 2015 as a spin-off of the University of Antwerp (Belgium). We specialize in the development of Artificial Intelligence that automatically detects and monitors harmful online societal trends and tensions, such as hate speech and disinformation. In 2016, Textgain gained significant attention for its efforts to detect jihadist propaganda on social media and the company has since then expanded its software stack to detect online signs of radicalization in all of its aspects, including extreme left and extreme right rhetoric. In 2021, Textgain became the coordinator of the European Observatory of Online Hate, an initiative to monitor online hate speech across the European Union.
Textgain built a social media GDPR-compliant monitoring pipeline with the use of a list of keywords with the most potentially polarising topics in order to further analyse the resulting database. We built a bot detection algorithm customised for finding harmful bot content on the mainstream social media platform TikTok. To distinguish harmful bots from regular bots on TikTok we applied our transformer-based customised toxicity detection algorithm to all of the texts in the database. The resulting network displays the hashtags appended to the intersectional data of potential bot content on Tiktok that is also marked as toxic. Check out their website here.
Glossary
-
The circles represent hashtags and the lines represent connections between hashtags that
are used by the same bot(s). Hashtag circles that are often seen together attract each other
and vice versa resulting in a grouping of those hashtags. The bigger the size of the circle the
more the hashtag appears in the data. The colour of the circles represents community, which
is calculated using the Louvain method ( https://en.wikipedia.org/wiki/Louvain_method ). A
community is a hub in which the circles interact significantly more with each other than with
circles outside of the community. The network as a whole displays bot message interaction
among tiktok posts whereby the hashtags are a representation of bot-targeted topics.
-
Textgain, as technological partner of IMSyPP, is tackling hate speech in a multidisciplinary
fashion combining machine learning, computational social science and linguistic approaches
to support a data-driven approach to hate speech regulation, prevention and
awareness-raising. The goal of this initiative is automated detection and sustainable
monitoring of hate speech. Therefore, we developed near real-time hate speech detection
models tuned to language, culture and legislation, taking into account the context of the
message.
-
● 0 - APPROPRIATE: - no target
● 1 - INAPPROPRIATE: contains terms that are obscene, vulgar; but the text is not
directed at any person specifically) - has no target
● 2 - OFFENSIVE: including offensive generalization, contempt, dehumanization,
indirect offensive remarks
● 3 - VIOLENT: author threatens, indulges, desires or calls for physical violence against
a target; it also includes calling for, denying or glorifying war crimes and crimes
against humanity
Words used in bot comments
In the word cloud below you can see the words used in suspected bot comments classified as toxic over the past month.
Hashtags used in bot comments
In the word cloud below you can see the hashtags used in suspected bot comments classified as toxic over the past 3 months.
Suspected bot comments
Amount of suspected bot comments per month over the last 6 months.
Past month
Amount of suspected bot comments per classification over the past month.
Past 3 months
Amount of suspected bot comments per classification over the past 3 months.