Analytics

Textgain was founded in 2015 as a spin-off of the University of Antwerp (Belgium). We specialize in the development of Artificial Intelligence that automatically detects and monitors harmful online societal trends and tensions, such as hate speech and disinformation. In 2016, Textgain gained significant attention for its efforts to detect jihadist propaganda on social media and the company has since then expanded its software stack to detect online signs of radicalization in all of its aspects, including extreme left and extreme right rhetoric. In 2021, Textgain became the coordinator of the European Observatory of Online Hate, an initiative to monitor online hate speech across the European Union.

Textgain built a social media GDPR-compliant monitoring pipeline with the use of a list of keywords with the most potentially polarising topics in order to further analyse the resulting database. We built a bot detection algorithm customised for finding harmful bot content on the mainstream social media platform TikTok. To distinguish harmful bots from regular bots on TikTok we applied our transformer-based customised toxicity detection algorithm to all of the texts in the database. The resulting network displays the hashtags appended to the intersectional data of potential bot content on Tiktok that is also marked as toxic. Check out their website here.

Glossary

  • The circles represent hashtags and the lines represent connections between hashtags that

    are used by the same bot(s). Hashtag circles that are often seen together attract each other

    and vice versa resulting in a grouping of those hashtags. The bigger the size of the circle the

    more the hashtag appears in the data. The colour of the circles represents community, which

    is calculated using the Louvain method ( https://en.wikipedia.org/wiki/Louvain_method ). A

    community is a hub in which the circles interact significantly more with each other than with

    circles outside of the community. The network as a whole displays bot message interaction

    among tiktok posts whereby the hashtags are a representation of bot-targeted topics.

  • Textgain, as technological partner of IMSyPP, is tackling hate speech in a multidisciplinary

    fashion combining machine learning, computational social science and linguistic approaches

    to support a data-driven approach to hate speech regulation, prevention and

    awareness-raising. The goal of this initiative is automated detection and sustainable

    monitoring of hate speech. Therefore, we developed near real-time hate speech detection

    models tuned to language, culture and legislation, taking into account the context of the

    message.

  • ● 0 - APPROPRIATE: - no target

    ● 1 - INAPPROPRIATE: contains terms that are obscene, vulgar; but the text is not

    directed at any person specifically) - has no target

    ● 2 - OFFENSIVE: including offensive generalization, contempt, dehumanization,

    indirect offensive remarks

    ● 3 - VIOLENT: author threatens, indulges, desires or calls for physical violence against

    a target; it also includes calling for, denying or glorifying war crimes and crimes

    against humanity

Words used in bot comments

In the word cloud below you can see the words used in suspected bot comments classified as toxic over the past month.

Hashtags used in bot comments

In the word cloud below you can see the hashtags used in suspected bot comments classified as toxic over the past 3 months.

Suspected bot comments

Amount of suspected bot comments per month over the last 6 months.

Line graph of the suspected bot comments over the last 6 months.

Past month

Amount of suspected bot comments per classification over the past month.

Bar chart of the suspected bot comments per classification over the past month.

Past 3 months

Amount of suspected bot comments per classification over the past 3 months.

Bar chart of the suspected bot comments per classification over the past 3 months.