GOTCHA BOT DETECTION: CONTEXT, TIME AND PLACE MATTERS
DOI:
https://doi.org/10.1590/SciELOPreprints.5974Keywords:
Bot detection, machine learning algorithm, Brazil, computational propagandaResumo
Bot detection is increasingly relevant considering that automated accounts play a disproportionate role in spreading disinformation, controlling social interactions, influencing social media algorithms and manufacturing public opinion online for different purposes. Definition, description and detection of automated manipulation techniques have proved a challenge as technology quickly advances in reach and sophistication. Considering the high contextual character of social science research, the employment of off-the-shelf detection tools raises questions regarding the applicability of machine learning systems in different cases, times and places. Thus, our purpose is to discuss the role of computational methods focusing on understanding the limitations and potential of machine learning systems to identify bots on social media platforms. To address it, we analyze the performance of Botometer, a widely adopted detection tool, in a specific domain (Amazon Forest Fires) and language (Portuguese) and propose a supervised machine learning classifier, called Gotcha, based on Botometer's framework and trained for this specific dataset. We also question how our classifier behaves and evolves over time and perform tests to evaluate the generalization capabilities of the retrained model. Our results demonstrated that supervised methods do not perform well with datasets that present features on which the system was not directly trained, such as language and topic. Hence, our study shows that a successful computational model does not always guarantee reliable results, applicable to a specific real case. Our findings indicate the need for social scientists to confirm the reliability of different tools created and tested only through the prism of computational studies before applying them to empirical social science research.
Downloads
Métricas
Postado
Como Citar
Série
Copyright (c) 2023 Rose Marie Santini, Débora Salles, Fernando Ferreira, Felipe Grael
Este trabalho está licenciado sob uma licença Creative Commons Attribution 4.0 International License.