Exploring Interoperability Between Local and Global Databases in Scientometrics: Lattes, Capes, and OpenAlex
DOI:
https://doi.org/10.1590/SciELOPreprints.12668Keywords:
bibliometric coverage, Lattes CV, ScientometricsAbstract
Numerous initiatives are currently underway to disambiguate databases worldwide. In this paper, we propose a methodology for disambiguating research entities using big data techniques, adopting an approach that goes from local to global databases. Our objective is to enhance the quality of data in the OpenAlex database by leveraging information from Brazilian databases, particularly data from the Lattes Platform and the Brazilian Federal Agency for Support and Evaluation of Graduate Education. We compare similar names of authors and institutions, employing Digital Object Identifiers to link entities, along with an adaptation of the Levenshtein distance algorithm. The proposed method is straightforward to implement in tabular databases and facilitates disambiguation, thereby contributing to open science practices and providing an effective solution for research information systems. The findings indicate the potential for integrating local and global databases to address issues related to ambiguous names and incomplete metadata.
Downloads
Posted
How to Cite
Section
Copyright (c) 2025 Alysson Fernandes Mazoni, Estevão Fernandes Macedo, Luís Fabiano Farias Borges, Esteban Fernandez Tuesta

This work is licensed under a Creative Commons Attribution 4.0 International License.
Funding data
-
Fundação de Amparo à Pesquisa do Estado de São Paulo
Grant numbers 2021/05823-1;2019/04300-5 -
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Grant numbers 23038.007842/2022-84
Plaudit
Data statement
-
The research data is available on demand, condition justified in the manuscript


