Entity Resolution in Distributed Systems: From Fuzzy Matching to Knowledge Graph Integration

Authors

  • Veera Venakata Sathya Bhargav Nunna Amazon Web Services, USA
  • Radhakant Sahu Amazon Web Services, USA

DOI:

https://doi.org/10.32996/jcsts.2025.7.8.55

Keywords:

Entity resolution, record linkage, data integration, knowledge graphs, fuzzy matching

Abstract

Entity resolution addresses the critical challenge of identifying records that refer to the same real-world entities across distributed data systems, despite variations in their representation. In big data environments, a single entity such as "Apple Inc." may appear as "AAPL," "Apple Computer," or thousands of other variations, significantly impacting analytics accuracy and data quality for decision-making. This paper provides a comprehensive overview of entity resolution techniques, from traditional rule-based systems to modern AI-powered approaches. We examine core components including blocking strategies for computational efficiency, similarity measures for record comparison, and classification algorithms for match determination. The field has evolved through five distinct generations, progressing from rigid deterministic matching to sophisticated AI systems utilizing fuzzy logic, probabilistic modeling, and deep learning. Key processes such as canonicalization, clustering algorithms, and cross-database linkage are analyzed alongside human-in-the-loop approaches for handling ambiguous cases. We demonstrate the critical importance of entity resolution in knowledge graph construction, where proper entity identification enables meaningful relationship discovery and semantic integration. Through enterprise case studies and implementation examples, we illustrate how systematic entity resolution transforms disparate data sources into unified knowledge systems that support reliable decision-making.

Downloads

Published

2025-08-04

Issue

Section

Research Article

How to Cite

Veera Venakata Sathya Bhargav Nunna, & Radhakant Sahu. (2025). Entity Resolution in Distributed Systems: From Fuzzy Matching to Knowledge Graph Integration. Journal of Computer Science and Technology Studies, 7(8), 489-495. https://doi.org/10.32996/jcsts.2025.7.8.55