Downloads: 0
Vietnam | Computer Science and Mathematics | Volume 13 Issue 4, April 2025 | Pages: 35 - 39
Fuzzy Join Algorithms in Big Data
Abstract: Fuzzy join is a more sophisticatedly approach to data matching and has many applications in many filelds. Instead of marking records as matches or mismatches based on exact matching algorithms, fuzzy joins are used to combine two data sets that do not have exact matching keys, but are similar up to a certain threshold. The biggest challenge is finding pairs of data with similarity greater than or equal to a given threshold within a given period of time. The paper presents several algorithms with Hamming, Levenshtein, Cosine distance measures applied to the MapReduce model and propose a new algorithm based on the hedge algebra for fuzzy join on semantically ordered fuzzy datasets in big data.
Keywords: Fuzzy join , similarity algorithms, hedge algebra, big data, MapReduce
Rating submitted successfully!
Received Comments
No approved comments available.