Call for false positives: help us build out a great set of name tests

paco · May 1, 2025, 7:10pm

One approach, if you run entity resolution in Senzing on a collection of datasets, the resolved entities will each have their lists of values for features, such as names.

In other words, one of the byproducts from running ER is that it produces a domain-specific thesaurus, plus data quality metrics for the features.

Then you can get from the results which name variants are related, and which are relatively the most popularly used.

If this might help develop data and tests for name matching?

Topic		Replies	Views
Deep, fuzzy matching person and company names with Project Eridu Research & Development	24	713	November 24, 2025
False negatives Research & Development	3	116	June 18, 2025
Yente 5 - resilient updates, powerful name matcher! Announcements yente , announcements	0	86	September 15, 2025
Best practice to handle ID / National ID search Support & Questions	5	55	April 17, 2026
Yente 5.4 — gotta go fast! Announcements release , yente	0	32	May 13, 2026

Call for false positives: help us build out a great set of name tests

Related topics