Call for false positives: help us build out a great set of name tests

We do have quite a bit of that sort of data, just also from merging entities across sanctions lists. (eg: One fun asset that we have and should talk more about is a pairwise match file of person and companies that’s generated off the main OS data.)

What I’m trying to chase down at the moment a bit is a more domain-inspired typology of name matching error types.

For example:

  • A screening system should consider John B. Roberts and John A. Roberts to be different people. Mainly if we know they’re in America…
  • LLC ORION and ORION OOO are the same Russian company,
  • Ben Netanyahu and Benjamin Netanyahu - is that a match? Does that get too broad?

We often get screening false positives (unfortunately: we rarely get false negatives!) sent in by people, and I think those bits can serve as a harness on the API to make sure we at least don’t do the same mistake twice :slight_smile:

1 Like