I’ve created a custom dataset (World Bank Leadership) and deployed it to yente. It
works, but I’m seeing duplicate entities in search results.
For example:
-
My dataset has: “Ajay Banga” (wb-lead-120ea…)
-
OpenSanctions default has: “Ajaypal Singh Banga” (Q4699676)
{
“id”: “wb-lead-120ea365ea249e476b8507d585cfdb7e24fa21dd”,
“caption”: “Ajay Banga”,
“schema”: “Person”,
“datasets”: [“worldbank_leadership”],
“properties”: {
“name”: [“Ajay Banga”],
“topics”: [“gov.igo”, “role.pep”],
“sourceUrl”: [“https://www.worldbank.org/ext/en/who-we-are/leadership/ajay-banga”]
},
“target”: true,
“first_seen”: “2025-11-17T18:09:43”,
“last_seen”: “2025-11-17T18:09:43”
}
// Entity 2: OpenSanctions Default
{
“id”: “Q4699676”,
“caption”: “Ajaypal Singh Banga”,
“schema”: “Person”,
“datasets”: [“wikidata”],
“properties”: {
“name”: [“Ajaypal Singh Banga”],
“wikidataId”: [“Q4699676”],
“country”: [“zz”]
},
“target”: true
}
These are the same person but appear as separate results.
Question: How do I deduplicate my custom dataset with the default OpenSanctions
catalog?