Generating and Accessing Statement Data for Custom Datasets in a Yente Server?

jordangoulet13 · December 8, 2025, 10:31pm

Hi all,

I’m working with a self-hosted OpenSanctions Yente server and several custom datasets. I understand how statements work for the public OpenSanctions datasets using the API, but I’m unclear on the intended approach when it comes to custom datasets.

Any recommended patterns or best practices for exposing statement-level data from a custom dataset running on self-hosted Yente

If there’s existing documentation, examples, or design rationale around this that I’ve missed, I’d really appreciate being pointed in the right direction.

Thanks in advance — and thanks for all the work on OpenSanctions!

pudo · December 13, 2025, 12:52pm

Hey Jordan, damn you’re getting deep

The very short of it is: yente doesn’t know about statement data. For doing what we usually want the self-hosted API to do (screen some entities, maybe do a search, traverse the knowledge graph), statement data seemed like overkill. Each release of the statements is 10GB and requires building an ad-hoc local graph store to really use. Of course, we’re now discovering things that would be nice to have in the API (like filtering properties in combined entities by source, or filtering names on language) - but a migration still seems like distant roadmap.

Regarding tools for building the statement data: the basic model is now in followthemoney.statement, but some of the tooling for building graphs out of lives in nomenklatura.store (we use the LevelDB implementation there).

As for the /statements API: it’s basically a SQL endpoint that runs against a table that our ETL jobs are pushing data into (see nomenklatura.db).

Cheers,

Friedrich

Topic		Replies	Views
Using own custom dataset in yente? Support & Questions yente , support	1	143	February 19, 2025
How to retrieve source URLs and evidence metadata for entity properties via API? Support & Questions	4	145	September 27, 2025
Open Source Tutorials Use Cases & Case Studies	5	105	June 16, 2025
Yente Manifest setup Support & Questions	7	76	October 2, 2025
How to deduplicate custom datasets with the default OpenSanctions dataset? Support & Questions	1	42	November 19, 2025

Generating and Accessing Statement Data for Custom Datasets in a Yente Server?

Related topics