Note: Requires Elasticsearch 9.x — full index rebuild required
This release is mostly a maintenance and hardening release: yente now requires an Elasticsearch 9.x server, verifies the integrity of downloaded entity data via checksums, exposes index freshness as an OpenTelemetry gauge, and ships container images with signatures and bill-of-materials attestations for downstream supply-chain scanners.
If you haven’t moved your search cluster off ES 8 yet, do that before upgrading yente — see the upgrade guide. The transition path is: 8.x → 8.19.x → 9.x, then upgrade yente. v5.4 was the last release that talked to ES 8 servers.
The changes in more detail:
- Entity data integrity checks. When the catalog metadata advertises a checksum for an entities resource, yente now verifies the downloaded data against it during indexing. A mismatch raises a clear error instead of silently indexing a possibly truncated/corrupt file. This behavior can be turned off via the new
YENTE_VERIFY_CHECKSUM=falsesetting if needed. Thanks @jbothma for driving this work. - Bounded match compute. The
/matchendpoint now retrieves and scores at mostYENTE_MAX_MATCH_CANDIDATEScandidates per query (default 500). Previously a largelimitcould pull in thousands of candidates per query, causing slow responses and occasional OOMs under load. The cap is set high enough that real-world matching results don’t change. - HTTP 414 for oversized URLs. Requests whose URL (path + query string) exceeds
YENTE_MAX_URL_LENGTH(default 60000 bytes) are now rejected with a 414 response. This is to avoid a bug in uvicorn that silently eats long get requests. - Index freshness as an OTel gauge. A new
indexed_dataset_version_timegauge exposes each indexed dataset’slast_exportas a Unix timestamp, read from the index_metawritten at index time and refreshed after each catalog reload. Wire it into your monitoring to alert on stale indices. See the monitoring docs. - Supply-chain artifacts on tag push. Every release now produces, alongside the multi-arch image: per-platform CycloneDX 1.6 + SPDX 2.3 SBOMs.
As usual, this release contains updates across the whole stack — followthemoney 4.8.2 → 4.9.2, nomenklatura 4.9.0 → 4.10.0, rigour 2.1.0 → 2.1.2, fastapi to 0.137.1, uvicorn to 0.49.0, cryptography 48 → 49, plus the usual CI action and dev-dependency bumps
Full Changelog: Comparing v5.4.0...v5.5.0 · opensanctions/yente · GitHub