Bulk Data Download and Diff

Hi everyone — I’m planning to use the OpenSanctions data for an automated workflow and I have a few questions about the best way to handle downloads and updates:

  1. Data formats — What formats are recommended for programmatic use (e.g., JSON, CSV, FollowTheMoney)? Are there differences in content or completeness between them?

  2. Update frequency — How often do the main datasets get updated on data.opensanctions.org (e.g., daily)? Is this consistent across “default”, “sanctions”, “PEPs”, etc.?

  3. Incremental updates or diffs — Is there a way to fetch only incremental changes (diffs) between releases, or do I need to download the full dataset each time? If diffs are available, where are they documented or published?

Hi @imranbaloch and welcome to our community! Have you checked out our website, specifically the dataset sources page, where we list the update frequency of each source? If you click through to a specific source you can see the download file types. We have a nice explanation of how to use our deltas for incremental processing as well.

Feel free to peruse our documentation for frequently asked questions and explanations of all the different facets of our data.

2 Likes

Maybe to pile on to this:

  • Formats: JSON/FtM is the richest, the standard CSV file we provide is VERY limited. We can make other CSV exports on request.
  • Update frequency: is specified on each dataset page, 6 hours for the main dataset
  • Our delta (incremental update) mechanism: https://www.opensanctions.org/docs/bulk/delta/
2 Likes