PoliLoom – Loom for weaving politician's data

Devlog #7: Deletion and Other Adventures

Hey everyone! Remember when I said testing environment by end of week? Well, we hit an unexpected roadblock.

Turns out we were blocked by WikiData. Literally blocked. But we’ll get to that story in a minute.

The Delete Button Philosophy

So we made some important architectural choices. The first one: implementing a proper deletion strategy.

Here’s how it works: When someone evaluates a statement negatively, we now actually delete it. We only actually delete it after deleting from WikiData, ensuring we maintain data integrity on both sides.

We’re keeping all the positive evaluations, and now storing every single statement ID we get back from the WikiData API, again maintaining data integrity. This is new, before, we didn’t expect to need these since we were only inserting new data. Turns out, we definitely need them.

The Statement ID Saga

During testing, we discovered something interesting: we often get new dates for positions where we’d want to merge the new data with existing database entries. So we added statement ID importing and deletion for existing entities, so we could accept new entries and discard existing ones.

However, dates can be wrong, but other metadata might be accurate, like which party someone belongs to or the context of their position.

The Hidden Metadata Problem

Our import system was only capturing dates from Wikidata position statements, but these statements often have much more metadata: political party, electoral district, jurisdiction, and more.

When evaluators see a existing position in our interface, they’re only seeing the partial data we captured. If they reject it, they could unknowingly delete a Wikidata statement that has way more information than what they’re seeing. We could be destroying valuable metadata without anyone realizing it.

So we’ll have to import all position properties during import, not just dates. Make sure users can see exactly what they’re about to modify or delete.

The Leaderboard Question

With our new deletion strategy, we only track positive evaluations. That would support only part of a potential “Top accepted edits” leaderboard. Soft deletes would make this possible in the future, but honestly, there’s a lot to be done before we get to a leaderboard. It’s a possibility we’re keeping open, but not a priority right now.

The WikiData Block Resolution :prohibited:

Remember that testing environment I promised? Here’s what actually happened:

We deployed to production, everything looked good, then discovered we couldn’t push any evaluations to WikiData. We were getting rest-write-denied errors with no clear explanation.

After some detective work (and help from the WikiData community on Phabricator), we discovered the issue: our entire IP range (148.251.0.0/16 - that’s 65,534 hosts!) was blocked as an “Open proxy” on the Hetzner network.

The journey to resolution:
  1. Filed a bug report on Phabricator (T404727)
  2. Appealed through the Open Proxies Stewards Block Wizard
  3. Submitted an unblock request following WikiData’s IP block exemption procedure
  4. Worked with WikiData administrators who found a clever solution

Big thanks to the WikiData admin team for helping us navigate this.

The Small Improvements That Add Up

While communicating with the admins, we haven’t been idle. We’ve:

  1. Dramatically improved the position and birthplace mapping by importing lots of extra relations and building short descriptions for the reconciliation step
  2. Improved authentication to handle token expiration and error states better
  3. Redesigned the UI to show grouped statements, handle edge cases in the Wikidata statements more cleanly and allow for deletion of existing statements
  4. Did part of the work to support multiple languages

What’s Next?

The testing environment is live! :tada: Check it out at https://loom.everypolitician.org/

Right now I’m working on multilingual support, which is a bigger undertaking than it sounds. We’re:

  • Importing all languages and countries from WikiData
  • Linking those to politicians based on their citizenship
  • Using those links to ensure we import the correct Wikipedia content in the right languages
  • Building UI for you to choose your preferred languages for evaluating and will allow you to filter for politicians from your country.

This should dramatically improve our coverage for non-English politicians and make the tool truly international.

We’ve successfully implemented a robust deletion system that maintains data integrity across both our platform and WikiData. Getting there’s been quite the journey, but we’re now reliably processing and validating politician data.

Let’s see how this works across multiple languages!?

P.S. - If you find any issues or have feedback on the testing environment, please let us know!

1 Like

Devlog #8: Speaking (Almost) Everyone’s Language

Hey everyone! I couldn’t help it, and changed almost every part of the codebase again this cycle. The good news? All of it still works.

The Wikipedia Link Problem

Politicians have a tonne of Wikipedia pages linked to them, we can’t afford to fetch them all, and it would be wasteful to do so. So which links do you fetch? :thinking:

Our solution: Build a language inference system that checks:

  • Politician citizenship countries
  • Official languages those countries speak
  • Politicians Wikipedia links
  • Size of Wikipedia language editions

Good thing we can extract all that from Wikidata :tada: We pick the three most probable languages based on what we know about someone. No matches? Fall back to the biggest Wikipedia editions (based on total link counts in our database).

The Property Enlightenment

After offering the option to discard existing data from Wikidata, we discovered that we should be able to show all the metadata for statements, otherwise data could be lost unknowingly. I always knew that our implementation where we tracked certain fields in different models was not The Way™️, but I did not know that The Way™️ was.

The breakthrough came while looking at qualifiers, which is a object that holds metadata on statements about an entity, like start_date and end_date for a held position. I noticed everything in entity properties lives in the qualifiers :exploding_head:. Everything. Positions, birthplaces, citizenships… they’re all just properties with their metadata tucked into qualifiers, but wait, the Birthdate things are also just a property with qualifiers, but with a value and precision instead of a linked entity.

So I rebuilt our entire pipeline around this realization. If the frontend just parses qualifiers, and enrichment stores extracted data as qualifiers, we can reduce everything to one unified Property model :sparkles:. So that’s exactly what I’ve done.

We were already tracking citizenship to link politicians to countries. After the refactor? Adding citizenship to enrichment was literally one config entry and some generated prompts. Five minutes of work.

Queue + Frontend Upgrades

Background queue now pre-fetches politicians based on your filters. No more waiting during evaluation.

The UI now:

  • Renders dates from qualifiers client-side
  • Shows all qualifiers and references
  • Warns before deletions
  • Filters by any language/country combination from WikiData
  • Stores filter preferences and sets defaults based on browser settings

Soft Delete Implementation

When we get a new Wikidata dump, we should keep track of what’s in there so we can clear the stuff from our database that is missing. “But what if you inserted stuff into Wikidata after the dump was created!?” I hear you think. Right, we should keep that in mind… that’s why we now mark entities and statements as soft-deleted when we haven’t seen them in the last 2 dumps.

Why soft-delete? Well, I realized we need to remember what users already evaluated. Without it, enrichment would recreate statements users already rejected. Now when you reject something, we remember. Next enrichment checks: “Already evaluated?” If yes, don’t create a statement.

I definitely did not implement soft-delete so we can have a cool leader-board. Definitely not. (Really) :baby_angel:.

What Shipped

  • Full multilingual support with local language prioritization
  • Working country/language filtering
  • Unified property system
  • Queue with pre-fetching
  • Full soft delete support with dump tracking
  • Systemd timers/services to run sync pipeline & backups
  • A fleshed out test suite
  • Database performance improvements
  • All sorts of small UI improvements

What’s Next

We want to support editing of existing statements. At the moment, I’m a bit torn between two philosophies for this:

JSON editing with validation - Give users the raw power. They edit JSON directly, we validate hard. It’s honest about what’s happening under the hood. WikiData veterans might feel at home. But this is dealing in raw P361 properties and Q110087116 identifiers, no labels… It’s powerful, but is power what we need here?

Drag-and-drop interface - Make it intuitive. Drag qualifiers between statements, click to edit values, visual feedback everywhere :rainbow:. We can constrain user actions to only make valid edits through good design, and we don’t have to support every edge case. The goal is to get people editing, not to expose every possible operation from day one.

The thing is, these aren’t just UI choices. They’re statements about our priorities and audience. JSON prioritizes flexibility and transparency and looking smart. Drag-and-drop prioritizes actually getting edits done and having fun.

I’m leaning toward starting with the friendly approach. What is the best approach? Maybe the answer is both. Maybe it’s neither. But we need to pick a direction and commit.

I need to sync with the team on how we handle edits, if you have any good ideas do share them! Also, we’re thinking about prioritization, which is a completely different story. What politician do you enrich and show first? There are multiple paths forward, each with trade-offs, so also more on that later!

Thanks for reading this far and keep it up! :flexed_biceps:

1 Like

Devlog #9: We’re ready

Hello everyone! Another update!

The last sprint was really about polishing things, let’s get into it.

Search Infrastructure Rebuild

We’ve added labels to millions of entities across multiple languages. Before we only used embeddings, but embeddings are not ideal for proper names. So now we’re using PostgreSQL trigram indexes with adaptive fuzzy matching. Typing “Merkl” finds “Angela Merkel”.

Intelligent Enrichment Queue

Our enrichment strategy before assumed that we would trigger the process for the whole dataset, but after some thought we went with a queue that we fill based on requested data. Before, users filtering for unenriched countries saw empty results because we only enriched so many during testing. Now, empty queues no more because we trigger enrich with the requested set of filters when we don’t have any unevaluated politicians to show you.

While at it, we made sure to parallelize the process where possible, making nice use of async python.

Class Tree Filtering

Wikidata locations range from “hamlet” to “beach” to “archaeological site”. Searching every location subclass is computationally prohibitive.

We mapped the entire class tree and identified relevant branches: cities, regions, countries. We’re gearing up to filter edge cases like “former body of water” (real Wikidata class). The strategy will differ for positions, where we will blacklist branches, and locations, where we will whitelist branches, it’s a interesting problem that is more one of definition then technical.

User Experience Improvements

We’ve invested significant effort in making the interface more pleasant and intuitive. Everything should now be on brand. Next to that we’ve added a comprehensive guide that walks users through the system with examples, making it easier for newcomers to contribute effectively.

Lot’s of other minor fixes in regard to the preferences, authentication, queing etc.

Significant Performance Improvements

We’re pleased to report that lot’s of API routes now respond quite a bit faster thanks to careful database optimization. We strategically added indexes, separated labels into a dedicated model for improved ranking performance, and optimized some of our query patterns by switching from subqueries to joins.

Looking Forward

We’re exploring prioritization algorithms to determine which politicians to enrich first. Should we prioritize recently elected officials? Those with incomplete data? Also we’d like to have some better filters, select for certain parliaments for example would be cool.

We’re excited for you to start playing with this thing. Thank you for your continued support and all the best!

Devlog #10: The Game Loop

Hey everyone! Time for another update.

Since the last post, we’ve been through two rounds of user testing. The first round led to a bunch of UI polish: guide improvements, better multiselect filters with counts, cleaner route structure, and lots of style tweaks. The second round drove most of this update. The goal throughout: make sure we don’t lose people along the way.

Learning From Real Users

Watching people use PoliLoom for the first time taught us a lot. Where do they hesitate? What makes them click away? What keeps them going?

The answer wasn’t more features. It was clarity. People needed to understand what they were doing and feel confident doing it. Every moment of confusion is a potential drop-off.

The old flow didn’t help. You’d log in and immediately land in the evaluation screen — politicians coming at you, no context, no goal. Filters existed but were buried in a menu most people never found.

Now you start with a choice. Pick a country, pick a language, then begin. You’re not drowning in data — you’re starting a session with a clear endpoint. Finish it, come back later, start another. Same data, completely different feeling.

The Tutorial

The biggest addition is a hands-on tutorial. Not a wall of text — an interactive walkthrough where you actually evaluate data and get immediate feedback.

Basic tutorial (13 steps):

  • Learn about source documents and extracted data
  • Practice accepting and rejecting
  • Work with multiple sources
  • Understand why specific beats generic

Advanced tutorial (6 steps):

  • Unlock the ability to deprecate existing Wikidata statements
  • Learn when to preserve valuable metadata

The tutorial catches mistakes and explains what went wrong. Get it right? Move on. Get it wrong? Try again with a hint. By the end, users understand the task and feel ready.

Advanced Mode

Speaking of deprecation — this is now opt-in. New users see a simpler interface: just accept or reject new extractions. Toggle advanced mode and you can also deprecate existing Wikidata statements that need replacing.

This came directly from user testing. The full interface was overwhelming for newcomers. Now there’s a clear progression.

Better Evidence

We now extract multiple supporting quotes per statement instead of just one. When the AI finds “Jane Doe was born on March 15, 1975” and also “Born: March 15, 1975” — you see both. More evidence, easier verification.

The highlighting algorithm got smarter too. Multiple quotes, multiple highlights.

The Small Stuff

Lots of polish:

  • “Review” became “Evaluate” — clearer verb
  • Consistent accept/reject terminology throughout
  • Button and input style refresh
  • The old static guide is gone (the tutorial replaced it)
  • Properties are now sorted by date.

Backend Housekeeping

  • Preferences moved entirely to client-side localStorage — simpler architecture
  • Soft delete for extracted statements — we remember what users already rejected
  • Wikipedia permanent URLs and Wikipedia project tracking
  • Wider language support
  • Hierarchy logic rewritten with proper tests

Test suite overhaul — We refactored the entire test suite from database setup/teardown to transaction-based testing. The old approach was getting slow as we added more indexes. The switch to transactions also pushed us to clean up parts of the codebase that weren’t following proper transaction patterns — so the tests got (a lot) faster and the code quality improved.

What’s Next

A few directions we’re exploring:

Mobile interface — The current UI is desktop-focused. Making it work on phones would open up casual contribution. The design was made with this support in mind, but the actual styles are still missing.

User and community stats — Pages showing your contribution history, how much you’ve evaluated, and how the community is doing overall. Leaderboards, personal stats, maybe completion tracking per country or parliament.

Government data pipelines — This one’s been cooking for a while.

The Government Data Vision

We can’t filter by position or parliament effectively right now. The position data is incomplete (that’s literally why this tool exists). We can filter by country and language because most politicians have citizenship data, but “show me all Members of the European Parliament” doesn’t work yet.

The path forward: government publications.

Parliaments publish lists of their members. If we maintain a curated list of these source pages — linked to specific positions and countries — we can build scrapers that fetch them regularly. Different formats (simple lists, index pages with detail pages, etc.), same outcome: complete data for specific positions.

This gives us:

  • Reliable filters — Filter by parliament or position group, get everyone
  • Recurrent engagement — New data shows up with a “NEW” tag when scrapers run
  • Completion incentives — Be the first to verify a new parliament’s data
  • Targeted enrichment — When someone filters by a scraped position, we trigger Wikipedia enrichment for exactly that set

The manual work upfront is linking source pages to Wikidata positions and countries. We’ll do this ourselves initially, but it could become something users help with.

More on this as it develops.

Try It Out

The tutorial is live. If you haven’t used PoliLoom before, it’s a good time to start. If you have, consider toggling advanced mode and going through the advanced tutorial.

Thanks for following along!

Devlog #11: The Quiet Sprint

Hey everyone! It’s been a while, I sort of forgot to post after the last publish, as I was also working on getting the main site out! Anyway, three months, 111 commits, 158 files changed. Let’s catch up.

Meilisearch Replaces pgvector + Trigrams

The biggest change: we ripped out our PostgreSQL trigram indexes and our SentenceTransformers embedding pipeline and replaced both with Meilisearch.

Previously we had two separate search systems. Trigram indexes for name matching (“Merkl” → “Angela Merkel”) and pgvector with SentenceTransformers for semantic similarity during enrichment (“Mayor of Labastide-Murat” → the right Wikidata position). Two systems, two sets of problems.

Meilisearch gives us hybrid search out of the box: keyword matching and semantic search in one query, using OpenAI embeddings. All entities (politicians, positions, locations, countries) are now indexed into Meilisearch during import, and the enrichment pipeline queries it directly. Positions use a higher semantic ratio (0.8) for better fuzzy matching, while politician search leans more on keywords.

This also let us unify positions with the rest of the wiki entities. Positions used to be special-cased everywhere. They had their own embedding column, their own search path, their own indexing logic. Now they’re just another entity type that gets indexed and searched the same way as everything else. A lot of code disappeared.

The result: faster search, better results, simpler architecture. We dropped the pgvector extension, removed the SentenceTransformers dependency, and deleted the embedding generation pipeline entirely.

Crawl4ai → Playwright

We replaced crawl4ai with raw Playwright for page fetching. Crawl4ai was causing Pydantic deprecation warnings and pulling in a lot of dependencies for what we actually needed: fetch a page, capture its full content.

Our new page_fetcher.py is ~100 lines. Playwright renders the page, we capture an MHTML snapshot via CDP, convert it to HTML with unmhtml, done. We control the user agent, timeouts, and error handling directly. No magic, no wrapper library.

Stats Dashboard

We built a community stats page showing enrichment and evaluation coverage by country. Bar charts show how many politicians have been enriched and evaluated per country, with separate counts for each.

Dark Mode

Full dark mode with a stateful toggle that persists across sessions. New color system for inputs, accents, and highlighting that works across both themes.

Login & Onboarding Flow

There’s now a proper a Wikidata account check (you need one to push statements) before you land in the evaluation interface. Session management got a proper cleanup: graceful ending, reset handling, and a fix for a refresh token expiration bug that was silently logging people out.

Religious Entity Filtering

Turns out our position and location hierarchies included a lot of religious entities. 21,818 bishops, dioceses showing up as birthplaces, that sort of thing. We now filter (even more :sweat_smile:) religious administrative positions and religious locations from the class tree during import.

Enrichment Pipeline

A few targeted improvements:

  • Timer-based enrichment for filterless politicians - Enrichment is normally triggered by user filters (country, language), which are all based on citizenship data. Politicians without citizenship were never enriched. A systemd timer now periodically picks up those politicians and enriches them anyway, closing that coverage gap.
  • Dump cleanup - Rigorous handling of orphaned records from failed dump downloads, with proper tests.
  • Better polling - The frontend now shows enrichment status and polls for completion, so users aren’t staring at empty screens.

What’s Next

Right now, PoliLoom generates all the data to evaluate. That’s useful, but limited. We want users to be able to contribute information and their own sources. We’d like to be able to drop a link and have the system extract and present that data for evaluation the same way it does with Wikipedia.

To support this, we’re building the interface around two complementary views: politician pages (/politicians/:QID) and source pages (/sources/:uuid). Same extracted data, same evaluation flow, one anchored on a person, the other on a page. A politician view shows all sources for one person. A source view shows all politicians properties found in one page.

Every politician gets a permalink that works both inside evaluation sessions and can be linked to from everypolitician.org. The evaluation sessions become a navigation layer on top of these pages. We can use the existing “Advanced Mode” to enable users to create statements themselves manually in addition to discarding existing statements.

To simplify navigating this, we’ll be adding a dual-purpose input to the header: search for politicians, or paste a URL to archive.

That’s it for now! Hope you have a great day :smiley:

Oh and if you didn’t yet, check out everypolitician.org!

1 Like

Devlog #12: Your Turn

Hey everyone! 135 commits, 195 files changed. Let’s get into it.

This release marks sort of a structural shift. The focus was on user agency. A lot changed under the hood to make the surface feel simple.

Politician Permalinks

The old monolithic /evaluate page is gone. Every politician now has their own page at /politician/:QID – shareable, bookmarkable, and linkable from everypolitician.org. The session flow that was buried inside /evaluate now has its own routes, but the big change is that every politician you visit has a permanent URL.

User-Submitted Sources

Until now, PoliLoom generated all the data. Wikipedia in, extracted claims out. That changes now.

You can now paste any URL into a politician’s page, and PoliLoom will fetch it, archive it, run the extraction pipeline, and present the results for evaluation. The same flow as Wikipedia, but with whatever source you want. Government gazette, parliament website, news article. Drop the link, we do the rest.

Multi-Source Properties

This is the big architectural change under the hood. Previously, each source was a one-shot extraction – if a property already existed, the prompts told the LLM to skip it. Now we extract everything from every source, and the system figures out what’s new and what’s a confirmation.

When the same fact shows up in multiple sources, they get linked together – one property, multiple sources, each with their own supporting quotes. A birth date confirmed by both Wikipedia and a government page carries more weight than one found in a single source, and when you push it out to Wikidata, it will get both sources added as references.

Manual Creation

Accept and reject was the whole game. Not anymore.

In advanced mode, users can now create properties from scratch. Positions held, birthplace, citizenship, birth and death dates – The property goes straight to Wikidata together with your evaluations.

For positions, birthplaces and citizenships, the search is powered by Meilisearch, so you get the same hybrid keyword+semantic matching we use in enrichment.

And talking about search, there’s now a search bar in the header, which allows you to jump to any politicians page. It also allows you to create a new one, which actually creates a new Wikidata entity – sets it as human, marks it as a politician – and drops you straight into their page.

On-Demand Enrichment

Navigate to a politician who hasn’t been enriched yet – via the OmniBox or a direct link – and the system triggers enrichment automatically. Their Wikipedia sources get fetched and processed in the background, and thanks to SSE events, the page updates live as results come in. No more empty pages.

Ah yes, SSE events, we added server-sent events for live updates. When a source finishes processing, when enrichment completes, when evaluation counts change – the UI updates immediately. No polling, no refresh.

Cheaper, Better Prompts

We switched to a more modern model for extraction, and the prompts got simpler. We want to keep the job for the LLM as simple as possible, to improve the quality of the responses. This means tackling in code what we can, and bothering the language model with the rest.

For example we are detecting consecutive terms that overlap with a extracted term, which helps when Wikidata already has high resolution data compared to our extraction. Turns out determining if the LLM extracted tenures are “similar enough” to what we already have, is not easy. We’re getting there though.

That’s it for now! Thanks for reading :slight_smile:

1 Like