How We Built William Price

A case study in ethical AI ancestry simulation

A note from the founder

It didn't start with code.

It started with family.

I've been a serious genealogy hobbyist for nearly fifteen years. Over that time I've collected stories, documents, photos, and—eventually—my entire family tree, some branches reaching back to the 1600s. And yes, like most genealogists, I rely heavily on the major genealogy platforms to manage and maintain that history.

When I began building the underlying platform in early 2025—the technology that would eventually become ThroughTheirEyes—I wasn't thinking about my own ancestors at all. I was experimenting with ways to create believable historical personas using large language models.

A couple of months into the project, somewhere between battling prompt structures and testing persona boundaries, I had a sudden thought:

What would it feel like to talk to my own ancestors?

So I built the very first prototype the only way I could at the time: manually. I cut and pasted historical details from my family tree into an early character blueprint and chose a distant ancestor—a Welsh miner, my great-great-great-grandfather. I layered in everything I knew about him, piece by piece.

And then I spoke to him.

And he spoke back.

It was moving in a way I didn't expect.

But that experience also taught me something important:

If we were going to offer this technology to others, we had to protect real families.

  • No privacy risks. We do not recreate living people or anyone close to living memory.
  • No distortion of genealogical facts. Births, deaths, relationships must remain faithful to the tree.
  • Clear separation between fact and inference. When the AI fills gaps or adds contextual colour, it happens within controlled, era-appropriate boundaries.
  • Respect for the people we bring to life. Every persona is framed as an interpretive simulation, not a literal reconstruction.

To demonstrate this approach safely, we created William Price—a fully synthetic ancestor built from a carefully constructed fictional GEDCOM file. Every detail reflects real 19th-century Welsh mining communities, but no names, dates, or relationships correspond to actual individuals.

William became our safe, ethical demonstration persona:
realistic enough to show what the platform can do,
fictional enough to protect real families.

William is an interpretive simulation — a synthetic ancestor grounded in historically accurate structure and cultural context, not a reconstruction of any real individual.

This page explains exactly how we built him.

The Challenge

Creating believable ancestor personas with AI is surprisingly hard. The core problem?

Large language models love to fabricate. Ask an LLM to roleplay a 19th-century miner, and it will confidently invent family members, historical events, and personal anecdotes that never happened.

For creative writing, that's fine. For genealogy, it's a disaster.

We needed a system that could generate emotionally resonant conversations while staying rigorously faithful to genealogical records—no invented siblings, no fabricated historical events, and no creative liberties with core biographical facts.

Our Approach

What Is a Synthetic Ancestor?

A synthetic ancestor is a fictional individual whose historical details, family structure, and cultural context are plausible and historically accurate, but not derived from real people. William Price is our synthetic ancestor—built from a carefully constructed GEDCOM file that reflects real 19th-century Welsh mining communities without exposing anyone's actual genealogy.

1. Building a Synthetic GEDCOM

We started by creating a historically accurate but entirely fictional GEDCOM file for William Price and his family:

  • Birth: 1834, Blaenavon, Wales
  • Occupation: Coal miner (collier)
  • Father: Griffith Price (miner)
  • Mother: Mary Evans (from Lydbrook, Forest of Dean)
  • Spouse: Catherine
  • Children: Thomas (putter at Pontnewynydd pit), Caroline
  • Location: Aberdare, South Wales mining valleys
  • Historical context: 1884 miners' strike, Fernhill Colliery

Every detail was researched to reflect real Welsh mining communities of the era—migration patterns, occupations, naming conventions, and local geography.

Place names and historical references (such as Fernhill Colliery) reflect real regions and events, but do not correspond to any actual individual’s genealogy.

2. The Persona Blueprint

We developed a structured persona blueprint that defines strict boundaries for what William can and cannot know:

Hard Cutoff Date: 1885

William cannot acknowledge any events, people, or knowledge that would have occurred after this date. This prevents anachronisms and keeps responses historically grounded.

Verified Facts Only

Every statement about William's life must tie back to a specific GEDCOM note or historical enrichment field. No improvisation on core biographical facts.

Cultural Context Allowed

William can reference general historical knowledge (mining conditions, Welsh culture, hymn singing) but never invent specific personal experiences.

A Note About AI Inference (and Why It Matters)

Our goal is to keep William human without compromising historical accuracy. Inference gives him personality, but facts always stay grounded.

Even with all the guardrails, the cutoff dates, verified facts, historical context, and provenance markers, William Price is still powered by a modern large language model. That means two things are always true within the controlled boundaries of the persona blueprint:

1. He will occasionally fill in the gaps.

The AI may make era-appropriate inferences using historically plausible context (e.g., common mining practices, regional conflicts, cultural rituals). These are natural, human-like inferences that help the persona that feels alive.

Examples:

  • • Describing how a miner might stand or gesture
  • • Mentioning a hymn everyone in the valley knew
  • • Adding flavour that fits Welsh culture of the time
  • • Reacting emotionally to a story (e.g., pride, surprise)

These inferences are part of what makes the conversation feel human—and they're allowed only within the boundaries we set.

2. He will never fabricate core facts.

William will not invent:

  • • New family members
  • • Major life events
  • • Locations he never lived
  • • Historical events that never happened
  • • Knowledge after 1885

When William states something factual—a birth place, an occupation, a family relationship—it will always be tied to a provenance marker. [P#].If a detail cannot be sourced, he will not present it as a fact.

3. We will soon make AI inferences fully transparent.

A future update will introduce a second type of marker:

  • [P#]Provenance markers — grounded in your GEDCOM or historical enrichment packs
  • [AI]Inference markers — showing when the model is filling in contextual colour

This will give users full transparency: you will always know when a detail comes from your tree, from historical knowledge, or the AI's reasoning.

What Is Provenance Tracking?

Provenance tracking means every factual claim in a conversation contains a reference to its source—GEDCOM records, user input, or contextual inference. Each statement is marked with a citation (like [P1] or [P2]) that shows exactly where the information came from, along with a confidence level. This ensures genealogical transparency and prevents AI fabrication.

3. Provenance Tracking

Every factual claim in William's responses includes a provenance marker—a citation that points back to the underlying GEDCOM source.

Example response:

"My father Griffith swung a pick before me, and my mother Mary stitched the holes in our shirts when the pit took its toll." [P1, P3]

Hover over [P1] to see:

Synthetic persona: William Price

Confidence: 95%

This transparency allows users (and researchers) to validate every claim and understand the difference between verified facts and cultural inference.

In the full platform, provenance markers are visible and interactive — you’ll be able to click or tap each marker to see its exact source.

4. Historical Enrichment Fields

Beyond basic GEDCOM data, we added enrichment fields to make conversations feel authentic:

  • Occupation details: "Six Feet seam," "trammer's cart," "pit props"
  • Regional geography: Aberdare cross, Pontnewynydd pit, Fernhill Colliery
  • Cultural references: Welsh hymns ("Llwyn Onn," "Bread of Heaven"), hiraeth (longing)
  • Historical events: 1884 miners' strike, specific mine disasters
  • Dialect patterns: "bach," "Duw, Duw," "fy machgen i"

These details ground the persona in lived reality without requiring users to manually input them. Future versions will auto-generate enrichment from historical databases.

The Result

William Price represents our vision for ethical AI ancestry: emotionally engaging, historically grounded, and completely transparent about its sources.

Accurate

Every fact verified against GEDCOM data

🎭

Authentic

Era-appropriate language and cultural grounding

🔍

Transparent

Provenance markers for every claim

What's Next

William Price is just the beginning. When you join the beta, you'll be able to:

  • Talk to Catherine Hughes — a synthetic ancestor built from the same fictional GEDCOM, offering a complementary perspective on 19th‑century Welsh valley life
  • Upload your own GEDCOM and create personas for your actual ancestors
  • Access historical enrichment packs for specific regions, time periods, and occupations
  • Expand provenance types to include census records, military service, immigration documents
  • Have multi-generational conversations and explore family dynamics across time

These features are under active development and will roll out progressively during and after the beta.