How It's Built

OpenScripture is not only a Bible reader. It is a data project that connects published translations, source-language morphology, manuscript signals, commentary, cross-references, and personal study choices into one careful reading experience.

The most significant work is happening beneath the surface: building translation-specific interlinear data so every supported Bible can eventually carry native original-language word data and trustworthy Word Lock behavior.

OpenScripture data path

Evidence to alignment to reader

1

Source data

Hebrew, Aramaic, Greek, English, notes, canons, and licensing terms all enter the system as separate evidence streams.

2

Alignment graph

OpenScripture turns those streams into reviewed links between source-language tokens and each translation's actual English wording.

3

Reader signals

The app then uses those links for interlinear study, Word Locks, translation differences, certainty signals, and AI translation context.

Why This Matters

Many Bible apps make digital text fast and searchable. OpenScripture is aiming at a different layer of accessibility: helping readers see why translations differ, how English words connect back to the source languages, and where manuscript evidence is more or less settled.

That means treating technical infrastructure as part of the mission. If the data is careful, the reader can be simple. A tap can open the original-language word. A small marker can explain a real translation difference. A personal Word Lock can help someone learn recurring vocabulary while reading naturally.

Feature by Feature

Each feature has a visible reader experience and a less visible data problem underneath it. The work is to make the hidden layer rigorous enough that the visible layer can feel calm, quick, and trustworthy.

The big data layer

Translation-specific interlinear data

How it is built

OpenScripture is building its own word-level alignment layer for every supported translation. The system starts with source-language morphology and Strong's-linked source data, then maps each source token to the actual English token or phrase used by a translation. Reviewed rows become runtime gold data; generated rows stay labelled as approximate until they pass audit.

Technical complications

Bible translation is not a neat word-for-word spreadsheet. One Hebrew or Greek word can become a phrase. English can add helper words that are not separate source tokens. Word order changes. Some translations follow different textual bases or versification systems. The alignment model has to preserve direct word anchors, phrase groups, supplied English, and untranslated source words without pretending the problem is simpler than it is.

Desired result

The desired result is native original-language word data beneath the reader, not a generic gloss pasted under every translation. A reader should be able to tap a word in the English text, see the matching Hebrew, Aramaic, or Greek data, switch interlinear bases, and trust that the app is describing that translation rather than a nearby approximation.

Learning by reading

Word Locks and personalised Bibles

How it is built

Word Locks are keyed to source-language identity, normally a Strong's number plus the aligned source token. In Composite mode, a reader can choose a preferred rendering for a word, and the app applies it wherever the alignment is suitable. Verse Locks and Word Locks share the same personalisation model, with a clear rule for which one wins when both touch the same verse.

Technical complications

This only works if the alignment is compact enough for substitution. If one source token accidentally owns a whole English clause, a Word Lock would damage the sentence. The pipeline therefore uses a substitution test: replacing one direct anchor should leave the surrounding English intelligible. Publisher policies also matter, so the write path has to respect translation-specific restrictions rather than relying on the button being hidden.

Desired result

A reader can build vocabulary in context. Instead of studying a word once in a separate lexicon, they can see that word reappear across Scripture with their chosen rendering, while the rest of the verse remains connected to published translation text.

Why translations disagree

Translation difference symbols

How it is built

OpenScripture stores precomputed divergence data by verse. Entries are classified by what kind of difference is present: source-text or canon split, theological or interpretive rendering, or translation philosophy. The reader sees circled markers in context, and the drawer explains the difference with the relevant renderings grouped by tradition.

Technical complications

The hard part is judgment. A visible wording difference is not automatically a meaningful disagreement. The pipeline has to avoid inflating ordinary style differences into manuscript issues, avoid hiding important textual variants, keep explanations short, and store only phrase-level renderings so copyright boundaries stay respected.

Desired result

Readers get a small signal exactly where it helps: this verse is translated differently, and here is why. The goal is not to push a preferred wording, but to help people notice the scholarly landscape behind familiar English phrases.

Manuscript context without overload

Textual certainty signals

How it is built

Textual certainty data is stored sparsely at word or passage level. The pipeline can draw candidates from documented reference variants, translation editorial brackets, and SBLGNT/MorphGNT signals, then attach scores and reasons to morphology word positions. The reader setting decides how strongly those signals appear.

Technical complications

Textual certainty is adjacent to translation disagreement, but it is not the same thing. A translation can differ because of style even when the source text is stable, or because the underlying manuscript reading is genuinely contested. The app keeps those signals separate. It also has to respect licensing limits around critical apparatus material, storing only what OpenScripture is allowed to store.

Desired result

Stable readings stay quiet. More debated readings can be marked when the reader wants that level of detail. The result is a Bible reader that can surface manuscript uncertainty without turning every chapter into a specialist apparatus.

Experimental, labelled, and source-aware

AI Translation mode

How it is built

The AI Translation mode produces multiple style and emphasis combinations, using source-language morphology and permitted source material rather than simply paraphrasing a copyrighted English translation. Generated text carries confidence and decision metadata, and AI alignment is expected to follow the same word-data contract where alignment payloads exist.

Technical complications

AI output is only useful if it is labelled honestly and kept inside a disciplined data model. The pipeline needs provenance, regeneration triggers, source-word context, and clear separation from publisher-authored Bible text. It also needs to avoid treating a fluent model sentence as automatically aligned or authoritative.

Desired result

Readers can explore how a passage might be rendered under different translation goals while still seeing where published translations, source-language data, and divergence signals provide firmer ground.

The quiet infrastructure

Translation ingestion, notes, and canons

How it is built

Each new translation has to pass through licensing, metadata, verse text ingestion, publisher notes, copyright notices, reader visibility, search, comparison, and word-data checks. Where a publisher provides notes, introductions, commentary, or cross-references, those sources are normalized so the drawer can show the right material for the verse the reader tapped.

Technical complications

Publishers deliver data in different formats. Word documents, USFM, JSON, public-domain files, study notes, cross-reference lists, and commentary all behave differently. Versification can differ. Canon scope can differ. Formatting such as italics, bold, paragraphing, quotation layout, and note anchors is part of the meaning, so the parser cannot simply flatten everything into plain text.

Desired result

The desired result is a broad, respectful reader across Protestant, Catholic, Orthodox, Ecumenical, Jewish, and Independent traditions, with each translation shown on its own terms and connected to the same study surfaces where the data allows.

The Pipeline Pattern

The same discipline shows up across translation ingestion, word alignment, divergence explanations, and certainty data. OpenScripture tries to keep the original source, the generated candidate, the review status, and the reader-facing claim separate until the evidence is strong enough.

  1. 1

    Ingest the publisher text and notes from the most authoritative available source.

  2. 2

    Normalize books, chapters, verses, notes, formatting, copyright terms, and reader visibility.

  3. 3

    Attach morphology, Strong's data, source tokens, and translation-specific English token positions.

  4. 4

    Generate candidate alignments and divergence explanations, then audit for semantic honesty.

  5. 5

    Promote reviewed data into the runtime tables that power the reader, drawer, interlinear view, and lock system.

  6. 6

    Keep uncertainty visible: approximate alignment stays approximate, reviewed data earns stronger language, and gaps remain labelled rather than hidden.

A Contribution to Digital Bible Accessibility

The broader movement is not only about putting more Bible text online. It is about making the depth behind the text easier to reach: original languages, translation philosophy, textual history, commentary, cross-references, and personal study patterns.

OpenScripture contributes by building a reader where those layers can be available without overwhelming the page. The work is slow because the details matter, but the payoff is a Bible experience that can become more open, more transparent, and more useful with each data layer added.

  • Make deep study tools understandable for ordinary readers, not only specialists.
  • Let many translation traditions sit beside each other without flattening their differences.
  • Expose the data limits honestly so digital convenience does not become false certainty.
  • Use modern software, careful licensing, and reviewable pipelines to make Scripture study more accessible over time.

Built for Readers, Designed for Evidence

The app should feel simple, but the simplicity is earned by the pipeline beneath it: careful ingestion, honest alignment, reviewable automation, and visible uncertainty where certainty would be dishonest.