Precision
Infrastructure
for Bio-Data.

The evidence substrate for living systems. Every biological fact — lab result, AI prediction, researcher hypothesis — structured, provenance-tagged, and cross-species queryable.

Evidence Layer · James, 58
SOURCE FACT196 Blood Panels
Source Fact
Derived
Normalized
Model Output

James, 58 — 847 evidence assertions visualized across body systems

The Atomic Unit

Every biological fact is an assertion. Every assertion knows its origin.

Not a row in a table. Not a PDF attachment. An EvidenceAssertion — typed, timestamped, traceable. It knows what kind of claim it is, where it came from, how confident you should be, and what it connects to.

source_factVERIFIED

EGFR_EXON_19_DELETION

James, 58 — Homo sapiens · TAX_9606

Mutation Type
Deletion
Exon
19
VAF
47.3%
Coverage
2,847×
Source
Illumina NovaSeq 6000
Provenance
Foundation Medicine · Liquid Biopsy
Collected
2024-03-14 · Status: Current
  • → 4 derived_features
  • → 2 model_outputs
  • → 1 hypothesis

The Friction

Biological data is stuck in the dark ages of static PDFs and broken pipelines.

“James's oncologist ordered 14 tests across 3 systems over 6 months. By treatment decision time, no one could reconstruct which EGFR result came from which instrument run — or which interpretation was from the AI classifier and which from the pathologist.”

Infrastructure Gap Detected

76% of preclinical studies cannot be reproduced. The reason isn't fraud — it's missing provenance. OpenBio is the evidence substrate that fixes the bridge.

What if every biological fact carried its full story?

source_fact
HBA1_HUMAN

Hemoglobin Card

Atomic Mass
64,458 Da
Sequence Integrity
99.98%
Verified Citations
1,204
Source
Swiss-Prot · UniProtKB/Swiss-Prot
Entry
P69905 (HBA_HUMAN)
Valid since
1988-07-01 · Last reviewed 2024-01

Lineage source_fact3 derived_features12 model_outputs

100%

Traceability across every digital asset generated within the OpenBio mesh.

The Trust Architecture.

Epistemic Boundaries

source_fact

Raw measurement from a system of record.

"James's CBC from Quest Diagnostics"

normalized_fact

Standardized and mapped to a shared ontology.

"EGFR mapped to HGNC:3236"

extracted_annotation

Derived from documents or images via parsing.

"Physician note, NLP-extracted"

derived_feature

Computed from one or more other assertions.

"Progression score from 3 lab values"

model_output

AI/ML prediction with attached confidence.

"Recurrence risk: 74% (AlphaFold3)"

hypothesis

Proposed interpretation, under active review.

"EGFR-TKI resistance via T790M"

How Data Moves Through the Levels

  1. Ingestion

    Raw sequence data cleaning and metadata normalization.

    Produces source_fact · normalized_fact

  2. Verification

    Cryptographic hashing of biological records.

    Validates source_fact integrity

  3. Consensus

    Multi-node validation of experimental results.

    Elevates to normalized_fact · extracted_annotation

  4. Immutable

    Clinical-grade archiving of final biological truths.

    Archives all levels with full lineage

evidence_vectors

Same schema.
Every organism.

From bacteria to primates — every subject is encoded into the same evidence dimensions. Different shapes, identical axes.

  1. James

    TAX_9606 · human

    847 assert.

  2. Luna

    TAX_9615 · canine

    234 assert.

  3. M-4872

    TAX_9544 · macaque

    512 assert.

  4. HeLa-S3

    CELL_HELA · cell line

    1,247 pass.

  5. E. coli K-12

    TAX_83333 · microbe

    4,891 feat.

labsgenomictemporalimagingdrugsEvidenceAssertion · shared schema

The Evidence Mesh

One substrate. Every organism.

The same EvidenceAssertion model spans the entire tree of life. Find an EGFR variant in James. Trace it to macaque trial data. Cross-reference E. coli gene expression. One query. Five organisms.

Query: EGFR-related assertions across subjects

James
847
Luna
234
M-4872
512

The same EGFR exon 19 deletion in James links to macaque trial results in M-4872 through the evidence mesh.

Built For

Infrastructure serves everyone who touches biology.

Research Scientists

Query provenance-complete evidence across organisms. Know whether you're looking at a raw measurement or an AI inference. Compare findings cross-species. Never lose the chain of custody from instrument to publication.

847

assertions on a single human subject

AI & ML Engineers

Structured, typed biological data for training and inference pipelines. Every example knows its trust level — preventing source facts from being mixed with model outputs. The data substrate your agents deserve.

6

trust levels, queryable by filter

Biotech Builders

An evidence API that speaks FHIR, OMOP, GA4GH — and adds provenance none of them offer. Build on infrastructure instead of data engineering heroics. The layer above your existing systems, not a replacement.

5

organism types, one data model