All posts

Against the Black Box: How We Engineered Transparent Health Scoring for Vitera

โ—‰ Product Engineering · Health Tech

Whoop shows you a number. Oura shows you a ring. Neither will tell you why. Vitera decomposes every score into its inputs, weights, and drivers — and ships it to the App Store on a production-grade AWS stack with a Bedrock agent reviewing every PR.

By: TSP Engineering Team · 12 min read · React Native · AppSync · DynamoDB · Bedrock
      scoring-philosophy.diff
Black Box
?
Score appears. No inputs. No weights.
No way to interrogate it.
Decomposable
84
Every composite breaks down to its drivers.
Transparency isn't a feature — it's the product.
Live
On App Store
4
Scoring Dimensions
<1ms
DynamoDB Reads
AI
Bedrock PR Review
TL;DR

Vitera is a production-grade iOS health analytics platform for athletes and high performers — now live on the App Store. Tech Stack Playbook designed and built it on React Native with Expo, backed by DynamoDB, AppSync GraphQL, and a Bedrock-powered PR review agent that ships in the CI/CD pipeline.

The product thesis: serious athletes don't need another opaque recovery number. They need a scoring engine they can interrogate — drill into inputs, see weights, understand drivers. Below: the live decomposable score, the architecture, and the DevSecOps pattern that makes it all shippable.

01 / THESISThe Black Box Is the Bug, Not the Feature

The wearable market has an open secret: every major recovery score — Whoop's Recovery, Oura's Readiness, Apple's Vitals trend — is calculated from a proprietary algorithm the user is never allowed to see. The number arrives. It might feel right. It might feel deeply, obviously wrong. There's no inspection, no appeal, no way to reason about what changed.

That's fine for a casual user who wants a nudge to go easy today. It's inadequate for the audience Vitera is built for: competitive athletes, longevity-obsessed high performers, tennis players tracking serve mechanics across training blocks, people running biomechanical pipelines on their own match footage because no off-the-shelf tool can surface what they need. These users don't want to be told how they're recovering. They want to inspect the evidence.

So the design constraint wasn't "build a better score." It was: every score must be fully decomposable at runtime, with every input, every weight, and every contribution visible to the user who owns the data.

02 / DEMODrag a Slider. Watch the Score Rebuild.

This is a simplified version of Vitera's Recovery Score engine. Five weighted inputs feed a composite 0–100. Drag any slider — the composite recalculates, the decomposition bars update, the tier badge re-evaluates. No opaque math, no hidden normalization. This is what "decomposable" means as a UI contract, not a marketing word.

โ—‰ Recovery Score — Decomposable Engine
 Live
 
80
Recovery
Primed
 

03 / DIMENSIONSFour Scores, Four Lenses on Performance

Recovery is one dimension. Vitera surfaces four — each with its own decomposable formula, each weighted for the way serious athletes actually train and recover. Tap a tab to see the formula and the signal it captures.

Recovery — readiness for today's output

Quantifies readiness based on what you actually did to recover, not just biometric proxies. Contrast therapy cycles, compression sessions, rest quality, and active recovery work are first-class inputs — not afterthoughts.

Recovery = 0.35·Sleep + 0.25·HRV + 0.20·ContrastTherapy + 0.15·RHRΔ + 0.05·RestDay

First platform to score cold plunge and sauna cycles as quantified recovery modalities — not just passive biometric inference.

Strain — cumulative training load

Measures load across every workout type — tennis matches, strength, running, conditioning — weighted by duration, heart rate zone distribution, and session density. Designed to show when you're building versus approaching overtraining.

Strain = Σ(Durationi × ZoneIntensityi × SportWeighti) × FrequencyMultiplier

Sport-specific weighting matters: a 90-minute tennis match produces different strain than a 90-minute Zone 2 run. Generic trackers flatten this; Vitera preserves it.

Vitality — long-term health trajectory

The longevity lens. Tracks systemic health trajectory over weeks and months, not day-to-day noise: HRV baseline shifts, RHR trends, sleep architecture stability, sustained recovery patterns.

Vitality = 0.30·HRV30d + 0.25·RHRtrend + 0.25·SleepArch90d + 0.20·RecoveryConsistency

A Vitality score of 78 means something very different at 28 vs. 52. The composite is age-normalized in the production engine.

Sleep — architecture & efficiency

Detailed sleep analysis incorporating total sleep time, deep sleep %, REM sleep %, sleep consistency, and time-in-bed efficiency — surfaced as both a standalone score and a weighted input to Recovery and Vitality composites.

Sleep = 0.30·Duration + 0.25·DeepSleep% + 0.20·REM% + 0.15·Consistency + 0.10·Efficiency

Consistency matters as much as total duration. Seven hours with a stable bedtime beats eight hours with a drifting schedule — and the composite reflects that.

04 / ARCHITECTUREThe Stack That Makes It Shippable

A decomposable score is a product idea. Shipping it to the App Store with sub-millisecond drill-down queries is an engineering problem. Vitera runs on a lean, entirely managed AWS backend — every layer chosen for latency, type safety, and audit-grade score reproducibility.

โ—‰ Vitera Data Flow — Apple Health → AWS → App
 

Why DynamoDB for health data

The access patterns fit the database: user-scoped score reads, time-ranged history queries, recovery-session writes. All of it partition-keyed, all of it served in single-digit milliseconds. Score drill-downs happen inside the dial animation frame — that's only possible when the backend never becomes the bottleneck.

GSIs handle the query fan-out: by date, by workout type, by score range, without table scans. Bursty morning check-in traffic + minimal overnight load = the exact shape DynamoDB's on-demand billing is built for.

Why AppSync for the API layer

Schema-first GraphQL is the right contract between a React Native client and a typed backend. Every score, every decomposition, every historical query fetches exactly the shape the UI needs — no over-fetching, no N+1 on drill-downs. Real-time subscriptions mean new health data ingested on-device surfaces in the dashboard without polling.

graphql · vitera schema excerpt
type RecoveryScore {
  id: ID!
  userId: ID!
  date: AWSDate!
  composite: Int!            # 0–100
  tier: RecoveryTier!       # PRIMED | READY | MODERATE | LOW
  inputs: [ScoreInput!]!     # decomposition — the whole point
  calculatedAt: AWSDateTime!
}

type ScoreInput {
  name: String!              # "Sleep" | "HRV" | "ContrastTherapy" | ...
  rawValue: Float!           # the measured input
  normalized: Float!         # 0–100 after normalization
  weight: Float!             # its contribution weight
  contribution: Float!       # normalized × weight
}

05 / DEVSECOPSEvery PR Gets Reviewed by Claude on Bedrock First

Shipping health data means raising the bar on what gets merged. Vitera runs a Bedrock-powered PR review agent as the first reviewer on every pull request — before a human opens the diff. The agent is grounded in project context (the data model, the scoring engine contracts, the architectural standards) and flags the specific classes of mistake that compound fastest in a mobile health app.

yaml · github actions · bedrock agent
name: bedrock-pr-review
on: [pull_request]

jobs:
  review:
    runs-on: ubuntu-latest
    permissions:
      id-token: write        # federated OIDC to AWS
      pull-requests: write
    steps:
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.BEDROCK_REVIEWER_ROLE }}
          aws-region: us-east-1

      - name: Claude-powered review
        run: |
          pnpm vitera-reviewer \
            --diff "$(git diff origin/main)" \
            --context .vitera/review-context.md \
            --model anthropic.claude-sonnet-4 \
            --checks quality,completeness,security,schema-drift

Four check classes the agent owns: quality (bugs, anti-patterns), completeness (does the PR actually implement the spec?), security (IAM drift, DynamoDB config linting), and schema-drift (GraphQL contract changes that break the client). Human reviewers open the PR after the agent has already done the pattern-matching pass — so the conversation starts at "is this the right design?" instead of "did you forget to handle null?"

โ—‰ Key Insight

Transparency isn't a feature on the roadmap. It's the UI contract that forces every decision upstream — how scores are stored, how the API is shaped, what the database schema has to prove. Change the contract and everything downstream collapses. Keep the contract, and you get a product that serious athletes can actually reason about.

06 / OUTCOMESWhat Shipped

Live
On App Store

Production-grade iOS application deployed via Expo's managed build and submission pipeline with full HealthKit integration.

4
Decomposable Scores

Recovery, Strain, Vitality, and Sleep — every composite drills down to visible inputs, weights, and contributions.

<1ms
Score Retrieval

DynamoDB-backed queries deliver instantaneous drill-downs. Decomposition animations never wait on I/O.

AI
PR Review in CI

Bedrock agent catches quality, completeness, security, and schema-drift issues before human review begins.

Stack

React Native Expo NativeWind Apple HealthKit AWS AppSync GraphQL DynamoDB DynamoDB GSIs Amazon Bedrock Claude (Sonnet) GitHub Actions IAM IaC Scanning

07 / TAKEAWAYWhy Engineering Rigor Belongs in Consumer Health

Consumer health apps have historically shipped with the engineering discipline of a weekend hackathon and the data sensitivity of a bank. That gap is the market. When we built Vitera the same way we'd build an enterprise data platform — typed schemas, auditable IaC, AI-reviewed PRs, sub-millisecond reads — the product got better not because of new features but because every score became a first-class, inspectable object. That's the thesis: health data deserves the same engineering rigor as financial data, and the people who depend on it deserve better than a black box.

If you're building something where users are betting decisions on model outputs — health, finance, performance, coaching — the transparency pattern generalizes. Decomposable outputs force better architecture upstream. The UI is where the contract gets enforced. It's also how trust gets earned. Our AI/ML engagements all land here eventually.

Building a data-heavy mobile or AI product that can't afford to be a black box?

We partner with product teams to ship production-grade mobile and AI platforms on AWS — with the architecture, observability, and DevSecOps to make every output auditable. No opaque models. No hand-wavy outputs. Just systems your users can actually reason about.

Book a strategy call  
Explore more