Platform
Orchestration
How the backend persists configuration, plans deployments, and orchestrates rollouts across a distributed fleet.
Design intent
Use this lens when implementing Orchestration across a fleet: define clear boundaries, make change snapshot-based, and keep operational signals observable.
- Orchestration is a state machine with explicit transitions
- Reconciliation (desired vs observed) is how fleets stay consistent
- Idempotent retries prevent partial rollouts from getting stuck
What it is
The backend is the control plane: it persists configuration and versions designs, and orchestrates deployments to edge devices.
Design constraints
- Orchestration is a state machine with explicit transitions
- Reconciliation (desired vs observed) is how fleets stay consistent
- Idempotent retries prevent partial rollouts from getting stuck
Architecture at a glance
- Endpoints (protocol sessions) → points (signals) → mappings (typed bindings) → control app ports
- Adapters isolate variable-latency protocol work from deterministic control execution paths
- Validation and data-quality checks sit between “connected” and “correct”
- This is a UI + backend + edge concern: changes affect real-world actuation
Typical workflow
- Define endpoints and point templates (units, scaling, ownership)
- Bind points to app ports and validate types/limits
- Commission using a canary device and verify data quality (staleness/range)
- Roll out with rate limits and monitoring for flaps and errors
System boundary
Treat Orchestration as a repeatable interface between engineering intent (design) and runtime reality (deployments + signals). Keep site-specific details configurable so the same design scales across sites.
Example artifact
I/O mapping table (conceptual)
point_name, protocol, address, type, unit, scale, direction, owner
pump_speed, modbus, 40021, REAL, rpm, 0.1, read, device:pump-1
valve_cmd, modbus, 00013, BOOL, -, -, write, app:fb-networkWhy it matters
- Fleet-wide consistency for deployments and configuration
- Automated rollout/rollback across sites
- Single source of truth for audit and compliance
Engineering outcomes
- Orchestration is a state machine with explicit transitions
- Reconciliation (desired vs observed) is how fleets stay consistent
- Idempotent retries prevent partial rollouts from getting stuck
Quick acceptance checks
- Define a deployment state machine (plan → deploy → verify → complete)
- Store desired state and reconcile against observed device state
Common failure modes
- Units/scaling mismatch (values look “reasonable” but are wrong)
- Swapped addresses/endianness/encoding issues that only show under load
- Staleness: values stop changing but connectivity stays “green”
- Write conflicts from unclear single-writer ownership
Acceptance tests
- Step input values and verify expected output actuation (end-to-end)
- Inject stale/noisy values and confirm guards flag or suppress them
- Confirm single-writer ownership with a write-conflict test
- Verify the deployed snapshot/version matches intent (no drift)
- Run a canary validation: behavior, health, and telemetry align with expectations
- Verify rollback works and restores known-good behavior
In the platform
- Stores device/resource registry and application models
- Plans deployments from a snapshot to a target fleet
- Tracks rollout status and failures
Implementation checklist
- Define a deployment state machine (plan → deploy → verify → complete)
- Store desired state and reconcile against observed device state
- Track failures with categories (runtime/adapters/network/config)
- Automate rollback when health gates fail during rollout
Rollout guidance
- Start with a canary site that matches real conditions
- Use health + telemetry gates; stop expansion on regressions
- Keep rollback to a known-good snapshot fast and rehearsed
Acceptance tests
- Step input values and verify expected output actuation (end-to-end)
- Inject stale/noisy values and confirm guards flag or suppress them
- Confirm single-writer ownership with a write-conflict test
- Verify the deployed snapshot/version matches intent (no drift)
- Run a canary validation: behavior, health, and telemetry align with expectations
- Verify rollback works and restores known-good behavior
Deep dive
Practical next steps
How teams typically apply this in real deployments.
Key takeaways
- Orchestration is a state machine with explicit transitions
- Reconciliation (desired vs observed) is how fleets stay consistent
- Idempotent retries prevent partial rollouts from getting stuck
Checklist
- Define a deployment state machine (plan → deploy → verify → complete)
- Store desired state and reconcile against observed device state
- Track failures with categories (runtime/adapters/network/config)
- Automate rollback when health gates fail during rollout
Next steps
Related topics
Deep dive
Common questions
Quick answers that help during commissioning and operations.
What does orchestration need to record for audits?
Who initiated a change, which snapshot was deployed, which targets were affected, the rollout timeline, and any health/telemetry outcomes.
How do we keep orchestration safe?
Use immutable artifacts, staged rollouts, and explicit gates. Avoid “push latest everywhere”.
What is the most common orchestration failure mode?
Partial rollouts with unclear state. Make transitions explicit and ensure retries are idempotent.