Trilio Site Recovery for OpenStack

Architecture

System architecture including service placement, replication topology, and the storage driver model.

master

Overview

Trilio Site Recovery for OpenStack (protector) is a tenant-driven disaster recovery service that coordinates failover and failback of Nova VMs between two independent OpenStack clouds using Pure Storage FlashArray replication. This page describes how the system is structured: where services run, how the coordination layer bridges the two sites, how storage replication is modeled, and why the architecture is designed the way it is. Understanding this structure helps you reason about failure modes, plan your deployment, and interpret the behavior you observe during DR operations.

Architecture diagram

Components

protector-api

A RESTful API service that runs independently on each site. It validates inbound Keystone tokens, enforces RBAC policies, and routes requests to protector-engine via RPC. You interact with it through the OSC CLI plugin or Horizon. Because each site runs its own protector-api, you can issue DR operations from whichever site is reachable — including initiating an unplanned failover from the secondary site when the primary is down.

protector-engine

The business logic service that runs independently on each site. It owns all stateful DR workflows: creating and updating Protection Groups and Consistency Groups, adding VM members, interacting with Pure Storage arrays, managing OpenStack resources (Nova, Cinder, Neutron) across sites, and driving failover/failback state machines. The engine uses a pluggable StorageDriver interface to isolate array-specific logic from orchestration logic.

MariaDB (per site)

Each site maintains its own MariaDB instance storing the full Protector data model: sites, protection groups, consistency groups, members, volumes, DR operations, replication policies, and metadata sync state. There is no shared or replicated database between sites — instead, the coordination layer pushes metadata updates explicitly.

RabbitMQ (per site)

Each site has its own RabbitMQ broker used for internal RPC between protector-api and protector-engine. Cross-site communication does not flow through RabbitMQ.

OSC CLI plugin (protectorclient) and Horizon dashboard

These are the coordination layer that bridges the two sites. They authenticate independently to both sites' Keystone endpoints and orchestrate cross-site metadata synchronization — for example, pushing a protection group metadata update to the remote site's protector-api after a local change. Because all cross-site coordination flows through this client layer rather than between the service processes themselves, neither protector-api nor protector-engine needs a routable path to its peer service on the other site.

PureStorageDriver

The primary production StorageDriver implementation. It manages Pure Storage FlashArray resources in two modes depending on the replication type configured for a Protection Group:

Async replication: The driver manages a Pure Storage Protection Group (PG) and orchestrates periodic snapshot replication to the peer array. Recovery uses the latest replicated snapshot.
Sync replication: The driver manages an ActiveCluster Pod, which provides zero-RPO synchronous mirroring between arrays.

MockStorageDriver

A full behavioral simulation of the PureStorageDriver backed by SQLite instead of physical arrays. It implements the same StorageDriver interface and supports end-to-end DR workflows — including failover, failback, and test failover — without any FlashArray hardware. Use this driver in development, CI, or lab environments where physical arrays are not available.

Pure Storage FlashArray (per site)

The physical or virtual storage arrays that underpin Cinder block storage at each site. Volume replication — whether via Protection Group snapshots (async) or ActiveCluster Pods (sync) — is handled natively by the arrays. Protector drives this replication through the StorageDriver interface; it does not implement its own data-plane replication.

Cinder Consistency Groups (per site)

Cinder Consistency Groups ensure that all volumes belonging to a Protection Group are snapshotted and replicated together in a crash-consistent manner. Every volume added to a Protection Group must use a Cinder volume type with replication_enabled='<is> True' and a matching replication_type property. Each site maintains its own Cinder Consistency Group that maps 1:1 to the Protector Protection Group on that site.

Data flow

The following trace walks through creating a Protection Group, adding a VM, and executing a failover. It shows how control and data move across every layer.

1. Creating a Protection Group

You run openstack protector protection-group create with the OSC plugin. The plugin authenticates to both sites' Keystone endpoints using per-site service credentials and then issues a REST call to the primary site's protector-api. The engine validates that the named volume type exists on both sites and carries replication_enabled='<is> True'. It then creates a Cinder Consistency Group on both sites and writes the Protection Group record to the local MariaDB. The client layer pushes the initial metadata (version 1) to the secondary site's protector-api, which writes a replica record to its own MariaDB. Both sites now hold identical metadata at version 1.

2. Adding a VM member

You run openstack protector protection-group member-add. The engine queries Nova for the VM's full configuration (flavor, networks, security groups, keypair, floating IPs) and inspects all attached Cinder volumes. It validates that every volume uses the replication-enabled volume type. Each volume is added to the Cinder Consistency Group on the primary site, and a cg_volumes record is written to the local database with the captured attachment metadata. The client layer then pushes the updated metadata (version incremented) to the secondary site. The operation is blocked if the secondary site is unreachable at this point — preventing the sites from diverging.

3. Volume replication (ongoing)

Once volumes are members of the Consistency Group, Pure FlashArray replicates them to the peer array continuously (sync Pod) or on the configured interval (async Protection Group snapshots). This is a storage-layer operation; Protector observes and reports on replication status but does not implement the data path.

4. Executing a failover

If the primary site fails, you authenticate to the secondary site and run openstack protector protection-group failover. Because the metadata replica on the secondary site contains the complete VM configuration, the engine can proceed without contacting the primary. The engine's dr_operations workflow proceeds in four phases:

Preparation: The Protection Group status transitions to failing_over and a dr_operations record is created.
Storage failover: The engine calls PureStorageDriver to identify the latest replicated snapshot on FlashArray B, extracts per-volume snapshots from the Protection Group snapshot, creates new Cinder volumes from them, and registers those volumes with Cinder on the secondary site.
Instance recreation: For each VM member, the engine reads the captured metadata from the local database, applies the configured network and flavor mappings, and issues Nova API calls on the secondary site to boot new instances with the recovered volumes.
Finalization: The Protection Group current_primary_site_id is updated to point to the secondary site, the status is set to failed_over, and the DR operation record is marked completed. A sync to the primary site is attempted; if the primary is still unreachable, the sync status is marked UNREACHABLE and the failover completes anyway. Any subsequent modification to the Protection Group will be blocked until the primary site recovers and a force-sync is performed.

5. Failback

Once the primary site recovers, you run openstack protector protection-group failback. The engine quiesces workloads on the secondary site, takes a final snapshot, reverses the replication direction (if requested), replicates data back to FlashArray A, restores volumes on the primary site, and recreates instances there. The Protection Group current_primary_site_id is updated back to the original primary site, and metadata at version N is synced to both sites.

Design decisions

Independent service stacks per site, no direct service-to-service communication

Running fully independent protector-api, protector-engine, MariaDB, and RabbitMQ stacks on each site means each site can operate autonomously. The alternative — a single centralized control plane — would make the DR service itself a single point of failure. If the central plane were co-located with the primary site, you would lose DR control precisely when you need it most. Independence ensures that an unplanned failover can be orchestrated entirely from the secondary site using locally-held metadata.

Client-layer coordination rather than service-to-service sync

Cross-site metadata synchronization is orchestrated by the OSC CLI plugin and Horizon dashboard, which authenticate to both sites and push updates explicitly. This avoids requiring routable network paths between the Protector service processes on the two sites, which simplifies firewall rules and reduces the trust surface. It also means the coordination logic is versioned and auditable at the client level. The trade-off is that the client must be reachable to both sites during normal operations; offline or agent-based sync is not supported.

Strict metadata sync: modifications blocked when peer is unreachable

The system refuses to allow any modification to a Protection Group if the remote site cannot be reached at the time of the change. This prevents the two sites from diverging into inconsistent metadata states that would cause split-brain during failover. The trade-off is that a network partition between sites (even without a site failure) will temporarily prevent Protection Group updates. This is a deliberate safety choice: correctness is prioritized over availability of the management plane.

1:1:1 mapping: Protection Group ↔ Cinder Consistency Group ↔ Pure Storage Protection Group (or Pod)

Enforcing a strict one-to-one-to-one relationship keeps the data model unambiguous and makes crash-consistency guarantees tractable. A single Cinder Consistency Group ensures all volumes are snapshotted atomically. Mapping that directly to a single Pure Storage Protection Group (async) or ActiveCluster Pod (sync) means there is no fan-out or aggregation logic that could introduce inconsistency. The trade-off is that you cannot share a Pure Storage Protection Group across multiple Protector Protection Groups, so storage-side resource counts scale linearly with the number of Protection Groups.

Pluggable StorageDriver interface with a production driver and a mock driver

Abstracting storage operations behind a StorageDriver interface decouples the DR orchestration logic from FlashArray-specific API calls. This makes it possible to ship a MockStorageDriver backed by SQLite that faithfully simulates all DR behaviors — including failover, failback, and test failover — without physical hardware. This is essential for development, automated testing, and customer evaluations. The alternative (hard-coding FlashArray calls throughout the engine) would make the system untestable without real arrays and would tightly couple the product to a single storage vendor.

Dynamic primary/secondary designation

The architecture treats both sites symmetrically. The concepts of "primary" and "secondary" are workload-relative and stored as current_primary_site_id on each Protection Group, not as a fixed property of the site itself. After a failover, the secondary site becomes the effective primary for that workload. This means the same failover and failback code paths work regardless of which site currently holds the workload, and there is no need for asymmetric configuration between sites.

Trade-offs

Metadata sync requires both sites to be reachable for mutations

Any write operation on a Protection Group (adding a member, updating policy, initiating a planned failover) requires the remote site to be reachable so metadata can be kept in sync. If you experience a network partition between sites that does not involve a site failure, you will be unable to modify Protection Groups until connectivity is restored. A forced sync (openstack protector protection-group sync-force) is required after the remote site recovers before modifications are permitted again. This is a deliberate design choice; the alternative of allowing offline mutations would risk split-brain metadata on failover.

No agent-based or asynchronous metadata replication

Metadata synchronization is synchronous and client-driven. There is no background agent continuously reconciling state between sites. This means metadata consistency depends on the client completing its sync call successfully. If the CLI process is interrupted mid-operation, the sync log and version tracking allow you to detect and recover from the inconsistency, but there is no automatic retry mechanism.

Cinder volume type constraints limit which VMs can be protected

All volumes attached to a VM must use a Cinder volume type with replication_enabled='<is> True' and a compatible replication_type property. VMs with volumes on non-replicated backends cannot be added to a Protection Group. This means you may need to live-migrate or re-type volumes before protecting certain workloads, which is an operational cost that should be planned for during initial deployment.

Linear scaling of Pure Storage Protection Groups

Because each Protector Protection Group maps to exactly one Pure Storage Protection Group (or Pod), the number of storage-side protection constructs scales linearly with the number of Protection Groups you create. Large deployments with many fine-grained Protection Groups will create a corresponding number of FlashArray-side resources. If your FlashArray has a limit on the number of Protection Groups or Pod members, you should account for this when sizing your deployment.

MockStorageDriver is for testing only

The MockStorageDriver simulates Pure FlashArray behavior using SQLite and is suitable for end-to-end DR testing, CI pipelines, and development environments. It does not perform real data replication and should never be used in a production deployment. If you configure the mock driver in production, volume data will not actually be replicated and a failover will produce VMs attached to empty or inconsistent volumes.

No built-in Keystone federation

The current authentication model relies on per-site service account credentials stored in the site configuration. User tokens issued by one site's Keystone are not valid on the other site. Cross-site API calls use service credentials rather than federated identity. Keystone federation (SAML/OIDC) is listed as a future enhancement but is not currently supported. If your organization requires federated identity across sites, this is an integration gap to plan around.