Service Installation
Installing and configuring protector-api and protector-engine on each site
This page walks you through installing and configuring the two Trilio Site Recovery services ā protector-api and protector-engine ā on each of your OpenStack sites. Because the two sites operate independently with no direct service-to-service communication, you must repeat this installation on both your primary and secondary clusters. After completing this guide, each site will have a running Protector service registered in Keystone, backed by a MariaDB database, and ready to be paired with its peer site for DR operations.
Before you begin, confirm the following on each site where you are installing the service:
OpenStack environment
- OpenStack Victoria or later
- Nova, Cinder, Neutron, and Keystone endpoints operational
- Admin credentials available (
admin-openrcor equivalent)
Infrastructure
- MariaDB or MySQL database server accessible from the controller node
- Python 3.8 or later
pipavailable in the target Python environment- Ports open:
8788/tcp(Protector API),3306/tcp(MariaDB)
Storage
- Pure Storage FlashArray with async (or sync) replication configured between the two arrays
- Pure Storage management IP reachable from the controller on each site
- Cinder Pure Storage backend driver configured on each site
Both sites
- Each OpenStack cluster must be able to reach the other site's Keystone endpoint and Protector API endpoint (
port 8788) - You need the auth URL, project name, username, and password for an admin account on each site
Repeat all steps in this guide on both sites. The primary and secondary designations are workload-relative ā both sites run identical service configurations.
Perform all steps below on each site. Where commands differ between sites, the distinction is noted.
Step 1: Create the database
Connect to MariaDB and create a dedicated database and user for the Protector service:
mysql -u root -p << EOF
CREATE DATABASE protector CHARACTER SET utf8;
GRANT ALL PRIVILEGES ON protector.* TO 'protector'@'localhost' IDENTIFIED BY 'PROTECTOR_DBPASS';
GRANT ALL PRIVILEGES ON protector.* TO 'protector'@'%' IDENTIFIED BY 'PROTECTOR_DBPASS';
FLUSH PRIVILEGES;
EOF
Replace PROTECTOR_DBPASS with a strong password. Use the same logical name (protector) on both sites, but each site connects to its own local database instance.
Step 2: Create the service user and endpoints in Keystone
Source your admin credentials, then create the service identity:
source ~/admin-openrc
# Create the protector user
openstack user create --domain default --password-prompt protector
# Grant the admin role in the service project
openstack role add --project service --user protector admin
# Register the service in the catalog
openstack service create --name protector \
--description "OpenStack Disaster Recovery Service" protector
# Create endpoints (adjust the controller hostname for each site)
openstack endpoint create --region RegionOne \
protector public http://controller:8788/v1/%\(tenant_id\)s
openstack endpoint create --region RegionOne \
protector internal http://controller:8788/v1/%\(tenant_id\)s
openstack endpoint create --region RegionOne \
protector admin http://controller:8788/v1/%\(tenant_id\)s
Replace controller with the actual hostname or IP of the controller node on each site.
Step 3: Create the system user and directories
Create a dedicated non-login system account and the required directories:
useradd --system --shell /bin/false protector
mkdir -p /var/log/protector
mkdir -p /var/lib/protector
mkdir -p /etc/protector
chown -R protector:protector /var/log/protector
chown -R protector:protector /var/lib/protector
chown -R protector:protector /etc/protector
Step 4: Install the Protector package
git clone https://github.com/your-org/openstack-protector.git
cd openstack-protector
pip install -r requirements.txt
python setup.py install
After installation, verify the management command is available:
protector-manage --version
Step 5: Write configuration files
Create the three configuration files below. Detailed descriptions of every option appear in the Configuration section.
/etc/protector/protector.conf
[DEFAULT]
debug = False
log_dir = /var/log/protector
state_path = /var/lib/protector
[api]
bind_host = 0.0.0.0
bind_port = 8788
workers = 4
[database]
connection = mysql+pymysql://protector:PROTECTOR_DBPASS@controller/protector
[keystone_authtoken]
www_authenticate_uri = http://controller:5000
auth_url = http://controller:5000
memcached_servers = controller:11211
auth_type = password
project_domain_name = Default
user_domain_name = Default
project_name = service
username = protector
password = PROTECTOR_PASS
[service_credentials]
default_trust_roles = member,_member_
[oslo_policy]
policy_file = /etc/protector/policy.yaml
Replace PROTECTOR_DBPASS, controller, and PROTECTOR_PASS with the values appropriate for each site.
/etc/protector/policy.yaml
"context_is_admin": "role:admin"
"admin_or_owner": "is_admin:True or project_id:%(project_id)s"
"default": "rule:admin_or_owner"
# Protection Groups
"protector:protection_groups:index": "rule:default"
"protector:protection_groups:show": "rule:default"
"protector:protection_groups:create": "rule:default"
"protector:protection_groups:update": "rule:default"
"protector:protection_groups:delete": "rule:default"
# Members
"protector:members:index": "rule:default"
"protector:members:create": "rule:default"
"protector:members:delete": "rule:default"
# Operations
"protector:operations:index": "rule:default"
"protector:operations:show": "rule:default"
"protector:operations:action": "rule:default"
# Policies
"protector:policies:show": "rule:default"
"protector:policies:create": "rule:default"
/etc/protector/api-paste.ini
[composite:protector]
use = egg:Paste#urlmap
/: protectorversions
/v1: protectorapi_v1
[composite:protectorapi_v1]
use = call:keystonemiddleware.auth_token:filter_factory
paste.filter_factory = keystonemiddleware.auth_token:filter_factory
keystone_authtoken = keystoneauth
[app:protectorversions]
paste.app_factory = protector.api.versions:VersionsController.factory
[app:protectorapi_v1]
paste.app_factory = protector.api.app:create_app
[filter:keystoneauth]
paste.filter_factory = keystonemiddleware.auth_token:filter_factory
Set ownership on all configuration files:
chown -R protector:protector /etc/protector
chmod 640 /etc/protector/protector.conf
Step 6: Apply required OpenStack policy changes
The Protector service acts on behalf of tenants using Keystone trusts. Cinder's default policy restricts several operations the service needs during failover. Add the following to Cinder's policy file on each site.
Standard deployments ā edit /etc/cinder/policy.yaml:
# Required for DR failover: manage/unmanage volumes and discover service hosts
"volume_extension:volume_manage": "rule:admin_or_owner"
"volume_extension:volume_unmanage": "rule:admin_or_owner"
"volume_extension:services:index": "rule:admin_or_owner"
Kolla-Ansible deployments ā create or update /etc/kolla/config/cinder/policy.yaml with the same content, then reconfigure:
kolla-ansible -i inventory reconfigure -t cinder
These changes are required because:
volume_manageā Protector imports replicated volumes into Cinder on the target site after a failovervolume_unmanageā Protector removes volumes from Cinder management during failback cleanupservices:indexā Protector discovers the correct Cinder volume service host to target the manage operation
Step 7: Initialize the database schema
protector-manage db sync
Run this command on each site after writing protector.conf. It is safe to re-run; subsequent executions apply only pending Alembic migrations.
Step 8: Install systemd service files
/etc/systemd/system/protector-api.service
[Unit]
Description=OpenStack Protector API Service
After=network.target
[Service]
Type=simple
User=protector
Group=protector
ExecStart=/usr/local/bin/protector-api --config-file /etc/protector/protector.conf
Restart=on-failure
RestartSec=10
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
/etc/systemd/system/protector-engine.service
[Unit]
Description=OpenStack Protector Engine Service
After=network.target
[Service]
Type=simple
User=protector
Group=protector
ExecStart=/usr/local/bin/protector-engine --config-file /etc/protector/protector.conf
Restart=on-failure
RestartSec=10
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
Step 9: Enable and start the services
systemctl daemon-reload
systemctl enable protector-api
systemctl enable protector-engine
systemctl start protector-api
systemctl start protector-engine
# Verify both services are active
systemctl status protector-api
systemctl status protector-engine
Step 10: Verify the API is reachable
curl http://controller:8788/
A successful response returns the available API versions. Repeat this health check on both sites before proceeding to register the sites with each other.
The primary configuration file is /etc/protector/protector.conf. The sections and options below govern service behavior.
[DEFAULT]
| Option | Default | Description |
|---|---|---|
debug | False | Set to True to enable verbose debug logging. Do not use in production. |
log_dir | /var/log/protector | Directory where protector-api.log and protector-engine.log are written. |
state_path | /var/lib/protector | Directory for ephemeral state files. Must be writable by the protector user. |
[api]
| Option | Default | Description |
|---|---|---|
bind_host | 0.0.0.0 | Interface the API process listens on. Use a specific IP to restrict access. |
bind_port | 8788 | TCP port the API listens on. Both sites must expose the same port to each other. |
workers | 4 | Number of API worker processes. Tune based on available CPU cores. |
[database]
| Option | Description |
|---|---|
connection | SQLAlchemy connection string for the local MariaDB/MySQL instance. Format: mysql+pymysql://USER:PASS@HOST/DBNAME. Each site connects to its own independent database ā there is no shared database between sites. |
[keystone_authtoken]
This section configures the Keystonemiddleware token validation pipeline. All options follow the standard OpenStack auth_token middleware convention.
| Option | Description |
|---|---|
www_authenticate_uri | Public Keystone endpoint, returned to clients that need to authenticate. |
auth_url | Keystone endpoint the service uses to validate tokens internally. |
memcached_servers | Optional token cache. Omit to disable caching. |
auth_type | Must be password. |
project_name | Service project. Conventionally service. |
username / password | Credentials of the protector service user created in Step 2. |
[service_credentials]
| Option | Default | Description |
|---|---|---|
default_trust_roles | member,_member_ | Roles the service requests when creating Keystone trusts on behalf of tenants. Both member and _member_ are listed for compatibility across OpenStack releases. These roles must exist in Keystone and, combined with the Cinder policy changes in Step 6, must be sufficient for the service to perform DR operations within tenant scope. |
[oslo_policy]
| Option | Default | Description |
|---|---|---|
policy_file | /etc/protector/policy.yaml | Path to the RBAC policy file. The default policy grants all operations to admins and to the resource owner (admin_or_owner). Modify this file to enforce finer-grained access control. |
API microversioning
The Protector API uses the OpenStack-API-Version: protector <version> header for microversioning. The base version is 1.0 and the current version is 1.2. Clients that do not send this header receive the base version response. You do not configure this in protector.conf ā it is negotiated per request.
Once both sites have running Protector services, the typical operational flow is:
- Configure
clouds.yamlso that the OSC CLI plugin (protectorclient) can authenticate to both sites simultaneously. - Register both sites with each Protector service.
- Prepare replication-enabled Cinder volume types on both sites.
- Create Protection Groups and add VMs.
- Execute DR operations (test failover, planned failover, failback).
This page covers only steps 1 and 2. For the full workflow, see the DR Workflow guide.
Configure multi-site credentials
Create ~/.config/openstack/clouds.yaml with an entry for each site:
clouds:
site-a:
auth:
auth_url: http://site-a-controller:5000/v3
project_name: admin
username: admin
password: password
user_domain_name: Default
project_domain_name: Default
region_name: RegionOne
site-b:
auth:
auth_url: http://site-b-controller:5000/v3
project_name: admin
username: admin
password: password
user_domain_name: Default
project_domain_name: Default
region_name: RegionOne
With this file in place, every openstack command accepts --os-cloud site-a or --os-cloud site-b to select the target site. The protectorclient plugin uses both entries when it needs to coordinate metadata across sites.
Register sites
After installation, register each site with the Protector service. You run this command once from a host that can reach both sites:
# Register the primary site
openstack --os-cloud site-a protector site create \
--name site-a \
--description "Primary datacenter" \
--site-type primary \
--auth-url http://site-a-controller:5000/v3 \
--region-name RegionOne
# Register the secondary site
openstack --os-cloud site-a protector site create \
--name site-b \
--description "Secondary datacenter" \
--site-type secondary \
--auth-url http://site-b-controller:5000/v3 \
--region-name RegionOne
The
site-typevalues (primaryandsecondary) express the initial designation for these site records. In practice, primary and secondary are workload-relative ā they swap on failover. Both sites run identical service configurations.
Validate that the service can reach each site's OpenStack endpoints:
openstack --os-cloud site-a protector site validate site-a
openstack --os-cloud site-a protector site validate site-b
Understand metadata synchronization behavior
Protector keeps a complete copy of all Protection Group metadata on both sites at all times. When you modify a Protection Group (add a member, change a policy, execute a failover), the service:
- Updates the local metadata and increments the version number.
- Checks that the peer site is reachable.
- Pushes the updated metadata to the peer site.
- Confirms the peer has accepted and written the update.
If the peer site is unreachable, the modification is blocked. This is intentional ā allowing changes without synchronization would cause the two sites to diverge, making future failovers unreliable. If you encounter a blocked operation after a connectivity interruption, restore connectivity and then run openstack protector protection-group sync-force <pg-name> before retrying.
Example 1: Verify both services are running after installation
Run on each site after completing Step 9.
systemctl status protector-api protector-engine
Expected output (abbreviated):
ā protector-api.service - OpenStack Protector API Service
Loaded: loaded (/etc/systemd/system/protector-api.service; enabled)
Active: active (running) since Mon 2025-01-01 08:00:00 UTC
ā protector-engine.service - OpenStack Protector Engine Service
Loaded: loaded (/etc/systemd/system/protector-engine.service; enabled)
Active: active (running) since Mon 2025-01-01 08:00:01 UTC
If either service shows failed or activating, check the journal output shown in the Troubleshooting section.
Example 2: Health check the API endpoint
Confirm the API is accepting requests and returning version information:
curl -s http://controller:8788/
Expected output:
{
"versions": [
{
"id": "v1",
"status": "CURRENT",
"min_version": "1.0",
"max_version": "1.2"
}
]
}
Example 3: Confirm database schema was applied
After running protector-manage db sync, verify the expected tables exist:
mysql -u protector -p protector -e "SHOW TABLES;"
Expected output (table names may vary by release):
+-------------------------+
| Tables_in_protector |
+-------------------------+
| alembic_version |
| consistency_groups |
| cg_volumes |
| dr_operations |
| pg_members |
| protection_groups |
| replication_policies |
| sites |
+-------------------------+
Example 4: Register both sites and validate connectivity
This example assumes clouds.yaml is configured with site-a and site-b entries.
# Register Site A
openstack --os-cloud site-a protector site create \
--name site-a \
--description "Primary datacenter - Boston" \
--site-type primary \
--auth-url http://10.0.1.10:5000/v3 \
--region-name RegionOne
# Register Site B
openstack --os-cloud site-a protector site create \
--name site-b \
--description "Secondary datacenter - Seattle" \
--site-type secondary \
--auth-url http://10.0.2.10:5000/v3 \
--region-name RegionOne
# Validate both sites
openstack --os-cloud site-a protector site validate site-a
openstack --os-cloud site-a protector site validate site-b
Expected output for each validate call:
+--------------------+--------+
| Field | Value |
+--------------------+--------+
| name | site-a |
| status | active |
| keystone_reachable | True |
| nova_reachable | True |
| cinder_reachable | True |
| neutron_reachable | True |
+--------------------+--------+
If any endpoint shows False, resolve the connectivity issue before proceeding to create Protection Groups.
Use a consistent diagnostic approach for each issue: check systemctl status, then the service log at /var/log/protector/, then the systemd journal with journalctl -u <service> -n 100.
Service fails to start: protector-api or protector-engine enters failed state
Symptom: systemctl status protector-api shows Active: failed.
Likely causes and fixes:
- Configuration syntax error ā Run
protector-api --config-file /etc/protector/protector.conf --helpto surface parse errors before starting the service. - Database unreachable ā Verify the connection string in
[database]and test it manually:mysql -h <host> -u protector -p protector. Ensure MariaDB is running and port 3306 is open. - Port already in use ā Check for a conflicting process:
ss -tlnp | grep 8788. Changebind_portin[api]if needed. - Missing directories ā Confirm
/var/log/protectorand/var/lib/protectorexist and are owned by theprotectoruser.
protector-manage db sync fails with Access denied
Symptom: (1044, "Access denied for user 'protector'@'%' to database 'protector'")
Likely cause: The database grants were not applied, or the hostname in the connection string does not match the GRANT statement.
Fix: Reconnect as root and re-issue the GRANT statements from Step 1, then retry db sync.
Keystone authentication errors in protector-api.log
Symptom: Log entries containing 401 Unauthorized or keystonemiddleware.auth_token [-] Unable to validate token.
Likely causes and fixes:
- Incorrect credentials ā Verify
usernameandpasswordin[keystone_authtoken]match the Keystone user:openstack user show protector. - Wrong
auth_urlā Confirm the URL points to the Keystone endpoint on the same site. Each site has its ownauth_url. - Service user missing role ā Re-run:
openstack role add --project service --user protector admin.
API returns 503 or is unreachable after startup
Symptom: curl http://controller:8788/ times out or returns Connection refused.
Likely causes and fixes:
- Service not running ā Confirm
systemctl status protector-apishowsactive (running). - Binding to wrong interface ā If
bind_hostis set to a specific IP, ensure that IP is assigned to the controller:ip addr show. Use0.0.0.0to bind all interfaces. - Firewall blocking port ā Check:
iptables -L -n | grep 8788orfirewall-cmd --list-ports. Open the port if needed.
protector site validate reports one or more endpoints unreachable
Symptom: cinder_reachable: False or similar after registering sites.
Likely cause: Network path between the Protector controller and the remote site's OpenStack endpoints is not open, or the endpoint URL registered in Keystone is incorrect.
Fix: From the Protector controller, test connectivity directly:
curl -s http://<remote-site-controller>:5000/v3
curl -s http://<remote-site-controller>:8776/
Resolve firewall or routing issues, then re-run protector site validate.
Cinder volume_manage operations fail during failover with Policy doesn't allow
Symptom: DR operation log shows HTTP 403 when the engine attempts to manage a volume on the target site.
Likely cause: The Cinder policy changes from Step 6 were not applied, or were applied to the wrong site.
Fix: Confirm the policy entries exist in /etc/cinder/policy.yaml (or the Kolla-Ansible equivalent) on the target site, then restart the Cinder API and volume services:
systemctl restart cinder-api cinder-volume
Modification to a Protection Group is blocked with "remote site unreachable"
Symptom: Any write operation on a Protection Group returns an error stating the remote site cannot be reached and the operation is blocked.
Likely cause: This is expected behavior ā Protector blocks modifications when it cannot synchronize metadata to the peer site, to prevent divergence.
Fix: Restore connectivity to the peer site, then force a metadata sync before retrying your operation:
openstack protector protection-group sync-status <pg-name>
openstack protector protection-group sync-force <pg-name>