Maintenance

Best practices for upgrades, backups, and ongoing maintenance

4 minute read

Regular maintenance keeps your Drasi deployment healthy and secure. This guide covers upgrade procedures, backup strategies, and routine maintenance tasks.

Upgrade Procedures

Planning an Upgrade

Before upgrading:

Review release notes for breaking changes
Test in non-production environment first
Schedule maintenance window if needed
Notify stakeholders of potential disruption
Prepare rollback plan

Upgrading Drasi

Use the Drasi CLI to upgrade:

# Check current version
drasi version

# Upgrade to latest version
drasi upgrade

# Upgrade to specific version
drasi upgrade --version v0.2.0

Rolling Upgrades

For minimal disruption, Drasi supports rolling upgrades:

Components upgrade one at a time
Health checks ensure new pods are ready
Traffic shifts gradually to new versions

Monitor during upgrade:

kubectl get pods -n drasi-system -w

Rollback

If issues occur after upgrade:

# Rollback to previous version
drasi rollback

# Or reinstall specific version
drasi uninstall
drasi init --version v0.1.0

Backup and Recovery

What to Backup

Critical data to preserve:

Component	Data	Backup Method
Sources	Configuration	`kubectl get sources -o yaml`
Queries	Definitions	`kubectl get continuousqueries -o yaml`
Reactions	Configuration	`kubectl get reactions -o yaml`
Secrets	Credentials	`kubectl get secrets -o yaml`

Backup Script

Create a comprehensive backup:

#!/bin/bash
BACKUP_DIR="drasi-backup-$(date +%Y%m%d)"
mkdir -p $BACKUP_DIR

# Backup all Drasi resources
kubectl get sources -n drasi-system -o yaml > $BACKUP_DIR/sources.yaml
kubectl get continuousqueries -n drasi-system -o yaml > $BACKUP_DIR/queries.yaml
kubectl get reactions -n drasi-system -o yaml > $BACKUP_DIR/reactions.yaml
kubectl get secrets -n drasi-system -o yaml > $BACKUP_DIR/secrets.yaml
kubectl get configmaps -n drasi-system -o yaml > $BACKUP_DIR/configmaps.yaml

# Archive
tar -czf $BACKUP_DIR.tar.gz $BACKUP_DIR
echo "Backup saved to $BACKUP_DIR.tar.gz"

Restore Procedure

To restore from backup:

# Extract backup
tar -xzf drasi-backup-20240115.tar.gz

# Apply resources
kubectl apply -f drasi-backup-20240115/secrets.yaml
kubectl apply -f drasi-backup-20240115/sources.yaml
kubectl apply -f drasi-backup-20240115/queries.yaml
kubectl apply -f drasi-backup-20240115/reactions.yaml

Routine Maintenance

Daily Tasks

Review monitoring dashboards for anomalies
Check alert history for overnight issues
Verify change processing is current (no lag)

Weekly Tasks

Review resource utilization trends
Check for pending security updates
Verify backups completed successfully
Review error logs for patterns

Monthly Tasks

Audit access permissions
Review and update alert thresholds
Test backup restoration
Review capacity planning

Log Management

Log Retention

Configure log retention to balance storage costs and troubleshooting needs:

# For Kubernetes logging
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
data:
  retention.conf: |
    [OUTPUT]
        Name  es
        Logstash_Prefix drasi
        Logstash_Prefix_Separator -
        Logstash_DateFormat %Y.%m
        # Retain 30 days of logs

Log Aggregation

Aggregate logs from all components for easier analysis:

Use centralized logging (ELK, Loki, CloudWatch)
Add structured logging fields
Create log-based alerts

Credential Rotation

Planning Rotation

Create a rotation schedule:

Credential	Rotation Frequency	Procedure
Database passwords	90 days	Update secret, restart source
API keys	90 days	Update secret, restart reaction
TLS certificates	Before expiry	Update secret, restart components

Rotation Procedure

Create new credential in external system
Update Kubernetes secret:

kubectl create secret generic new-db-creds \
  --from-literal=username=user \
  --from-literal=password=newpassword \
  -n drasi-system

Update source/reaction configuration to reference new secret
Verify connectivity with new credentials
Delete old credential after confirmation

Capacity Planning

Monitoring Growth

Track these metrics over time:

Change rate from sources
Query result volume
Storage utilization
Resource consumption

Planning Ahead

Review growth trends quarterly
Plan scaling 2-3 months ahead
Budget for resource increases
Test scaling in non-production

Health Checks

Manual Health Check

Periodic manual verification:

# Check all components are running
kubectl get pods -n drasi-system

# Verify sources are connected
kubectl get sources -n drasi-system

# Check queries are running
kubectl get continuousqueries -n drasi-system

# Verify reactions are healthy
kubectl get reactions -n drasi-system

Automated Health Checks

Set up automated health checks:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: drasi-health-check
spec:
  schedule: "*/5 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: health-check
            image: bitnami/kubectl
            command:
            - /bin/sh
            - -c
            - |
              if kubectl get pods -n drasi-system | grep -v Running; then
                echo "ALERT: Some pods are not running"
                # Send alert
              fi
          restartPolicy: OnFailure

Documentation

Maintaining Runbooks

Keep runbooks updated with:

Current configuration
Common issues and resolutions
Contact information
Escalation procedures

Change Documentation

Document all changes:

What changed
Why it changed
Who made the change
Rollback procedure

Next Steps

Set up Monitoring
Configure Scaling for growth
Review Security patterns

Feedback

Was this page helpful?

Glad to hear it! Please tell us what you found helpful.

Sorry to hear that. Please tell us how we can improve.