Maintenance
Best practices for upgrades, backups, and ongoing maintenance
4 minute read
Regular maintenance keeps your Drasi deployment healthy and secure. This guide covers upgrade procedures, backup strategies, and routine maintenance tasks.
Upgrade Procedures
Planning an Upgrade
Before upgrading:
- Review release notes for breaking changes
- Test in non-production environment first
- Schedule maintenance window if needed
- Notify stakeholders of potential disruption
- Prepare rollback plan
Upgrading Drasi
Use the Drasi CLI to upgrade:
# Check current version
drasi version
# Upgrade to latest version
drasi upgrade
# Upgrade to specific version
drasi upgrade --version v0.2.0
Rolling Upgrades
For minimal disruption, Drasi supports rolling upgrades:
- Components upgrade one at a time
- Health checks ensure new pods are ready
- Traffic shifts gradually to new versions
Monitor during upgrade:
kubectl get pods -n drasi-system -w
Rollback
If issues occur after upgrade:
# Rollback to previous version
drasi rollback
# Or reinstall specific version
drasi uninstall
drasi init --version v0.1.0
Backup and Recovery
What to Backup
Critical data to preserve:
| Component | Data | Backup Method |
|---|---|---|
| Sources | Configuration | kubectl get sources -o yaml |
| Queries | Definitions | kubectl get continuousqueries -o yaml |
| Reactions | Configuration | kubectl get reactions -o yaml |
| Secrets | Credentials | kubectl get secrets -o yaml |
Backup Script
Create a comprehensive backup:
#!/bin/bash
BACKUP_DIR="drasi-backup-$(date +%Y%m%d)"
mkdir -p $BACKUP_DIR
# Backup all Drasi resources
kubectl get sources -n drasi-system -o yaml > $BACKUP_DIR/sources.yaml
kubectl get continuousqueries -n drasi-system -o yaml > $BACKUP_DIR/queries.yaml
kubectl get reactions -n drasi-system -o yaml > $BACKUP_DIR/reactions.yaml
kubectl get secrets -n drasi-system -o yaml > $BACKUP_DIR/secrets.yaml
kubectl get configmaps -n drasi-system -o yaml > $BACKUP_DIR/configmaps.yaml
# Archive
tar -czf $BACKUP_DIR.tar.gz $BACKUP_DIR
echo "Backup saved to $BACKUP_DIR.tar.gz"
Restore Procedure
To restore from backup:
# Extract backup
tar -xzf drasi-backup-20240115.tar.gz
# Apply resources
kubectl apply -f drasi-backup-20240115/secrets.yaml
kubectl apply -f drasi-backup-20240115/sources.yaml
kubectl apply -f drasi-backup-20240115/queries.yaml
kubectl apply -f drasi-backup-20240115/reactions.yaml
Routine Maintenance
Daily Tasks
- Review monitoring dashboards for anomalies
- Check alert history for overnight issues
- Verify change processing is current (no lag)
Weekly Tasks
- Review resource utilization trends
- Check for pending security updates
- Verify backups completed successfully
- Review error logs for patterns
Monthly Tasks
- Audit access permissions
- Review and update alert thresholds
- Test backup restoration
- Review capacity planning
Log Management
Log Retention
Configure log retention to balance storage costs and troubleshooting needs:
# For Kubernetes logging
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
data:
retention.conf: |
[OUTPUT]
Name es
Logstash_Prefix drasi
Logstash_Prefix_Separator -
Logstash_DateFormat %Y.%m
# Retain 30 days of logs
Log Aggregation
Aggregate logs from all components for easier analysis:
- Use centralized logging (ELK, Loki, CloudWatch)
- Add structured logging fields
- Create log-based alerts
Credential Rotation
Planning Rotation
Create a rotation schedule:
| Credential | Rotation Frequency | Procedure |
|---|---|---|
| Database passwords | 90 days | Update secret, restart source |
| API keys | 90 days | Update secret, restart reaction |
| TLS certificates | Before expiry | Update secret, restart components |
Rotation Procedure
- Create new credential in external system
- Update Kubernetes secret:
kubectl create secret generic new-db-creds \
--from-literal=username=user \
--from-literal=password=newpassword \
-n drasi-system
- Update source/reaction configuration to reference new secret
- Verify connectivity with new credentials
- Delete old credential after confirmation
Capacity Planning
Monitoring Growth
Track these metrics over time:
- Change rate from sources
- Query result volume
- Storage utilization
- Resource consumption
Planning Ahead
- Review growth trends quarterly
- Plan scaling 2-3 months ahead
- Budget for resource increases
- Test scaling in non-production
Health Checks
Manual Health Check
Periodic manual verification:
# Check all components are running
kubectl get pods -n drasi-system
# Verify sources are connected
kubectl get sources -n drasi-system
# Check queries are running
kubectl get continuousqueries -n drasi-system
# Verify reactions are healthy
kubectl get reactions -n drasi-system
Automated Health Checks
Set up automated health checks:
apiVersion: batch/v1
kind: CronJob
metadata:
name: drasi-health-check
spec:
schedule: "*/5 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: health-check
image: bitnami/kubectl
command:
- /bin/sh
- -c
- |
if kubectl get pods -n drasi-system | grep -v Running; then
echo "ALERT: Some pods are not running"
# Send alert
fi
restartPolicy: OnFailure
Documentation
Maintaining Runbooks
Keep runbooks updated with:
- Current configuration
- Common issues and resolutions
- Contact information
- Escalation procedures
Change Documentation
Document all changes:
- What changed
- Why it changed
- Who made the change
- Rollback procedure
Next Steps
- Set up Monitoring
- Configure Scaling for growth
- Review Security patterns
Feedback
Was this page helpful?
Glad to hear it! Please tell us what you found helpful.
Sorry to hear that. Please tell us how we can improve.