Introduction
Zerto provides robust disaster recovery (DR) and replication capabilities, but like any complex system, issues can arise. This guide focuses on troubleshooting Zerto Virtual Manager (ZVM), Virtual Replication Appliances (VRAs), journal replication failures, and failover/failback issues.
We’ll cover:
✅ How to analyze Zerto logs
✅ Common ZVM, VRA, and journal replication failures
✅ Troubleshooting failover & failback
✅ Network & performance tuning for optimal RPO
1. Understanding Zerto Logs & Diagnostics
Key Log Locations in Zerto
To diagnose issues, start by reviewing logs. Here’s where to find them:
Component | Log File Location |
---|---|
Zerto Virtual Manager (ZVM) | C:\Program Files\Zerto\Zerto Virtual Replication\Logs\ |
Virtual Replication Appliance (VRA) | /var/log/zerto/ on ESXi/Hyper-V |
ZCC (Zerto Cloud Connector) | C:\Program Files\Zerto\Zerto Cloud Connector\Logs\ |
Zerto Journal Logs | Inside VRA logs, under journal.log |
How to Read Zerto Logs
- Look for errors like: pgsqlCopyEdit
Failed to establish a connection with VRA on Host-12
This usually means a network or firewall issue. - Use Zerto Diagnostics Utility (
ZertoDiag.exe
) to automate log collection and analysis. makefileCopyEditC:\Program Files\Zerto\Zerto Virtual Replication\ZertoDiag.exe
- Compare timestamps between ZVM and VRA logs to trace failures.
2. ZVM, VRA & Journal Replication Failures
Common ZVM & VRA Issues
Issue | Cause | Resolution |
---|---|---|
ZVM UI not loading | Service failure, port conflict | Restart ZVMService & check firewall settings |
VRA not communicating | Network issue, firewall blocking ports | Check TCP 9080/9081 & vCenter connectivity |
Journal space exhausted | Incorrect retention settings | Increase journal size or adjust retention policy |
Replication paused | Storage latency, high I/O | Optimize storage IOPS & bandwidth |
Step-by-Step Fix for VRA Connectivity Issues
1️⃣ Ping VRA IP from ZVM:
php-templateCopyEditping <VRA-IP>
2️⃣ Test Port Connectivity:
php-templateCopyEdittelnet <VRA-IP> 9081
3️⃣ Restart VRA Service:
nginxCopyEditservice zerto restart
4️⃣ If unresolved, redeploy VRA and reconfigure host settings in Zerto UI.
3. Failover & Failback Troubleshooting
Common Failover & Failback Issues
Issue | Cause | Fix |
---|---|---|
Failover stuck at 99% | VM tools issue | Ensure VMware Tools is installed & running |
Reverse Protection Fails | Storage or IP config mismatch | Check vCenter mapping & correct storage paths |
Failover VMs won’t power on | IP conflict, missing port groups | Verify DR site networking & static IPs |
Steps to Fix a Stuck Failover
1️⃣ Check Recovery Site Storage:
- Ensure target datastores are available and not in read-only mode.
2️⃣ Verify Network Configurations:
- Check VM NIC mappings under Recovery Settings.
3️⃣ Manually Power On the VM in vCenter:
- If the Zerto UI fails to complete the failover, manually start the VM and check for issues.
4️⃣ Check Zerto Logs for Failover Events:
perlCopyEditgrep "failover" /var/log/zerto/
4. Network & Performance Tuning for Zerto
Optimizing Bandwidth & Latency
- Enable WAN compression in Zerto UI for faster replication.
- Ensure latency is below 5ms for low RPO performance.
- Use dedicated VLANs for replication traffic.
Fixing Slow Replication Performance
Issue | Solution |
---|---|
High replication lag | Check network congestion & prioritize replication traffic |
Slow RPO improvement | Increase VRA memory allocation |
Packet drops during replication | Verify MTU settings & enable Jumbo Frames |
Conclusion
By following these step-by-step troubleshooting techniques, you can quickly diagnose and resolve Zerto replication failures, failover issues, and performance bottlenecks.