- Modify ansible.cfg to increase SSH connection retries from 2 to 3 and add a connection timeout setting for better reliability. - Enhance auto-fallback.sh script to provide detailed feedback during IP connectivity tests, including clearer status messages for primary and fallback IP checks. - Update documentation to reflect changes in connectivity testing and fallback procedures. These updates improve the robustness of the connectivity testing process and ensure smoother operations during IP failover scenarios.
4.3 KiB
4.3 KiB
Connectivity Test Documentation
Overview
The test_connectivity.py script provides comprehensive connectivity testing for Ansible hosts with intelligent fallback IP detection and detailed diagnostics.
Features
- Comprehensive Testing: Tests both ping and SSH connectivity
- Fallback Detection: Identifies when fallback IPs should be used
- Smart Diagnostics: Provides specific error messages and recommendations
- Multiple Output Formats: Console, quiet mode, and JSON export
- Actionable Recommendations: Suggests specific commands to fix issues
Usage
Basic Usage
# Test all hosts
make test-connectivity
# Or run directly
python3 test_connectivity.py
Advanced Options
# Quiet mode (summary only)
python3 test_connectivity.py --quiet
# Export results to JSON
python3 test_connectivity.py --json results.json
# Custom hosts file
python3 test_connectivity.py --hosts-file inventories/staging/hosts
# Custom timeout
python3 test_connectivity.py --timeout 5
Output Interpretation
Status Icons
- ✅ SUCCESS: Host is fully accessible via primary IP
- 🔑 SSH KEY: SSH key authentication issue
- 🔧 SSH SERVICE: SSH service not running
- ⚠️ SSH ERROR: Other SSH-related errors
- 🔄 USE FALLBACK: Should switch to fallback IP
- ❌ BOTH FAILED: Both primary and fallback IPs failed
- 🚫 NO FALLBACK: Primary IP failed, no fallback available
- ❓ UNKNOWN: Unexpected connectivity state
Common Issues and Solutions
SSH Key Issues
🔑 Fix SSH key issues (2 hosts):
make copy-ssh-key HOST=dev01
make copy-ssh-key HOST=debianDesktopVM
Solution: Run the suggested make copy-ssh-key commands
Fallback Recommendations
🔄 Switch to fallback IPs (1 hosts):
sed -i 's/vaultwardenVM ansible_host=100.100.19.11/vaultwardenVM ansible_host=10.0.10.142/' inventories/production/hosts
Solution: Run the suggested sed command or use make auto-fallback
Critical Issues
🚨 Critical issues (4 hosts):
bottom: ✗ bottom: Primary IP 10.0.10.156 failed, no fallback available
Solution: Check network connectivity, host status, or add fallback IPs
Integration with Ansible Workflow
Before Running Ansible
# Test connectivity first
make test-connectivity
# Fix any issues, then run Ansible
make apply
Automated Fallback
# Automatically switch to working IPs
make auto-fallback
# Then run your Ansible tasks
make apply
Configuration
Hosts File Format
The script expects hosts with optional fallback IPs:
vaultwardenVM ansible_host=100.100.19.11 ansible_host_fallback=10.0.10.142 ansible_user=ladmin
Timeout Settings
- Ping timeout: 3 seconds (configurable with
--timeout) - SSH timeout: 5 seconds (hardcoded for reliability)
Troubleshooting
Common Problems
-
"Permission denied (publickey)"
- Run:
make copy-ssh-key HOST=hostname
- Run:
-
"Connection refused"
- Check if SSH service is running on target host
- Verify firewall settings
-
"Host key verification failed"
- Add host to known_hosts:
ssh-keyscan hostname >> ~/.ssh/known_hosts
- Add host to known_hosts:
-
"No route to host"
- Check network connectivity
- Verify IP addresses are correct
Debug Mode
For detailed debugging, run with verbose output:
python3 test_connectivity.py --timeout 10
JSON Output Format
When using --json, the output includes detailed information:
[
{
"hostname": "vaultwardenVM",
"group": "vaultwarden",
"primary_ip": "100.100.19.11",
"fallback_ip": "10.0.10.142",
"user": "ladmin",
"primary_ping": true,
"primary_ssh": true,
"fallback_ping": true,
"fallback_ssh": true,
"status": "success",
"recommendation": "✓ vaultwardenVM is fully accessible via primary IP 100.100.19.11"
}
]
Best Practices
- Run before Ansible operations to catch connectivity issues early
- Use quiet mode in scripts:
python3 test_connectivity.py --quiet - Export JSON results for logging and monitoring
- Fix SSH key issues before running Ansible
- Use auto-fallback for automated IP switching
Integration with CI/CD
# In your CI pipeline
make test-connectivity
if [ $? -ne 0 ]; then
echo "Connectivity issues detected"
exit 1
fi
make apply