Monitoring & Maintenance
This guide covers monitoring the bot's health and performing regular maintenance tasks.
Health Monitoring
Real-time Status Monitoring
Using PM2
If deployed with PM2, use built-in monitoring:
Using Systemd
For systemd deployments:
Key Metrics to Monitor
1. Process Health
- Uptime: Should be continuously running
- Restarts: Frequent restarts indicate issues
- Memory Usage: Should stay under 1GB
- CPU Usage: Should be minimal except during checks
2. Bot Responsiveness
- Command response time < 1 second
- Monitoring cycle completion time
- Queue size (if using Redis)
3. Database Health
- Database file size
- Query execution time
- Number of active subscriptions
4. External Services
- Telegram API connectivity
- Blockchain RPC availability
- Redis connection (if used)
Monitoring Tools
Application Logs
The bot uses Winston logger with daily rotation:
Performance Monitoring
Memory Usage Tracking
Response Time Monitoring
Database Monitoring
Regular Maintenance Tasks
Daily Tasks
1. Log Review
2. Performance Check
- Monitor command response times
- Check memory usage trends
- Verify monitoring is running
Weekly Tasks
1. Database Optimization
2. Log Cleanup
3. Cache Cleanup
Monthly Tasks
1. Security Review
- Review admin access list
- Check for unusual activity patterns
- Update dependencies
2. Performance Analysis
3. Backup Verification
Automated Monitoring
Health Check Script
Create health-check.sh:
Cron Jobs
Add to crontab:
Monitoring Best Practices
1. Set Up Alerts
- High memory usage (>80%)
- Bot offline/crashed
- High error rate
- Database growth
2. Regular Reviews
- Daily: Check logs for errors
- Weekly: Review performance metrics
- Monthly: Analyze usage patterns
3. Proactive Maintenance
- Keep logs rotated
- Optimize database regularly
- Monitor disk space
- Update dependencies
4. Documentation
- Keep incident log
- Document configuration changes
- Track performance baselines
- Note recurring issues
Troubleshooting Performance Issues
High Memory Usage
-
Check for memory leaks:
-
Reduce cache size:
-
Restart periodically:
Slow Response Times
-
Check RPC latency:
-
Analyze database queries:
-
Review monitoring intervals:
- Increase intervals if too frequent
- Use batch processing