# Project N.O.M.A.D. — Monitoring Architecture ## Overview The Nomad Homelab Edition includes built-in monitoring capabilities and integrates with standard homelab monitoring stacks. ## Monitoring Layers ``` ┌─────────────────────────────────────────────────────────────┐ │ Nomad Dashboard │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ NAS │ │ Server │ │Container │ │ Network │ │ │ │ Health │ │ Metrics │ │ Status │ │ Devices │ │ │ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ │ │ │ │ │ │ │ ┌────▼────────────▼────────────▼────────────▼─────┐ │ │ │ Nomad App (Aggregator) │ │ │ └──────────────────────┬──────────────────────────┘ │ └─────────────────────────┼──────────────────────────────────┘ │ ┌───────────────┼───────────────┐ │ │ │ ┌─────▼─────┐ ┌────▼────┐ ┌──────▼──────┐ │ Agent 1 │ │ Agent 2 │ │ Agent N │ │ (Server) │ │ (NAS) │ │ (Remote) │ │ :9100 │ │ :9100 │ │ :9100 │ └───────────┘ └─────────┘ └─────────────┘ ``` ## Built-in Monitoring ### System Resource Monitoring The Nomad application provides built-in system information through the system controller: - **CPU**: Model, core count, usage - **Memory**: Total, used, free, usage percentage - **Disk**: Mount points, usage, filesystem types - **Docker**: Container status, image versions, resource usage ### Docker Container Monitoring Nomad monitors its own container stack and any Docker containers on the host via the Docker socket: - Container health status - Image versions and update availability - Resource consumption - Log access (via Dozzle integration in original install) ### Health Endpoints | Endpoint | Service | Purpose | |----------|---------|---------| | `GET /api/health` | nomad-app | Application health | | `GET /nginx-health` | nomad-nginx | Reverse proxy health | | `GET /health` | nomad-agent | Agent health | ## Agent-Based Monitoring ### Architecture The Nomad monitoring agent runs on remote homelab nodes and reports metrics via: 1. **Push model**: Agent sends JSON metrics to `POST /api/agent/report` on the Nomad server 2. **Pull model**: Prometheus scrapes the agent's `/metrics` endpoint ### Agent Metrics ``` nomad_agent_cpu_usage_percent - CPU utilization nomad_agent_cpu_count - CPU core count nomad_agent_memory_total_bytes - Total RAM nomad_agent_memory_used_bytes - Used RAM nomad_agent_memory_usage_percent - RAM utilization nomad_agent_uptime_seconds - System uptime nomad_agent_docker_containers - Docker container count nomad_agent_docker_container_running - Per-container status ``` ### Deployment Deploy an agent on each node you want to monitor: ```bash docker run -d --name nomad-agent \ --restart unless-stopped \ -p 9100:9100 \ -v /var/run/docker.sock:/var/run/docker.sock:ro \ -v /proc:/host/proc:ro \ -v /sys:/host/sys:ro \ -e NOMAD_SERVER_URL=http://nomad-server:8080 \ -e NODE_NAME=$(hostname) \ nomad-agent ``` ## Prometheus Integration ### Full Monitoring Stack For a complete monitoring setup, add Prometheus and Grafana to your docker-compose: ```yaml # docker-compose.monitoring.yml services: prometheus: image: prom/prometheus:latest container_name: nomad-prometheus restart: unless-stopped ports: - "9090:9090" volumes: - ./monitoring/prometheus.yml:/etc/prometheus/prometheus.yml:ro - prometheus-data:/prometheus networks: - nomad-internal grafana: image: grafana/grafana:latest container_name: nomad-grafana restart: unless-stopped ports: - "3000:3000" volumes: - grafana-data:/var/lib/grafana environment: - GF_SECURITY_ADMIN_PASSWORD=admin networks: - nomad-internal volumes: prometheus-data: grafana-data: ``` ### Prometheus Configuration ```yaml # monitoring/prometheus.yml global: scrape_interval: 30s evaluation_interval: 30s scrape_configs: - job_name: 'nomad-agents' static_configs: - targets: - 'nomad-agent:9100' # Local agent - 'server2:9100' # Remote server - 'nas:9100' # NAS agent ``` ### Node Exporter (Optional) For deeper host-level metrics, add the Prometheus Node Exporter alongside the Nomad agent: ```yaml node-exporter: image: prom/node-exporter:latest container_name: nomad-node-exporter restart: unless-stopped ports: - "9101:9100" volumes: - /proc:/host/proc:ro - /sys:/host/sys:ro - /:/rootfs:ro command: - '--path.procfs=/host/proc' - '--path.sysfs=/host/sys' - '--path.rootfs=/rootfs' - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)' ``` ## Observability ### Structured Logging The Nomad application uses structured JSON logging: ```json { "level": "info", "timestamp": "2026-03-13T12:00:00.000Z", "msg": "HTTP request completed", "method": "GET", "url": "/api/health", "status": 200, "duration": "12ms" } ``` Configure log level via `LOG_LEVEL` environment variable: `debug`, `info`, `warn`, `error`. ### Log Access ```bash # Application logs docker compose logs -f nomad-app # Worker logs docker compose logs -f nomad-worker # All service logs docker compose logs -f # Nginx access logs (on host) tail -f ${NOMAD_DATA_DIR}/logs/nginx/access.log ``` ### Alerting Recommendations For homelab alerting, integrate with: | Tool | Use Case | |------|----------| | **Uptime Kuma** | Service uptime monitoring | | **Grafana Alerting** | Metric-based alerts | | **Ntfy** | Push notifications | | **Gotify** | Self-hosted notifications | Example Uptime Kuma monitor: - **URL**: `http://nomad-app:8080/api/health` - **Interval**: 60 seconds - **Expected status**: 200