mirror of
https://github.com/Crosstalk-Solutions/project-nomad.git
synced 2026-05-12 16:10:11 +02:00
Closes #810. ## Bug A: HSA_OVERRIDE_GFX_VERSION=11.0.0 was unconditional PR #804 set HSA_OVERRIDE_GFX_VERSION=11.0.0 for any AMD GPU. The inline comment claimed this was harmless on supported discrete cards (gfx1030 RX 6800, etc.) — empirically false. With the override, Ollama crashes during GPU discovery on gfx1030 and falls back to CPU silently. Affects every NOMAD user with an RX 6800 or other RDNA 2 discrete card. The correct value depends on the gfx version: - gfx1030, gfx1100, gfx1101, gfx1102: officially supported by ROCm — no override - gfx1031..gfx1036 (RDNA 2 variants + iGPUs like Rembrandt 680M): 10.3.0 - gfx1103, gfx1150, gfx1151 (Phoenix 780M, Strix 890M, Strix Halo): 11.0.0 ### Resolution chain in `_resolveAmdHsaOverride()` 1. KV `ai.amdHsaOverride` — manual override; accepts 'none' to disable, or a semver-style value to force. 2. Marker file `/app/storage/.nomad-amd-gfx` — written by install_nomad.sh based on lspci codename. Mapped to override via `_mapGfxToHsaOverride()`. 3. Default: `11.0.0` — preserves prior behavior so existing iGPU users (780M / 890M, the dominant AMD population today) don't regress on upgrade. Discrete RDNA 2 users on existing installs can opt out via `ai.amdHsaOverride='none'` and force-reinstall AI Assistant, OR re-run install_nomad.sh to refresh the marker file. The helper is used in both `createContainer` (initial install) and `updateContainer` (image update) paths, replacing the unconditional push. ## Bug B: BenchmarkService had no AMD discrete detection path `BenchmarkService.getHardwareInfo()` had three GPU detection fallbacks: 1. `si.graphics()` — empty inside Docker for AMD 2. nvidia-smi — NVIDIA only 3. AMD APU regex from CPU model — integrated only Result: AMD discrete cards (RX 6800, RX 7900 XTX, etc.) showed up as "GPU: Not detected" on the leaderboard despite ROCm working. Corrupts leaderboard data quality for that population. Fix: after the existing fallbacks, call `SystemService.getSystemInfo()` and read `graphics.controllers[0].model`. That path already handles AMD via the marker file + Ollama log probe added in PR #804, so we're reusing existing plumbing rather than duplicating detection logic. ## install_nomad.sh changes The existing AMD detection block already runs lspci. Added a codename parse step that maps Navi 21/22/23/24, Rembrandt, Phoenix1/Phoenix2, Strix/Strix Point/Strix Halo, and Navi 31/32/33 to gfx versions, then writes `/opt/project-nomad/storage/.nomad-amd-gfx`. Unknown codenames write nothing (admin handles missing-marker case via the backward-compat default). ## Validation Both bugs were originally surfaced and validated empirically on RX 6800 / gfx1030 / Ubuntu 24.04 + kernel 6.17 + ollama/ollama:rocm during the #810 filing. Validation grid from that report: | Run | NOMAD Score | tok/s | GPU detected | |-----------------------------------------------|-------------|-------|-------------------------| | Pre-fix (Bug A active) | n/a | 0 | yes, but library=cpu | | HSA_OVERRIDE removed, Bug B unfixed | 73.8 | 221.6 | "Not detected" | | Both fixes hot-patched (this PR's behavior) | 73.7 | 216.0 | AMD Radeon RX 6800 | Local checks: `npm run typecheck` clean, `npm run build` clean.
665 lines
31 KiB
Bash
665 lines
31 KiB
Bash
#!/bin/bash
|
|
|
|
# Project N.O.M.A.D. Installation Script
|
|
|
|
###################################################################################################################################################################################################
|
|
|
|
# Script | Project N.O.M.A.D. Installation Script
|
|
# Version | 1.0.0
|
|
# Author | Crosstalk Solutions, LLC
|
|
# Website | https://crosstalksolutions.com
|
|
|
|
###################################################################################################################################################################################################
|
|
# #
|
|
# Color Codes #
|
|
# #
|
|
###################################################################################################################################################################################################
|
|
|
|
RESET='\033[0m'
|
|
YELLOW='\033[1;33m'
|
|
WHITE_R='\033[39m' # Same as GRAY_R for terminals with white background.
|
|
GRAY_R='\033[39m'
|
|
RED='\033[1;31m' # Light Red.
|
|
GREEN='\033[1;32m' # Light Green.
|
|
|
|
###################################################################################################################################################################################################
|
|
# #
|
|
# Constants & Variables #
|
|
# #
|
|
###################################################################################################################################################################################################
|
|
|
|
WHIPTAIL_TITLE="Project N.O.M.A.D Installation"
|
|
NOMAD_DIR="/opt/project-nomad"
|
|
MANAGEMENT_COMPOSE_FILE_URL="https://raw.githubusercontent.com/Crosstalk-Solutions/project-nomad/refs/heads/main/install/management_compose.yaml"
|
|
START_SCRIPT_URL="https://raw.githubusercontent.com/Crosstalk-Solutions/project-nomad/refs/heads/main/install/start_nomad.sh"
|
|
STOP_SCRIPT_URL="https://raw.githubusercontent.com/Crosstalk-Solutions/project-nomad/refs/heads/main/install/stop_nomad.sh"
|
|
UPDATE_SCRIPT_URL="https://raw.githubusercontent.com/Crosstalk-Solutions/project-nomad/refs/heads/main/install/update_nomad.sh"
|
|
script_option_debug='true'
|
|
accepted_terms='false'
|
|
local_ip_address=''
|
|
|
|
###################################################################################################################################################################################################
|
|
# #
|
|
# Functions #
|
|
# #
|
|
###################################################################################################################################################################################################
|
|
|
|
header() {
|
|
if [[ "${script_option_debug}" != 'true' ]]; then clear; clear; fi
|
|
echo -e "${GREEN}#########################################################################${RESET}\\n"
|
|
}
|
|
|
|
header_red() {
|
|
if [[ "${script_option_debug}" != 'true' ]]; then clear; clear; fi
|
|
echo -e "${RED}#########################################################################${RESET}\\n"
|
|
}
|
|
|
|
check_has_sudo() {
|
|
if sudo -n true 2>/dev/null; then
|
|
echo -e "${GREEN}#${RESET} User has sudo permissions.\\n"
|
|
else
|
|
echo "User does not have sudo permissions"
|
|
header_red
|
|
echo -e "${RED}#${RESET} This script requires sudo permissions to run. Please run the script with sudo.\\n"
|
|
echo -e "${RED}#${RESET} For example: sudo bash $(basename "$0")"
|
|
exit 1
|
|
fi
|
|
}
|
|
|
|
check_is_bash() {
|
|
if [[ -z "$BASH_VERSION" ]]; then
|
|
header_red
|
|
echo -e "${RED}#${RESET} This script requires bash to run. Please run the script using bash.\\n"
|
|
echo -e "${RED}#${RESET} For example: bash $(basename "$0")"
|
|
exit 1
|
|
fi
|
|
echo -e "${GREEN}#${RESET} This script is running in bash.\\n"
|
|
}
|
|
|
|
check_is_debian_based() {
|
|
if [[ ! -f /etc/debian_version ]]; then
|
|
header_red
|
|
echo -e "${RED}#${RESET} This script is designed to run on Debian-based systems only.\\n"
|
|
echo -e "${RED}#${RESET} Please run this script on a Debian-based system and try again."
|
|
exit 1
|
|
fi
|
|
echo -e "${GREEN}#${RESET} This script is running on a Debian-based system.\\n"
|
|
}
|
|
|
|
check_is_x86_64() {
|
|
local arch
|
|
arch="$(uname -m)"
|
|
if [[ "${arch}" != "x86_64" && "${arch}" != "amd64" ]]; then
|
|
echo -e "${YELLOW}#${RESET} WARNING: Detected architecture '${arch}'. NOMAD officially supports x86_64 only.\\n"
|
|
echo -e "${YELLOW}#${RESET} ARM64/aarch64 support is tracked in PR #419 and is not yet ready.\\n"
|
|
echo -e "${YELLOW}#${RESET} Continuing on an unsupported architecture will likely fail and may leave\\n"
|
|
echo -e "${YELLOW}#${RESET} partial Docker images and files behind that you'll need to clean up manually.\\n"
|
|
echo -e "${YELLOW}#${RESET} Continuing in 10 seconds... press Ctrl+C now to abort.\\n"
|
|
sleep 10
|
|
return
|
|
fi
|
|
echo -e "${GREEN}#${RESET} Architecture check passed (${arch}).\\n"
|
|
}
|
|
|
|
ensure_dependencies_installed() {
|
|
local missing_deps=()
|
|
|
|
# Check for curl
|
|
if ! command -v curl &> /dev/null; then
|
|
missing_deps+=("curl")
|
|
fi
|
|
|
|
# Check for gpg (required for NVIDIA container toolkit keyring)
|
|
if ! command -v gpg &> /dev/null; then
|
|
missing_deps+=("gpg")
|
|
fi
|
|
|
|
# Check for whiptail (used for dialogs, though not currently active)
|
|
# if ! command -v whiptail &> /dev/null; then
|
|
# missing_deps+=("whiptail")
|
|
# fi
|
|
|
|
if [[ ${#missing_deps[@]} -gt 0 ]]; then
|
|
echo -e "${YELLOW}#${RESET} Installing required dependencies: ${missing_deps[*]}...\\n"
|
|
sudo apt-get update
|
|
sudo apt-get install -y "${missing_deps[@]}"
|
|
|
|
# Verify installation
|
|
for dep in "${missing_deps[@]}"; do
|
|
if ! command -v "$dep" &> /dev/null; then
|
|
echo -e "${RED}#${RESET} Failed to install $dep. Please install it manually and try again."
|
|
exit 1
|
|
fi
|
|
done
|
|
echo -e "${GREEN}#${RESET} Dependencies installed successfully.\\n"
|
|
else
|
|
echo -e "${GREEN}#${RESET} All required dependencies are already installed.\\n"
|
|
fi
|
|
}
|
|
|
|
check_is_debug_mode(){
|
|
# Check if the script is being run in debug mode
|
|
if [[ "${script_option_debug}" == 'true' ]]; then
|
|
echo -e "${YELLOW}#${RESET} Debug mode is enabled, the script will not clear the screen...\\n"
|
|
else
|
|
clear; clear
|
|
fi
|
|
}
|
|
|
|
generateRandomPass() {
|
|
local length="${1:-32}" # Default to 32
|
|
local password
|
|
|
|
# Generate random password using /dev/urandom
|
|
password=$(tr -dc 'A-Za-z0-9' < /dev/urandom | head -c "$length")
|
|
|
|
echo "$password"
|
|
}
|
|
|
|
ensure_docker_installed() {
|
|
if ! command -v docker &> /dev/null; then
|
|
echo -e "${YELLOW}#${RESET} Docker not found. Installing Docker...\\n"
|
|
|
|
# Update package database
|
|
sudo apt-get update
|
|
|
|
# Install prerequisites
|
|
sudo apt-get install -y ca-certificates curl
|
|
|
|
# Create directory for keyrings
|
|
# sudo install -m 0755 -d /etc/apt/keyrings
|
|
|
|
# # Download Docker's official GPG key
|
|
# sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
|
|
# sudo chmod a+r /etc/apt/keyrings/docker.asc
|
|
|
|
# # Add the repository to Apt sources
|
|
# echo \
|
|
# "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
|
|
# $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
|
|
# sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
|
|
|
|
# # Update the package database with the Docker packages from the newly added repo
|
|
# sudo apt-get update
|
|
|
|
# # Install Docker packages
|
|
# sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
|
|
|
|
# Download the Docker convenience script
|
|
curl -fsSL https://get.docker.com -o get-docker.sh
|
|
|
|
# Run the Docker installation script
|
|
sudo sh get-docker.sh
|
|
|
|
# Check if Docker was installed successfully
|
|
if ! command -v docker &> /dev/null; then
|
|
echo -e "${RED}#${RESET} Docker installation failed. Please check the logs and try again."
|
|
exit 1
|
|
fi
|
|
|
|
echo -e "${GREEN}#${RESET} Docker installation completed.\\n"
|
|
else
|
|
echo -e "${GREEN}#${RESET} Docker is already installed.\\n"
|
|
|
|
# Check if Docker service is running
|
|
if ! systemctl is-active --quiet docker; then
|
|
echo -e "${YELLOW}#${RESET} Docker is installed but not running. Attempting to start Docker...\\n"
|
|
sudo systemctl start docker
|
|
if ! systemctl is-active --quiet docker; then
|
|
echo -e "${RED}#${RESET} Failed to start Docker. Please check the Docker service status and try again."
|
|
exit 1
|
|
else
|
|
echo -e "${GREEN}#${RESET} Docker service started successfully.\\n"
|
|
fi
|
|
else
|
|
echo -e "${GREEN}#${RESET} Docker service is already running.\\n"
|
|
fi
|
|
fi
|
|
}
|
|
|
|
check_docker_compose() {
|
|
# Check if 'docker compose' (v2 plugin) is available
|
|
if ! docker compose version &>/dev/null; then
|
|
echo -e "${RED}#${RESET} Docker Compose v2 is not installed or not available as a Docker plugin."
|
|
echo -e "${YELLOW}#${RESET} This script requires 'docker compose' (v2), not 'docker-compose' (v1)."
|
|
echo -e "${YELLOW}#${RESET} Please read the Docker documentation at https://docs.docker.com/compose/install/ for instructions on how to install Docker Compose v2."
|
|
exit 1
|
|
fi
|
|
}
|
|
|
|
setup_nvidia_container_toolkit() {
|
|
# This function attempts to set up NVIDIA GPU support but is non-blocking
|
|
# Any failures will result in warnings but will NOT stop the installation process
|
|
|
|
echo -e "${YELLOW}#${RESET} Checking for NVIDIA GPU...\\n"
|
|
|
|
# Safely detect NVIDIA GPU
|
|
local has_nvidia_gpu=false
|
|
if command -v lspci &> /dev/null; then
|
|
if lspci 2>/dev/null | grep -i nvidia &> /dev/null; then
|
|
has_nvidia_gpu=true
|
|
echo -e "${GREEN}#${RESET} NVIDIA GPU detected.\\n"
|
|
fi
|
|
fi
|
|
|
|
# Also check for nvidia-smi
|
|
if ! $has_nvidia_gpu && command -v nvidia-smi &> /dev/null; then
|
|
if nvidia-smi &> /dev/null; then
|
|
has_nvidia_gpu=true
|
|
echo -e "${GREEN}#${RESET} NVIDIA GPU detected via nvidia-smi.\\n"
|
|
fi
|
|
fi
|
|
|
|
if ! $has_nvidia_gpu; then
|
|
echo -e "${YELLOW}#${RESET} No NVIDIA GPU detected. Skipping NVIDIA container toolkit installation.\\n"
|
|
return 0
|
|
fi
|
|
|
|
# Check if nvidia-container-toolkit is already installed
|
|
if command -v nvidia-ctk &> /dev/null; then
|
|
echo -e "${GREEN}#${RESET} NVIDIA container toolkit is already installed.\\n"
|
|
return 0
|
|
fi
|
|
|
|
echo -e "${YELLOW}#${RESET} Installing NVIDIA container toolkit...\\n"
|
|
|
|
# Install dependencies per https://docs.ollama.com/docker - wrapped in error handling
|
|
if ! curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey 2>/dev/null | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg 2>/dev/null; then
|
|
echo -e "${YELLOW}#${RESET} Warning: Failed to add NVIDIA container toolkit GPG key. Continuing anyway...\\n"
|
|
return 0
|
|
fi
|
|
|
|
if ! curl -fsSL https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list 2>/dev/null \
|
|
| sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
|
|
| sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list > /dev/null 2>&1; then
|
|
echo -e "${YELLOW}#${RESET} Warning: Failed to add NVIDIA container toolkit repository. Continuing anyway...\\n"
|
|
return 0
|
|
fi
|
|
|
|
if ! sudo apt-get update 2>/dev/null; then
|
|
echo -e "${YELLOW}#${RESET} Warning: Failed to update package list. Continuing anyway...\\n"
|
|
return 0
|
|
fi
|
|
|
|
if ! sudo apt-get install -y nvidia-container-toolkit 2>/dev/null; then
|
|
echo -e "${YELLOW}#${RESET} Warning: Failed to install NVIDIA container toolkit. Continuing anyway...\\n"
|
|
return 0
|
|
fi
|
|
|
|
echo -e "${GREEN}#${RESET} NVIDIA container toolkit installed successfully.\\n"
|
|
|
|
# Configure Docker to use NVIDIA runtime
|
|
echo -e "${YELLOW}#${RESET} Configuring Docker to use NVIDIA runtime...\\n"
|
|
|
|
if ! sudo nvidia-ctk runtime configure --runtime=docker 2>/dev/null; then
|
|
echo -e "${YELLOW}#${RESET} nvidia-ctk configure failed, attempting manual configuration...\\n"
|
|
|
|
# Fallback: Manually configure daemon.json
|
|
local daemon_json="/etc/docker/daemon.json"
|
|
local config_success=false
|
|
|
|
if [[ -f "$daemon_json" ]]; then
|
|
# Backup existing config (best effort)
|
|
sudo cp "$daemon_json" "${daemon_json}.backup" 2>/dev/null || true
|
|
|
|
# Check if nvidia runtime already exists
|
|
if ! grep -q '"nvidia"' "$daemon_json" 2>/dev/null; then
|
|
# Add nvidia runtime to existing config using jq if available
|
|
if command -v jq &> /dev/null; then
|
|
if sudo jq '. + {"runtimes": {"nvidia": {"path": "nvidia-container-runtime", "runtimeArgs": []}}}' "$daemon_json" > /tmp/daemon.json.tmp 2>/dev/null; then
|
|
if sudo mv /tmp/daemon.json.tmp "$daemon_json" 2>/dev/null; then
|
|
config_success=true
|
|
fi
|
|
fi
|
|
# Clean up temp file if move failed
|
|
sudo rm -f /tmp/daemon.json.tmp 2>/dev/null || true
|
|
else
|
|
echo -e "${YELLOW}#${RESET} jq not available, skipping manual daemon.json configuration...\\n"
|
|
fi
|
|
else
|
|
config_success=true # Already configured
|
|
fi
|
|
else
|
|
# Create new daemon.json with nvidia runtime (best effort)
|
|
if echo '{"runtimes":{"nvidia":{"path":"nvidia-container-runtime","runtimeArgs":[]}}}' | sudo tee "$daemon_json" > /dev/null 2>&1; then
|
|
config_success=true
|
|
fi
|
|
fi
|
|
|
|
if ! $config_success; then
|
|
echo -e "${YELLOW}#${RESET} Manual daemon.json configuration unsuccessful. GPU support may require manual setup.\\n"
|
|
fi
|
|
fi
|
|
|
|
# Restart Docker service
|
|
echo -e "${YELLOW}#${RESET} Restarting Docker service...\\n"
|
|
if ! sudo systemctl restart docker 2>/dev/null; then
|
|
echo -e "${YELLOW}#${RESET} Warning: Failed to restart Docker service. You may need to restart it manually.\\n"
|
|
return 0
|
|
fi
|
|
|
|
# Verify NVIDIA runtime is available
|
|
echo -e "${YELLOW}#${RESET} Verifying NVIDIA runtime configuration...\\n"
|
|
sleep 2 # Give Docker a moment to fully restart
|
|
|
|
if docker info 2>/dev/null | grep -q "nvidia"; then
|
|
echo -e "${GREEN}#${RESET} NVIDIA runtime successfully configured and verified.\\n"
|
|
else
|
|
echo -e "${YELLOW}#${RESET} Warning: NVIDIA runtime not detected in Docker info. GPU acceleration may not work.\\n"
|
|
echo -e "${YELLOW}#${RESET} You may need to manually configure /etc/docker/daemon.json and restart Docker.\\n"
|
|
fi
|
|
|
|
echo -e "${GREEN}#${RESET} NVIDIA container toolkit configuration completed.\\n"
|
|
}
|
|
|
|
get_install_confirmation(){
|
|
echo -e "${YELLOW}#${RESET} This script will install Project N.O.M.A.D. and its dependencies on your machine."
|
|
echo -e "${YELLOW}#${RESET} If you already have Project N.O.M.A.D. installed with customized config or data, please be aware that running this installation script may overwrite existing files and configurations. It is highly recommended to back up any important data/configs before proceeding."
|
|
read -p "Are you sure you want to continue? (y/N): " choice
|
|
case "$choice" in
|
|
y|Y )
|
|
echo -e "${GREEN}#${RESET} User chose to continue with the installation."
|
|
;;
|
|
* )
|
|
echo "User chose not to continue with the installation."
|
|
exit 0
|
|
;;
|
|
esac
|
|
}
|
|
|
|
accept_terms() {
|
|
printf "\n\n"
|
|
echo "License Agreement & Terms of Use"
|
|
echo "__________________________"
|
|
printf "\n\n"
|
|
echo "Project N.O.M.A.D. is licensed under the Apache License 2.0. The full license can be found at https://www.apache.org/licenses/LICENSE-2.0 or in the LICENSE file of this repository."
|
|
printf "\n"
|
|
echo "By accepting this agreement, you acknowledge that you have read and understood the terms and conditions of the Apache License 2.0 and agree to be bound by them while using Project N.O.M.A.D."
|
|
echo -e "\n\n"
|
|
read -p "I have read and accept License Agreement & Terms of Use (y/N)? " choice
|
|
case "$choice" in
|
|
y|Y )
|
|
accepted_terms='true'
|
|
;;
|
|
* )
|
|
echo "License Agreement & Terms of Use not accepted. Installation cannot continue."
|
|
exit 1
|
|
;;
|
|
esac
|
|
}
|
|
|
|
create_nomad_directory(){
|
|
# Ensure the main installation directory exists
|
|
if [[ ! -d "$NOMAD_DIR" ]]; then
|
|
echo -e "${YELLOW}#${RESET} Creating directory for Project N.O.M.A.D at $NOMAD_DIR...\\n"
|
|
sudo mkdir -p "$NOMAD_DIR"
|
|
sudo chown "$(whoami):$(whoami)" "$NOMAD_DIR"
|
|
|
|
echo -e "${GREEN}#${RESET} Directory created successfully.\\n"
|
|
else
|
|
echo -e "${GREEN}#${RESET} Directory $NOMAD_DIR already exists.\\n"
|
|
fi
|
|
|
|
# Also ensure the directory has a /storage/logs/ subdirectory
|
|
sudo mkdir -p "${NOMAD_DIR}/storage/logs"
|
|
|
|
# Create a admin.log file in the logs directory
|
|
sudo touch "${NOMAD_DIR}/storage/logs/admin.log"
|
|
}
|
|
|
|
download_management_compose_file() {
|
|
local compose_file_path="${NOMAD_DIR}/compose.yml"
|
|
|
|
echo -e "${YELLOW}#${RESET} Downloading docker-compose file for management...\\n"
|
|
if ! curl -fsSL "$MANAGEMENT_COMPOSE_FILE_URL" -o "$compose_file_path"; then
|
|
echo -e "${RED}#${RESET} Failed to download the docker compose file. Please check the URL and try again."
|
|
exit 1
|
|
fi
|
|
echo -e "${GREEN}#${RESET} Docker compose file downloaded successfully to $compose_file_path.\\n"
|
|
|
|
local app_key=$(generateRandomPass)
|
|
local db_root_password=$(generateRandomPass)
|
|
local db_user_password=$(generateRandomPass)
|
|
|
|
# If MySQL data directory exists from a previous install attempt, remove it.
|
|
# MySQL only initializes credentials on first startup when the data dir is empty.
|
|
# If stale data exists, MySQL ignores the new passwords above and uses the old ones,
|
|
# causing "Access denied" errors when the admin container tries to connect.
|
|
if [[ -d "${NOMAD_DIR}/mysql" ]]; then
|
|
echo -e "${YELLOW}#${RESET} Removing existing MySQL data directory to ensure credentials match...\\n"
|
|
sudo rm -rf "${NOMAD_DIR}/mysql"
|
|
fi
|
|
|
|
# Inject dynamic env values into the compose file
|
|
echo -e "${YELLOW}#${RESET} Configuring docker-compose file env variables...\\n"
|
|
sed -i "s|URL=replaceme|URL=http://${local_ip_address}:8080|g" "$compose_file_path"
|
|
sed -i "s|APP_KEY=replaceme|APP_KEY=${app_key}|g" "$compose_file_path"
|
|
|
|
sed -i "s|DB_PASSWORD=replaceme|DB_PASSWORD=${db_user_password}|g" "$compose_file_path"
|
|
sed -i "s|MYSQL_ROOT_PASSWORD=replaceme|MYSQL_ROOT_PASSWORD=${db_root_password}|g" "$compose_file_path"
|
|
sed -i "s|MYSQL_PASSWORD=replaceme|MYSQL_PASSWORD=${db_user_password}|g" "$compose_file_path"
|
|
|
|
echo -e "${GREEN}#${RESET} Docker compose file configured successfully.\\n"
|
|
}
|
|
|
|
download_helper_scripts() {
|
|
local start_script_path="${NOMAD_DIR}/start_nomad.sh"
|
|
local stop_script_path="${NOMAD_DIR}/stop_nomad.sh"
|
|
local update_script_path="${NOMAD_DIR}/update_nomad.sh"
|
|
|
|
echo -e "${YELLOW}#${RESET} Downloading helper scripts...\\n"
|
|
if ! curl -fsSL "$START_SCRIPT_URL" -o "$start_script_path"; then
|
|
echo -e "${RED}#${RESET} Failed to download the start script. Please check the URL and try again."
|
|
exit 1
|
|
fi
|
|
chmod +x "$start_script_path"
|
|
|
|
if ! curl -fsSL "$STOP_SCRIPT_URL" -o "$stop_script_path"; then
|
|
echo -e "${RED}#${RESET} Failed to download the stop script. Please check the URL and try again."
|
|
exit 1
|
|
fi
|
|
chmod +x "$stop_script_path"
|
|
|
|
if ! curl -fsSL "$UPDATE_SCRIPT_URL" -o "$update_script_path"; then
|
|
echo -e "${RED}#${RESET} Failed to download the update script. Please check the URL and try again."
|
|
exit 1
|
|
fi
|
|
chmod +x "$update_script_path"
|
|
|
|
echo -e "${GREEN}#${RESET} Helper scripts downloaded successfully to $start_script_path, $stop_script_path, and $update_script_path.\\n"
|
|
}
|
|
|
|
start_management_containers() {
|
|
echo -e "${YELLOW}#${RESET} Starting management containers using docker compose...\\n"
|
|
if ! sudo docker compose -p project-nomad -f "${NOMAD_DIR}/compose.yml" up -d; then
|
|
echo -e "${RED}#${RESET} Failed to start management containers. Please check the logs and try again."
|
|
exit 1
|
|
fi
|
|
echo -e "${GREEN}#${RESET} Management containers started successfully.\\n"
|
|
}
|
|
|
|
get_local_ip() {
|
|
local_ip_address=$(hostname -I | awk '{print $1}')
|
|
if [[ -z "$local_ip_address" ]]; then
|
|
echo -e "${RED}#${RESET} Unable to determine local IP address. Please check your network configuration."
|
|
exit 1
|
|
fi
|
|
}
|
|
verify_gpu_setup() {
|
|
# This function only displays GPU setup status and is completely non-blocking
|
|
# It never exits or returns error codes - purely informational
|
|
|
|
echo -e "\\n${YELLOW}#${RESET} GPU Setup Verification\\n"
|
|
echo -e "${YELLOW}===========================================${RESET}\\n"
|
|
|
|
# Check if NVIDIA GPU is present
|
|
if command -v nvidia-smi &> /dev/null; then
|
|
echo -e "${GREEN}✓${RESET} NVIDIA GPU detected:"
|
|
nvidia-smi --query-gpu=name,memory.total --format=csv,noheader 2>/dev/null | while read -r line; do
|
|
echo -e " ${WHITE_R}$line${RESET}"
|
|
done
|
|
echo ""
|
|
else
|
|
echo -e "${YELLOW}○${RESET} No NVIDIA GPU detected (nvidia-smi not available)\\n"
|
|
fi
|
|
|
|
# Check if NVIDIA Container Toolkit is installed
|
|
if command -v nvidia-ctk &> /dev/null; then
|
|
echo -e "${GREEN}✓${RESET} NVIDIA Container Toolkit installed: $(nvidia-ctk --version 2>/dev/null | head -n1)\\n"
|
|
else
|
|
echo -e "${YELLOW}○${RESET} NVIDIA Container Toolkit not installed\\n"
|
|
fi
|
|
|
|
# Check if Docker has NVIDIA runtime
|
|
if docker info 2>/dev/null | grep -q "nvidia"; then
|
|
echo -e "${GREEN}✓${RESET} Docker NVIDIA runtime configured\\n"
|
|
else
|
|
echo -e "${YELLOW}○${RESET} Docker NVIDIA runtime not detected\\n"
|
|
fi
|
|
|
|
# Check for AMD GPU — restrict to display controller classes to avoid false positives
|
|
# from AMD CPU host bridges, PCI bridges, and chipset devices.
|
|
local has_amd_gpu='false'
|
|
local amd_gfx_version=''
|
|
if command -v lspci &> /dev/null; then
|
|
if lspci 2>/dev/null | grep -iE "VGA|3D controller|Display" | grep -iE "amd|radeon" &> /dev/null; then
|
|
has_amd_gpu='true'
|
|
echo -e "${GREEN}✓${RESET} AMD GPU detected — ROCm acceleration will be configured automatically when AI Assistant is installed.\\n"
|
|
|
|
# Map AMD codename → gfx version so the admin can pick the right HSA_OVERRIDE_GFX_VERSION.
|
|
# gfx1030/1100/1101/1102 are on AMD's official ROCm allowlist and need NO override —
|
|
# forcing one (e.g. 11.0.0) breaks GPU discovery on these. Other variants do need it.
|
|
local amd_devices
|
|
amd_devices=$(lspci -vmm 2>/dev/null | awk -F'\t' '/^Class:.*(VGA|3D|Display)/{c=1} c && /^Device:/{print $2; c=0}')
|
|
if echo "${amd_devices}" | grep -iq 'Navi 21'; then
|
|
amd_gfx_version='gfx1030'
|
|
elif echo "${amd_devices}" | grep -iq 'Navi 22'; then
|
|
amd_gfx_version='gfx1031'
|
|
elif echo "${amd_devices}" | grep -iq 'Navi 23'; then
|
|
amd_gfx_version='gfx1032'
|
|
elif echo "${amd_devices}" | grep -iq 'Navi 24'; then
|
|
amd_gfx_version='gfx1034'
|
|
elif echo "${amd_devices}" | grep -iq 'Rembrandt'; then
|
|
amd_gfx_version='gfx1035'
|
|
elif echo "${amd_devices}" | grep -iEq 'Phoenix1?|Phoenix2'; then
|
|
amd_gfx_version='gfx1103'
|
|
elif echo "${amd_devices}" | grep -iEq 'Strix Halo'; then
|
|
amd_gfx_version='gfx1151'
|
|
elif echo "${amd_devices}" | grep -iEq 'Strix( Point)?'; then
|
|
amd_gfx_version='gfx1150'
|
|
elif echo "${amd_devices}" | grep -iq 'Navi 31'; then
|
|
amd_gfx_version='gfx1100'
|
|
elif echo "${amd_devices}" | grep -iq 'Navi 32'; then
|
|
amd_gfx_version='gfx1101'
|
|
elif echo "${amd_devices}" | grep -iq 'Navi 33'; then
|
|
amd_gfx_version='gfx1102'
|
|
fi
|
|
fi
|
|
fi
|
|
|
|
# Write detected GPU type to a marker file the admin container can read. The admin
|
|
# container lacks lspci and AMD GPUs don't register a Docker runtime, so this is the
|
|
# only reliable way for the admin to know an AMD GPU is present at install time.
|
|
local gpu_marker_path="${NOMAD_DIR}/storage/.nomad-gpu-type"
|
|
if command -v nvidia-smi &> /dev/null; then
|
|
echo 'nvidia' | sudo tee "${gpu_marker_path}" > /dev/null 2>&1 || true
|
|
elif [[ "${has_amd_gpu}" == 'true' ]]; then
|
|
echo 'amd' | sudo tee "${gpu_marker_path}" > /dev/null 2>&1 || true
|
|
else
|
|
sudo rm -f "${gpu_marker_path}" 2>/dev/null || true
|
|
fi
|
|
|
|
# Companion marker used by the admin to pick the right HSA_OVERRIDE_GFX_VERSION for
|
|
# the detected card. Absence of this file means "unknown gfx" — the admin falls back
|
|
# to its built-in default. Always rewrite (or remove) on install to keep state fresh.
|
|
local amd_gfx_marker_path="${NOMAD_DIR}/storage/.nomad-amd-gfx"
|
|
if [[ -n "${amd_gfx_version}" ]]; then
|
|
echo "${amd_gfx_version}" | sudo tee "${amd_gfx_marker_path}" > /dev/null 2>&1 || true
|
|
else
|
|
sudo rm -f "${amd_gfx_marker_path}" 2>/dev/null || true
|
|
fi
|
|
|
|
echo -e "${YELLOW}===========================================${RESET}\\n"
|
|
|
|
# Summary
|
|
if command -v nvidia-smi &> /dev/null && docker info 2>/dev/null | grep -q "nvidia"; then
|
|
echo -e "${GREEN}#${RESET} GPU acceleration is properly configured! The AI Assistant will use your GPU.\\n"
|
|
elif [[ "${has_amd_gpu}" == 'true' ]]; then
|
|
echo -e "${GREEN}#${RESET} GPU acceleration will be enabled (AMD/ROCm) when AI Assistant is installed from the dashboard.\\n"
|
|
else
|
|
echo -e "${YELLOW}#${RESET} GPU acceleration not detected. The AI Assistant will run in CPU-only mode.\\n"
|
|
if command -v nvidia-smi &> /dev/null && ! docker info 2>/dev/null | grep -q "nvidia"; then
|
|
echo -e "${YELLOW}#${RESET} Tip: Your GPU is detected but Docker runtime is not configured.\\n"
|
|
echo -e "${YELLOW}#${RESET} Try restarting Docker: ${WHITE_R}sudo systemctl restart docker${RESET}\\n"
|
|
fi
|
|
fi
|
|
}
|
|
|
|
success_message() {
|
|
echo -e "${GREEN}#${RESET} Project N.O.M.A.D installation completed successfully!\\n"
|
|
echo -e "${GREEN}#${RESET} Installation files are located at /opt/project-nomad\\n\n"
|
|
echo -e "${GREEN}#${RESET} Project N.O.M.A.D's Command Center should automatically start whenever your device reboots. However, if you need to start it manually, you can always do so by running: ${WHITE_R}${NOMAD_DIR}/start_nomad.sh${RESET}\\n"
|
|
echo -e "${GREEN}#${RESET} You can now access the management interface at http://localhost:8080 or http://${local_ip_address}:8080\\n"
|
|
echo -e "${GREEN}#${RESET} Thank you for supporting Project N.O.M.A.D!\\n"
|
|
}
|
|
|
|
###################################################################################################################################################################################################
|
|
# #
|
|
# Main Script #
|
|
# #
|
|
###################################################################################################################################################################################################
|
|
|
|
# Pre-flight checks
|
|
check_is_debian_based
|
|
check_is_x86_64
|
|
check_is_bash
|
|
check_has_sudo
|
|
ensure_dependencies_installed
|
|
check_is_debug_mode
|
|
|
|
# Main install
|
|
get_install_confirmation
|
|
accept_terms
|
|
ensure_docker_installed
|
|
check_docker_compose
|
|
setup_nvidia_container_toolkit
|
|
get_local_ip
|
|
create_nomad_directory
|
|
download_helper_scripts
|
|
download_management_compose_file
|
|
start_management_containers
|
|
verify_gpu_setup
|
|
success_message
|
|
|
|
# free_space_check() {
|
|
# if [[ "$(df -B1 / | awk 'NR==2{print $4}')" -le '5368709120' ]]; then
|
|
# header_red
|
|
# echo -e "${YELLOW}#${RESET} You only have $(df -B1 / | awk 'NR==2{print $4}' | awk '{ split( "B KB MB GB TB PB EB ZB YB" , v ); s=1; while( $1>1024 && s<9 ){ $1/=1024; s++ } printf "%.1f %s", $1, v[s] }') of disk space available on \"/\"... \\n"
|
|
# while true; do
|
|
# read -rp $'\033[39m#\033[0m Do you want to proceed with running the script? (y/N) ' yes_no
|
|
# case "$yes_no" in
|
|
# [Nn]*|"")
|
|
# free_space_check_response="Cancel script"
|
|
# free_space_check_date="$(date +%s)"
|
|
# echo -e "${YELLOW}#${RESET} OK... Please free up disk space before running the script again..."
|
|
# cancel_script
|
|
# break;;
|
|
# [Yy]*)
|
|
# free_space_check_response="Proceed at own risk"
|
|
# free_space_check_date="$(date +%s)"
|
|
# echo -e "${YELLOW}#${RESET} OK... Proceeding with the script.. please note that failures may occur due to not enough disk space... \\n"; sleep 10
|
|
# break;;
|
|
# *) echo -e "\\n${RED}#${RESET} Invalid input, please answer Yes or No (y/n)...\\n"; sleep 3;;
|
|
# esac
|
|
# done
|
|
# if [[ -n "$(command -v jq)" ]]; then
|
|
# if [[ "$(dpkg-query --showformat='${version}' --show jq 2> /dev/null | sed -e 's/.*://' -e 's/-.*//g' -e 's/[^0-9.]//g' -e 's/\.//g' | sort -V | tail -n1)" -ge "16" && -e "${eus_dir}/db/db.json" ]]; then
|
|
# jq '.scripts."'"${script_name}"'" += {"warnings": {"low-free-disk-space": {"response": "'"${free_space_check_response}"'", "detected-date": "'"${free_space_check_date}"'"}}}' "${eus_dir}/db/db.json" > "${eus_dir}/db/db.json.tmp" 2>> "${eus_dir}/logs/eus-database-management.log"
|
|
# else
|
|
# jq '.scripts."'"${script_name}"'" = (.scripts."'"${script_name}"'" | . + {"warnings": {"low-free-disk-space": {"response": "'"${free_space_check_response}"'", "detected-date": "'"${free_space_check_date}"'"}}})' "${eus_dir}/db/db.json" > "${eus_dir}/db/db.json.tmp" 2>> "${eus_dir}/logs/eus-database-management.log"
|
|
# fi
|
|
# eus_database_move
|
|
# fi
|
|
# fi
|
|
# }
|