Skip to end of metadata
Go to start of metadata

You are viewing an old version of this content. View the current version.

Compare with Current View Version History

« Previous Version 34 Next »

Managing Log Levels in Swarm Cluster Using a Shell Script

The Swarm Log Level Management Script (castor-change-log-level.sh) is designed to manage and dynamically adjust log levels on a DataCore Swarm cluster, providing options to change the log level temporarily and automatically revert it after a specified duration. It supports background execution via screen or tmux, making it ideal for long-running operations that require detachment from the terminal.

Script Features

  • Set and Revert Log Levels: Temporarily change the log level and revert after a specified duration.

  • Flexible JSON Parsing: Uses jq for JSON parsing if available; defaults to grep otherwise.

  • Detachable Session Execution: Runs in a detachable session using screen or tmux, allowing the script to continue running independently of the terminal session.

  • Log Size Monitoring: Reports the log size generated during the temporary log level change.

  • Countdown Display: Shows a countdown for the specified duration.

Requirements

  • jq (optional): Used for parsing JSON responses; falls back to grep if unavailable.

  • screen or tmux (optional): Required for background execution.

  • Permissions: Ensure sufficient permissions to execute on the DataCore Swarm server and access required files.

Script Usage

./castor-change-log-level.sh -d <node_ip> -p <admin:password> -i <new_log_level> [-t <duration_in_seconds>] [--detach] [-v]

Parameter

Description

-d, --swarm_ip

IP address of the Swarm API endpoint (or set SCSP_HOST environment variable).

-p, --credentials

Admin credentials in the format admin:password.

-i, --log.level

New log level to set (values: 10, 15, 20, 30, 40, 50).

-t, --time

Duration in seconds to keep the new log level (optional).

--detach

Runs the script in a detachable session using screen or tmux, allowing continued operation if terminal session ends.

-v, --verbose

Enables verbose mode to display debug information.

Instruction for Use

Example 1: Set log level to 20 and keep it for 10 seconds

./castor-change-log-level.sh -d 192.168.8.84 -p admin:datacore -i 20 -t 10

Example 2: Set node(s) log level to debug (10) and keep it for 10 seconds

./castor-change-log-level.sh -d 192.168.8.84,192.168.8.86 -p admin:datacore -i debug -t 10

Example 3: Run in background mode with verbose logging

./castor-change-log-level.sh -d 192.168.8.84 -p admin:datacore -i 20 -t 30 --detach -v

Example 4: Running the Script at a Specific Time

The at command can used to schedule the scrip to run at a later time. This is useful when you need to start collecting debug logs at a specific hour.

Example 5: Schedule the script to run at 3:00 AM and collect debug logs for 1 hour

  1. Ensure the at service is installed, enabled and running

    dnf -y install at atd
    systemctl enable --now at
  2. Schedule the script execution using at:

    echo "/root/dist/castor-change-log-level.sh -i 10 -t 3600" | at 03:00 AM 2/16/2025

This schedules the script to run at 3:00 AM on Feb 16 2025 and collects log at debug level for 1 hour.

  1. To verify scheduled job:

    atq
    7	Sun Feb 16 03:00:00 2025 a root
  2. To remove a scheduled job (replace JOB_ID with the actual job number from atq output):

    atrm 7

Behavior

  1. Log Level Change: Sets the log level to the specified value. If the current log level matches the requested level, the script skips the update.

  2. Countdown: During the specified duration, the script displays a countdown every second.

  3. Revert Log Level: After the countdown, the log level reverts to the initial value.

  4. Log Size Report: Provides approximately log size generated during the temporary log level change.

  5. Debug Mode: When -v is specified, debug messages display the script's internal operations.

Output Messages

Message

Description

Swarm IP:

Displays the specified Swarm IP address.

Credentials:

Credentials are masked for security.

Cluster Name:

Displays the cluster name retrieved from the Swarm API.

New log level:

Shows the new log level requested.

Current log level:

Displays the current log level.

Updating log level to X...

Indicates the beginning of the log level update process.

Log level changed successfully...

Confirms that the log level was successfully updated.

Keeping log level at X for Y...

Shows the temporary period for which the new log level is retained, with a countdown.

Time's up! Reverting log level...

Indicates that the temporary period has ended and the script is reverting the log level.

Approximate X new logs generated...

Provides information on the amount of logging activity generated during the temporary log level.

Example Output

[root@scs dist]# ./castor-change-log-level.sh -p admin:datacore -i 10 -t 300
Swarm IP: 192.168.1.84
Credentials: [hidden for security]
Cluster Name: gatewayadmindomain

New log level: 10
Current log level is 30.
Updating log level to 10...
Log level changed successfully from 30 → 10.
Keeping log level at 10 for 300 second(s)...
Countdown: 00:00:01 remaining...

Time's up! Reverting log level back to 30...
Approximate 69.4MB new logs were generated at log level 10. Current castor.log size is 371.3MB after 00:05:00.
Log level reverted successfully back to 30.

[root@scs dist]#

Error Handling

  • Missing Parameters: Missing parameters prompt a usage message.

  • Invalid Duration: If a non-numeric duration is provided, you’re prompted to enter a valid duration in seconds.

  • Connection Issues: If unable to connect to the Swarm API, check the IP, credentials, and network access.

Notes

  • Credentials are masked in the output for security.

  • Log file sizes are shown in human-readable format (GB, MB, KB, B).

This script provides administrators with an effective way to adjust and monitor Swarm logging, supporting both temporary and permanent log level changes for troubleshooting and performance monitoring.

Script Source Code

Latest version: castor-change-log-level.sh

#!/bin/bash
# -----------------------------------------------------------------------------------------------------------------------------
# Script: castor-change-log-level.sh
# -----------------------------------------------------------------------------------------------------------------------------
# Description:
# This script changes the log level for the Castor cluster or node(s) using the Swarm API.
# The script supports changing the log level for the entire cluster or individual node(s).
# The script can run in a detachable session using 'screen' or 'tmux'.
# -----------------------------------------------------------------------------------------------------------------------------
# Written by Milton Suen (milton.suen@datacore.com) Oct 31, 2024
# Revision History:
# v1.0.0 - Update to support running the script in a detachable session using screen or tmux.
# v1.1.0 - 2025-02-20 Add support node(s) level log level change.
# v1.2.0 - 2025-02-26 SUPSCR-208: Auto detect CSN or SCS to adjust castor.log file path.
# v1.2.1 - 2025-02-26 Address the issue of the script not display correct when the castor.log file is rotated.
# v1.2.2 - 2025-02-26 SUPSCR-209: Enforced proper credential formatting: credentials must be in the username:password format.
# -----------------------------------------------------------------------------------------------------------------------------
# Current Version: 1.2.2
# -----------------------------------------------------------------------------------------------------------------------------
# KB: https://perifery.atlassian.net/wiki/spaces/public/pages/3872161835/Setting+and+Managing+Swarm+Log+Levels+with+script
# -----------------------------------------------------------------------------------------------------------------------------

# Define colors
RED='\033[0;31m'
BOLD_RED='\033[1;31m'
GREEN='\033[0;32m'
BOLD_GREEN='\033[1;32m'
UNDERLINE_BOLD_GREEN='\033[4;32m'
YELLOW='\033[0;33m'
BOLD_YELLOW='\033[1;33m'
BLUE='\033[0;34m'
BOLD_BLUE='\033[1;34m'
MAGENTA='\033[0;35m'
BOLD_MAGENTA='\033[1;35m'
CYAN='\033[0;36m'
BOLD_CYAN='\033[1;36m'
RESET='\033[0m' # Reset color to default

# Function to display usage information
usage() {
    echo "Usage: ./castor-change-log-level10.sh -d swarm_ip -p admin:password [-i new_log_level | -L new_node_log_level] [-t duration_in_seconds] [-D]"
    echo "  -d, --swarm_ip           IP address of the Swarm API endpoint. Supports single or multiple IPs separated by \",\", \";\" or \" \"."
    echo "                           If multiple IPs are provided, the script will update the log level for all nodes."
    echo "                           (Alternatively, set the SCSP_HOST environment variable to the Swarm IP.)"
    echo "  -p, --credentials        Credentials in the format admin:password"
    echo "  -i, --log.level          New cluster log level to set (5, 10, 15, 20, 30, 40, 50, chatter, debug, announce, info, error, critical, default)"
    echo "  -L, --node.log.level     New node log level to set (0, 5, 10, 15, 20, 30, 40, 50, chatter, debug, announce, info, error, critical, default)"
    echo "                           **Either -i or -L must be specified, but not both.**"
    echo "  -t, --time               (Optional) Duration in seconds to keep the new log level (must be greater than 0)"
    echo "  -D, --detach             (Optional) Detach the script from the current terminal and run in a detachable session using screen or tmux"
    exit 1
}

# Default options
detachable=false
debug=false
output_log="script_output.log"						# Log file for capturing detachable session output
log_level_type="cluster"									# Default log level type
default_log_level=30											# Default log level
log_file="/var/log/datacore/castor.log"		# Default log file location

# Global associative array for log levels
declare -A log_levels=(
    [5]="chatter"
    [10]="debug"
    [15]="audit"
    [20]="info"
    [30]="warning"
    [40]="error"
    [45]="defect"
    [50]="critical"
    [60]="announce"
    [30]="default"
)

# Global associative array for log level names
declare -A log_level_names=(
    ["chatter"]=5
    ["debug"]=10
    ["audit"]=15
    ["info"]=20
    ["warning"]=30
    ["error"]=40
    ["defect"]=45
    ["critical"]=50
    ["announce"]=60
    ["default"]=30
)

# Function to display debug messages if debug mode is enabled
debug_msg() {
    if $debug; then
        echo -e "[DEBUG] $1" >> "$output_log"
    fi
}

# Function to check if either 'screen' or 'tmux' is installed
check_screen_or_tmux() {
    if ! command -v screen &>/dev/null && ! command -v tmux &>/dev/null; then
        echo "Error: Neither 'screen' nor 'tmux' is installed. Cannot run in detachable mode."
        detachable=false  # Disable detachable session
        debug_msg "detachable=${YELLOW}false${RESET}"
    fi
}

# Function to format file size
format_size() {
    local size=$1
    if (( size >= 1073741824 )); then
        echo "$(awk "BEGIN {printf \"%.1fGB\", $size/1073741824}")"
    elif (( size >= 1048576 )); then
        echo "$(awk "BEGIN {printf \"%.1fMB\", $size/1048576}")"
    elif (( size >= 1024 )); then
        echo "$(awk "BEGIN {printf \"%.1fKB\", $size/1024}")"
    else
        echo "${size}B"
    fi
}

# Function to format duration
format_duration() {
    local duration=$1
    local hours=$((duration / 3600))
    local minutes=$(( (duration % 3600) / 60 ))
    local seconds=$((duration % 60))
    printf "%02d:%02d:%02d" $hours $minutes $seconds
}

# Function to check if jq is available and set up JSON parsing method
check_jq() {
    if [[ -x "/usr/local/bin/jq" ]]; then
        echo "/usr/local/bin/jq"
    elif [[ -x "$(pwd)/jq" ]]; then
        echo "$(pwd)/jq"
    elif command -v jq &>/dev/null; then
        echo "jq"
    else
        echo "grep"
    fi
}

jq_or_grep=$(check_jq)

# Function to determine the log file path
determine_log_file() {
    if [[ -f "/var/log/datacore/castor.log" ]]; then
        echo "/var/log/datacore/castor.log"
    elif [[ -f "/var/log/caringo/castor.log" ]]; then
        echo "/var/log/caringo/castor.log"
    else
        echo "Error: Log file not found in /var/log/datacore/castor.log or /var/log/caringo/castor.log"
        exit 1
    fi
}

log_file=$(determine_log_file)

# Parse input arguments
while [[ "$#" -gt 0 ]]; do
    case $1 in
        -d|--swarm_ip) swarm_ip="$2"; debug_msg "Set swarm_ip to $swarm_ip"; shift 2 ;;
        -p|--credentials)
            credentials="$2"
            if ! [[ "$credentials" =~ ^[^:]+:[^:]+$ ]]; then
                echo -e ""
                echo -e "Error: Credentials must be in the format ${BOLD_GREEN}username:password${RESET}"
                exit 1
            fi
            if [[ "$credentials" =~ [^a-zA-Z0-9:] ]]; then
                echo -e ""
                echo -e "Warning: Password contains special characters. Please enclose the credentials with single quotes (${BOLD_YELLOW}'${BOLD_GREEN}username:password${RESET}${BOLD_YELLOW}'${RESET})."
                exit 1
            fi
            debug_msg "Set credentials"; shift 2 ;;
        -i|--log.level)
            if [[ -n "$new_log_level" ]]; then
                echo "Error: Options -i (cluster log leve) and -L (node log level) cannot be used together."
                usage
            fi
            if [[ ${log_level_names[$2]} ]]; then
                new_log_level=${log_level_names[$2]}
            elif [[ ${log_levels[$2]} ]]; then
                new_log_level=$2
            else
                echo "Invalid log level: $2"
                exit 1
            fi
            new_log_level_name=${log_levels[$new_log_level]}
            log_level_type="cluster"
            default_log_level=30
            debug_msg "Set new_log_level to $new_log_level ($new_log_level_name)"
            shift 2
            ;;
        -L|--node.log.level)
            if [[ -n "$new_log_level" ]]; then
                echo "Error: Options -i (cluster log leve) and -L (node log level) cannot be used together."
                usage
            fi
            if [[ ${log_level_names[$2]} ]]; then
                new_log_level=${log_level_names[$2]}
            elif [[ ${log_levels[$2]} || $2 -eq 0 ]]; then
                new_log_level=$2
                new_log_level_name="default"
            else
                echo "Invalid log level: $2"
                exit 1
            fi
            log_level_type="node"
            default_log_level=0
            debug_msg "Set new_log_level to $new_log_level ($new_log_level_name)"
            shift 2
            ;;
        -t|--time)
            if [[ -n "$2" && "$2" != -* && "$2" -gt 0 ]]; then
                duration="$2"
                debug_msg "Set duration to $duration"
                shift 2
            else
                echo "Error: Duration must be a number greater than 0."
                exit 1
            fi
            ;;
        -D|--detach) detachable=true; debug_msg "Set detachable to true"; shift ;;
        --debug) debug=true; debug_msg "Set debug to true"; shift ;;
        *) usage ;;
    esac
done

debug_msg "Set log_level_type to ${YELLOW}$log_level_type${RESET}"

# Set default values if not provided
if [[ -z "$new_log_level" ]]; then
    if [[ "$log_level_type" == "cluster" ]]; then
        new_log_level=$default_log_level
    else
        new_log_level=0
    fi
    new_log_level_name="default"
    new_log_level_name=${log_levels[$new_log_level]}
    debug_msg "Set default new_log_level to $new_log_level ($new_log_level_name)"
fi

# Check if 'screen' or 'tmux' is installed
check_screen_or_tmux

# If swarm_ip is not provided, try using SCSP_HOST environment variable
if [[ -z "$swarm_ip" ]]; then
    if [[ -n "$SCSP_HOST" ]]; then
        swarm_ip="$SCSP_HOST"
        debug "Using Swarm IP from SCSP_HOST: $swarm_ip"
    else
        echo "Error: swarm_ip not provided and SCSP_HOST is not set."
        usage
    fi
fi

# Check if required arguments are provided
if [[ -z "$credentials" || -z "$new_log_level" ]]; then
    usage
fi

# Split the swarm_ip into an array of IP addresses if it contains delimiters
IFS=';, ' read -r -a ip_array <<< "$swarm_ip"

# Retrieve cluster name and handle JSON parsing using the first IP address
debug_msg "Retrieving the cluster name from Swarm API using IP: ${ip_array[0]}"
if [[ "$jq_or_grep" == "grep" ]]; then
    clusterName=$(curl --user "$credentials" -sS "http://${ip_array[0]}:91/api/storage/clusters" | grep -oP '"name":\s*"\K[^"]+')
else
    clusterName=$(curl --user "$credentials" -sS "http://${ip_array[0]}:91/api/storage/clusters" | "$jq_or_grep" -r '._embedded.clusters[0].name')
fi

if [[ -z "$clusterName" ]]; then
    echo "Failed to retrieve the cluster name. Please check your inputs."
    exit 1
fi
debug_msg "Cluster Name: $clusterName"

# Main logic function to run the script tasks
main_script() {
    local swarm_ip="$1"
    local credentials="$2"
    local new_log_level="$3"
    local new_log_level_name="$4"
    local duration="$5"
    local log_level_type="$6"
    # local log_file="/var/log/datacore/castor.log"
    local initial_size=$(stat -c%s "$log_file" 2>/dev/null || echo 0)
    local current_log_level
    local clusterName="$7"
    local jq_or_grep="$8"
    local debug="$9"
    debug_msg "**********************************************************"
    debug_msg "local variables"
    debug_msg "Swarm IP: $swarm_ip"
    debug_msg "Credentials: $credentials"
    debug_msg "New Log Level: $new_log_level ($new_log_level_name)"
    debug_msg "Duration: $duration"
    debug_msg "Log Level Type: $log_level_type"
    debug_msg "Log File: $log_file"
    debug_msg "Initial Log File Size: $initial_size"
    debug_msg "Current Log Level: $current_log_level"
    debug_msg "Cluster Name: $clusterName"
    debug_msg "jq_or_grep: $jq_or_grep"
    debug_msg "Debug: $debug"
    debug_msg "**********************************************************"

    # Split the swarm_ip into an array of IP addresses
    IFS=';, ' read -r -a ip_array <<< "$swarm_ip"
    debug_msg "IP Array: ${ip_array[*]}"

    # Display initial information
    if [[ "$log_level_type" == "cluster" ]]; then
        echo -e "Swarm IP: ${GREEN}${ip_array[0]}${RESET}"
    else
        echo -e "Swarm IPs: ${GREEN}${ip_array[*]}${RESET}"
    fi
    # echo -e "Swarm IP: ${GREEN}$swarm_ip${RESET}"
    echo -e "Credentials: ${GREEN}[hidden for security]${RESET}"
    echo -e "Cluster Name: ${GREEN}$clusterName${RESET}"

    debug_msg "Starting main_script function..."
    debug_msg "Log level type: $log_level_type"

    # Store the original log levels
    declare -A original_log_levels
    declare -A original_log_level_names

    if [[ $log_level_type == "cluster" ]]; then
        debug_msg "Setting cluster log level to $new_log_level ($new_log_level_name)"
        # Retrieve current log level
        if [[ "$jq_or_grep" == "grep" ]]; then
            current_log_level=$(curl --user "$credentials" -sS "http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level" | grep -oP '"value":\s*\K[0-9]+')
        else
            current_log_level=$(curl --user "$credentials" -sS "http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level" | "$jq_or_grep" -r '.value')
        fi
        current_log_level_name=${log_levels[$current_log_level]}
        debug_msg "Current cluster log level: ${BOLD_GREEN}$current_log_level${RESET}"
        debug_msg "Current log level: ${BOLD_GREEN}$current_log_level_name${RESET}"
        echo -e ""
        echo -e "New cluster log level: ${BOLD_GREEN}$new_log_level_name${RESET}"
        echo -e "Current cluster log level is ${GREEN}$current_log_level_name.${RESET}"

        # Skip update if new level matches the current level
        if [[ "$current_log_level" -eq "$new_log_level" ]]; then
            echo ""
            echo -e "Cluster log level is already set to ${BOLD_GREEN}$new_log_level_name${RESET}. No changes made."
            return
        fi

        # Update the cluster log level using PUT
        debug_msg "Updating cluster log level to $new_log_level_name"
        debug_msg "Cluster Name: $clusterName"
        response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \
            "http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level" \
            -d "{\"value\": $new_log_level}")
        debug_msg "Response: $response"

        if [[ "$jq_or_grep" == "grep" ]]; then
            updated_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+')
        else
            updated_log_level=$(echo "$response" | "$jq_or_grep" -r '.value')
        fi
        debug_msg "Updated cluster log level: $updated_log_level"

        if [[ "$updated_log_level" -eq "$new_log_level" ]]; then
            echo -e "Log level changed successfully from ${GREEN}$current_log_level${RESET} → ${BOLD_GREEN}$new_log_level${RESET}."
        else
            echo -e "Failed to update log level. Response: ${RED}$response${RESET}"
            exit 1
        fi

        # Countdown and revert log level
        if [[ -n "$duration" && "$duration" -gt 0 ]]; then
            echo -e "Keeping log level at ${YELLOW}$new_log_level${RESET} for ${YELLOW}$duration${RESET} second(s)..."
            echo -e ""
            for ((i=duration; i>0; i--)); do
                printf -v countdown "%02d:%02d:%02d" $((i/3600)) $(( (i%3600) / 60 )) $((i%60))
                echo -ne "Countdown: ${YELLOW}$countdown${RESET} remaining...\r"
                sleep 1
            done
            echo -e "\n\nTime's up! Reverting log level back to ${GREEN}$current_log_level${RESET}..."

            # Revert the log level back to the original value
            response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \
                "http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level" \
                -d "{\"value\": $current_log_level}")
            debug_msg "Response: $response"

            if [[ "$jq_or_grep" == "grep" ]]; then
                reverted_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+')
            else
                reverted_log_level=$(echo "$response" | "$jq_or_grep" -r '.value')
            fi
            debug_msg "Reverted cluster log level: $reverted_log_level"

            final_size=$(stat -c%s "$log_file" 2>/dev/null || echo 0)
            size_diff=$(( final_size - initial_size ))
            size_diff_formatted=$(format_size "$size_diff")
            duration_formatted=$(format_duration "$duration")
	    if (( size_diff < 0 )); then
		echo -e "castor.log file was rotated."
	    else
                echo -e "Approximate ${BOLD_GREEN}$size_diff_formatted${RESET} new logs were generated at log level ${BOLD_GREEN}$new_log_level${RESET}. Current castor.log size is ${BOLD_GREEN}$(format_size "$final_size")${RESET} after $duration_formatted."
	    fi

            if [[ "$reverted_log_level" -eq "$current_log_level" ]]; then
                echo -e "Log level reverted successfully back to ${BOLD_GREEN}$current_log_level${RESET}."
            else
                echo -e "Failed to revert log level. Response: ${RED}$response${RESET}"
                exit 1
            fi
        else
            echo "Log level change is permanent until manually modified."
        fi
    elif [[ "$log_level_type" == "node" ]]; then
         # First loop: Change the node log level
        for ip in "${ip_array[@]}"; do
            # Retrieve current node log level
            debug_msg "Retrieving current node log level for IP: $ip"
            debug_msg "curl -s -u \"$credentials\" \"http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel\""
            current_log_level=$(curl -s -u "$credentials" "http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel" | $jq_or_grep -r '.value')
            original_log_levels["$ip"]=$current_log_level
            if [[ "$current_log_level" == "0" ]]; then
                current_log_level_name="default"
            else
                current_log_level_name=${log_levels[$current_log_level]}
            fi
            original_log_level_names["$ip"]=$current_log_level_name
            if [[ "$new_log_level" == "0" ]]; then
                new_log_level_name="default"
            else
                new_log_level_name=${log_levels[$new_log_level]}
            fi
            debug_msg "Current log level for IP $ip: $current_log_level ($current_log_level_name)"
            echo ""
            echo -e "New node log level: ${BOLD_GREEN}$new_log_level_name${RESET}"
            echo -e "Current node log level for IP $ip is ${GREEN}$current_log_level_name.${RESET}"

            # Skip update if new level matches the current level
            if [[ "$current_log_level" -eq "$new_log_level" ]]; then
                echo ""
                echo -e "Node log level for IP $ip is already set to ${BOLD_GREEN}$new_log_level_name${RESET}. No changes made."
                continue
            fi

            # Update the node log level using PUT
            debug_msg "Updating node log level to $new_log_level_name for IP $ip..."
            debug_msg "curl --user \"$credentials\" -sS -X PUT -H \"Content-Type: application/json\" \"http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel?value=$new_log_level\""
            response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \
                "http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel?value=$new_log_level")
            debug_msg "Response: $response"

            if [[ "$jq_or_grep" == "grep" ]]; then
                updated_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+')
            else
                updated_log_level=$(echo "$response" | "$jq_or_grep" -r '.value')
            fi
            debug_msg "Updated log level for IP $ip: $updated_log_level"

            if [[ "$updated_log_level" -eq "$new_log_level" ]]; then
                echo -e "Node log level for IP $ip changed successfully from ${GREEN}$current_log_level_name${RESET} → ${BOLD_GREEN}$new_log_level_name${RESET}."
            else
                echo -e "Failed to update node log level for IP $ip. Response: ${RED}$response${RESET}"
                exit 1
            fi
        done

        # Second loop: Countdown if duration is provided
        if [[ -n "$duration" && "$duration" -gt 0 ]]; then
            echo -e ""
            echo -e "Keeping node(s) log level at ${YELLOW}$new_log_level_name${RESET} for ${YELLOW}$duration${RESET} second(s)..."
            echo -e ""
            for ((i=duration; i>0; i--)); do
                printf -v countdown "%02d:%02d:%02d" $((i/3600)) $(( (i%3600) / 60 )) $((i%60))
                echo -ne "Countdown: ${YELLOW}$countdown${RESET} remaining...\r"
                sleep 1
            done
            echo -e "\n\nTime's up! Reverting node log level back to original levels..."

            # Third loop: Revert the node log level
            for ip in "${ip_array[@]}"; do
                current_log_level=${original_log_levels["$ip"]}
                current_log_level_name=${original_log_level_names["$ip"]}
                debug_msg "Reverting node log level back to $current_log_level_name for IP $ip..."
                debug_msg "curl --user \"$credentials\" -sS -X PUT -H \"Content-Type: application/json\" \"http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel?value=$current_log_level\""
                response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \
                    "http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel?value=$current_log_level")
                debug_msg "Response: $response"

                if [[ "$jq_or_grep" == "grep" ]]; then
                    reverted_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+')
                else
                    reverted_log_level=$(echo "$response" | "$jq_or_grep" -r '.value')
                fi

                final_size=$(stat -c%s "$log_file" 2>/dev/null || echo 0)
                size_diff=$(( final_size - initial_size ))
                size_diff_formatted=$(format_size "$size_diff")
                duration_formatted=$(format_duration "$duration")

                if [[ "$reverted_log_level" -eq "$current_log_level" ]]; then
                    echo -e "Node log level for IP $ip reverted successfully back to ${BOLD_GREEN}${current_log_level_name}${RESET}."
                else
                    echo -e "Failed to revert node log level for IP $ip. Response: ${RED}$response${RESET}"
                    exit 1
                fi
            done

            # Combine the log output for multiple IP addresses into a single summary
	    if (( size_diff < 0 )); then
		echo -e "castor.log file was rotated."
	    else
                echo -e ""
                echo -e "Approximate ${BOLD_GREEN}$size_diff_formatted${RESET} new logs were generated at log level ${BOLD_GREEN}$new_log_level_name${RESET} for IP ${ip_array[*]}. Current castor.log size is ${BOLD_GREEN}$(format_size "$final_size")${RESET} after $duration_formatted."
	    fi
        else
            echo "Log level change is permanent until manually modified."
        fi
    fi
}

# Run in detachable or directly
if $detachable; then
    # Pass the main_script function to the screen session and store the output in a file
    if command -v screen &>/dev/null; then
        # screen -dmS indexer_script bash -c "$(declare -f main_script format_size format_duration check_jq debug); main_script \"$swarm_ip\" \"$credentials\" \"$new_log_level\" \"$new_log_level_name\" \"$duration\" \"$log_level_type\" \"$clusterName\" \"$jq_or_grep\" | tee \"$output_log\""
        screen -dmS indexer_script bash -c "$(declare -f main_script debug_msg format_size format_duration check_jq); main_script \"$swarm_ip\" \"$credentials\" \"$new_log_level\" \"$new_log_level_name\" \"$duration\" \"$log_level_type\" \"$clusterName\" \"$jq_or_grep\" \"$debug\" | tee \"$output_log\""
        screen -r indexer_script
    elif command -v tmux &>/dev/null; then
        # tmux new-session -d -s indexer_script "$(declare -f main_script format_size format_duration check_jq debug); main_script \"$swarm_ip\" \"$credentials\" \"$new_log_level\" \"$new_log_level_name\" \"$duration\" \"$log_level_type\" \"$clusterName\" \"$jq_or_grep\" | tee \"$output_log\""
        tmux new-session -d -s indexer_script "$(declare -f main_script debug_msg format_size format_duration check_jq); main_script \"$swarm_ip\" \"$credentials\" \"$new_log_level\" \"$new_log_level_name\" \"$duration\" \"$log_level_type\" \"$clusterName\" \"$jq_or_grep\" \"$debug\" | tee \"$output_log\""
        tmux attach-session -t indexer_script
    else
        echo "Error: Neither screen nor tmux available. Run without --detachable."
        exit 1
    fi

    # Wait for the screen session to complete and then display the output log
    sleep 1
    while screen -list | grep -q "indexer_script"; do
        sleep 1
    done

    echo ""
    cat "$output_log"
else
    # main_script "$swarm_ip" "$credentials" "$new_log_level" "$duration" "$clusterName" "$jq_or_grep" | tee "$output_log"
    debug_msg "***********************************"
    debug_msg "Parameters passed to main_script:"
    debug_msg "Swarm IP: $swarm_ip"
    debug_msg "Credentials: $credentials"
    debug_msg "New Log Level: $new_log_level"
    debug_msg "Duration: $duration"
    debug_msg "Cluster Name: $clusterName"
    debug_msg "jq_or_grep: $jq_or_grep"
    debug_msg "***********************************"
    debug_msg "Running main_script function..."
    main_script "$swarm_ip" "$credentials" "$new_log_level" "$new_log_level_name" "$duration" "$log_level_type" "$clusterName" "$jq_or_grep" | tee "$output_log"
fi

  • No labels