Managing Log Levels in Swarm Cluster Using a Shell Script
The Swarm Log Level Management Script (castor-change-log-level.sh
) is designed to manage and dynamically adjust log levels on a DataCore Swarm cluster, providing options to change the log level temporarily and automatically revert it after a specified duration. It supports background execution via screen
or tmux
, making it ideal for long-running operations that require detachment from the terminal.
Script Features
Set and Revert Log Levels: Temporarily change the log level and revert after a specified duration.
Flexible JSON Parsing: Uses
jq
for JSON parsing if available; defaults togrep
otherwise.Detachable Session Execution: Runs in a detachable session using
screen
ortmux
, allowing the script to continue running independently of the terminal session.Log Size Monitoring: Reports the log size generated during the temporary log level change.
Countdown Display: Shows a countdown for the specified duration.
Requirements
jq (optional): Used for parsing JSON responses; falls back to
grep
if unavailable.screen or tmux (optional): Required for background execution.
Permissions: Ensure sufficient permissions to execute on the DataCore Swarm server and access required files.
Script Usage
./castor-change-log-level.sh -d <node_ip> -p <admin:password> -i <new_log_level> [-t <duration_in_seconds>] [--detach] [-v]
Parameter | Description |
---|---|
| IP address of the Swarm API endpoint (or set |
| Admin credentials in the format |
| New log level to set (values: |
| Duration in seconds to keep the new log level (optional). |
| Runs the script in a detachable session using |
Instruction for Use
Example 1: Set cluster log level to 20
and keep it for 10
minutes
./castor-change-log-level.sh -d 192.168.8.84 -p admin:datacore -i 20 -t 600
Example 2: Set node(s) log level to debug (10) and keep it for 10 minutes
./castor-change-log-level.sh -d 192.168.8.84,192.168.8.86 -p admin:datacore -i debug -t 600
Example 3: Run in background mode
./castor-change-log-level.sh -d 192.168.8.84 -p admin:datacore -i 20 -t 30 --detach
Running the Script at a Specific Time
The at
command can used to schedule the script to run at a later time. This is useful when you need to start collecting debug logs at a specific hour.
Example 4: Schedule the script to run at 3:00 AM and collect debug logs for 1 hour
Ensure the at service is installed, enabled and running
dnf -y install at atd systemctl enable --now at
Schedule the script execution using
at
:echo "/root/dist/castor-change-log-level.sh -i 10 -t 3600" | at 03:00 AM 2/16/2025
This schedules the script to run at 3:00 AM on Feb 16 2025 and collects log at debug level for 1 hour.
To verify scheduled job:
atq 7 Sun Feb 16 03:00:00 2025 a root
To remove a scheduled job (replace
JOB_ID
with the actual job number fromatq
output):atrm 7
Behavior
Log Level Change: Sets the log level to the specified value. If the current log level matches the requested level, the script skips the update.
Countdown: During the specified duration, the script displays a countdown every second.
Revert Log Level: After the countdown, the log level reverts to the initial value.
Log Size Report: Provides approximately log size generated during the temporary log level change.
Output Messages
Message | Description |
---|---|
| Displays the specified Swarm IP address. |
| Credentials are masked for security. |
| Displays the cluster name retrieved from the Swarm API. |
| Shows the new log level requested. |
| Displays the current log level. |
| Indicates the beginning of the log level update process. |
| Confirms that the log level was successfully updated. |
| Shows the temporary period for which the new log level is retained, with a countdown. |
| Indicates that the temporary period has ended and the script is reverting the log level. |
| Provides information on the amount of logging activity generated during the temporary log level. |
Example Output
[root@scs dist]# ./castor-change-log-level.sh -p admin:datacore -i 10 -t 300 Swarm IP: 192.168.1.84 Credentials: [hidden for security] Cluster Name: gatewayadmindomain New log level: 10 Current log level is 30. Updating log level to 10... Log level changed successfully from 30 → 10. Keeping log level at 10 for 300 second(s)... Countdown: 00:00:01 remaining... Time's up! Reverting log level back to 30... Approximate 69.4MB new logs were generated at log level 10. Current castor.log size is 371.3MB after 00:05:00. Log level reverted successfully back to 30. [root@scs dist]#
Error Handling
Missing Parameters: Missing parameters prompt a usage message.
Invalid Duration: If a non-numeric duration is provided, you’re prompted to enter a valid duration in seconds.
Connection Issues: If unable to connect to the Swarm API, check the IP, credentials, and network access.
Notes
Credentials are masked in the output for security.
Log file sizes are shown in human-readable format (GB, MB, KB, B).
This script provides administrators with an effective way to adjust and monitor Swarm logging, supporting both temporary and permanent log level changes for troubleshooting and performance monitoring.
Script Source Code
Latest version: castor-change-log-level.sh
#!/bin/bash # ----------------------------------------------------------------------------------------------------------------------------- # Script: castor-change-log-level.sh # ----------------------------------------------------------------------------------------------------------------------------- # Description: # This script changes the log level for the Castor cluster or node(s) using the Swarm API. # The script supports changing the log level for the entire cluster or individual node(s). # The script can run in a detachable session using 'screen' or 'tmux'. # ----------------------------------------------------------------------------------------------------------------------------- # Written by Milton Suen (milton.suen@datacore.com) Oct 31, 2024 # Revision History: # v1.0.0 - Update to support running the script in a detachable session using screen or tmux. # v1.1.0 - 2025-02-20 Add support node(s) level log level change. # v1.2.0 - 2025-02-26 SUPSCR-208: # - Enforced proper credential formatting: credentials must be in the username:password format. # - Fixed help message shows script name without hard code it. # v1.2.1 - 2025-02-26 SUPSCR-209: Auto detect CSN or SCS to adjust castor.log file path. # v1.2.2 - 2025-02-27 Address the issue of the script not display correct when the castor.log file is rotated. # v1.2.3 - 2025-02-27 Address the issue of log level not display correct within detach session. # v1.2.4 - 2025-02-27 Bug fix: The default node-level log level is 0 (unset), which differs from the cluster-level default of 30. # v1.2.5 - 2025-02-28 SUPSCR-208: # - Credentials validation and password prompt enhancements. # - screen or tmux required with detachable mode, script will stopped with '-D' option if neither is installed. # - Fixed enter password multiple times issue. # - Hide password on debug output messages. # ----------------------------------------------------------------------------------------------------------------------------- # Current Version: 1.2.5 # ----------------------------------------------------------------------------------------------------------------------------- # KB: https://perifery.atlassian.net/wiki/spaces/public/pages/3872161835/Setting+and+Managing+Swarm+Log+Levels+with+script # ----------------------------------------------------------------------------------------------------------------------------- # Define colors RED='\033[0;31m' BOLD_RED='\033[1;31m' GREEN='\033[0;32m' BOLD_GREEN='\033[1;32m' UNDERLINE_BOLD_GREEN='\033[4;32m' YELLOW='\033[0;33m' BOLD_YELLOW='\033[1;33m' BLUE='\033[0;34m' BOLD_BLUE='\033[1;34m' MAGENTA='\033[0;35m' BOLD_MAGENTA='\033[1;35m' CYAN='\033[0;36m' BOLD_CYAN='\033[1;36m' RESET='\033[0m' # Reset color to default SCRIPT_NAME=$(basename "$0") # Function to display usage information usage() { echo "Usage: ./$SCRIPT_NAME -d swarm_ip -p admin:password [-i new_log_level | -L new_node_log_level] [-t duration_in_seconds] [-D]" echo " -d, --swarm_ip IP address of the Swarm API endpoint. Supports single or multiple IPs separated by \",\", \";\" or \" \"." echo " If multiple IPs are provided, the script will update the log level for all nodes." echo " (Alternatively, set the SCSP_HOST environment variable to the Swarm IP.)" echo " -p, --credentials Credentials in the format admin:password" echo " -i, --log.level New cluster log level to set (5, 10, 15, 20, 30, 40, 50, chatter, debug, announce, info, error, critical, default)" echo " -L, --node.log.level New node log level to set (0, 5, 10, 15, 20, 30, 40, 50, chatter, debug, announce, info, error, critical, default)" echo " **Either -i or -L must be specified, but not both.**" echo " -t, --time (Optional) Duration in seconds to keep the new log level (must be greater than 0)" echo " -D, --detach (Optional) Detach the script from the current terminal and run in a detachable session using screen or tmux" exit 1 } # Default options detachable=false debug=false output_log="castor-change-log-level_output.log" # Log file for capturing detachable session output log_level_type="cluster" # Default log level type default_log_level=30 # Default log level log_file="/var/log/datacore/castor.log" # Default log file location SCRIPTDIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd ) JQLOCATION=$SCRIPTDIR/jq MAX_RETRIES=3 # Maximum number of password retries attempts username="admin" # Default username password="" # Default password # Global associative array for log levels declare -A log_levels=( [5]="chatter" [10]="debug" [15]="audit" [20]="info" [30]="warning" [40]="error" [45]="defect" [50]="critical" [60]="announce" [30]="default" ) # Global associative array for log level names declare -A log_level_names=( ["chatter"]=5 ["debug"]=10 ["audit"]=15 ["info"]=20 ["warning"]=30 ["error"]=40 ["defect"]=45 ["critical"]=50 ["announce"]=60 ["default"]=30 ) # Function to get the current timestamp timestamp() { date +"%Y-%m-%d_%H:%M:%S.%4N" } # Function to display debug messages if debug mode is enabled debug_msg() { if $debug; then local caller_line=${BASH_LINENO[0]:-unknown} # Fallback to "unknown" if empty echo -e "$(timestamp) [DEBUG] (Line $caller_line) $*" fi } # Function to check if either 'screen' or 'tmux' is installed check_screen_or_tmux() { if ! command -v screen &>/dev/null && ! command -v tmux &>/dev/null; then echo -e "" echo -e "---------------------------------------------------------------------------------------" echo -e " ${YELLOW}Warning${RESET}: Neither '${BOLD_GREEN}screen${RESET}' nor '${BOLD_GREEN}tmux${RESET}' is installed. Cannot run in detachable mode." echo -e " Please install either '${BOLD_GREEN}screen${RESET}' or '${BOLD_GREEN}tmux${RESET}' to run the script in a detachable session." # echo -e "" # echo -e " Proceeding with detachable mode disabled." echo -e "---------------------------------------------------------------------------------------" echo -e "" detachable=false # Disable detachable session debug_msg "detachable=${YELLOW}false${RESET}" debug_msg "Stopping the script..." exit 1 fi } # Function to format file size format_size() { local size=$1 if (( size >= 1073741824 )); then echo "$(awk "BEGIN {printf \"%.1fGB\", $size/1073741824}")" elif (( size >= 1048576 )); then echo "$(awk "BEGIN {printf \"%.1fMB\", $size/1048576}")" elif (( size >= 1024 )); then echo "$(awk "BEGIN {printf \"%.1fKB\", $size/1024}")" else echo "${size}B" fi } # Function to format duration format_duration() { local duration=$1 local hours=$((duration / 3600)) local minutes=$(( (duration % 3600) / 60 )) local seconds=$((duration % 60)) printf "%02d:%02d:%02d" $hours $minutes $seconds } # Function to check if jq is available and set up JSON parsing method check_jq() { if [[ -x "/usr/local/bin/jq" ]]; then echo "/usr/local/bin/jq" elif [[ -x "$(pwd)/jq" ]]; then echo "$(pwd)/jq" elif command -v jq &>/dev/null; then echo "jq" else echo "grep" fi } if [[ -f "$JQLOCATION" ]]; then jq_or_grep=$JQLOCATION else jq_or_grep=$(check_jq) fi debug_msg "jq_or_grep: $jq_or_grep" #jq_or_grep=$(check_jq) # Function to determine the log file path determine_log_file() { if [[ -f "/var/log/datacore/castor.log" ]]; then echo "/var/log/datacore/castor.log" elif [[ -f "/var/log/caringo/castor.log" ]]; then echo "/var/log/caringo/castor.log" else echo "Error: Log file not found in /var/log/datacore/castor.log or /var/log/caringo/castor.log" exit 1 fi } log_file=$(determine_log_file) # Function to check if credentials are valid check_credentials() { local CREDENTIALS="$1" local SWARM_IP="$2" debug_msg "Credentials: [hidden for security]" debug_msg "Swarm IP: $SWARM_IP" # API endpoint for validating user credentials local VALIDATE_URL="http://${SWARM_IP}:91/api/validateUser" debug_msg "Validate URL: $VALIDATE_URL" # Make the API request debug_msg "Validating user credentials..." debug_msg "curl -s -u \"[hidden for security]\" -X GET \"$VALIDATE_URL\" -H 'Content-Type: application/json'" RESPONSE=$(curl -s -u "$CREDENTIALS" -X GET "$VALIDATE_URL" -H 'Content-Type: application/json') debug_msg "Validate User Response: $RESPONSE" # Check if the response contains "isValid": true if echo "$RESPONSE" | "$jq_or_grep" -e '.isValid == true' > /dev/null 2>&1; then debug_msg "Authentication successful for user '${CREDENTIALS%%:*}'" return 0 # Success elif echo "$RESPONSE" | "$jq_or_grep" -e '.isValid == false' > /dev/null 2>&1; then debug_msg "Authentication failed for user '${CREDENTIALS%%:*}'" return 1 # Failure else debug_msg "Error: Unable to validate credentials. Please check your inputs." return 1 # Failure fi } # Function to print credentials - hide password print_credentials() { local CREDENTIALS="$1" local USERNAME="${CREDENTIALS%%:*}" echo -e "${GREEN}$USERNAME${RESET}":"${GREEN}[hidden for security]${RESET}" } # Parse input arguments while [[ "$#" -gt 0 ]]; do case $1 in -d|--swarm_ip) swarm_ip="$2"; debug_msg "Set swarm_ip to $swarm_ip"; shift 2 ;; -p|--credentials) credentials="$2" if ! [[ "$credentials" =~ ^[^:]+(:[^:]+)?$ ]]; then echo -e "" echo -e "Error: Credentials must be in the format ${BOLD_GREEN}username:password${RESET} or ${BOLD_GREEN}username${RESET}" exit 1 fi if [[ "$credentials" =~ [^a-zA-Z0-9:] ]]; then echo -e "" echo -e "Warning: Password contains special characters. Please enclose the credentials with single quotes (${BOLD_YELLOW}'${BOLD_GREEN}username:password${RESET}${BOLD_YELLOW}'${RESET})." exit 1 fi debug_msg "Set credentials"; shift 2 ;; -i|--log.level) if [[ -n "$new_log_level" ]]; then echo "Error: Options -i (cluster log leve) and -L (node log level) cannot be used together." usage fi if [[ ${log_level_names[$2]} ]]; then new_log_level=${log_level_names[$2]} elif [[ ${log_levels[$2]} ]]; then new_log_level=$2 else echo "Invalid log level: $2" exit 1 fi new_log_level_name=${log_levels[$new_log_level]} log_level_type="cluster" default_log_level=30 debug_msg "Set new_log_level to $new_log_level ($new_log_level_name)" shift 2 ;; -L|--node.log.level) if [[ -n "$new_log_level" ]]; then echo "Error: Options -i (cluster log leve) and -L (node log level) cannot be used together." usage fi if [[ ${log_level_names[$2]} ]]; then new_log_level=${log_level_names[$2]} elif [[ ${log_levels[$2]} || $2 -eq 0 ]]; then new_log_level=$2 new_log_level_name="default" else echo "Invalid log level: $2" exit 1 fi new_log_level_name=${log_levels[$new_log_level]} log_level_type="node" default_log_level=0 debug_msg "Set new_log_level to $new_log_level ($new_log_level_name)" shift 2 ;; -t|--time) if [[ -n "$2" && "$2" != -* && "$2" -gt 0 ]]; then duration="$2" debug_msg "Set duration to $duration" shift 2 else echo "Error: Duration must be a number greater than 0." exit 1 fi ;; -D|--detach) detachable=true; debug_msg "Set detachable to true"; shift ;; --debug) debug=true; debug_msg "Set debug to true with ${YELLOW}--debug${RESET}"; shift ;; *) usage ;; esac done debug_msg "Set log_level_type to ${YELLOW}$log_level_type${RESET}" # Set default values if not provided debug_msg "Checking for default log level values..." # Change the default log level based on Log Level Type debug_msg "Changing default log level based on Log Level Type: ${YELLOW}$log_level_type${RESET}" if [[ -n "$new_log_level" ]]; then if [[ "$log_level_type" == "node" ]]; then # new_log_level=0 debug_msg "Checing node default log level" default_log_level=0 if [[ $new_log_level == 30 ]]; then new_log_level=$default_log_level new_log_level_name="default" fi debug_msg "node - new_log_level: to $new_log_level" debug_msg "node - new_log_level_name: to $new_log_level_name" elif [[ "$log_level_type" == "cluster" ]]; then default_log_level=30 fi debug_msg "New log level: ${GREEN}$new_log_level${RESET}" debug_msg "New log level name: ${GREEN}$new_log_level_name${RESET}" debug_msg "Set default_log_level to: $default_log_level" fi # Check if 'screen' or 'tmux' is installed if [[ "$detachable" == true ]]; then check_screen_or_tmux fi # If swarm_ip is not provided, try using SCSP_HOST environment variable if [[ -z "$swarm_ip" ]]; then if [[ -n "$SCSP_HOST" ]]; then swarm_ip="$SCSP_HOST" debug "Using Swarm IP from SCSP_HOST: $swarm_ip" else echo "Error: swarm_ip not provided and SCSP_HOST is not set." usage fi fi # Check if required arguments are provided if [[ -z "$credentials" || -z "$new_log_level" ]]; then usage fi # Split the swarm_ip into an array of IP addresses if it contains delimiters IFS=';, ' read -r -a ip_array <<< "$swarm_ip" # Validate credentials before proceeding debug_msg "Validating credentials..." if [[ "$credentials" =~ ^[^:]+$ ]]; then CREDENTIALS="$credentials" ATTEMPT=$MAX_RETRIES USERNAME="" PASSWORD="" debug_msg "CREDENTIALS: $(print_credentials "$CREDENTIALS")" debug_msg "ATTEMPT: $ATTEMPT" debug_msg "USERNAME: $USERNAME" debug_msg "PASSWORD: $PASSWORD" # Check if credentials contain a colon (username:password format) if [[ "$CREDENTIALS" == *":"* ]]; then USERNAME="${CREDENTIALS%%:*}" PASSWORD="${CREDENTIALS#*:}" # Validate credentials check_credentials "$CREDENTIALS" "${ip_array[0]}" if [[ $? -ne 0 ]]; then debug_msg "Invalid credentials. Please check your username and password." echo "" echo "${RED}Error${RESET}: Invalid credentials. Please check your username and password." exit 1 fi debug_msg "Validate credentials" credentials=$CREDENTIALS return 0 else debug_msg "Credentials do not contain password" debug_msg "Username: $(print_credentials "$CREDENTIALS")" USERNAME="$CREDENTIALS" # Prompt for password while [[ $ATTEMPTS -lt $MAX_RETRIES ]]; do echo "" read -sp "Enter password for user $USERNAME: " PASSWORD echo "" echo "" CREDENTIALS="$USERNAME:$PASSWORD" debug_msg "CREDENTIALS: $(print_credentials "$CREDENTIALS"))" check_credentials "$CREDENTIALS" "${ip_array[0]}" if [[ $? -eq 0 ]]; then debug_msg "CREDENTIALS: $(print_credentials "$CREDENTIALS"))" debug_msg "Valid credentials" credentials=$CREDENTIALS break else echo "" echo "${RED}Error${RESET}: Invalid credentials. Please check your username and password." fi ((ATTEMPTS++)) done fi else if ! check_credentials "$credentials" "${ip_array[0]}"; then echo -e "" echo -e "${YELLOW}Warning${RESET}: ${RED}Invalid credentials${RESET}. Please check your username and password in the format ${GREEN}username:password${RESET}." echo -e "" usage exit 1 fi fi # Retrieve cluster name and handle JSON parsing using the first IP address debug_msg "Retrieving the cluster name from Swarm API using IP: ${ip_array[0]}" if [[ "$jq_or_grep" == "grep" ]]; then clusterName=$(curl --user "$credentials" -sS "http://${ip_array[0]}:91/api/storage/clusters" | grep -oP '"name":\s*"\K[^"]+') else clusterName=$(curl --user "$credentials" -sS "http://${ip_array[0]}:91/api/storage/clusters" | "$jq_or_grep" -r '._embedded.clusters[0].name') fi if [[ -z "$clusterName" ]]; then echo "Failed to retrieve the cluster name. Please check your inputs." exit 1 fi debug_msg "Cluster Name: $clusterName" # Main logic function to run the script tasks main_script() { local swarm_ip="$1" local credentials="$2" local new_log_level="$3" local new_log_level_name="$4" local duration="$5" local log_level_type="$6" #local log_file="/var/log/datacore/castor.log" if [[ -z "$log_file" ]]; then log_file=$(determine_log_file) fi local initial_size=$(stat -c%s "$log_file" 2>/dev/null || echo 0) local current_log_level local clusterName="$7" local jq_or_grep="$8" local detachable="$9" if [[ "$detachable" ]]; then local debug="${10}" eval "$(echo "${11}" | sed 's/declare -A/declare -A/')" eval "$(echo "${12}" | sed 's/declare -A/declare -A/')" fi debug_msg "**********************************************************" debug_msg "local variables" debug_msg "Log Level Type: ${GREEN}$log_level_type${RESET}" debug_msg "Default Log Level: ${GREEN}$default_log_level${RESET}" debug_msg "Swarm IP: $swarm_ip" debug_msg "Credentials: $(print_credentials "$credentials")" debug_msg "New Log Level: $new_log_level" debug_msg "New log Level Name: $new_log_level_name" debug_msg "Duration: $duration" debug_msg "Log File: $log_file" debug_msg "Initial Log File Size: $initial_size" debug_msg "Current Log Level: $current_log_level" debug_msg "Cluster Name: $clusterName" debug_msg "jq_or_grep: $jq_or_grep" debug_msg "Detach: $detachable" debug_msg "Debug: $debug" debug_msg "**********************************************************" # Split the swarm_ip into an array of IP addresses IFS=';, ' read -r -a ip_array <<< "$swarm_ip" debug_msg "IP Array: ${ip_array[*]}" # Display initial information if [[ "$log_level_type" == "cluster" ]]; then echo -e "Swarm IP: ${GREEN}${ip_array[0]}${RESET}" else echo -e "Swarm IPs: ${GREEN}${ip_array[*]}${RESET}" fi # echo -e "Swarm IP: ${GREEN}$swarm_ip${RESET}" echo -e "Credentials: ${GREEN}[hidden for security]${RESET}" echo -e "Cluster Name: ${GREEN}$clusterName${RESET}" debug_msg "Starting main_script function..." debug_msg "Log level type: $log_level_type" # Store the original log levels declare -A original_log_levels declare -A original_log_level_names if [[ $log_level_type == "cluster" ]]; then debug_msg "Setting cluster log level to $new_log_level ($new_log_level_name)" # Retrieve current log level if [[ "$jq_or_grep" == "grep" ]]; then current_log_level=$(curl --user "$credentials" -sS "http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level" | grep -oP '"value":\s*\K[0-9]+') else current_log_level=$(curl --user "$credentials" -sS "http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level" | "$jq_or_grep" -r '.value') fi current_log_level_name=${log_levels[$current_log_level]} debug_msg "Current cluster log level: ${BOLD_GREEN}$current_log_level${RESET}" debug_msg "Current log level name: ${BOLD_GREEN}$current_log_level_name${RESET}" echo -e "" echo -e "New cluster log level: ${BOLD_GREEN}$new_log_level_name${RESET}" echo -e "Current cluster log level is ${GREEN}$current_log_level_name.${RESET}" # Skip update if new level matches the current level if [[ "$current_log_level" -eq "$new_log_level" ]]; then echo "" echo -e "Cluster log level is already set to ${BOLD_GREEN}$new_log_level_name${RESET}. No changes made." return fi # Update the cluster log level using PUT debug_msg "Updating cluster log level to $new_log_level_name" debug_msg "Cluster Name: $clusterName" debug_msg "Credentials: $(print_credentials "$credentials")" debug_msg "curl --user \"$(print_credentials "$credentials")\" -sS -X PUT -H \"Content-Type: application/json\" \"http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level\" -d \"{\\\"value\\\": $new_log_level}\"" response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \ "http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level" \ -d "{\"value\": $new_log_level}") debug_msg "Response: $response" if [[ "$jq_or_grep" == "grep" ]]; then updated_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+') else updated_log_level=$(echo "$response" | "$jq_or_grep" -r '.value') fi debug_msg "Updated cluster log level: $updated_log_level" if [[ "$updated_log_level" -eq "$new_log_level" ]]; then echo -e "Log level changed successfully from ${GREEN}$current_log_level${RESET} -> ${BOLD_GREEN}$new_log_level${RESET}." else echo -e "Failed to update log level. Response: ${RED}$response${RESET}" exit 1 fi # Countdown and revert log level if [[ -n "$duration" && "$duration" -gt 0 ]]; then echo -e "Keeping log level at ${YELLOW}$new_log_level_name${RESET} for ${YELLOW}$duration${RESET} second(s)..." echo -e "" for ((i=duration; i>0; i--)); do printf -v countdown "%02d:%02d:%02d" $((i/3600)) $(( (i%3600) / 60 )) $((i%60)) echo -ne "Countdown: ${YELLOW}$countdown${RESET} remaining...\r" sleep 1 done echo -e "\n\nTime's up! Reverting log level back to ${GREEN}$current_log_level${RESET}..." # Revert the log level back to the original value response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \ "http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level" \ -d "{\"value\": $current_log_level}") debug_msg "Response: $response" if [[ "$jq_or_grep" == "grep" ]]; then reverted_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+') else reverted_log_level=$(echo "$response" | "$jq_or_grep" -r '.value') fi debug_msg "Reverted cluster log level: $reverted_log_level" final_size=$(stat -c%s "$log_file" 2>/dev/null || echo 0) debug_msg "Initial log file size: $initial_size" debug_msg "Final log file size: $final_size" size_diff=$(( final_size - initial_size )) size_diff_formatted=$(format_size "$size_diff") duration_formatted=$(format_duration "$duration") if (( size_diff < 0 )); then echo -e "castor.log file was rotated." else echo -e "Approximate ${BOLD_GREEN}$size_diff_formatted${RESET} new logs were generated at log level ${BOLD_GREEN}$new_log_level${RESET}. Current castor.log size is ${BOLD_GREEN}$(format_size "$final_size")${RESET} after $duration_formatted." fi if [[ "$reverted_log_level" -eq "$current_log_level" ]]; then echo -e "Log level reverted successfully back to ${BOLD_GREEN}$current_log_level${RESET}." else echo -e "Failed to revert log level. Response: ${RED}$response${RESET}" exit 1 fi else echo "Log level change is permanent until manually modified." fi elif [[ "$log_level_type" == "node" ]]; then # First loop: Change the node log level local same_log_level=false for ip in "${ip_array[@]}"; do # Retrieve current node log level debug_msg "Retrieving current node log level for IP: $ip" debug_msg "curl -s -u \"$(print_credentials "$credentials")\" \"http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel\"" current_log_level=$(curl -s -u "$credentials" "http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel" | $jq_or_grep -r '.value') debug_msg "Current log level for IP $ip: $current_log_level" original_log_levels["$ip"]=$current_log_level debug_msg "Original log level for IP $ip: ${original_log_levels["$ip"]}" if [[ "$current_log_level" == "0" ]]; then current_log_level_name="default" else current_log_level_name=${log_levels[$current_log_level]} fi original_log_level_names["$ip"]=$current_log_level_name debug_msg "Original log level name: ${original_log_level_names["$ip"]}" if [[ "$new_log_level" == "0" ]]; then new_log_level_name="default" else new_log_level_name=${log_levels[$new_log_level]} fi debug_msg "Current log level for IP $ip: $current_log_level ($current_log_level_name)" echo "" echo -e "New node log level: ${BOLD_GREEN}$new_log_level_name${RESET}" echo -e "Current node log level for IP $ip is ${GREEN}$current_log_level_name.${RESET}" # Skip update if new level matches the current level if [[ "$current_log_level" -eq "$new_log_level" ]]; then same_log_level=true echo "" echo -e "Node log level for IP $ip is already set to ${BOLD_GREEN}$new_log_level_name${RESET}. No changes made." continue else same_log_level=false fi if [[ "$same_log_level" == false ]]; then # Update the node log level using PUT debug_msg "Updating node log level to $new_log_level_name for IP $ip..." debug_msg "curl --user \"$(print_credentials "$credentials")\" -sS -X PUT -H \"Content-Type: application/json\" \"http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel?value=$new_log_level\"" response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \ "http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel?value=$new_log_level") debug_msg "Response: $response" if [[ "$jq_or_grep" == "grep" ]]; then updated_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+') else updated_log_level=$(echo "$response" | "$jq_or_grep" -r '.value') fi debug_msg "Updated log level for IP $ip: $updated_log_level" if [[ "$updated_log_level" -eq "$new_log_level" ]]; then echo -e "Node log level for IP $ip changed successfully from ${GREEN}$current_log_level_name${RESET} -> ${BOLD_GREEN}$new_log_level_name${RESET}." else echo -e "Failed to update node log level for IP $ip. Response: ${RED}$response${RESET}" exit 1 fi fi done # Second loop: Countdown if duration is provided if [[ "$same_log_level" == false ]]; then\ if [[ -n "$duration" && "$duration" -gt 0 ]]; then echo -e "" echo -e "Keeping node(s) log level at ${YELLOW}$new_log_level_name${RESET} for ${YELLOW}$duration${RESET} second(s)..." echo -e "" for ((i=duration; i>0; i--)); do printf -v countdown "%02d:%02d:%02d" $((i/3600)) $(( (i%3600) / 60 )) $((i%60)) echo -ne "Countdown: ${YELLOW}$countdown${RESET} remaining...\r" sleep 1 done echo -e "\n\nTime's up! Reverting node log level back to original levels..." # Third loop: Revert the node log level for ip in "${ip_array[@]}"; do current_log_level=${original_log_levels["$ip"]} current_log_level_name=${original_log_level_names["$ip"]} debug_msg "Reverting node log level back to $current_log_level_name for IP $ip..." debug_msg "curl --user \"$(print_credentials "$credentials")\" -sS -X PUT -H \"Content-Type: application/json\" \"http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel?value=$current_log_level\"" response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \ "http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel?value=$current_log_level") debug_msg "Response: $response" if [[ "$jq_or_grep" == "grep" ]]; then reverted_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+') else reverted_log_level=$(echo "$response" | "$jq_or_grep" -r '.value') fi final_size=$(stat -c%s "$log_file" 2>/dev/null || echo 0) size_diff=$(( final_size - initial_size )) debug_msg "Initial log file size: $initial_size" debug_msg "Final log file size: $final_size" size_diff_formatted=$(format_size "$size_diff") duration_formatted=$(format_duration "$duration") if [[ "$reverted_log_level" -eq "$current_log_level" ]]; then echo -e "Node log level for IP $ip reverted successfully back to ${BOLD_GREEN}${current_log_level_name}${RESET}." else echo -e "Failed to revert node log level for IP $ip. Response: ${RED}$response${RESET}" exit 1 fi done # Combine the log output for multiple IP addresses into a single summary if (( size_diff < 0 )); then echo -e "castor.log file was rotated." else echo -e "" echo -e "Approximate ${BOLD_GREEN}$size_diff_formatted${RESET} new logs were generated at log level ${BOLD_GREEN}$new_log_level_name${RESET} for IP ${ip_array[*]}. Current castor.log size is ${BOLD_GREEN}$(format_size "$final_size")${RESET} after $duration_formatted." fi else echo "Log level change is permanent until manually modified." fi fi fi } # Run in detachable or directly if $detachable; then # Pass the main_script function to the screen session and store the output in a file debug_msg "**********************************************************" | tee -a "$output_log" debug_msg "Detach mode - Parameters passed to main_script:" | tee -a "$output_log" debug_msg " Swarm IP: $swarm_ip" | tee -a "$output_log" debug_msg " Credentials: $(print_credentials "$credentials")" | tee -a "$output_log" debug_msg " New Log Level: $new_log_level" | tee -a "$output_log" debug_msg " New log Level Name: $new_log_level_name" | tee -a "$output_log" debug_msg " Duration: $duration" | tee -a "$output_log" debug_msg " Log Level Type: $log_level_type" | tee -a "$output_log" debug_msg " Log File: $log_file" | tee -a "$output_log" debug_msg " Initial Log File Size: $initial_size" | tee -a "$output_log" debug_msg " Current Log Level: $current_log_level" | tee -a "$output_log" debug_msg " Cluster Name: $clusterName" | tee -a "$output_log" debug_msg " jq_or_grep: $jq_or_grep" | tee -a "$output_log" debug_msg " Detach: $detachable" | tee -a "$output_log" debug_msg " Debug: $debug" | tee -a "$output_log" debug_msg "**********************************************************" | tee -a "$output_log" # Convert associative arrays to strings and pass them to the screen session log_levels_string=$(declare -p log_levels) log_level_names_string=$(declare -p log_level_names) debug_msg "log_levels_string: $log_levels_string" | tee -a "$output_log" debug_msg "log_level_names_string: $log_level_names_string" | tee -a "$output_log" if command -v screen &>/dev/null; then echo -e "Running in ${YELLOW}screen${RESET} detachable mode..." | tee -a "$output_log" # if [[ "$debug" ]]; then # echo -e "Debug mode enabled. Check the output log for debug messages." # sleep 5 # fi screen -dmS indexer_script bash -c "$(declare -f main_script timestamp debug_msg format_size format_duration check_jq determine_log_file print_credentials); main_script \"$swarm_ip\" \"$credentials\" \"$new_log_level\" \"$new_log_level_name\" \"$duration\" \"$log_level_type\" \"$clusterName\" \"$jq_or_grep\" \"$detachable\" \"$debug\" \"${log_levels_string}\" \"${log_level_names_string}\" | tee \"$output_log\"" screen -r indexer_script # Wait for the screen session to complete and then display the output log sleep 1 while screen -list | grep -q "indexer_script"; do sleep 1 done elif command -v tmux &>/dev/null; then echo -e "Running in ${YELLOW}tmux${RESET} detachable mode..." | tee -a "$output_log" # if [[ "$debug" ]]; then # echo -e "Debug mode enabled. Check the output log for debug messages." # sleep 5 # fi tmux new-session -d -s indexer_script "$(declare -f main_script timestamp debug_msg format_size format_duration check_jq determine_log_file print_credentials); main_script \"$swarm_ip\" \"$credentials\" \"$new_log_level\" \"$new_log_level_name\" \"$duration\" \"$log_level_type\" \"$clusterName\" \"$jq_or_grep\" \"$detachable\" \"$debug\" \"${log_levels_string}\" \"${log_level_names_string}\" | tee \"$output_log\"" tmux attach-session -t indexer_script else echo "Error: Neither screen nor tmux available. Run without --detachable." exit 1 fi echo "" cat "$output_log" else # main_script "$swarm_ip" "$credentials" "$new_log_level" "$duration" "$clusterName" "$jq_or_grep" | tee "$output_log" debug_msg "**********************************************************" debug_msg "Parameters passed to main_script:" debug_msg "Swarm IP: $swarm_ip" debug_msg "Credentials: $(print_credentials "$credentials")" debug_msg "New Log Level: $new_log_level" debug_msg "Duration: $duration" debug_msg "Cluster Name: $clusterName" debug_msg "jq_or_grep: $jq_or_grep" debug_msg "**********************************************************" debug_msg "Running main_script function..." main_script "$swarm_ip" "$credentials" "$new_log_level" "$new_log_level_name" "$duration" "$log_level_type" "$clusterName" "$jq_or_grep" | tee "$output_log" fi