Setting and Managing Swarm Log Levels with Script
- 1 Managing Log Levels in Swarm Cluster Using a Shell Script
- 1.1 Script Features
- 1.2 Requirements
- 1.3 Script Usage
- 1.3.1 Handling Passwords with Special Characters
- 1.3.2 Instruction for Use
- 1.3.2.1 Example 1: Set cluster log level to 30 (permanent change)
- 1.3.2.2 Example 2: Set cluster log level to 20 and keep it for 10 minutes
- 1.3.2.3 Example 3: Set node(s) log level to debug (10) and keep it for 10 minutes
- 1.3.2.4 Example 4: Run in detachable mode
- 1.3.2.5 Example 5: Secure password prompt if password are not provided
- 1.3.3 Running the Script at a Specific Time
- 1.3.4 Reattaching to a Detached Session
- 1.3.4.1 For screen users:
- 1.3.4.2 For tmux users:
- 1.4 Behavior
- 1.4.1 Output Messages
- 1.4.2 Example Output
- 1.4.3 Error Handling
- 1.5 Notes
- 1.6 Script Source Code
Managing Log Levels in Swarm Cluster Using a Shell Script
The Swarm Log Level Management Script (castor-change-log-level.sh
) is designed to manage and dynamically adjust log levels on a DataCore Swarm cluster, providing options to change the log level temporarily and automatically revert it after a specified duration. It supports background execution via screen
or tmux
, making it ideal for long-running operations that require detachment from the terminal.
Script Features
Set and Revert Log Levels: Temporarily change the log level and revert after a specified duration.
Flexible JSON Parsing: Uses
jq
for JSON parsing if available; defaults tojq
otherwise.Detachable Session Execution: Runs in a detachable session using
screen
ortmux
, allowing the script to continue running independently of the terminal session. If bothscreen
andtmux
are installed, the script will prioritizescreen
first.Supports Cluster and Node Log Levels: Allows setting log level at both the cluster and individual node levels.
Log File Auto-Detection: Automatically detects and adjust the
castor.log
file path forCSN
andSCS
environments.Log Size Monitoring: Reports the log size generated during the temporary log level change.
Countdown Display: Shows a countdown for the specified duration.
Requirements
jq: Used for parsing JSON responses; falls back to
grep
if unavailable.screen or tmux (optional): Required for background execution.
Permissions: Ensure sufficient permissions to execute on the DataCore Swarm server and access required files.
Script Usage
Handling Passwords with Special Characters
If your password contains special characters such as @
, #
, $
, &
or !
, you need to properly escape them or enclose the password in single quote ('
) when passing it as a parameter. Alternatively, use a secure password prompt.
Example 1: Escape special characters in the password
./castor-change-log-level.sh -d 192.168.8.84 -p admin:pa\@ss\$word -i 20 -t 600
Example 2: Enclose the password in single quotes
./castor-change-log-level.sh -d 192.168.8.84 -p 'admin:pa@ss$word' -i 20 -t 600
Example 3: Use the secure password prompt (Recommended)
./castor-change-log-level.sh -p admin -d 192.168.8.84 -i 10 -t 300
The script will securely prompt you to enter the password without exposing it in the command line.
Usage of the script as described below:
./castor-change-log-level.sh -d <node_ip> -p '<admin:password>' [-i new_log_level | -L new_node_log_level] [-t <duration_in_seconds>] [-D --detach] [-v]
Parameter | Description |
---|---|
| IP address of the Swarm API endpoint (or set |
| Admin credentials in the format |
| New log level to set (values: |
| New node log level to set (values: |
| Duration in seconds to keep the new log level (optional). |
| Runs the script in a detachable session using |
Instruction for Use
Example 1: Set cluster log level to 30
(permanent change)
./castor-change-log-level.sh -d 192.168.8.84 -p admin:datacore -i 30
Example 2: Set cluster log level to 20
and keep it for 10
minutes
./castor-change-log-level.sh -d 192.168.8.84 -p admin:datacore -i 20 -t 600
Example 3: Set node(s) log level to debug (10) and keep it for 10 minutes
./castor-change-log-level.sh -d 192.168.8.84,192.168.8.86 -p admin:datacore -i debug -t 600
Example 4: Run in detachable mode
./castor-change-log-level.sh -d 192.168.8.84 -p admin:datacore -i 20 -t 30 -D
Example 5: Secure password prompt if password are not provided
./castor-change-log-level.sh -p admin -d 192.168.8.84 -i 10 -t 300
You will be prompted to enter the password securely.
./castor-change-log-level.sh -p admin -d 192.168.8.84 -i 10 -t 300
Enter password for user admin:
Running the Script at a Specific Time
The at
command can used to schedule the script to run at a later time. This is useful when you need to start collecting debug logs at a specific hour.
Example 5: Schedule the script to run at 3:00 AM on 03/02/2025 and collect debug logs of node 192.168.8.84
and 192.168.8.89
for 1 hour
Ensure the
at
service is installed, enabled and runningdnf -y install epel-release dnf -y install at systemctl enable --now at
Schedule the script execution using
at
:echo "/root/dist/castor-change-log-level.sh -p admin:datacore -d 192.168.8.84 -L '192.168.8.84,192.168.8.89' 10 -t 3600" | at 03:00 AM 03/02/2025
This schedules the script to run at 3:00 AM on Mar 02, 2025 and collects log at debug level for 1 hour.
To verify scheduled job:
atq 7 Sun Mar 02 03:00:00 2025 a root
To remove a scheduled job (replace
JOB_ID
with the actual job number fromatq
output):atrm 7
Reattaching to a Detached Session
For more details on using screen and tmux, refer to their official documentation:
If you accidentally close your terminal (e.g. putty, iterm2, etc) while the script is running in a detached session, you can reattach using the following commands:
For screen
users:
List active
screen
session:screen -ls
Reattach to the session:
screen -r <session_name>
For tmux
users:
List active
tmux
sessions:tmux ls
Reattach to the session:
tmux a -t <session_name>
If the session has ended or was not found, you may need to restart the script manually.
Behavior
Log Level Change: Sets the log level to the specified value. If the current log level matches the requested level, the script skips the update.
Countdown: During the specified duration, the script displays a countdown every second.
Revert Log Level: After the countdown, the log level reverts to the initial value.
Log Size Report: Provides approximately log size generated during the temporary log level change.
Output Messages
Message | Description |
---|---|
| Displays the specified Swarm IP address. |
| Displays the cluster name retrieved from the Swarm API. |
| Shows the new log level requested. |
| Displays the current log level. |
| Indicates the beginning of the log level update process. |
| Confirms that the log level was successfully updated. |
| Shows the temporary period for which the new log level is retained, with a countdown. |
| Indicates that the temporary period has ended and the script is reverting the log level. |
| Provides information on the amount of logging activity generated during the temporary log level. |
Example Output
[root@scs dist]# ./castor-change-log-level.sh -p admin:datacore -i 10 -t 300
Swarm IP: 192.168.1.84
Cluster Name: msuen-scs1.suen.work
New log level: debug
Current log level is default.
2025-03-02T05:15:49.901Z Log level changed successfully from 30 → 10.
Keeping log level at debug for 300 second (00:05:00) ...
Countdown: 00:00:01 remaining...
Time's up! Reverting log level back to 30...
Approximate 88.9MB new logs were generated at log level 10. Current castor.log size is 371.3MB after 300 seconds (00:05:00).
2025-03-02T05:20:50.483Z Log level reverted successfully back to 30.
[root@scs dist]#
Error Handling
Missing Parameters: Missing parameters prompt a usage message.
Invalid Log Levels: If an unsupported log level is specified, the script will prompt the user to enter a valid value.
Invalid Duration: If a non-numeric duration is provided, you’re prompted to enter a valid duration in seconds.
Connection Issues: If unable to connect to the Swarm API, check the IP, credentials, and network access.
Notes
Credentials are masked in the output for security.
Log file sizes are shown in human-readable format (GB, MB, KB, B).
This script provides administrators with an effective way to adjust and monitor Swarm logging, supporting both temporary and permanent log level changes for troubleshooting and performance monitoring.
Script Source Code
Latest version: castor-change-log-level.sh
#!/bin/bash
# -----------------------------------------------------------------------------------------------------------------------------
# Script: castor-change-log-level.sh
# -----------------------------------------------------------------------------------------------------------------------------
# Description:
# This script changes the log level for the Castor cluster or node(s) using the Swarm API.
# The script supports changing the log level for the entire cluster or individual node(s).
# The script can run in a detachable session using 'screen' or 'tmux'.
# -----------------------------------------------------------------------------------------------------------------------------
# Written by Milton Suen (milton.suen@datacore.com) Oct 31, 2024
# Revision History:
# v1.0.0 - Update to support running the script in a detachable session using screen or tmux.
# v1.1.0 - 2025-02-20 Add support node(s) level log level change.
# v1.2.0 - 2025-02-26 SUPSCR-208:
# - Enforced proper credential formatting: credentials must be in the username:password format.
# - Fixed help message shows script name without hard code it.
# v1.2.1 - 2025-02-26 SUPSCR-209: Auto detect CSN or SCS to adjust castor.log file path.
# v1.2.2 - 2025-02-27 Address the issue of the script not display correct when the castor.log file is rotated.
# v1.2.3 - 2025-02-27 Address the issue of log level not display correct within detach session.
# v1.2.4 - 2025-02-27 Bug fix: The default node-level log level is 0 (unset), which differs from the cluster-level default of 30.
# v1.2.5 - 2025-02-28 SUPSCR-208:
# - Credentials validation and password prompt enhancements.
# - screen or tmux required with detachable mode, script will stopped with '-D' option if neither is installed.
# - Fixed an issue where users were prompted to enter password multiple times issue.
# - Passwords are now hidden in debug output messages.
# v1.2.6 - 2025-02-28 SUPSCR-208:
# - Disabled credential display on the screen.
# - Bug fix: Resolved the "debug: command not found" error when removing the -d [IP address] option.
# v1.3.0 - 2025-02-28 Minor bug fix and enhancement.
# v1.3.1 - 2025-03-01 SUPSCR-208:
# - Fixed an issue where credentials enclosed in single quotes (') were not processed correctly.
# v1.3.2 - 2025-03-02 Minor bug fix and enhancement.
# v1.3.3 - 2025-03-02 Fixed tmux command to long issue.
# v1.3.4 - Enhanced password handling to better support password with special characters.
# -----------------------------------------------------------------------------------------------------------------------------
# Current Version: 1.3.4
# -----------------------------------------------------------------------------------------------------------------------------
# KB: https://perifery.atlassian.net/wiki/spaces/KB/pages/4143939606/Setting+and+Managing+Swarm+Log+Levels+with+Script
# -----------------------------------------------------------------------------------------------------------------------------
# Define colors
RED='\033[0;31m'
BOLD_RED='\033[1;31m'
GREEN='\033[0;32m'
BOLD_GREEN='\033[1;32m'
UNDERLINE_BOLD_GREEN='\033[4;32m'
YELLOW='\033[0;33m'
BOLD_YELLOW='\033[1;33m'
BLUE='\033[0;34m'
BOLD_BLUE='\033[1;34m'
MAGENTA='\033[0;35m'
BOLD_MAGENTA='\033[1;35m'
CYAN='\033[0;36m'
BOLD_CYAN='\033[1;36m'
RESET='\033[0m' # Reset color to default
SCRIPT_NAME=$(basename "$0")
# Function to display usage information
usage() {
echo -e ""
echo -e "${BOLD_GREEN}Usage:${RESET} ./$SCRIPT_NAME -d swarm_ip -p admin:password [-i new_log_level | -L new_node_log_level] [-t duration_in_seconds] [-D | --detach]"
echo -e " -d, --swarm_ip IP address of the Swarm API endpoint. Supports single or multiple IPs separated by \",\", \";\" or \" \"."
echo -e " If multiple IPs are provided, the script will update the log level for all nodes."
echo -e " (Alternatively, set the SCSP_HOST environment variable to the Swarm IP.)"
echo -e " -p, --credentials Credentials in the format admin:password"
echo -e " -i, --log.level Set cluster log level to set (5, 10, 15, 20, 30, 40, 50)"
echo -e " Aliases: chatter, debug, announce, info, error, critical, default."
echo -e " -L, --node.log.level Set node log level to set (0, 5, 10, 15, 20, 30, 40, 50)"
echo -e " Aliases: chatter, debug, announce, info, error, critical, default."
echo -e " **Either -i or -L must be specified, but not both.**"
echo -e " -t, --time (Optional) Duration in seconds to keep the new log level (must be greater than 0)"
echo -e " -D, --detach (Optional) Detach the script from the current terminal and run in a detachable session using screen or tmux"
echo -e ""
exit 1
}
# Default options
detachable=false
debug=false
output_log="castor-change-log-level_output.log" # Log file for capturing detachable session output
# log_level_type="cluster" # Default log level type
# default_log_level=30 # Default log level
log_file="/var/log/datacore/castor.log" # Default log file location
SCRIPTDIR=$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )
JQLOCATION=$SCRIPTDIR/jq
MAX_RETRIES=3 # Maximum number of password retries attempts
username="admin" # Default username
password="" # Default password
# Global associative array for log levels
declare -A log_levels=(
[5]="chatter"
[10]="debug"
[15]="audit"
[20]="info"
[30]="warning"
[40]="error"
[45]="defect"
[50]="critical"
[60]="announce"
)
# Global associative array for log level names
declare -A log_level_names=(
["chatter"]=5
["debug"]=10
["audit"]=15
["info"]=20
["warning"]=30
["error"]=40
["defect"]=45
["critical"]=50
["announce"]=60
)
# Separate default values for cluster and node levels
declare -A default_log_levels=(
["cluster"]=30
["node"]=0
)
# Separate default values for cluster and node level names
declare -A default_log_level_names=(
["cluster"]="default"
["node"]="default"
)
# Function to get the current timestamp
timestamp() {
date -u +"%Y-%m-%dT%H:%M:%S.%3NZ"
}
# Function to display debug messages if debug mode is enabled
debug_msg() {
if $debug; then
local caller_func=${FUNCNAME[1]:-main} # Fallback to "unknown" if empty
local caller_line=${BASH_LINENO[0]:-unknown} # Fallback to "unknown" if empty
echo -e "$(timestamp) [DEBUG] ($caller_func:$caller_line) $*"
fi
}
# Function to check if either 'screen' or 'tmux' is installed
check_screen_or_tmux() {
if ! command -v screen &>/dev/null && ! command -v tmux &>/dev/null; then
echo -e ""
echo -e "---------------------------------------------------------------------------------------"
echo -e " ${YELLOW}Warning${RESET}: Neither '${BOLD_GREEN}screen${RESET}' nor '${BOLD_GREEN}tmux${RESET}' is installed. Cannot run in detachable mode."
echo -e " Please install either '${BOLD_GREEN}screen${RESET}' or '${BOLD_GREEN}tmux${RESET}' to run the script in a detachable session."
echo -e "---------------------------------------------------------------------------------------"
detachable=false # Disable detachable session
usage
fi
}
# Function to format file size
format_size() {
local size=$1
if (( size >= 1073741824 )); then
echo "$(awk "BEGIN {printf \"%.1fGB\", $size/1073741824}")"
elif (( size >= 1048576 )); then
echo "$(awk "BEGIN {printf \"%.1fMB\", $size/1048576}")"
elif (( size >= 1024 )); then
echo "$(awk "BEGIN {printf \"%.1fKB\", $size/1024}")"
else
echo "${size}B"
fi
}
# Function to format duration
format_duration() {
local duration=$1
local hours=$((duration / 3600))
local minutes=$(( (duration % 3600) / 60 ))
local seconds=$((duration % 60))
printf "%02d:%02d:%02d" $hours $minutes $seconds
}
# Function to check if jq is available and set up JSON parsing method
check_jq() {
for jq_path in "/usr/local/bin/jq" "$(pwd)/jq"; do
[[ -x "$jq_path" ]] && echo "$jq_path" && return
done
command -v jq &>/dev/null && echo "jq" || echo "grep"
}
if [[ -f "$JQLOCATION" ]]; then
jq_or_grep=$JQLOCATION
else
jq_or_grep=$(check_jq)
fi
debug_msg "jq_or_grep: $jq_or_grep"
#jq_or_grep=$(check_jq)
# Function to determine the log file path
determine_log_file() {
for log in "/var/log/datacore/castor.log" "/var/log/caringo/castor.log"; do
if [[ -f "$log" ]]; then
echo "$log"
return
fi
done
echo -e "${RED}Error: Log file not found.${RESET}"
exit 1
}
log_file=$(determine_log_file)
# Function to check if credentials are valid
check_credentials() {
local CREDENTIALS="$1"
local SWARM_IP="$2"
debug_msg "Credentials: [hidden for security]"
debug_msg "Swarm IP: $SWARM_IP"
# API endpoint for validating user credentials
local VALIDATE_URL="http://${SWARM_IP}:91/api/validateUser"
debug_msg "Validate URL: $VALIDATE_URL"
# Make the API request
debug_msg "Validating user credentials..."
debug_msg "curl -s -u \"********\" -X GET \"$VALIDATE_URL\" -H 'Content-Type: application/json'"
RESPONSE=$(curl -s -u "$CREDENTIALS" -X GET "$VALIDATE_URL" -H 'Content-Type: application/json')
debug_msg "Validate User Response: $RESPONSE"
# Check if the response contains "isValid": true
if echo "$RESPONSE" | "$jq_or_grep" -e '.isValid == true' > /dev/null 2>&1; then
debug_msg "Authentication successful for user '${CREDENTIALS%%:*}'"
return 0 # Success
elif echo "$RESPONSE" | "$jq_or_grep" -e '.isValid == false' > /dev/null 2>&1; then
debug_msg "Authentication failed for user '${CREDENTIALS%%:*}'"
return 1 # Failure
else
debug_msg "Error: Unable to validate credentials. Please check your inputs."
return 1 # Failure
fi
}
# Function to print credentials - hide password
print_credentials() {
local CREDENTIALS="$1"
local USERNAME="${CREDENTIALS%%:*}"
echo -e "${GREEN}$USERNAME${RESET}":"${GREEN}********${RESET}"
}
# Parse input arguments
while [[ "$#" -gt 0 ]]; do
case $1 in
-d|--swarm_ip) swarm_ip="$2"; debug_msg "Set swarm_ip to $swarm_ip"; shift 2 ;;
-p|--credentials)
credentials="$2"
shift 2 ;;
-i|--log.level)
if [[ -n "$new_log_level" ]]; then
echo "Error: Options -i (cluster log leve) and -L (node log level) cannot be used together."
usage
fi
log_level_type="cluster"
if [[ ${log_level_names[$2]} ]]; then
new_log_level=${log_level_names[$2]}
elif [[ "$2" == "default" ]]; then
new_log_level=${default_log_levels[$log_level_type]}
new_log_level_name="default"
elif [[ ${log_levels[$2]} ]]; then
new_log_level=$2
else
echo "Invalid log level: $2"
exit 1
fi
new_log_level_name=${log_levels[$new_log_level]}
default_log_level=30
debug_msg "Set new_log_level to $new_log_level ($new_log_level_name)"
shift 2
;;
-L|--node.log.level)
if [[ -n "$new_log_level" ]]; then
echo "Error: Options -i (cluster log leve) and -L (node log level) cannot be used together."
usage
fi
log_level_type="node"
if [[ ${log_level_names[$2]} ]]; then
new_log_level=${log_level_names[$2]}
elif [[ "$2" == "default" ]]; then
new_log_level=${default_log_levels[$log_level_type]}
new_log_level_name="default"
elif [[ ${log_levels[$2]} || $2 -eq 0 ]]; then
new_log_level=$2
new_log_level_name="default"
else
echo "Invalid log level: $2"
exit 1
fi
new_log_level_name=${log_levels[$new_log_level]}
default_log_level=0
debug_msg "Set new_log_level to $new_log_level ($new_log_level_name)"
shift 2
;;
-t|--time)
if [[ -n "$2" && "$2" != -* && "$2" -gt 0 ]]; then
duration="$2"
debug_msg "Set duration to $duration"
shift 2
else
echo -e ""
echo -e "---------------------------------------------------------------------------------------"
echo -e " ${RED}Error${RESET}: Duration must be a number greater than ${BOLD_GREEN}0${RESET}."
echo -e " Please specify the duration using the ${BOLD_GREEN}-t${RESET} ${GREEN}<seconds>${RESET} or ${BOLD_GREEN}--time${RESET} ${GREEN}<seconds>${RESET} option."
echo -e "---------------------------------------------------------------------------------------"
usage
fi
;;
-D|--detach) detachable=true; debug_msg "Set detachable to true"; shift ;;
--debug) debug=true; debug_msg "Set debug to true with ${YELLOW}--debug${RESET}"; shift ;;
*) usage ;;
esac
done
debug_msg "Set log_level_type to ${YELLOW}$log_level_type${RESET}"
# Check if 'screen' or 'tmux' is installed
if [[ "$detachable" == true ]]; then
if [[ -z "$duration" ]]; then
echo -e ""
echo -e "---------------------------------------------------------------------------------------"
echo -e " ${RED}Error${RESET}: Duration must be specified when running in detachable mode."
echo -e " Please specify the duration using the ${BOLD_GREEN}-t${RESET} or ${BOLD_GREEN}--time${RESET} option."
echo -e "---------------------------------------------------------------------------------------"
usage
fi
check_screen_or_tmux
fi
# If swarm_ip is not provided, try using SCSP_HOST environment variable
if [[ -z "$swarm_ip" ]]; then
if [[ -n "$SCSP_HOST" ]]; then
swarm_ip="$SCSP_HOST"
debug_msg "Using Swarm IP from SCSP_HOST: $swarm_ip"
else
echo "Error: swarm_ip not provided and SCSP_HOST is not set."
usage
fi
fi
# Check if required arguments are provided
if [[ -z "$credentials" || -z "$new_log_level" ]]; then
usage
fi
# Split the swarm_ip into an array of IP addresses if it contains delimiters
IFS=';, ' read -r -a ip_array <<< "$swarm_ip"
# Validate credentials before proceeding
debug_msg "Validating credentials..."
#if [[ "$credentials" =~ ^[^:]+$ ]]; then
if [[ -n "$credentials" ]]; then
CREDENTIALS="$credentials"
ATTEMPT=$MAX_RETRIES
USERNAME=""
PASSWORD=""
debug_msg "CREDENTIALS: $(print_credentials "$CREDENTIALS")"
debug_msg "ATTEMPT: $ATTEMPT"
# # Capture the original command as executed by the user
# ORIGINAL_INPUT=$(ps -o args= -p $$)
# debug_msg "ORIGINAL_INPUT: $ORIGINAL_INPUT"
# Check if credentials contain a colon (username:password format)
if [[ "$CREDENTIALS" == *":"* ]]; then
debug_msg "Checking for colon in credentials..."
debug_msg "Credentials contain colon"
USERNAME="${CREDENTIALS%%:*}"
PASSWORD="${CREDENTIALS#*:}"
debug_msg "Username: $USERNAME"
debug_msg "Password: $PASSWORD"
# Define special characters that may cause issues
debug_msg "Defining special characters..."
SPECIAL_CHARS="!@#\$%^&*()_+{}|:<>?~\`-=[]\\;',./\""
debug_msg "Special characters: $SPECIAL_CHARS"
# Debug: Show the final regex pattern
debug_msg "Special characters regex: $SPECIAL_REGEX"
# Check if password start with '$'
# When credentials contains ":" and password is empty
# it will be treated as environment variable
# Bash will try to expand the variable and if it starts with '$'
if [[ -z "$PASSWORD" ]]; then
debug_msg "Password is empty, bash may interpret the password as an environment variable."
echo -e ""
echo -e "${RED}Error${RESET}: Password cannot start with '${BOLD_YELLOW}\$'${RESET}."
echo -e "Bash may interpret the password as an environment variable."
echo -e "To prevent issues, enclose the password in single quotes (${BOLD_YELLOW}'${RESET})."
echo -e ""
echo -e "${GREEN}Example${RESET}:"
echo -e " -p ${BOLD_YELLOW}'${RESET}${GREEN}admin:\$password123${RESET}${BOLD_YELLOW}'${RESET}"
echo -e ""
exit 1
fi
# Validate credentials
check_credentials "$CREDENTIALS" "${ip_array[0]}"
if [[ $? -ne 0 ]]; then
debug_msg "Invalid credentials. Please check your username and password."
echo -e ""
echo -e "${RED}Error${RESET}: Invalid credentials. Please check your username and password."
echo -e "If your password contains special characters, enclose it in single quotes (${BOLD_YELLOW}'${RESET}) to prevent misinterpretation by the system."
echo -e "${GREEN}Special characters include${RESET}: ${YELLOW}$SPECIAL_CHARS${RESET}"
echo -e ""
echo -e "${GREEN}Example${RESET}:"
echo -e " -p ${BOLD_YELLOW}'${RESET}${GREEN}admin:password!@#${RESET}${BOLD_YELLOW}'${RESET}"
echo -e ""
exit 1
fi
debug_msg "Validate credentials"
credentials=$CREDENTIALS
else
debug_msg "Credentials do not contain password"
debug_msg "Username: $(print_credentials "$CREDENTIALS")"
USERNAME="$CREDENTIALS"
# Prompt for password
while [[ $ATTEMPTS -lt $MAX_RETRIES ]]; do
echo ""
read -sp "Enter password for user $USERNAME: " PASSWORD
echo ""
echo ""
if [[ -z "$PASSWORD" ]]; then
echo -e "${RED}Error${RESET}: Password cannot be empty."
((ATTEMPTS++))
continue
fi
CREDENTIALS="$USERNAME:$PASSWORD"
debug_msg "CREDENTIALS: $(print_credentials "$CREDENTIALS"))"
check_credentials "$CREDENTIALS" "${ip_array[0]}"
if [[ $? -eq 0 ]]; then
debug_msg "CREDENTIALS: $(print_credentials "$CREDENTIALS"))"
debug_msg "Valid credentials"
credentials=$CREDENTIALS
break
else
echo -e ""
echo -e "${RED}Error${RESET}: Invalid credentials. Please check your username and password."
echo -e ""
fi
((ATTEMPTS++))
done
if [[ $ATTEMPTS -ge $MAX_RETRIES ]]; then
echo -e ""
echo -e "${RED}Error${RESET}: Maximum number of password attempts reached. Exiting script."
exit 1
fi
fi
else
if ! check_credentials "$credentials" "${ip_array[0]}"; then
echo -e ""
echo -e "${YELLOW}Warning${RESET}: ${RED}Invalid credentials${RESET}."
echo -e ""
exit 1
fi
fi
# Retrieve cluster name and handle JSON parsing using the first IP address
debug_msg "Retrieving the cluster name from Swarm API using IP: ${ip_array[0]}"
if [[ "$jq_or_grep" == "grep" ]]; then
clusterName=$(curl --user "$credentials" -sS "http://${ip_array[0]}:91/api/storage/clusters" | grep -oP '"name":\s*"\K[^"]+')
else
clusterName=$(curl --user "$credentials" -sS "http://${ip_array[0]}:91/api/storage/clusters" | "$jq_or_grep" -r '._embedded.clusters[0].name')
fi
if [[ -z "$clusterName" ]]; then
echo "Failed to retrieve the cluster name. Please check your inputs."
exit 1
fi
debug_msg "Cluster Name: $clusterName"
# Main logic function to run the script tasks
main_script() {
local swarm_ip="$1"
local credentials="$2"
local new_log_level="$3"
local new_log_level_name="$4"
local duration="$5"
local log_level_type="$6"
#local log_file="/var/log/datacore/castor.log"
if [[ -z "$log_file" ]]; then
log_file=$(determine_log_file)
fi
local initial_size=$(stat -c%s "$log_file" 2>/dev/null || echo 0)
local current_log_level
local clusterName="$7"
local jq_or_grep="$8"
local detachable="$9"
if [[ "$detachable" ]]; then
local debug="${10}"
eval "$(echo "${11}" | sed 's/declare -A/declare -A/')"
eval "$(echo "${12}" | sed 's/declare -A/declare -A/')"
fi
local default_log_levels="$13"
local default_log_levels_name="$14"
debug_msg "**********************************************************"
debug_msg "local variables"
debug_msg "Log Level Type: ${GREEN}$log_level_type${RESET}"
debug_msg "Default Log Level: ${GREEN}$default_log_level${RESET}"
debug_msg "Swarm IP: $swarm_ip"
debug_msg "Credentials: $(print_credentials "$credentials")"
debug_msg "New Log Level: $new_log_level"
debug_msg "New log Level Name: $new_log_level_name"
debug_msg "Duration: $duration"
debug_msg "Log File: $log_file"
debug_msg "Initial Log File Size: $initial_size"
debug_msg "Current Log Level: $current_log_level"
debug_msg "Cluster Name: $clusterName"
debug_msg "jq_or_grep: $jq_or_grep"
debug_msg "Detach: $detachable"
debug_msg "Debug: $debug"
debug_msg "**********************************************************"
# Split the swarm_ip into an array of IP addresses
IFS=';, ' read -r -a ip_array <<< "$swarm_ip"
debug_msg "IP Array: ${ip_array[*]}"
# Display initial information
if [[ "$log_level_type" == "cluster" ]]; then
echo -e "Swarm IP: ${GREEN}${ip_array[0]}${RESET}"
else
echo -e "Swarm IPs: ${GREEN}${ip_array[*]}${RESET}"
fi
# echo -e "Swarm IP: ${GREEN}$swarm_ip${RESET}"
debug_msg "Credentials: ${GREEN}[hidden for security]${RESET}"
echo -e "Cluster Name: ${GREEN}$clusterName${RESET}"
debug_msg "Starting main_script function..."
debug_msg "Log level type: $log_level_type"
# Store the original log levels
declare -A original_log_levels
declare -A original_log_level_names
default_log_level=${default_log_levels[$log_level_type]}
default_log_level_name=${default_log_level_names[$log_level_type]}
debug_msg "Default log level: $default_log_level"
debug_msg "Default log level name: $default_log_level_name"
if [[ $log_level_type == "cluster" ]]; then
debug_msg "Setting cluster log level to $new_log_level ($new_log_level_name)"
# Retrieve current log level
if [[ "$jq_or_grep" == "grep" ]]; then
current_log_level=$(curl --user "$credentials" -sS "http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level" | grep -oP '"value":\s*\K[0-9]+')
else
current_log_level=$(curl --user "$credentials" -sS "http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level" | "$jq_or_grep" -r '.value')
fi
current_log_level_name=${log_levels[$current_log_level]}
if [[ $current_log_level == 30 ]]; then
current_log_level_name="default"
fi
if [[ $new_log_level == 30 ]]; then
new_log_level_name="default"
fi
debug_msg "Current cluster log level: ${BOLD_GREEN}$current_log_level${RESET}"
debug_msg "Current log level name: ${BOLD_GREEN}$current_log_level_name${RESET}"
echo -e ""
echo -e "New cluster log level: ${BOLD_GREEN}$new_log_level_name${RESET} (${BOLD_GREEN}$new_log_level${RESET})"
echo -e "Current cluster log level is ${BOLD_GREEN}$current_log_level_name${RESET} (${BOLD_GREEN}$current_log_level${RESET})."
# Skip update if new level matches the current level
if [[ "$current_log_level" -eq "$new_log_level" ]]; then
echo ""
echo -e "Cluster log level is already set to ${BOLD_GREEN}$new_log_level_name${RESET} (${BOLD_GREEN}$new_log_level${RESET}). No changes made."
return
fi
# Update the cluster log level using PUT
debug_msg "Updating cluster log level to $new_log_level_name"
debug_msg "Cluster Name: $clusterName"
debug_msg "Credentials: $(print_credentials "$credentials")"
debug_msg "curl --user \"$(print_credentials "$credentials")\" -sS -X PUT -H \"Content-Type: application/json\" \"http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level\" -d \"{\\\"value\\\": $new_log_level}\""
response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \
"http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level" \
-d "{\"value\": $new_log_level}")
debug_msg "Response: $response"
if [[ "$jq_or_grep" == "grep" ]]; then
updated_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+')
else
updated_log_level=$(echo "$response" | "$jq_or_grep" -r '.value')
fi
debug_msg "Updated cluster log level: $updated_log_level"
if [[ "$updated_log_level" -eq "$new_log_level" ]]; then
echo -e "${GREEN}$(timestamp)${RESET} Log level changed successfully from ${BOLD_GREEN}$current_log_level_name${RESET} (${BOLD_GREEN}$current_log_level${RESET}) -> ${BOLD_GREEN}$new_log_level_name${RESET} (${BOLD_GREEN}$new_log_level${RESET})."
else
echo -e "${GREEN}$(timestamp)${RESET} Failed to update log level. Response: ${RED}$response${RESET}"
exit 1
fi
# Countdown and revert log level
if [[ -n "$duration" && "$duration" -gt 0 ]]; then
echo -e "Keeping log level at ${YELLOW}$new_log_level_name${RESET} (${YELLOW}$new_log_level${RESET}) for ${YELLOW}$duration${RESET} seconds (${YELLOW}$(format_duration $duration)${RESET}) ..."
echo -e ""
for ((i=duration; i>0; i--)); do
printf -v countdown "%02d:%02d:%02d" $((i/3600)) $(( (i%3600) / 60 )) $((i%60))
echo -ne "Countdown: ${YELLOW}$countdown${RESET} remaining...\r"
sleep 1
done
echo -e "\n\nTime's up! Reverting log level back to ${GREEN}$current_log_level_name${RESET} (${BOLD_GREEN}$current_log_level${RESET}) ..."
# Revert the log level back to the original value
# echo -e "Level log revert on $(timestamp)"
response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \
"http://${ip_array[0]}:91/api/storage/clusters/$clusterName/settings/log.level" \
-d "{\"value\": $current_log_level}")
debug_msg "Response: $response"
if [[ "$jq_or_grep" == "grep" ]]; then
reverted_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+')
else
reverted_log_level=$(echo "$response" | "$jq_or_grep" -r '.value')
fi
debug_msg "Reverted cluster log level: $reverted_log_level"
final_size=$(stat -c%s "$log_file" 2>/dev/null || echo 0)
debug_msg "Initial log file size: $initial_size"
debug_msg "Final log file size: $final_size"
size_diff=$(( final_size - initial_size ))
size_diff_formatted=$(format_size "$size_diff")
duration_formatted=$(format_duration "$duration")
if (( size_diff < 0 )); then
echo -e "castor.log file was rotated."
else
echo -e ""
echo -e "Approximate ${BOLD_GREEN}$size_diff_formatted${RESET} new logs were generated at log level ${BOLD_GREEN}$new_log_level_name${RESET} (${BOLD_GREEN}$new_log_level${RESET}). Current castor.log size is ${BOLD_GREEN}$(format_size "$final_size")${RESET} after ${YELLOW}$duration${RESET} seconds (${YELLOW}$duration_formatted${RESET})."
echo -e ""
fi
if [[ "$reverted_log_level" -eq "$current_log_level" ]]; then
echo -e "${GREEN}$(timestamp)${RESET} Log level reverted successfully back to ${BOLD_GREEN}$current_log_level_name${RESET} (${BOLD_GREEN}$current_log_level${RESET})."
echo -e ""
else
echo -e "${GREEN}$(timestamp)${RESET} Failed to revert log level. Response: ${RED}$response${RESET}"
echo -e ""
exit 1
fi
else
echo -e "${GREEN}$(timestamp)${RESET} Log level change is permanent until manually modified."
fi
elif [[ "$log_level_type" == "node" ]]; then
# First loop: Change the node log level
local same_log_level=false
for ip in "${ip_array[@]}"; do
# Retrieve current node log level
debug_msg "Retrieving current node log level for IP: $ip"
debug_msg "curl -s -u \"$(print_credentials "$credentials")\" \"http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel\""
current_log_level=$(curl -s -u "$credentials" "http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel" | $jq_or_grep -r '.value')
debug_msg "Current log level for IP $ip: $current_log_level"
original_log_levels["$ip"]=$current_log_level
debug_msg "Original log level for IP $ip: ${original_log_levels["$ip"]}"
if [[ "$current_log_level" == "0" ]]; then
current_log_level_name="default"
else
current_log_level_name=${log_levels[$current_log_level]}
fi
original_log_level_names["$ip"]=$current_log_level_name
debug_msg "Original log level name: ${original_log_level_names["$ip"]}"
if [[ "$new_log_level" == "0" ]]; then
new_log_level_name="default"
else
new_log_level_name=${log_levels[$new_log_level]}
fi
debug_msg "Current log level for IP $ip: $current_log_level ($current_log_level_name)"
echo ""
echo -e "New node log level: ${BOLD_GREEN}$new_log_level_name${RESET} (${BOLD_GREEN}$new_log_level${RESET})"
echo -e "Current node log level for IP ${BOLD_GREEN}$ip${RESET} is ${BOLD_GREEN}$current_log_level_name${RESET} (${BOLD_GREEN}$current_log_level${RESET})."
# Skip update if new level matches the current level
if [[ "$current_log_level" -eq "$new_log_level" ]]; then
same_log_level=true
echo ""
echo -e "Node log level for IP ${BOLD_GREEN}$ip${RESET} is already set to ${BOLD_GREEN}$new_log_level_name${RESET} (${BOLD_GREEN}$new_log_level${RESET}). No changes made."
continue
else
same_log_level=false
fi
if [[ "$same_log_level" == false ]]; then
# Update the node log level using PUT
debug_msg "Updating node log level to $new_log_level_name for IP $ip..."
debug_msg "curl --user \"$(print_credentials "$credentials")\" -sS -X PUT -H \"Content-Type: application/json\" \"http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel?value=$new_log_level\""
response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \
"http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel?value=$new_log_level")
debug_msg "Response: $response"
if [[ "$jq_or_grep" == "grep" ]]; then
updated_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+')
else
updated_log_level=$(echo "$response" | "$jq_or_grep" -r '.value')
fi
debug_msg "Updated log level for IP $ip: $updated_log_level"
if [[ "$updated_log_level" -eq "$new_log_level" ]]; then
echo -e "${GREEN}$(timestamp)${RESET} Node log level for IP ${BOLD_GREEN}$ip${RESET} changed successfully from ${BOLD_GREEN}$current_log_level_name${RESET} (${GREEN}$current_log_level${RESET}) -> ${BOLD_GREEN}$new_log_level_name${RESET} (${BOLD_GREEN}$new_log_level${RESET})."
else
echo -e "${GREEN}$(timestamp)${RESET} Failed to update node log level for IP ${BOLD_GREEN}$ip${RESET}. Response: ${RED}$response${RESET}"
exit 1
fi
fi
done
# Second loop: Countdown if duration is provided
if [[ "$same_log_level" == false ]]; then\
if [[ -n "$duration" && "$duration" -gt 0 ]]; then
echo -e ""
echo -e "Keeping node(s) log level at ${YELLOW}$new_log_level_name${RESET} (${YELLOW}$new_log_level${RESET}) for ${YELLOW}$duration${RESET} seconds (${YELLOW}$(format_duration $duration)${RESET}) ..."
echo -e ""
for ((i=duration; i>0; i--)); do
printf -v countdown "%02d:%02d:%02d" $((i/3600)) $(( (i%3600) / 60 )) $((i%60))
echo -ne "Countdown: ${YELLOW}$countdown${RESET} remaining...\r"
sleep 1
done
echo -e "\n\nTime's up! Reverting node log level back to original levels..."
# Third loop: Revert the node log level
for ip in "${ip_array[@]}"; do
current_log_level=${original_log_levels["$ip"]}
current_log_level_name=${original_log_level_names["$ip"]}
debug_msg "Reverting node log level back to $current_log_level_name for IP $ip..."
debug_msg "curl --user \"$(print_credentials "$credentials")\" -sS -X PUT -H \"Content-Type: application/json\" \"http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel?value=$current_log_level\""
# echo -e "Node level log revert on $(timestamp) for IP $ip"
response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \
"http://$ip:91/api/storage/nodes/_self/settings/log.nodeLogLevel?value=$current_log_level")
debug_msg "Response: $response"
if [[ "$jq_or_grep" == "grep" ]]; then
reverted_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+')
else
reverted_log_level=$(echo "$response" | "$jq_or_grep" -r '.value')
fi
final_size=$(stat -c%s "$log_file" 2>/dev/null || echo 0)
size_diff=$(( final_size - initial_size ))
debug_msg "Initial log file size: $initial_size"
debug_msg "Final log file size: $final_size"
size_diff_formatted=$(format_size "$size_diff")
duration_formatted=$(format_duration "$duration")
if [[ "$reverted_log_level" -eq "$current_log_level" ]]; then
echo -e "${GREEN}$(timestamp)${RESET} Node log level for IP ${BOLD_GREEN}$ip${RESET} reverted successfully back to ${BOLD_GREEN}${current_log_level_name}${RESET} (${BOLD_GREEN}$current_log_level${RESET})."
else
echo -e "${GREEN}$(timestamp)${RESET} Failed to revert node log level for IP ${BOLD_GREEN}$ip${RESET}. Response: ${RED}$response${RESET}"
exit 1
fi
done
# Combine the log output for multiple IP addresses into a single summary
if (( size_diff < 0 )); then
echo -e "castor.log file was rotated."
else
echo -e ""
echo -e "Approximate ${BOLD_GREEN}$size_diff_formatted${RESET} new logs were generated at log level ${BOLD_GREEN}$new_log_level_name${RESET} (${BOLD_GREEN}$new_log_level${RESET}) for IP ${ip_array[*]}. Current castor.log size is ${BOLD_GREEN}$(format_size "$final_size")${RESET} after ${YELLOW}$duration${RESET} seconds (${YELLOW}$duration_formatted${RESET})."
echo -e ""
fi
else
echo -e "${GREEN}$(timestamp)${RESET} Log level change is permanent until manually modified."
fi
fi
fi
}
# Run in detachable or directly
if $detachable; then
# Pass the main_script function to the screen session and store the output in a file
debug_msg "**********************************************************" | tee -a "$output_log"
debug_msg "Detach mode - Parameters passed to main_script:" | tee -a "$output_log"
debug_msg " Swarm IP: $swarm_ip" | tee -a "$output_log"
debug_msg " Credentials: $(print_credentials "$credentials")" | tee -a "$output_log"
debug_msg " New Log Level: $new_log_level" | tee -a "$output_log"
debug_msg " New log Level Name: $new_log_level_name" | tee -a "$output_log"
debug_msg " Duration: $duration" | tee -a "$output_log"
debug_msg " Log Level Type: $log_level_type" | tee -a "$output_log"
debug_msg " Log File: $log_file" | tee -a "$output_log"
debug_msg " Initial Log File Size: $initial_size" | tee -a "$output_log"
debug_msg " Current Log Level: $current_log_level" | tee -a "$output_log"
debug_msg " Cluster Name: $clusterName" | tee -a "$output_log"
debug_msg " jq_or_grep: $jq_or_grep" | tee -a "$output_log"
debug_msg " Detach: $detachable" | tee -a "$output_log"
debug_msg " Debug: $debug" | tee -a "$output_log"
debug_msg "**********************************************************" | tee -a "$output_log"
# Convert associative arrays to strings and pass them to the screen session
log_levels_string=$(declare -p log_levels)
log_level_names_string=$(declare -p log_level_names)
debug_msg "log_levels_string: $log_levels_string" | tee -a "$output_log"
debug_msg "log_level_names_string: $log_level_names_string" | tee -a "$output_log"
if command -v screen &>/dev/null; then
echo -e "Running in ${YELLOW}screen${RESET} detachable mode..." | tee -a "$output_log"
screen -dmS castor_log_script bash -c "$(declare -f main_script timestamp debug_msg format_size format_duration check_jq determine_log_file print_credentials); main_script \"$swarm_ip\" \"$credentials\" \"$new_log_level\" \"$new_log_level_name\" \"$duration\" \"$log_level_type\" \"$clusterName\" \"$jq_or_grep\" \"$detachable\" \"$debug\" \"${log_levels_string}\" \"${log_level_names_string}\" \"${default_log_levels}\" \"${default_log_levels_name}\" | tee \"$output_log\""
screen -r castor_log_script
# Wait for the screen session to complete and then display the output log
sleep 1
while screen -list | grep -q "castor_log_script"; do
sleep 1
done
elif command -v tmux &>/dev/null; then
echo -e "Running in ${YELLOW}tmux${RESET} detachable mode..." > "$output_log" # Truncate log file to remove old entries
# Create a temp script file
temp_script="$SCRIPTDIR/castor_log_script.sh"
# Write the script to a file to avoid printing function definitions
cat <<EOF > "$temp_script"
#!/bin/bash
$(declare -f main_script timestamp debug_msg format_size format_duration check_jq determine_log_file print_credentials)
main_script "$swarm_ip" "$credentials" "$new_log_level" "$new_log_level_name" "$duration" "$log_level_type" "$clusterName" "$jq_or_grep" "$detachable" "$debug" "${log_levels_string}" "${log_level_names_string}" "${default_log_levels}" "${default_log_levels_name}" | tee -a "$output_log"
rm -f "$temp_script" # Remove the script file
tmux kill-session -t castor_log_script
EOF
# Ensure the script is executable
chmod +x "$temp_script"
# Start tmux session and execute the script inside
tmux new-session -d -s castor_log_script "bash $temp_script"
# Attach session only if it's still running
while tmux has-session -t castor_log_script 2>/dev/null; do
tmux attach-session -t castor_log_script
done
# if tmux has-session -t castor_log_script 2>/dev/null; then
# tmux attach-session -t castor_log_script
# fi
else
echo "Error: Neither screen nor tmux available. Run without --detachable."
exit 1
fi
echo ""
cat "$output_log"
else
# main_script "$swarm_ip" "$credentials" "$new_log_level" "$duration" "$clusterName" "$jq_or_grep" | tee "$output_log"
debug_msg "**********************************************************"
debug_msg "Parameters passed to main_script:"
debug_msg "Swarm IP: $swarm_ip"
debug_msg "Credentials: $(print_credentials "$credentials")"
debug_msg "New Log Level: $new_log_level"
debug_msg "Duration: $duration"
debug_msg "Cluster Name: $clusterName"
debug_msg "jq_or_grep: $jq_or_grep"
debug_msg "**********************************************************"
debug_msg "Running main_script function..."
main_script "$swarm_ip" "$credentials" "$new_log_level" "$new_log_level_name" "$duration" "$log_level_type" "$clusterName" "$jq_or_grep" | tee "$output_log"
fi
Related content
© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.