...
Set and Revert Log Levels: Temporarily change the log level and revert after a specified duration.
Flexible JSON Parsing: Uses
jq
for JSON parsing if available; defaults togrep
jq
otherwise.Background Execution: Optionally runs in the background using
screen
ortmux
.Log Size Monitoring: Reports the log size generated during the temporary log level change.
Countdown Display: Shows a countdown for the specified duration.
...
Code Block |
---|
./castor-change-log-level.sh -d <swarm<node_ip> -p <admin:password> -i <new_log_level> [-t <duration_in_seconds>] [--background] [-v] |
Parameter | Description |
---|---|
| IP address of the Swarm API endpoint (or set |
| Admin credentials in the format |
| New log level to set (values: |
| Duration in seconds to keep the new log level (optional). |
| Runs the script in a detached session using |
| Enables verbose mode to display debug information. |
...
Log Level Change: Sets the log level to the specified value. If the current log level matches the requested level, the script skips the update.
Countdown: During the specified duration, the script displays a countdown every second.
Revert Log Level: After the countdown, the log level reverts to the initial value.
Log Size Report: Provides details on approximately log size generated during the temporary log level change.
Debug Mode: When
-v
is specified, debug messages display the script's internal operations.
...
Code Block |
---|
[root@scs dist]# ./castor-change-log-level.sh -p admin:datacore -i 2010 -t 300 Swarm IP: 192.168.1.84 Credentials: [hidden for security] Cluster Name: gatewayadmindomain New log level: 510 Current log level is 30. Updating log level to 510... Log level changed successfully from 30 → 510. Keeping log level at 510 for 300 second(s)... Countdown: 00:00:01 remaining... Time's up! Reverting log level back to 30... Approximate 69.4MB new logs were generated at log level 510. Current castor.log size is 371.3MB after 00:05:00. Log level reverted successfully back to 30. [root@scs dist]# |
...
This script provides administrators with an effective way to adjust and monitor Swarm logging, supporting both temporary and permanent log level changes for troubleshooting and performance monitoring.
Script Source Code
Latest version: castor-change-log-level.sh
Code Block | ||
---|---|---|
| ||
#!/bin/bash # Written by Milton Suen (milton.suen@datacore.com) Oct 31, 2024 # Revision: Update to support running the script in a backgroundpersistent session using screen or tmux. # Function to display usage information usage() { echo "Usage: $0 -d swarm_ip -p admin:password -i new_log_level [-t duration_in_seconds] [--backgroundpersistent] [-v]" echo " -d, --swarm_ip IP address of the Swarm API endpoint (or set SCSP_HOST environment variable)" echo " -p, --credentials Credentials in the format admin:password" echo " -i, --log.level New log level to set" echo " -t, --time Duration in seconds to keep the new log level (optional)" echo " --backgroundpersistent Run the script in a detached session using screen or tmux" echo " -v, --verbose Enable verbose mode for debug messages" exit 1 } # Default options backgroundpersistent=false verbose=false output_log="script_output.log" # Log file for capturing backgroundpersistent session output # Function to display debug messages if verbose mode is enabled debug() { if $verbose; then echo "[DEBUG] $1" fi } # Function to check if either 'screen' or 'tmux' is installed check_screen_or_tmux() { if ! command -v screen &>/dev/null && ! command -v tmux &>/dev/null; then echo "Error: Neither 'screen' nor 'tmux' is installed. Cannot run in backgroundpersistent mode." backgroundpersistent=false # Disable backgroundpersistent session fi } # Function to format file size format_size() { local size=$1 if (( size >= 1073741824 )); then echo "$(awk "BEGIN {printf \"%.1fGB\", $size/1073741824}")" elif (( size >= 1048576 )); then echo "$(awk "BEGIN {printf \"%.1fMB\", $size/1048576}")" elif (( size >= 1024 )); then echo "$(awk "BEGIN {printf \"%.1fKB\", $size/1024}")" else echo "${size}B" fi } # Function to format duration format_duration() { local duration=$1 local hours=$((duration / 3600)) local minutes=$(( (duration % 3600) / 60 )) local seconds=$((duration % 60)) printf "%02d:%02d:%02d" $hours $minutes $seconds } # Function to check if jq is available and set up JSON parsing method check_jq() { if [[ -x "/usr/local/bin/jq" ]]; then echo "/usr/local/bin/jq" elif [[ -x "$(pwd)/jq" ]]; then echo "$(pwd)/jq" elif command -v jq &>/dev/null; then echo "jq" else echo "grep" fi } jq_or_grep=$(check_jq) # Parse input arguments while [[ "$#" -gt 0 ]]; do case $1 in -d|--swarm_ip) swarm_ip="$2"; shift 2 ;; -p|--credentials) credentials="$2"; shift 2 ;; -i|--log.level) new_log_level="$2"; shift 2 ;; -t|--time) if [[ -n "$2" && "$2" != -* ]]; then duration="$2" shift 2 else read -p "Enter duration in seconds: " duration shift fi ;; --backgroundpersistent) backgroundpersistent=true; shift ;; -v|--verbose) verbose=true; shift ;; *) usage ;; esac done # Check if 'screen' or 'tmux' is installed check_screen_or_tmux # If swarm_ip is not provided, try using SCSP_HOST environment variable if [[ -z "$swarm_ip" ]]; then if [[ -n "$SCSP_HOST" ]]; then swarm_ip="$SCSP_HOST" debug "Using Swarm IP from SCSP_HOST: $swarm_ip" else echo "Error: swarm_ip not provided and SCSP_HOST is not set." usage fi fi # Check if required arguments are provided if [[ -z "$credentials" || -z "$new_log_level" ]]; then usage fi # Retrieve cluster name and handle JSON parsing debug "Retrieving the cluster name from Swarm API." if [[ "$jq_or_grep" == "grep" ]]; then clusterName=$(curl --user "$credentials" -sS "http://$swarm_ip:91/api/storage/clusters" | grep -oP '"name":\s*"\K[^"]+') else clusterName=$(curl --user "$credentials" -sS "http://$swarm_ip:91/api/storage/clusters" | "$jq_or_grep" -r '._embedded.clusters[0].name') fi if [[ -z "$clusterName" ]]; then echo "Failed to retrieve the cluster name. Please check your inputs." exit 1 fi debug "Cluster Name: $clusterName" # Main logic function to run the script tasks main_script() { local swarm_ip="$1" local credentials="$2" local new_log_level="$3" local duration="$4" local clusterName="$5" local log_file="/var/log/datacore/castor.log" local initial_size=$(stat -c%s "$log_file" 2>/dev/null || echo 0) local current_log_level local jq_or_grep="$6" # Display initial information echo "Swarm IP: $swarm_ip" echo "Credentials: [hidden for security]" echo "Cluster Name: $clusterName" debug "Starting main_script function..." # Retrieve current log level if [[ "$jq_or_grep" == "grep" ]]; then current_log_level=$(curl --user "$credentials" -sS "http://$swarm_ip:91/api/storage/clusters/$clusterName/settings/log.level" | grep -oP '"value":\s*\K[0-9]+') else current_log_level=$(curl --user "$credentials" -sS "http://$swarm_ip:91/api/storage/clusters/$clusterName/settings/log.level" | "$jq_or_grep" -r '.value') fi echo "" echo "New log level: $new_log_level" echo "Current log level is $current_log_level." # Skip update if new level matches the current level if [[ "$current_log_level" -eq "$new_log_level" ]]; then echo "" echo "Log level is already set to $new_log_level. No changes made." return fi # Update the log level using PUT echo "Updating log level to $new_log_level..." response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \ "http://$swarm_ip:91/api/storage/clusters/$clusterName/settings/log.level" \ -d "{\"value\": $new_log_level}") if [[ "$jq_or_grep" == "grep" ]]; then updated_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+') else updated_log_level=$(echo "$response" | "$jq_or_grep" -r '.value') fi if [[ "$updated_log_level" -eq "$new_log_level" ]]; then echo "Log level changed successfully from $current_log_level → $new_log_level." else echo "Failed to update log level. Response: $response" exit 1 fi # Countdown and revert log level if [[ -n "$duration" && "$duration" -gt 0 ]]; then echo "Keeping log level at $new_log_level for $duration second(s)..." for ((i=duration; i>0; i--)); do printf -v countdown "%02d:%02d:%02d" $((i/3600)) $(( (i%3600) / 60 )) $((i%60)) echo -ne "Countdown: $countdown remaining...\r" sleep 1 done echo -e "\n\nTime's up! Reverting log level back to $current_log_level..." response=$(curl --user "$credentials" -sS -X PUT -H "Content-Type: application/json" \ "http://$swarm_ip:91/api/storage/clusters/$clusterName/settings/log.level" \ -d "{\"value\": $current_log_level}") if [[ "$jq_or_grep" == "grep" ]]; then reverted_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+') else reverted_log_level=$(echo "$response" | "$jq_or_grep" -r '.value') fi final_size=$(stat -c%s "$log_file" 2>/dev/null || echo 0) size_diff=$(( final_size - initial_size )) size_diff_formatted=$(format_size "$size_diff") duration_formatted=$(format_duration "$duration") echo "Approximate $size_diff_formatted new logs were generated at log level $new_log_level. Current castor.log size is $(format_size "$final_size") after $duration_formatted." if [[ "$reverted_log_level" -eq "$current_log_level" ]]; then echo "Log level reverted successfully back to $current_log_level." else echo "Failed to revert log level. Response: $response" exit 1 fi else echo "Log level change is permanent until manually modified." fi } # Run in backgroundpersistent or directly if $background$persistent; then # Pass the main_script function to the screen session and store the output in a file if command -v screen &>/dev/null; then screen -dmS indexer_script bash -c "$(declare -f main_script format_size format_duration check_jq debug); main_script \"$swarm_ip\" \"$credentials\" \"$new_log_level\" \"$duration\" \"$clusterName\" \"$jq_or_grep\" | tee \"$output_log\"" screen -r indexer_script elif command -v tmux &>/dev/null; then tmux new-session -d -s indexer_script "$(declare -f main_script format_size format_duration check_jq debug); main_script \"$swarm_ip\" \"$credentials\" \"$new_log_level\" \"$duration\" \"$clusterName\" \"$jq_or_grep\" | tee \"$output_log\"" tmux attach-session -t indexer_script else echo "Error: Neither screen nor tmux available. Run without --backgroundpersistent." exit 1 fi # Wait for the screen session to complete and then display the output log sleep 1 while screen -list | grep -q "indexer_script"; do sleep 1 done echo "" cat "$output_log" else main_script "$swarm_ip" "$credentials" "$new_log_level" "$duration" "$clusterName" "$jq_or_grep" | tee "$output_log" fi |
...