Managing Log Levels in Swarm Cluster Using a Shell Script
Overview
This knowledge base entry provides detailed information about the set_swarm_log_level.sh
script, which is used to set and manage the logging levels of Swarm API clusters. This script simplifies the process of changing log levels and allows users to revert changes after a specified duration if needed.
Purpose
The purpose of this script is to:
Change the log level of a Swarm cluster to a specified value.
Optionally revert the log level back to its original setting after a defined duration.
Provide an easy-to-use interface for managing log levels without needing extensive API knowledge.
Script Functionality
Input Parameters: The script accepts parameters for the Swarm API IP address, credentials, new log level, and optional duration for temporary changes.
Validation and Error Handling: It checks for required parameters and handles errors gracefully.
Logging Level Management: Retrieves the current log level, updates it if necessary, and optionally reverts it after a countdown.
Key Features
Dynamic Duration Input: If the duration parameter is provided without a value, the user is prompted to enter the duration in seconds.
User Feedback: The script provides informative output regarding the current state and actions being taken, including success and error messages.
Count Down Timer: If a duration is specified, the script displays a countdown in hours, minutes, and seconds.
Script Source Code
#!/bin/bash # Function to display usage information usage() { echo "Usage: $0 -d swarm_ip -p admin:password -i new_log_level [-t duration_in_seconds]" echo " -d, --swarm_ip IP address of the Swarm API endpoint" echo " -p, --credentials Credentials in the format admin:password" echo " -i, --log.level New log level to set" echo " -t, --time Duration in seconds to keep the new log level (optional)" exit 1 } # Function to format file size format_size() { local size=$1 if (( size >= 1073741824 )); then echo "$(awk "BEGIN {printf \"%.1fGB\", $size/1073741824}")" elif (( size >= 1048576 )); then echo "$(awk "BEGIN {printf \"%.1fMB\", $size/1048576}")" elif (( size >= 1024 )); then echo "$(awk "BEGIN {printf \"%.1fKB\", $size/1024}")" else echo "${size}B" fi } # Function to format duration format_duration() { local duration=$1 local hours=$((duration / 3600)) local minutes=$(( (duration % 3600) / 60 )) local seconds=$((duration % 60)) printf "%02d:%02d:%02d" $hours $minutes $seconds } # Parse input arguments while [[ "$#" -gt 0 ]]; do case $1 in -d|--swarm_ip) swarm_ip="$2"; shift ;; -p|--credentials) credentials="$2"; shift ;; -i|--log.level) new_log_level="$2"; shift ;; -t|--time) if [[ -n "$2" && "$2" != -* ]]; then duration="$2" shift else read -p "Enter duration in seconds: " duration fi ;; *) usage ;; esac shift done # Check if required arguments are provided if [[ -z "$swarm_ip" || -z "$credentials" || -z "$new_log_level" ]]; then usage fi # Retrieve the cluster name clusterName=$(curl -u admin:caringo -sS "http://$swarm_ip:91/api/storage/clusters" | grep -oP '"name":\s*"\K[^"]+') if [[ -z "$clusterName" ]]; then echo "Failed to retrieve the cluster name. Please check your inputs." exit 1 fi # Convert duration to an integer if it is set if [[ -n "$duration" ]]; then if ! [[ "$duration" =~ ^[0-9]+$ ]]; then echo "Error: Duration must be a positive integer value in seconds." exit 1 fi fi # Display input parameters echo "Swarm IP: $swarm_ip" echo "Credentials: [hidden for security]" echo "Cluster Name: $clusterName" # Identify the log file location log_file="" if [[ -f "/var/log/caringo/castor.log" ]]; then log_file="/var/log/caringo/castor.log" elif [[ -f "/var/log/datacore/castor.log" ]]; then log_file="/var/log/datacore/castor.log" fi # Display log file information and truncate if [[ -n "$log_file" ]]; then echo "Log file located at: $log_file" # Capture initial file size initial_size=$(stat -c%s "$log_file") initial_size_formatted=$(format_size "$initial_size") echo "Initial log file size: $initial_size_formatted" else echo "Warning: Log file not found in expected directories." fi # Get the current log level echo "" echo "Retrieving the current log level..." current_log_level=$(curl -u "$credentials" -sS "http://$swarm_ip:91/api/storage/clusters/$clusterName/settings/log.level" | grep -oP '"value":\s*\K[0-9]+') # Check if the current log level was retrieved successfully if [[ -z "$current_log_level" ]]; then echo "Failed to retrieve the current log level. Please check your inputs." exit 1 fi echo "New log level: $new_log_level" echo "Current log level is $current_log_level." # Check if the new log level is the same as the current log level if [[ "$current_log_level" -eq "$new_log_level" ]]; then echo "" echo "Log level is already set to $new_log_level. No changes made." exit 0 fi # Update the log level using PUT echo "Updating log level to $new_log_level..." response=$(curl -u "$credentials" -sS -X PUT -H "Content-Type: application/json" \ "http://$swarm_ip:91/api/storage/clusters/$clusterName/settings/log.level" \ -d "{\"value\": $new_log_level}") # Verify if the log level was updated updated_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+') if [[ "$updated_log_level" -eq "$new_log_level" ]]; then echo "Log level changed successfully from $current_log_level → $new_log_level." else echo "Failed to update log level. Response: $response" exit 1 fi # If duration is specified, wait and revert after the specified time if [[ -n "$duration" && "$duration" -gt 0 ]]; then echo "Keeping log level at $new_log_level for $duration second(s)..." echo "" # Countdown loop for ((i=duration; i>0; i--)); do # Calculate hours, minutes, and seconds hours=$((i / 3600)) minutes=$(( (i % 3600) / 60 )) seconds=$((i % 60)) # Format countdown in hh:mm:ss printf -v countdown "%02d:%02d:%02d" $hours $minutes $seconds echo -ne "Countdown: $countdown remaining...\r" sleep 1 done echo -e "\n\nTime's up! Reverting log level back to $current_log_level..." # Check log file size before reverting final_size=$(stat -c%s "$log_file") final_size_formatted=$(format_size "$final_size") # Calculate size difference size_diff=$(( final_size - initial_size )) size_diff_formatted=$(format_size "$size_diff") # Display size difference and final log size # echo "Approximate $size_diff_formatted new logs was genreated at log level $new_log_level. Current castor.log size is $final_size_formatted." # Format the duration for display duration_formatted=$(format_duration "$duration") # Updated message with duration in hh:mm:ss format echo "Approximate $size_diff_formatted new logs was generated at log level $new_log_level. Current castor.log size is $final_size_formatted after $duration_formatted." echo "" # Revert to original log level response=$(curl -u "$credentials" -sS -X PUT -H "Content-Type: application/json" \ "http://$swarm_ip:91/api/storage/clusters/$clusterName/settings/log.level" \ -d "{\"value\": $current_log_level}") reverted_log_level=$(echo "$response" | grep -oP '"value":\s*\K[0-9]+') if [[ "$reverted_log_level" -eq "$current_log_level" ]]; then echo "Log level reverted successfully back to $current_log_level." else echo "Failed to revert log level. Response: $response" exit 1 fi else echo "Log level change is permanent until manually modified." fi
Usage Instructions
To use the script, run it from the command line with the appropriate parameters:
./set_swarm_log_level.sh -d <swarm_ip> -p <admin:password> -i <new_log_level> [-t <duration_in_seconds>]
Examples
Change Log Level Permanently:
. /set_swarm_log_level.sh -d 192.168.8.84 -p admin:datacore -i 30
Change Log Level Temporarily:
./set_swarm_log_level.sh -d 192.168.8.84 -p admin:datacore -i 20 -t 300
Prompt for Duration:
./set_swarm_log_level.sh -d 192.168.8.84 -p admin:datacore -i 10 -t
Troubleshooting
Ensure correct permissions and network access to the Swarm API.
Verify input parameters and check for typos.
Ensure that
curl
andjq
are installed and available in the environment.
Conclusion
The set_swarm_log_level.sh
script is a valuable tool for administrators managing Swarm clusters. By following the guidelines outlined in this KB entry, users can effectively utilize the script to adjust log levels as needed.