/
Setting and Managing Swarm Log Levels with script

Setting and Managing Swarm Log Levels with script

Managing Log Levels in Swarm Cluster Using a Shell Script

The Swarm Log Level Management Script (castor-change-log-level.sh) is designed to manage and dynamically adjust log levels on a DataCore Swarm cluster, providing options to change the log level temporarily and automatically revert it after a specified duration. It supports background execution via screen or tmux, making it ideal for long-running operations that require detachment from the terminal.

Script Features

  • Set and Revert Log Levels: Temporarily change the log level and revert after a specified duration.

  • Flexible JSON Parsing: Uses jq for JSON parsing if available; defaults to grep otherwise.

  • Detachable Session Execution: Runs in a detachable session using screen or tmux, allowing the script to continue running independently of the terminal session.

  • Log Size Monitoring: Reports the log size generated during the temporary log level change.

  • Countdown Display: Shows a countdown for the specified duration.

Requirements

  • jq (optional): Used for parsing JSON responses; falls back to grep if unavailable.

  • screen or tmux (optional): Required for background execution.

  • Permissions: Ensure sufficient permissions to execute on the DataCore Swarm server and access required files.

Script Usage

./castor-change-log-level.sh -d <node_ip> -p <admin:password> -i <new_log_level> [-t <duration_in_seconds>] [--detach] [-v]

Parameter

Description

Parameter

Description

-d, --swarm_ip

IP address of the Swarm API endpoint (or set SCSP_HOST environment variable).

-p, --credentials

Admin credentials in the format admin:password.

-i, --log.level

New log level to set (values: 10, 15, 20, 30, 40, 50).

-t, --time

Duration in seconds to keep the new log level (optional).

--detach

Runs the script in a detachable session using screen or tmux, allowing continued operation if terminal session ends.

Instruction for Use

Example 1: Set cluster log level to 20 and keep it for 10 minutes

./castor-change-log-level.sh -d 192.168.8.84 -p admin:datacore -i 20 -t 600

Example 2: Set node(s) log level to debug (10) and keep it for 10 minutes

./castor-change-log-level.sh -d 192.168.8.84,192.168.8.86 -p admin:datacore -i debug -t 600

Example 3: Run in background mode

Running the Script at a Specific Time

The at command can used to schedule the script to run at a later time. This is useful when you need to start collecting debug logs at a specific hour.

Example 4: Schedule the script to run at 3:00 AM and collect debug logs for 1 hour

  1. Ensure the at service is installed, enabled and running

  2. Schedule the script execution using at:

This schedules the script to run at 3:00 AM on Feb 16 2025 and collects log at debug level for 1 hour.

  1. To verify scheduled job:

  2. To remove a scheduled job (replace JOB_ID with the actual job number from atq output):

Behavior

  1. Log Level Change: Sets the log level to the specified value. If the current log level matches the requested level, the script skips the update.

  2. Countdown: During the specified duration, the script displays a countdown every second.

  3. Revert Log Level: After the countdown, the log level reverts to the initial value.

  4. Log Size Report: Provides approximately log size generated during the temporary log level change.

Output Messages

Message

Description

Message

Description

Swarm IP:

Displays the specified Swarm IP address.

Credentials:

Credentials are masked for security.

Cluster Name:

Displays the cluster name retrieved from the Swarm API.

New log level:

Shows the new log level requested.

Current log level:

Displays the current log level.

Updating log level to X...

Indicates the beginning of the log level update process.

Log level changed successfully...

Confirms that the log level was successfully updated.

Keeping log level at X for Y...

Shows the temporary period for which the new log level is retained, with a countdown.

Time's up! Reverting log level...

Indicates that the temporary period has ended and the script is reverting the log level.

Approximate X new logs generated...

Provides information on the amount of logging activity generated during the temporary log level.

Example Output

Error Handling

  • Missing Parameters: Missing parameters prompt a usage message.

  • Invalid Duration: If a non-numeric duration is provided, you’re prompted to enter a valid duration in seconds.

  • Connection Issues: If unable to connect to the Swarm API, check the IP, credentials, and network access.

Notes

  • Credentials are masked in the output for security.

  • Log file sizes are shown in human-readable format (GB, MB, KB, B).

This script provides administrators with an effective way to adjust and monitor Swarm logging, supporting both temporary and permanent log level changes for troubleshooting and performance monitoring.

Script Source Code

Latest version: castor-change-log-level.sh

 

Related content

SWARM Containerized Stack Installation
SWARM Containerized Stack Installation
Read with this
swarmctl -L Node log level
swarmctl -L Node log level
More like this
swarmctl -i Log level
swarmctl -i Log level
More like this
Script to periodically upload SCS log files
Script to periodically upload SCS log files
More like this
Turn on component level logging in Swarm
Turn on component level logging in Swarm
More like this
swarmctl -z Component Log Level
swarmctl -z Component Log Level
More like this

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.