Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Info

Deprecated

The Legacy Admin Console (port 90) is still available but has been replaced by the Swarm Storage UI. (v10.0)

...

The top row of the Node Status page provides summary information about the node and the associated volumes, such as up-time and storage usage statistics:

...

  • Streams: Counts the total number of managed data components (such as replicas and segments), not logical objects (such as video files).

  • Trapped: Calculates the space pending reclamation by the Swarm defragmentation process. This process is controlled by several Swarm parameters (see the Settings Reference).

...

To shut down or restart a node,

  • Click Shutdown Node or Restart Node in the Swarm Admin Console.

A node that is shutdown or rebooted by an Administrator appears with a Maintenance state on other nodes in the cluster.

...

Identify one or all volumes on a node using the links on the right side of the Swarm Admin Console under Restart Node.

The Identify function allows selection of a particular volume and enable the corresponding LED drive light, which can be helpful in identifying a failed or failing drive. Select the targeted volume and the amount of time the light is enabled.

...

Info

Note

Retire succeeds if objects can be replicated elsewhere in the cluster. As a result, the The Retire action does not remove an object until it can guarantee at least two replicas exist in the cluster or the existing number of replicas matches the policy.replicas min parameter value.

...

Messages display in the node status area if removing or insert a drive into a running node. This feature, referred to as hot plugging (adding a new drive) or hot swapping (replacing a failed drive), allows removal of failed drives for analysis or to add storage capacity to a node at any time.

...

The Node Info status section contains general information about the hardware installed on the node, as well as time server information and current uptime. Use this status information to determine if a node requires additional hardware resources.

For example, if the Index Utilization and Buffer Utilization values rise to 80% or more, the Swarm Admin Console generates an alert indicating the node may require additional RAM to maintain cluster performance. Additionally, if the Time value does not match the same value in the remaining cluster nodes, the node may not be communicating properly with an NTP server.

...

Scroll to the bottom of the Node Info section to access these links to additional reports:

  • SNMP Repository (the SNMP repository dump)

  • Object Counts (the Python classes in use)

  • Uncollectable Garbage

  • HTML Templates

  • Loggers... (the settings window for changing the logging levels)

  • Dmesg dump (the last 1000 messages logged by the Linux kernel reading buffer, for diagnosing a Swarm issue when a system panic or error occurs)

  • Hwinfo dump (the Linux hardware detection tool output)

...

For example, some cluster features (such as the Capacity column value in the Swarm Admin Console) do not update until the HP cycles are completed separately on each node. The HP Cycle time parameter increases exponentially as the number of objects increase on the node. Additionally, if the SCSP Last read bid and SCSP Last write bid parameters are high, the node may not be servicing new requests.

...

Hardware status reporting is dependent on hardware that supports and populates IPMI sensors, SMART status, and, in some cases, manufacturer-specific components such as SAS. Depending on the hardware, not all status fields are populated. The hardware status values are independently scanned and populated for each node, allowing variations in supported utilities on a node-by-node basis.

...

Scroll to the bottom of the Hardware Status section to access these links to additional reports:

  • Test Network - Pings all nodes in the cluster to ensure all nodes can communicate with each other using TCP/IP and UDP (see details below).

  • Test Volumes - Pings the volumes on the local hard drives and provides a response time (in milliseconds).

  • Dmesg Dump - Displays the last 1000 messages logged by the Linux kernel reading buffer. These messages can help troubleshoot and diagnose a Swarm issue when a system panic or error occurs.

  • Hwinfo dump (the Linux hardware detection tool output)

  • Send Health Report (script that sends the hardware health report to the configured destination)

Test Network

Test Network performs two sets of tests:

  • First, it sends 100 UDP multicasts to the cluster and computes the results:

  • Which nodes responded

  • How many responses returned

  • How long the responses took, on average

  • Next, it fetches the status page (port 80) via TCP for all responding nodes (once for each node). It tracks the total time for each of those round trips.

The data in the Network Test Results window allows comparing the responding nodes with the list of expected nodes in the cluster. Evaluate UDP packet loss and TCP connectivity within the cluster.

...