Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Next »

This section describes how to boot a cluster from the network using the Intel Preboot Execution Environment (PXE) specification. This booting process (commonly referred to as network booting) is supported by most NICs. PXE is one way to boot the storage cluster nodes.

Platform Server

If using Legacy Swarm Platform, skip this section: the network booting is set up.

  • To enable nodes to boot from a USB flash drive, see Initializing a Storage Cluster.

  • To enable nodes to boot using a configuration file server, see the section below.

  • To enable nodes to PXE boot, perform these steps:

  1. Configure the DHCP server with next-server and filename parameters.

  2. Configure PortFast on the switch ports leading to the storage cluster nodes.

  3. Configure the TFTP server with PXE bootstrap, configuration, and Swarm files.

  4. Set up the nodes' BIOS configurations for network booting.

Requirement

Increase the size of the initrd RAM disk to 160MB on the PXE boot server to prevent PXE boot failures. This does not apply if using Platform Server.

Setting Up the DHCP Server for PXE Booting

Warning

Swarm can erase all non-Swarm data on hosts that boot accidentally from the network. When setting up the DHCP server, verify it provides network booting information to the correct network hosts.

The following example shows the configuration lines from the Internet Systems Consortium (ISC) DHCP server that is commonly available on UNIX systems. The next-server parameter defines the IP address of the Trivial File Transfer Protocol (TFTP) server and the filename parameter to define the bootstrap loader program to download as shown below:

group {
   next-server 172.16.1.10;
   filename "/pxelinux.0";
   # Hosts allowed to network boot into Swarm
   host clusternode1 { hardware ethernet 00:90:cb:bf:45:26; }
   host clusternode2 { hardware ethernet 00:90:b2:92:09:e4; }
   host clusternode3 { hardware ethernet 00:90:0d:46:7a:b4; }
   }

The Swarm nodes are explicitly defined by MAC address to prevent Swarm from initiating an unattended boot by other servers or workstations in this example.

Configuring PortFast on Switch Ports

PortFast is a switch port configuration parameter that enables a port to bypass the listening and learning Spanning Tree states so the port immediately forwards traffic.

Verify PortFast is configured on the switch ports leading to each node if a storage cluster node is connected to a network switch. The extended time delay can prevent netboot from delivering the Swarm image to a PXE-enabled node in a timely manner if this condition is not met.

Configuring the TFTP Server

The TFTP server transfers configuration or boot files between systems in a local environment. Configure the TFTP server to load the Swarm software onto the cluster nodes after configuring the DHCP server.

To set up the TFTP server:

  1. Install and configure TFTP server software on the boot server.

  2. Create the /tftpboot directory hierarchy.

  3. Copy the kernel and fsimage files to the /tftpboot/profiles/castor directory.

See https://perifery.atlassian.net/wiki/spaces/public/pages/2443808626/Setting+Up+PXE+Booting#DHCP-and-Boot-Server-Redundancy%5BinlineExtension%5D below.

Installing and Configuring TFTP

TFTP server software is available in both free and commercial packages. UNIX distributions commonly include TFTP server software with the standard setup. The tftp-hpa package for UNIX can integrate with Swarm. Source code can be obtained from the Linux Kernel Archives website located at kernel.org/pub/software/network/tftp.

TFTP server software is also available as a binary package in many Linux distributions.

Creating the tftpboot Directory Hierarchy

Configure the server to access the network boot file directory after installing the TFTP server. This directory is typically labeled /tftpboot because TFTP is almost exclusively used for booting network devices.

A sample template is included in the samples/NetworkBoot directory of the Swarm software distribution.

Copying Kernel and fsimage

The Swarm software distribution media includes the kernel and fsimage files, which contain the Swarm embedded operating system. Copy these files to the tftpboot/profiles/castor directory on the TFTP server so they load onto each Swarm node during bootup.

The tftpboot directory on the TFTP server should contain these files after copying the directory template and the Swarm software files:

File Name

Description

tftpboot/pxelinux.0

Boot loader program

tftpboot/profiles/castor/fsimage

Swarm software

tftpboot/profiles/castor/kernel

Swarm operating system kernel

tftpboot/pxelinux.cfg/default

PXELINUX configuration file

See the documentation and ZIP file in the samples/Network-Boot directory on the Swarm distribution media for help with using the PXELINUX boot loader.

DHCP and Boot Server Redundancy

Configure both a primary and secondary DHCP server when setting up the DHCP server. This configuration eliminates a single point of failure if one of the servers goes offline for any reason.

  • See "Failover with ISC DHCP" at madboa.com/geek/dhcp-failover to set up the ISC DHCP daemon for redundancy.

  • Use the primary and secondary DHCP servers as TFTP servers to provide redundancy at the network booting layer.
    Set the next-server parameter in each server to specify a separate IP address when setting up the DHCP servers. the primary or secondary DHCP server handles the PXE boot when it answers a DHCP query.

  • Verify the TFTP boot servers are located in the same broadcast domain (or VLAN) as the Swarm nodes or enable a DHCP relay server on the VLAN to prevent any network interruptions.

Setting Up a Configuration File Server

Platform Server

Skip this section if using Legacy Swarm Platform: the centralized configuration is set up.

Swarm supports centralized node configuration files on an HTTP or FTP server. This method allows booting from a network or a standard USB flash drive. A centralized configuration file server simplifies storage cluster administration by supporting configuration file updates and providing a method to group similar node configurations together.

Set the value of the castor_cfg kernel configuration parameter to a URL that targets the configuration list file to implement a configuration file server as described below.

PXE Boot Example

This is an example PXELINUX configuration file located in the tftpboot/pxelinux.cfg directory on the TFTP boot server.

default profiles/castor/kernel
append initrd=profiles/castor/fsimage ramdisk_size=160000 root=/dev/ram0 
	castor_cfg=http://172.16.1.200/castor/cfg-list.txt

USB Boot Loader Example

This is an example section of the syslinux.cfg located in the root directory on the USB flash drive:

label normal
	kernel kernel
	append initrd=fsimage ramdisk_size=160000 root=/dev/ram0 
		castor_cfg=http://172.16.1.200/castor/cfg-list.txt

Configuration List file example

The castor_cfg kernel configuration parameter specifies a file containing a list of URLs for all configuration files to be loaded by a Swarm node. Swarm configuration files are evaluated in the order in which they are listed in the configuration list file.

Although Swarm configuration settings can be defined multiple times, the last definition is used. By redefining the settings, configuration files can be layered so they contain generally applicable values for a cluster, a group of similar nodes, and values specific to one node.

Example of URLs in a configuration list file:

http://172.16.1.200/castor/cluster.cfg 
http://172.16.1.200/castor/subcluster.cfg 
http://172.16.1.200/castor/testnode.cfg

Each of the configuration files in the list file uses the same format as the Swarm node.cfg file.

See the /caringo/node.cfg.sample in the Swarm software distribution.

See Managing Configuration Settings

Disabling Monitor Power-Saving Activation

Add the following kernel option to the APPEND line in either the syslinux.cfg file on the Swarm USB key or in the PXE boot configuration file to disable the monitor power-saving feature from activating while connected to a Swarm storage node.

This parameter tells the kernel to stop blanking the console when enabled:

consoleblank=0

This feature defaults to 10 minutes. A value of 0 disables the blank timer. Listed below are examples:

PXE Boot Example

This is a PXELINUX configuration file from the tftpboot/pxelinux.cfg directory on the TFTP boot server with the console power saver disabled.

default profiles/castor/kernel
append initrd=profiles/castor/fsimage consoleblank=0 ramdisk_size=160000 root=/dev/ram0 
	castor_cfg=http://172.16.1.200/castor/cfg-list.txt

USB Boot Loader Example

This is a section of the syslinux.cfg contained in the root directory on the USB flash drive with the console power savings disabled.

label normal
	kernel kernel
	append initrd=fsimage consoleblank=0 ramdisk_size=160000 root=/dev/ram0 
		castor_cfg=http://172.16.1.200/castor/cfg-list.txt
  • No labels