What are the file descriptor/connection messages in syslog

The per-node count of the file descriptors can be watched over time in order to detect:

  1. Apps that abandon open sockets in a socket pool, or
  2. Apps that open too many connections to Swarm (ie. more than it reasonably needs), or
  3. Periods of high usage (ie. spikes in network connections).

We have had customers that experienced problems #1 and #2 in their environments.  In the case of #1, abandoned connections, a graphical plot of the file descriptors will show a steady increase over time while the client-side activity from the SNMP counters will be stable. In the particular case that I have in mind, the application software had a socket pool where they failed to properly mark a previously used socket as returned; and thus the application kept opening new sockets in order to service new requests. Once they corrected the socket accounting, the sockets began to be reused and the file descriptor count remained steady.

In the case of #2, the customer had configured their PHP Apache application to use socket pools. However, they didn't take into account that Apache forks many processes per web server and each process has its own connection pool. The size of the connection pools multiplied by the number of Apache processes multiplied by the number of web servers meant that when the application was started there was a sudden jump in the file descriptors to a few thousand per Swarm node and there was almost no activity in the SNMP counters.

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.