One of the common problems with sophisticated clusters used for virtual machine deployment is fencing. Proper fencing ensures high availability and preserves data integrity. Read on to see how CARI.net is implementing fencing for virtual machine clusters and to see out power fencing agent.
When a host machine stops responding, the virtual machines on it must be moved to another host. This is easy with a shared storage system in place, but there is a risk that the virtual machine is still accessing the shared storage. If the virtual machine is running on two hosts, both accessing the same data, there is a high risk of data corruption.
To prevent this, the original host must be “fenced”. This means it is completely taken off the network. There are various ways to do this. At CARI.net, our frontline fencing is done via IPMI. This is an inexpensive option, as many new server boards have IPMI built in. There are many weaknesses to this technique, however, so a secondary fencing technique is important. On larger clusters, a network controlled power distribution unit makes an excellent fence, but such units are very expensive.
To bring down costs and make highly available private clouds more accessible to our customers, CARI.net has been experimenting with lower cost fencing agents. One that came to our attention is the Tripp Lite PDUMH20ATNET. This system is relatively inexpensive, rack mounted, and provides 16 power receptacles that can be controlled via telnet or SNMP. Unfortunately, this system does not have a fenced agent. Fenced is the fencing component of the Red Hat Cluster system. It requires an agent to be written to control any particular fencing method.
I have written an agent for this device, and we are now using it on a small internal cloud. We hope to use the same architecture for customer private clouds in the future. We have also decided to open source the agent. It is included here: fence_tripplite_snmp. The agent is documented in the comments, just open it up in a text editor. Please feel free to report any bugs or ask any questions to edorthe at cari dot net