In an environment where high availability is crucial, you won't be able to work without cross-machine replication in the long run. While normal RAIDs, backups, and networked file systems are reasonable ways to protect yourself from data loss and allow to access the same data from multiple hosts, they still leave you with a single point of failure, the storage itself. There are plenty of rather expensive solutions out there, which come with two machines and multiple NICs in one box, accessing the same RAID array (for example NetApp®). But for most environments this is overkill, or just not affordable. GlusterFS In this mini tutorial, I'd like to describe how to create a replicated clustered storage with two machines. This is probably the most common use-case. It removes the single point of failure (single storage backend). This short tutorial is based on CentOS 5.4 x64, but the GlusterFS team also provides binaries for other RedHat based systems and Debian (including derivates), as well as the source code. So it should be easy enough to apply this to any Linux distribution of your choice. So let's go... Obtaining and Installing the BinariesAll relevant binaries can be found here: http://ftp.gluster.com/pub/gluster/glusterfs/2.0/ At the time of writing, 2.0.7 was current, so let's use this here: $ curl -O http://ftp.gluster.com/pub/gluster/glusterfs/2.0/2.0.7/CentOS/glusterfs-server-2.0.7-1.x86_64.rpm $ curl -O http://ftp.gluster.com/pub/gluster/glusterfs/2.0/2.0.7/CentOS/glusterfs-client-2.0.7-1.x86_64.rpm $ curl -O http://ftp.gluster.com/pub/gluster/glusterfs/2.0/2.0.7/CentOS/glusterfs-common-2.0.7-1.x86_64.rpm In addition to that some libraries have to be installed (part of the CentOS base repository): $ yum -y install libibverbs fuse fuse-libs That's already it. Now we can actually install the RPMs" $ rpm -Uhv glusterfs-*-2.0.7-1.x86_64.rpm Note: On the GlusterFS clients you can obviously leave out the server RPM. Configuring the Servers (Nodes)For more in-depth configuration details, you could either see /usr/share/doc/glusterfs or the Documentation On both servers you need to create a volume, which strictly speaking is nothing else than a directory. However, I'd suggest to mount an Ext3 partition into a separate directory, which you are going to use for a Gluster volume. Make sure that the volumes on both nodes have the same physical size. The documentation is somewhat sparse on the exact requirements, so let's not challenge our luck by doing it in any other than their recommended ways. In this example, I assume that
It took me quite a while to find the right combination and, more importantly, order of the different I/O, Caching, Threading layers. Initially I had a configuration where I achieved only about 2MB/sec write speed, but 40MB/sec read speed (over GBit ethernet). Now with the configuration shown here, I am at 38MB/sec in both directions. Quite good for replicated storage via GBit Ethernet, I think. For small files I reach up to 180MB/sec, but that's probably not accurate as caching and write-behind is being used. But it does sound exciting. On the two nodes, deploy the following configuration file as /etc/glusterfs/glusterfsd.vol. Server 1 ### Export volume "brick" with the contents of "/home/export" directory. volume posix type storage/posix # POSIX FS translator option directory /gfs # Export this directory end-volume ### Add POSIX record locking support to the storage brick volume brick type features/posix-locks option mandatory on # enables mandatory locking on all files subvolumes posix end-volume volume vol1 type performance/io-threads option thread-count 8 subvolumes brick end-volume ### Add network serving capability to above brick. volume server type protocol/server option transport-type tcp # For TCP/IP transport option transport.socket.listen-port 6996 # Default is 6996 #option client-volume-filename /etc/glusterfs/glusterfs-client.vol subvolumes vol1 option auth.addr.vol1.allow * # access to "brick" volume end-volume Server2 That would be almost the same file. Just make sure to replace all occurrences of vol1 with vol2, as the client will reference to the volumes by name. The server part is done here. Fire up both instances: $ service glusterfsd start Have a look into the log files in /var/log/glusterfs to make sure that everything went well. Configuring the Client(s)To set up a client for your two nodes, install the RPMs as described earlier (except the server-related one). Change /etc/glusterfs/glusterfs.vol to look like this: ### Add client feature and attach to remote subvolume of server1 volume client1 type protocol/client option transport-type tcp # for TCP/IP transport option remote-host 10.0.100.101 # IP address of the remote brick option transport.socket.remote-port 6996 # default server port is 6996 option remote-subvolume vol1 # name of the remote volume end-volume ### Add client feature and attach to remote subvolume of server2 volume client2 type protocol/client option transport-type tcp # for TCP/IP transport option remote-host 10.0.100.102 # IP address of the remote brick option transport.socket.remote-port 6996 # default server port is 6996 option remote-subvolume vol2 # name of the remote volume end-volume ## Add replicate feature. volume replicate type cluster/replicate subvolumes client1 client2 end-volume volume writebehind type performance/write-behind option aggregate-size 1MB option window-size 2MB option flush-behind off subvolumes replicate end-volume volume iocache type performance/io-cache option page-size 256KB option page-count 2 subvolumes writebehind end-volume Make sure to amend the IP addresses according to your setup.
Now let's mount the storage, let's say to /mnt: $ glusterfs --volfile=/etc/glusterfs/glusterfs.vol /mnt Have a look into both the client's and the servers' log files again to make sure everything is ok. If it is, happy storing! Job done. |
Shortcuts |