Professional Documents
Culture Documents
01 August 2013
This article is a step-by-step guide for deploying a two-node IBM General Parallel File System
(IBM GPFS) V3.5 cluster on IBM AIX 7.1.
Overview
The purpose of this article is to provide a step-by-step guide for installing and configuring a simple
two-node GPFS cluster on AIX. The following diagram provides a visual representation of the
cluster configuration.
Trademarks
Page 1 of 15
developerWorks
ibm.com/developerWorks/
GPFS
GPFS provides a true "shared file system" capability, with excellent performance and scalability.
GPFS allows concurrent access for a group of computers to a common set of file data over a
common storage area network (SAN) infrastructure, a network, or a mix of connection types.
GPFS provides storage management, information lifecycle management tools, and centralized
administration and allows for shared access to file systems from remote GPFS clusters providing a
global namespace.
GPFS offers data tiering, replication, and many other advanced features. The configuration can be
as simple or complex as you want.
Each AIX system is configured with seven SAN disks. One disk is used for the AIX operating
system (rootvg) and the remaining six disks are used by GPFS.
# lspv
hdisk0
hdisk1
hdisk2
hdisk3
hdisk4
hdisk5
hdisk6
00c334b6af00e77b
none
none
none
none
none
none
rootvg
none
none
none
none
none
none
active
The SAN disks (to be used with GPFS) are assigned to both nodes (that is, they are shared
between both partitions). Both AIX partitions are configured with virtual Fibre Channel adapters
and access their shared storage through the SAN, as shown in the following figure.
Page 2 of 15
ibm.com/developerWorks/
developerWorks
The following attributes, shown in the table below, were changed for each hdisk, using the chdev
command.
Table 1.
AIX device name
Size in GB
Algorithm
queue_depth
reserve_policy
hdisk0
50
round_robin
32
no_reserve
hdisk1
50
round_robin
32
no_reserve
Page 3 of 15
developerWorks
ibm.com/developerWorks/
hdisk2
50
round_robin
32
no_reserve
hdisk3
50
round_robin
32
no_reserve
hdisk4
50
round_robin
32
no_reserve
hdisk5
50
round_robin
32
no_reserve
hdisk6
50
round_robin
32
no_reserve
The lsattr command can be used to verify that each attribute is set to the correct value:
# lsattr -El hdisk6 a queue_depth
algorithm
round_robin
queue_depth
32
reserve_policy no_reserve
q algorithm a reserve_policy
Algorithm
Queue DEPTH
Reserve Policy
True
True
True
The next step is to configure Secure Shell (SSH) so that both nodes can communicate with
each other. When building a GPFS cluster, you must ensure that the nodes in the cluster have
SSH configured correctly so that they do not require password authentication. This requires
the configuration of Rivest-Shamir-Adleman algorithm (RSA) key pairs for the root users SSH
configuration. This configuration needs to be configured in both directions, to all nodes in the
GPFS cluster.
The mm commands in GPFS require authentication in order for them to work. If the keys are not
configured correctly, the commands will prompt for the root password each time and the GPFS
cluster might fail. A good way to test this is to ensure that the ssh command can work unhindered
by a request for the roots password.
You can refer to the step-by-step guide for configuring SSH keys on AIX:
You can confirm that the nodes can communicate with each other (unhindered) using SSH with the
following commands on each node:
aixlpar1# ssh aixlpar1a date
aixlpar1# ssh aixlpar2a date
aixlpar2# ssh aixlpar2a date
aixlpar2# ssh aixlpar1a date
With SSH working, configure the WCOLL (Working Collective) environment variable for the root
user. For example, create a text file that lists each of the nodes, one per line:
# vi /usr/local/etc/gfps-nodes.list
aixlpar1a
aixlpar2a
Page 4 of 15
ibm.com/developerWorks/
developerWorks
Add the following entry to the root users .kshrc file. This will allow the root user to execute
commands on all nodes in the GPFS cluster using the dsh or mmdsh commands.
export WCOLL=/usr/local/etc/gfps-nodes.list
The root users PATH should be modified to ensure that all GPFS mm commands are available to
the system administrator. Add the following entry to the root user's .kshrc file.
export PATH=$PATH:/usr/sbin/acct:/usr/lpp/mmfs/bin
The /etc/hosts file should be consistent across all nodes in the GPFS cluster. Each IP address for
each node must be added to /etc/hosts on each cluster node. This is recommended, even when
Domain Name System (DNS) is configured on each node. For example:
# GPFS_CLUSTER1 Cluster - Test
# # GPFS Admin network - en0
10.1.5.110 aixlpar1a aixlpar1
10.1.5.120 aixlpar2a aixlpar2
# # GPFS Daemon - Private Network
10.1.7.110
aixlpar1p
10.1.7.120
aixlpar2p
en1
gpfs.msg.en_US
gpfs.docs.data
gpfs.base
.toc
3.5.0.0
3.5.0.0
3.5.0.0
3.5.0.0
COMMITTED
COMMITTED
COMMITTED
COMMITTED
GPFS
GPFS
GPFS
GPFS
File Manager
Server Messages - U.S.
File Manager
Server Manpages and
Page 5 of 15
developerWorks
ibm.com/developerWorks/
The latest GPFS updates are installed next. Again, you can use SMIT (or installp) to update the
file sets to the latest level. The lslpp command can be used to verify that the GPFS file sets have
been updated.
aixlpar1 : /tmp/cg/gpfs_fixes_3510
aixlpar1 : /tmp/cg/gpfs_fixes_3510
total 580864
-rw-r--r-1 30007
bin
-rw-r--r-1 30007
bin
-rw-r--r-1 30007
bin
-rw-r--r-1 30007
bin
-rw-r--r-1 root
system
-rw-r--r-1 root
system
-rw-r----1 root
system
-rw-r----1 root
system
-rw-r--r-1 root
system
# inutoc .
# ls -ltr
910336 Feb 9 00:10 U858102.gpfs.docs.data.bff
47887360 May 8 08:48 U859646.gpfs.base.bff
99655680 May 8 08:48 U859647.gpfs.gnr.bff
193536 May 8 08:48 U859648.gpfs.msg.en_US.bff
4591 May 10 05:15 changelog
3640 May 10 05:42 README
55931 May 15 10:23 GPFS-3.5.0.10-power-AIX.readme.html
148664320 May 15 10:28 GPFS-3.5.0.10-power-AIX.tar
8946 May 15 14:48 .toc
stdout: yes
stderr: no
(Total time:
18 secs).
+-----------------------------------------------------------------------------+
Pre-commit Verification...
+-----------------------------------------------------------------------------+
Verifying requisites...done
Results...
SUCCESSES
--------Filesets listed in this section passed pre-commit verification
and will be committed.
Selected Filesets
----------------gpfs.base 3.5.0.10
gpfs.msg.en_US 3.5.0.9
1 of 2
(Total time:
18 secs).
(Total time:
18 secs).
+-----------------------------------------------------------------------------+
Summaries:
+-----------------------------------------------------------------------------+
Installation Summary
Page 6 of 15
ibm.com/developerWorks/
developerWorks
-------------------Name
Level
Part
Event
Result
------------------------------------------------------------------------------gpfs.msg.en_US
3.5.0.9
USR
APPLY
SUCCESS
gpfs.base
3.5.0.10
USR
APPLY
SUCCESS
gpfs.base
3.5.0.10
ROOT
APPLY
SUCCESS
gpfs.base
3.5.0.10
USR
COMMIT
SUCCESS
gpfs.base
3.5.0.10
ROOT
COMMIT
SUCCESS
gpfs.msg.en_US
3.5.0.9
USR
COMMIT
SUCCESS
3.5.0.10
COMMITTED
Path: /usr/share/lib/objrepos
gpfs.docs.data
3.5.0.3
COMMITTED
The cluster is created using the mmcrcluster command.* The GPFS cluster name is
GPFS_CLUSTER1. The primary node (or NSD server; discussed in the next section) is aixlpar1p
and the secondary node is aixlpar2p. We have specified that ssh and scp will be used for cluster
communication and administration.
aixlpar1 : /tmp/cg # mmcrcluster C GPFS_CLUSTER1 -N /tmp/cg/gpfs-nodes.txt -p
aixlpar1p -s aixlpar2p -r /usr/bin/ssh -R /usr/bin/scp
Mon Apr 29 12:01:21 EET 2013: mmcrcluster: Processing node aixlpar2
Mon Apr 29 12:01:24 EET 2013: mmcrcluster: Processing node aixlpar1
mmcrcluster: Command successfully completed
mmcrcluster: Warning: Not all nodes have proper GPFS license designations.
Use the mmchlicense command to designate licenses as needed.
mmcrcluster: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
*Note: To ensure that GPFS daemon communication occurs over the private GPFS network,
during cluster creation, we specified the GPFS daemon node names (that is, host names ending
Building a two-node IBM GPFS cluster on IBM AIX
Page 7 of 15
developerWorks
ibm.com/developerWorks/
with p). There are two types of communication to consider in a GPFS cluster, administrative
commands and daemon communication. Administrative commands use remote shell (ssh, rsh,
or other) and socket-based communications. It is considered as a best practice to ensure that
all GPFS daemoncommunication is performed over a private network. Refer to the GPFS
developerWorks wiki for further information and discussion on GPFS network configuration
considerations and practices.
To use a separate network for administration command communication, you can change the
"Admin node name" using the mmchnode command. In this example, the separate network address
is designated by "a" (for Administration) at the end of the node name, aixlpar1a for example.
# mmchnode -admin-interface=aixlpar1p -N aixlpar1a
# mmchnode -admin-interface=aixlpar2p -N aixlpar2a
The mmcrcluster command warned us that not all nodes have the appropriate GPFS license
designation. We use the mmchlicense command to assign a GPFS server license to both the nodes
in the cluster.
aixlpar1 : / # mmchlicense server --accept -N aixlpar1a,aixlpar2a
The following nodes will be designated as possessing GPFS server licenses:
aixlpar2a
aixlpar1a
mmchlicense: Command successfully completed
mmchlicense: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
The cluster is now configured. The mmlscluster command can be used to display cluster
information.
# mmlscluster
GPFS cluster information
========================
GPFS cluster name:
GPFS cluster id:
GPFS UID domain:
Remote shell command:
Remote file copy command:
GPFS_CLUSTER1.aixlpar1p
8831612751005471855
GPFS_CLUSTER.aixlpar1p
/usr/bin/ssh
/usr/bin/scp
At this point, you can use the mmdsh command to verify that the SSH communication is working as
expected on all GPFS nodes. This runs a command on all the nodes in the cluster. If there is an
SSH configuration problem, this command highlights the issues.
Page 8 of 15
ibm.com/developerWorks/
developerWorks
The mmcrnsd command is used to create NSD devices for GPFS. First, we create a text file that
contains a list of each of the hdisk names, their GPFS designation (data, metadata, both*), and the
NSD name.
hdisk1:::dataAndMetadata::nsd01::
hdisk2:::dataAndMetadata::nsd02::
hdisk3:::dataAndMetadata::nsd03::
hdisk4:::dataAndMetadata::nsd04::
hdisk5:::dataAndMetadata::nsd05::
hdisk6:::dataAndMetadata::nsd06::
Page 9 of 15
developerWorks
ibm.com/developerWorks/
*Note: Refer to the GPFS Concepts, Planning, and Installation document for guidance on
selecting NSD device usage types.
Then, run the mmcrnsd command to create the NSD devices.
# mmcrnsd -F /tmp/cg/gpfs-disks.txt
mmcrnsd: Processing disk hdisk1
mmcrnsd: Processing disk hdisk2
mmcrnsd: Processing disk hdisk3
mmcrnsd: Processing disk hdisk4
mmcrnsd: Processing disk hdisk5
mmcrnsd: Processing disk hdisk6
mmcrnsd: Propagating the cluster configuration data to all
affected nodes. This is an asynchronous process.
The lspv command now shows the NSD name associated with each AIX hdisk.
# lspv
hdisk0
hdisk1
hdisk2
hdisk3
hdisk4
hdisk5
hdisk6
00c334b6af00e77b
none
none
none
none
none
none
rootvg
nsd01
nsd02
nsd03
nsd04
nsd05
nsd06
active
The mmlsnsd command displays information for each NSD, in particular which GPFS file system is
associated with each device. At this point, we have not created a GPFS file system. So each disk
is currently free. You'll notice that under NSD servers each device is shown as directly attached.
This is expected for SAN-attached disks.
# mmlsnsd
File system
Disk name
NSD servers
--------------------------------------------------------------------------(free disk)
nsd01
(directly attached)
(free disk)
nsd02
(directly attached)
(free disk)
nsd03
(directly attached)
(free disk)
nsd04
(directly attached)
(free disk)
nsd05
(directly attached)
(free disk)
nsd06
(directly attached)
Page 10 of 15
ibm.com/developerWorks/
developerWorks
# cat /tmp/cg/gpfs-disk.txt
nsd01:::dataAndMetadata:-1::system
nsd02:::dataAndMetadata:-1::system
nsd03:::dataAndMetadata:-1::system
# cat /tmp/cg/gpfs1-disk.txt
nsd04:::dataAndMetadata:-1::system
nsd05:::dataAndMetadata:-1::system
nsd06:::dataAndMetadata:-1::system
# mmcrfs /gpfs gpfs0 -F/tmp/cg/gpfs-disks.txt -M2 -R 2
# mmcrfs /gpfs1 gpfs1 -F/tmp/cg/gpfs1-disks.txt -M2 -R 2 B 1M
The mmlsnsd command displays the NSD configuration per file system. NSD devices 1 to 3 are
assigned to the gpfs0 device and devices 4 to 6 are assigned to gpfs1.
# mmlsnsd
File system
Disk name
NSD servers
--------------------------------------------------------------------------gpfs0
nsd01
(directly attached)
gpfs0
nsd02
(directly attached)
gpfs0
nsd03
(directly attached)
gpfs1
nsd04
(directly attached)
gpfs1
nsd05
(directly attached)
gpfs1
nsd06
(directly attached)
Free %Used
0.89
12%
0.96
71%
1.70
16%
1.36
33%
2.00
1%
0.79
21%
0.97
3%
1.00
1%
1.00
1%
0.12
1%
1.99
1%
147.69
2%
147.81
2%
The mmdsh command can be used here to quickly check the file system status on all the nodes.
aixlpar1 :
aixlpar2:
aixlpar2:
aixlpar1:
aixlpar1:
2%
2%
2%
2%
4041
4041
4041
4041
7%
3%
3%
7%
/gpfs
/gpfs1
/gpfs1
/gpfs
Page 11 of 15
developerWorks
ibm.com/developerWorks/
Disks in storage pool: system (Maximum disk size allowed is 422 GB)
nsd01
50G
-1 yes
yes
49.27G ( 99%)
872K ( 0%)
nsd02
50G
-1 yes
yes
49.27G ( 99%)
936K ( 0%)
nsd03
50G
-1 yes
yes
49.27G ( 99%)
696K ( 0%)
-------------------------------- ------------------(pool total) 150G
147.8G ( 99%)
2.445M ( 0%)
(total)
=============
150G
Inode Information
----------------Number of used inodes:
Number of free inodes:
Number of allocated inodes:
Maximum number of inodes:
==================== ===================
147.8G ( 99%)
2.445M ( 0%)
4040
62008
66048
66048
=============
150G
Inode Information
----------------Number of used inodes:
Number of free inodes:
Number of allocated inodes:
Maximum number of inodes:
==================== ===================
148.7G ( 99%)
4.812M ( 00%)
4040
155704
159744
159744
Page 12 of 15
ibm.com/developerWorks/
developerWorks
You can use the mmgetstate command to view the status of the GPFS daemons on all the nodes in
the cluster.
# mmgetstate -aLs
Node number Node name
Quorum Nodes up Total nodes GPFS state Remarks
-----------------------------------------------------------------------------------1
aixlpar2a
1*
2
2
active
quorum node
2
aixlpar1a
1*
2
2
active
quorum node
Summary information
--------------------Number of nodes defined in the cluster:
Number of local nodes active in the cluster:
Number of remote nodes joined in this cluster:
Number of quorum nodes defined in the cluster:
Number of quorum nodes active in the cluster:
Quorum = 1*, Quorum achieved
2
2
0
2
2
Summary
Congratulations! You've just configured your first GPFS cluster. In this article, you've learnt how to
build a simple two-node GPFS cluster on AIX. This type of configuration can be easily deployed to
support clustered workload with high availability requirements, for example an MQ multi-instance
cluster. GPFS offers many configuration options; you can spend a lot of time planning for a GPFS
cluster. If you are seriously considering a GPFS deployment, I encourage you to read all of the
available GPFS documentation in the Resources section of this article.
Resources
The following resources were referenced during the creation of this article.
IBM GPFS Wiki
IBM GPFS FAQ
IBM General Parallel File System (GPFS) 3.5
IBM General Parallel File System for Power Version 3.4
Setting up a multicluster environment using General Parallel File System
Testing and support statement for WebSphere MQ multi-instance queue managers
GPFS and TSM Backups
GPFS Backup questions (mmbackup)
Building a two-node IBM GPFS cluster on IBM AIX
Page 13 of 15
developerWorks
ibm.com/developerWorks/
Page 14 of 15
ibm.com/developerWorks/
developerWorks
Page 15 of 15