You are on page 1of 28

ISAM Appliance Clustering

SSH Tunnels
DATA REPLICATION AND COMMUNICATION ACCESS FOR CLUSTER SERVICES

Thomas Ermis Nick Lloyd Dave Hooks


IBM Security L2 Support IBM Security L2 Support IBM Security L2 Support

January 16, 2018


Agenda

• Appliance Cluster Pattern Overview

• Processes in a Cluster (What runs where)

• How is the data replicated?

• How are the services contacted?

• Load Balancing Best Practices

• What happens when the Primary Master is lost.

2 IBM Security
Appliance Clustering Pattern

• This Open Mic assumes an intermediate understanding of clustering. You should be familiar
with the following concepts:
̶ Master, Regular, and Restricted Nodes.
̶ The Services provided in an appliance cluster. ISAM Runtime, Internal LDAP, SSL Keystores, GeoLocation
database, ConfigDB/HVDB, and Distrbuted Session Cache (DSC).

• We will be discussing the common dual data center pattern in which DC1 hosts the
Primary/Tertiary masters and DC2 hosts the Secondary/Quatenary masters.

• The internal postgres database is used for both the CONFIGDB and the HVDB for a small volume
deployment. The best-practice for a large scale production deployments is to use an external
database for both the CONFIGDB and HVDB.

• The embedded OpenLDAP is used as the ISAM Primary Register with an SDS 6.3.1 LDAP
Federated in for Basic Users.

3 IBM Security
Dual data-center Cluster Pattern

Primary Secondary
Master Master

Non Restricted Non Restricted


Node Node
(Internal) (Internal)
Tertiary Quaternary
Master Master
Restricted (DSC ONLY) (DSC ONLY) Restricted
Node (DMZ) Node (DMZ)

4 IBM Security
Services in a Cluster (ISAM Runtime)

• Better known as the Policy Server (PDMGRD).

• The PDMGRD and read-write OpenLDAP processes are treated as one atomic unit.

• If PDMGRD is stopped, then OpenLDAP is also stopped. From now on we will just use the term
PDMGRD.

• The PDMGRD is only active on the Primary Master.

• At any given time there is only one PDMGRD running. There is no automatic failover. This is by
design to avoid ending up with conflicting ISAM Policy databases.

• The PDMGRD consists of basically 12 files depending on setup. Updates are not pushed out
using SSH tunnels. The cluster manager monitors the files for changes and pushes out the
complete file to the cluster when a change is made.

• ISAM Components communicate directly to PDMGRD (Policy Server Only) by direct access over
ManagementIP:7135.

5 IBM Security
Services in a Cluster (Internal OpenLDAP)

• As noted in the previous slides the read-write is tied to the Primary Master with the Policy
Server.

• The read-write is only active on the Primary Server.

• It runs on 127.0.0.1:389 for internal access only.

• It runs on ManagementIP:636 for external access.

• Only the read-write is accessible over 636.

• At any given time there is only one read-write LDAP. Remember, this is tied to the Policy Server.
By design of the ldap.conf file there is read-only HA to a local copy running on 127.0.0.1:390 on
each non-restricted node.

• A restricted node has an SSH Tunnel to the Secondary Master via 127.0.0.1:390

• When a non-restricted node is added to the cluster it receives a filesystem copy of the necessary
files. After that, the local replica is kept in sync via an SSH Tunnel back to the Primary via
127.0.0.1:389.

• Components communicate to the Primary via an SSH Tunnel.

6 IBM Security
Services in a Cluster (SSL Certificate Keystores)

• While not really a process these are considered a service and we do get asked about them.

• They can only be updated on the Primary.

• These are kept in sync via filesystem replication. There are no SSH Tunnels involved.

• All nodes regardless of type receive a copy.

7 IBM Security
Services in a Cluster (GeoLocation Database)

• While not really a process these are considered a service and we do get asked about them.

• It can only be updated on the Primary.

• These are kept in sync via filesystem replication. There are no SSH Tunnels involved.

• All nodes regardless of type receive a copy.

8 IBM Security
Services in a Cluster (CONFIGDB)

• The read-write postgres runs on the Primary using 127.0.0.1:2020.

• Config can only be updated on the Primary.

• Every node with AAC/Federation activated runs a local copy in stand-by mode.

• It contacts the Primary with an SSH Tunnel of 2029:127.0.0.1:2020.

• Data replication is accomplished via SSH Tunnels.

• Read-only HA is accomplished by using this local copy.

• There is no automatic read-write promotion. A node must be promoted to be the new Primary.

9 IBM Security
Services in a Cluster (HVDB)

• The read-write postgres runs on the Primary using 127.0.0.1:2024.

• There is a read-only backup on the Secondary Only. There are not copies on each node like the
CONFIGDB. It contacts the Primary with an SSH Tunnel of 2033:127.0.0.1:2024.

• Data replication is accomplished via SSH Tunnels on different ports.

• HA is accomplished by two SSH Tunnels to the Primary and Secondary.


̶ Secondary -> 2033:127.0.0.1:2024 -> Primary
̶ Primary -> 2046:127.0.0.1:2024 -> Secondary

• Data replication is accomplished via SSH Tunnels between a process that syncs the two
databases.

• There is no automatic read-write promotion. The Secondary must be promoted to be the new
Primary.

10 IBM Security
Services in a Cluster (Distributed Session Cache)

• The DSC only runs on the Master Nodes using 127.0.0.1:2026

• The read-write runs on the Primary.

• The others are hot-standby ready to take over as read-write. Failover is automatic.

• Data replication is accomplished via an inter-DSC server replicator service. Each process listens
on its ManagementIP:2027. This is a common gotcha when setting up the DSC because
everything uses an SSH tunnel so only port 22 is opened at firewalls. Don’t forget 2027 for the
Master Nodes!!!

• The Failover daisy chains from Primary to Secondary to Tertiary to Quatenary.

• Reverse Proxy instances will have the following set when the DSC is enabled:
̶ server = 9,http://127.0.0.1:2035/DSess/services/Dsess -> Primary
̶ server = 9,http://127.0.0.1:2036/DSess/services/Dsess -> Secondary
̶ server = 9,http://127.0.0.1:2037/DSess/services/Dsess -> Tertiary
̶ server = 9,http://127.0.0.1:2038/DSess/services/Dsess -> Quatenary

• The above are all SSH Tunnels to the Primary, Secondary, Tertiary, and Quatenary.

11 IBM Security
Services in a Cluster (Primary Master)
ISAM Runtime:
Policy Server on ManagementIP:7135.
No automatic takeover. A node must be promoted to Primary.

LDAP
Read-Write LDAP on 127.0.0.1:389 and ManagementIP:636.
No automatic takeover. A node must be promoted to Primary.

CONFIGDB:
Read-Write 127.0.0.1:2020
SSH Tunnel (2029:127.0.0.1:2020 -> Secondary) For Read-Only HA.

HVDB:
Read-Write on 127.0.0.1:2024.
SSH Tunnel (2046:127.0.0.1:2024 -> Secondary) for Read-Only HA.
SSH Tunnel (2034:127.0.0.1:2025 -> Secondary) for Data Replication.

DSC:
Read-Write DSC on 127.0.0.1:2026.
Inter-DSC server replicator service on ManagementIP:2027.
SSH Tunnel (2036:127.0.0.1:2026 -> Secondary) for HA.
SSH Tunnel (2037:127.0.0.1:2026 -> Tertiary) for HA.
SSH Tunnel (2038:127.0.0.1:2026 -> Quatenary) for HA.

12 IBM Security
Services in a Cluster (Secondary Master)
ISAM Runtime:
Replicated files. Can become Primary Master if promoted.

LDAP
Read-Only LDAP on 127.0.0.1:390.
SSH Tunnel (389:127.0.0.1:389 -> Primary) for Read-Write and Data Replication.

CONFIGDB:
Read-Only 127.0.0.1:2020
SSH Tunnel (2029:127.0.0.1:2020 -> Secondary) For HA and Data Replication.

HVDB:
Read-Only on 127.0.0.1:2024.
SSH Tunnel (2033:127.0.0.1:2024 -> Primary) for Read-Only HA.
SSH Tunnel (2034:127.0.0.1:2025 -> Primary) for Data Replication.

DSC:
Hot Standby DSC on 127.0.0.1:2026.
Inter-DSC server replicator service on ManagementIP:2027.
SSH Tunnel (2035:127.0.0.1:2026 -> Primary) for HA.
SSH Tunnel (2037:127.0.0.1:2026 -> Tertiary) for HA.
SSH Tunnel (2038:127.0.0.1:2026 -> Quatenary) for HA.

13 IBM Security
Services in a Cluster (Tertiary Master)
ISAM Runtime:
Replicated files. Can become Primary Master if promoted, but the HVDB will be lost.

LDAP
Read-Only LDAP on 127.0.0.1:390.
SSH Tunnel (389:127.0.0.1:389 -> Primary) for Read-Write and Data Replication.

CONFIGDB:
AAC/Federation is not activated so no services.

HVDB:
AAC/Federation is not activated so no services.

DSC:
Hot Standby DSC on 127.0.0.1:2026.
Inter-DSC server replicator service on ManagementIP:2027.
SSH Tunnel (2035:127.0.0.1:2026 -> Primary) for HA.
SSH Tunnel (2036:127.0.0.1:2026 -> Secondary) for HA.
SSH Tunnel (2038:127.0.0.1:2026 -> Quatenary) for HA.

14 IBM Security
Services in a Cluster (Quaternary Master)
ISAM Runtime:
Replicated files. Can become Primary Master if promoted, but the HVDB will be lost.

LDAP
Read-Only LDAP on 127.0.0.1:390.
SSH Tunnel (389:127.0.0.1:389 -> Primary) for Read-Write and Data Replication.

CONFIGDB:
AAC/Federation is not activated so no services.

HVDB:
AAC/Federation is not activated so no services.

DSC:
Hot Standby DSC on 127.0.0.1:2026.
Inter-DSC server replicator service on ManagementIP:2027.
SSH Tunnel (2035:127.0.0.1:2026 -> Primary) for HA.
SSH Tunnel (2036:127.0.0.1:2026 -> Secondary) for HA.
SSH Tunnel (2037:127.0.0.1:2026 -> Tertiary) for HA.

15 IBM Security
Services in a Cluster (Restricted Node)
ISAM Runtime:
Replicated files. Cannot become Primary Master because restricted node cannot be promoted.

LDAP
SSH Tunnel (389:127.0.0.1:389 -> Primary) for Read-Write and Data Replication.
SSH Tunnel (390:127.0.0.1:390 -> Secondary) for Read-Only HA.

CONFIGDB:
Read-Only 127.0.0.1:2020
SSH Tunnel (2029:127.0.0.1:2020 -> Secondary) For HA and Data Replication.

HVDB:
SSH Tunnel (2033:127.0.0.1:2024 -> Primary) for Read-Write.
SSH Tunnel (2046:127.0.0.1:2024 -> Secondary) for Read-Only HA.

DSC:
SSH Tunnel (2035:127.0.0.1:2026 -> Primary) for HA.
SSH Tunnel (2036:127.0.0.1:2026 -> Secondary) for HA.
SSH Tunnel (2037:127.0.0.1:2026 -> Tertiary) for HA.
SSH Tunnel (2038:127.0.0.1:2026 -> Quatenary) for HA.

16 IBM Security
Services in a Cluster (Non-Restricted Node)
ISAM Runtime:
Replicated files. Can become Primary Master if promoted.

LDAP
Read-Only LDAP on 127.0.0.1:390.
SSH Tunnel (389:127.0.0.1:389 -> Primary) for Read-Write and Data Replication.

CONFIGDB:
Read-Only 127.0.0.1:2020.
SSH Tunnel (2029:127.0.0.1:2020 -> Secondary) For HA and Data Replication.

HVDB:
SSH Tunnel (2033:127.0.0.1:2024 -> Primary) for Read-Write.
SSH Tunnel (2046:127.0.0.1:2024 -> Secondary) for Read-Only HA.

DSC:
SSH Tunnel (2035:127.0.0.1:2026 -> Primary) for HA.
SSH Tunnel (2036:127.0.0.1:2026 -> Secondary) for HA.
SSH Tunnel (2037:127.0.0.1:2026 -> Tertiary) for HA.
SSH Tunnel (2038:127.0.0.1:2026 -> Quatenary) for HA.

17 IBM Security
Load Balancing Best Practices
How can I minimize traffic flow between data centers?

ISAM Runtime
There is no load-balancing because there is only ever one active Policy Server.
Reverse Proxies communication is minimal. Contacting at startup and when receiving updates.
Java Applications which manage users should be using the RgyDirect API.
Java Applications which manage policy, ACLs, etc. should run in the data center with the active
Policy Server.

LDAP
For read-write there is no way. All write operations will go to the Primary Master.
For read-only it is possible to update the ldap.conf and set:

replica = 127.0.0.1,390,readonly,9

Now, read-only operations will use the local replica and not leave the system.

CONFIGDB/HVDB
There is no load-balancing because there is only ever one active read-write.
The SSH Tunnels as described will be used. There is no way to change this.
Use an external DB replicated between the data centers to minimize traffic.

18 IBM Security
Load Balancing Best Practices (DSC)
DSC
Can I mess around with these setting?

server = 9,http://127.0.0.1:2035/DSess/services/DSess
server = 9,http://127.0.0.1:2036/DSess/services/DSess
server = 9,http://127.0.0.1:2037/DSess/services/DSess
server = 9,http://127.0.0.1:2038/DSess/services/DSess

Not really but sort of. There is no load-balancing because there is only ever one Active DSC Master.

Now, there is a health check that happens to 2036, 2037, 2038 to make sure the hot standby is up.
You could do something like this for a reverse proxy in DC2:

server = 9,http://127.0.0.1:2035/DSess/services/Dsess (Primary)


server = 9,http://127.0.0.1:2036/DSess/services/Dsess (Secondary if whole DC1 goes down)
# server = 9,http://127.0.0.1:2037/DSess/services/Dsess (Why bother because if DC1 is down…)
server = 9,http://127.0.0.1:2038/DSess/services/Dsess

This will minimize traffic from a DC2 Reverse Proxy back to DC1 but Support does not recommend
this.

19 IBM Security
What Happens When The Primary Master Fails?
Should the Primary fail, or DC1 is lost, the following happens:

The Secondary Master DSC does not receive data on the Inter-DSC Replicator, can ping the ERE, and
switches to active mode.

The connection to http://127.0.0.1:2035/DSess/services/DSess fails, the connection to


http://127.0.0.1:2036/DSess/services/DSess is now used. The health check has returned it is now
running in active mode.

Restricted nodes contact LDAP over 390:127.0.0.1:390 -> Secondary

Non-restricted nodes use their local copy on 127.0.0.1:390.

The CONFIGDB and HVDB are used on the secondary in read-only mode.

Impacts:

Current users with sessions are not affected.

Current user with sessions and current AAC policy, such as an already registered device, are not affected.

New users that just need to login will succeed.

New users that require a write operation to the internal LDAP, such as an imported service account expired password flow, will
fail.

New or Current users which hit policy that requires a write operation to the HVDB, such as device registration will fail.

20 IBM Security
Things to check before opening a Case:

• Some of the settings such as LDAP:389 and DSC:2035 are not using SSL. Is this a security
exposure? No, because when the traffic goes across the network it is encrypted via the SSH
Tunnel.

• We have a node that is just not behaving. It will not sync, the SSH Tunnels is missing, etc. What
should we collect for a Case? Sometimes the best thing to do is remove the node from the cluster
then add it back. The config on the node will not be lost other than that related to clustering.

• Make sure firewalls and routes are not enabled to check for Denial of Service attacks. As you can
see there can be quite a lot of traffic which flows over port 22.

• The DSC is not working correctly. Did you remember to open up Inter-DSC Replicator port 2027?

21 IBM Security
ISAM Clustering Error Messages In A Support File

Cluster Manager log is located at /var/isam/clustermgr/log/msg__cluster_mgr.log. Check for errors


such as:
Primary is unable to contact a node:

2018-01-14-08:27:13.010-06:00I----- 0x38A70036 WebSEAL-Mgmt-API ERROR wga Common AMWASSHTunnel.cpp 611 0x7fe7de35e700


WGAWA0054E An error occurred while executing the system call: /usr/sbin/ssh 192.168.254.80 mesa_config isam.cluster validate schema
192.168.254.70 5 (0xff) ssh: connect to host 192.168.254.80 port 22: No route to host

The DSCD log is located at /var/dsc/log/dscd.log. Check for errors such as:
Primary is unable to contact a node:

2018-01-14-15:22:47.001-06:00I----- 0x38A0A302 /opt/dsc/bin/dscd ERROR wds admin AMWSMSSocket.cpp 1262 0x7f3a90545700


DPWDS0770E Function call, gsk_secure_soc_read, failed error: 000001a4 GSK_ERROR_SOCKET_CLOSED.

2018-01-14-15:22:47.003-06:00I----- 0x38A0A29C /opt/dsc/bin/dscd ERROR wds server AMWSMSReplicator.cpp 367 0x7f3a90545700


DPWDS0668E Unable to read replicator protocol header from the remote node

22 IBM Security
Clustering Resources
Open Mic: ISAM 9 Clustering, High Availability and Disaster Recovery: Part 1

https://www.youtube.com/watch?v=8wxDI8vv53o&index=17&list=PLFip581NcL2W4ZJETwsIdcOtCIGNy9yP_

Cluster Support in Knowledge Center


https://www.ibm.com/support/knowledgecenter/en/SSPREK_9.0.4/com.ibm.isam.doc/admin/concept/alps_cluster_support_container.html

23 IBM Security
But wait, there’s more…

Is it possible to gather all this information?

Yes, we are happy to announce the creation of the ISAM Support GitHub Repository at:

https://github.com/IBM-Security/isam-support

This is collection of diagnostic tools, configuration examples, and best-practices which we feel will
benefit ISAM admins for self-help.

Having some cluster issues and want to check the health of the processes and tunnels?

The next slide shows you how to use the script used to create this presentation.

24 IBM Security
Example Usage
[nlloyd@isam case-data]$ git clone https://github.com/IBM-Security/isam-support
Initialized empty Git repository in /home/nlloyd/case-data/isam-support/.git/
remote: Counting objects: 25, done.
remote: Compressing objects: 100% (11/11), done.
remote: Total 25 (delta 1), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (25/25), done.

[nlloyd@isam case-data]$ cd isam-support/diagnostic/appliance-cluster/


[nlloyd@isam appliance-cluster]$ chmod +x health-check.sh
[nlloyd@isam appliance-cluster]$ ./health-check.sh /home/nlloyd/case-data/TS000012345/support/
================================================================================
SYSTEM INFO
================================================================================
HostName="isam9020.level2.org"
9.0.2.1
20170116-1957

================================================================================
ALL SSH TUNNELS
================================================================================
/usr/bin/ssh -L 2036:127.0.0.1:2026 cluster@192.168.254.80
/usr/bin/ssh -L 2037:127.0.0.1:2026 cluster@192.168.254.90
/usr/bin/ssh -L 2038:127.0.0.1:2026 cluster@192.168.254.100
/usr/bin/ssh -L 2034:127.0.0.1:2025 cluster@192.168.254.80
/usr/bin/ssh -L 2046:127.0.0.1:2024 cluster@192.168.254.80
/usr/bin/ssh -L 2030:127.0.0.1:2021 cluster@192.168.254.80
/usr/bin/ssh -L 2029:127.0.0.1:2020 cluster@192.168.254.80

================================================================================
DSC TUNNELS
================================================================================
/usr/bin/ssh -L 2036:127.0.0.1:2026 cluster@192.168.254.80
/usr/bin/ssh -L 2037:127.0.0.1:2026 cluster@192.168.254.90
/usr/bin/ssh -L 2038:127.0.0.1:2026 cluster@192.168.254.100
….

25 IBM Security
QUESTIONS?

26 IBM Security
IBM Security Learning Academy
www.SecurityLearningAcademy.com

New content Learning at no


published daily! cost!

Learning Videos ● Hands-on Labs ● Live Events

27 IBM Security
THANK YOU
FOLLOW US ON:

ibm.com/security

securityintelligence.com
xforce.ibmcloud.com

@ibmsecurity

youtube/user/ibmsecuritysolutions

© Copyright IBM Corporation 2018. All rights reserved. The information contained in these materials is provided for informational purposes only, and is provided AS IS without warranty of any kind, express
or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, these materials. Nothing contained in these materials is intended to, nor shall have the effect of,
creating any warranties or representations from IBM or its suppliers or licensors, or altering the terms and conditions of the applicable license agreement governing the use of IBM software. References in
these materials to IBM products, programs, or services do not imply that they will be available in all countries in which IBM operates. Product release dates and / or capabilities referenced in these materials
may change at any time at IBM’s sole discretion based on market opportunities or other factors, and are not intended to be a commitment to future product or feature availability in any way. IBM, the IBM logo,
and other IBM products and services are trademarks of the International Business Machines Corporation, in the United States, other countries or both. Other company, product, or service names may be
trademarks or service marks of others.

Statement of Good Security Practices: IT system security involves protecting systems and information through prevention, detection and response to improper access from within and outside your
enterprise. Improper access can result in information being altered, destroyed, misappropriated or misused or can result in damage to or misuse of your systems, including for use in attacks on others. No IT
system or product should be considered completely secure and no single product, service or security measure can be completely effective in preventing improper use or access. IBM systems, products and
services are designed to be part of a lawful, comprehensive security approach, which will necessarily involve additional operational procedures, and may require other systems, products or services to be most
effective.

IBM DOES NOT WARRANT THAT ANYSYSTEMS, PRODUCTS OR SERVICES ARE IMMUNE FROM, OR WILL MAKE YOUR ENTERPRISE IMMUNE FROM, THE MALICIOUS OR ILLEGAL CONDUCT
OF ANY PARTY.

You might also like