You are on page 1of 11

Architecture of DB2 LUW

DB2 LUW:
DB2 LUW architecture is a 3 model design. They are:
1.

Process Model

2.

Memory Model and

3.

Storage model

1. Process Model

Knowledge of the DB2 process model will help you to understand how the database manager and its associated
components interact, and this can help you to troubleshoot problems that might arise.
The process model that is used by all DB2 database servers facilitates communication between database servers and
clients. It also ensures that database applications are isolated from resources, such as database control blocks and
critical database files.
The DB2 database server must perform many different tasks, such as processing database application requests or
ensuring that log records are written out to disk. Each task is typically performed by a separate engine dispatchable
unit (EDU).
There are many advantages to using a multithreaded architecture for the DB2 database server. A new thread
requires less memory and fewer operating system resources than a process, because some operating system
resources can be shared among all threads within the same process. Moreover, on some platforms, the context
switch time for threads is less than that for processes, which can improve performance. Using a threaded model on
all platforms makes the DB2 database server easier to configure, because it is simpler to allocate more EDUs when
needed, and it is possible to dynamically allocate memory that must be shared by multiple EDUs.
For each database being accessed, separate EDUs are started to deal with various database tasks such as
prefetching, communication, and logging. Database agents are a special class of EDU that are created to handle
application requests for a database.
In general, you can rely on the DB2 database server to manage the set of EDUs. However, there are DB2 tools that
look at the EDUs. For example, you can use the db2pd command with the -edus option to list all EDU threads that
are active.
Each client application connection has a single coordinator agent that operates on a database. A coordinator
agent works on behalf of an application, and communicates to other agents using private memory, interprocess
communication (IPC), or remote communication protocols, as needed.
The DB2 architecture provides a firewall so that applications run in a different address space than the DB2 database
server. The firewall protects the database and the database manager from applications, stored procedures, and
user-defined functions (UDFs). The firewall maintains the integrity of the data in the databases, because it prevents
application programming errors from overwriting internal buffers or database manager files. The firewall also
improves reliability, because application errors cannot crash the database manager.
Database server threads and processes
The system controller (db2sysc on UNIX and db2syscs.exe on Windows operating systems) must exist if the database
server is to function. The following threads and processes carry out a variety of tasks:

db2acd, an autonomic computing daemon that hosts the health monitor, automatic maintenance utilities,

and the administrative task scheduler. This process was formerly known as db2hmon.

db2aiothr, manages asynchronous I/O requests for a database partition (UNIX only)

db2alarm, notifies EDUs when their requested timer has expired (UNIX only)

db2cart, for archiving log files (when the userexit database configuration parameter is enabled)

db2disp, the client connection concentrator dispatcher

db2fcms, the fast communications manager sender daemon

db2fcmr, the fast communications manager receiver daemon

db2fmd, the fault monitor daemon

db2fmtlg, for formatting log files (when the logretain database configuration parameter is enabled and

the userexitdatabase configuration parameter is disabled)

db2licc, manages installed DB2 licenses

db2panic, the panic agent, which handles urgent requests after agent limits have been reached

db2pdbc, the parallel system controller, which handles parallel requests from remote database partitions

(used only in a partitioned database environment)

db2resync, the resync agent that scans the global resync list

db2sysc, the main system controller EDU; it handles critical DB2 server events

db2thcln, recycles resources when an EDU terminates (UNIX only)

db2wdog, the watchdog on UNIX and Linux operating systems that handles abnormal terminations

2. Memory Model

DB2 breaks and manages memory in four different memory sets. They are:

Instance shared memory

Database shared memory

Application group shared memory

Agent private memory

Each memory set consists of various memory pools (also referred to as heaps).
db2mtrk - Memory tracker command
Provides complete report of memory status, for instances, databases, agents, and applications. This command
outputs the following memory pool allocation information:

Current size

Maximum size (hard limit)

Largest size (high water mark)

Type (identifier indicating function for which memory will be used)

Agent who allocated pool (only if the pool is private)

Application

The same information is also available from the Snapshot monitor.


>>db2mtrk -i -d -p -m -r interval count -v
i: instance level memory
d: database level memory
p: Private memory
m: maximum value for each pool
r: repeat mode
interval: number of seconds need to wait
count: number of times need to repeat
v: verbos output(view)
Command Parameters
-i
Show instance level memory.
-d
Show database level memory.
-a
Show application memory usage.
-p
Deprecated. Show private memory.
Replaced with -a parameter to show application memory usage.
-m
Show maximum values for each pool.
-w
Show high watermark values for each pool.
-r
Repeat mode
interval
Number of seconds to wait between subsequent calls to the memory tracker (in repeat mode).
count
Number of times to repeat.
-v
Verbose output.
-h
Show help screen. If you specify -h, only the help screen appears. No other information is displayed.

Important dbm cfg for memory model:


We can define the number of databases within an instance by using the paramater "NUMDB"

The dbm cfg parameter which represents instance level memory is "INSTANCE_MEMORY"

The db cfg parameter which represents db level memory is DATABASE_MEMORY

3. Storage model

When you are designing a new database in DB2 on Linux, UNIX, Windows (DB2/LUW) one of the most important
aspects of your design is the layout of your data. It is important to get this right the first time, because changing
layouts is time consuming and difficult. There is a lot of good information on the web already, but I wanted to add
some practical observations that Graham Murphy and I have had in recently implemented or re-designed
systems. This article focuses on OLTP (On-line Transaction Processing) and reporting systems. Data warehouses and
databases with heavy analytical uses are beyond the scope of this document.

Since most of the main concepts are already covered very well by the IBM Database Storage DB2/LUW Best
Practices Guide, I highly recommend that you read it. In this article I will cover some more detailed
recommendations and some alternatives that may be helpful.

File Systems
One of the first items that you need to consider is the number and types of file systems that you need. The reason
that you should work on this first is that it is typically takes a while to get storage allocated especially in large
organizations where there is a separate storage management group. Formal requests need to be well considered by
the omnipotent storage team before then can condescend to bestow disk space to the unwashed masses. Most
organizations now get an allocation from the central SAN system. Storage is usually presented to servers in an
object called a Logical UNit (LUN).
Before making that request it is a good idea to have a discussion with the storage team about how data is stored
and allocated. In some organizations standard sized LUNs are used and in others custom size LUNs can be
ordered. Here are recommendations for the different types of SAN storage
Data on All Disks
In newer disk subsystems there seems to be a trend towards to spreading the storage for LUNS across all disks in the
physical disk device. An example is IBMs XIV storage, but other manufacturers are doing this too. This is the
simplest case for you. If you are getting your LUNs from this type of system and you can get custom sizes then
request 5 LUNs for your database data. There is noting really magic about this number, but it strikes a nice balance
between ease of administration and spreading data. If you want a few more that is fine, but dont go less. If your
organization issues storage in fixed sizes, then order enough LUNs for the amount of space you need. Finally when
LUNs are presented to the operating system then it is a good practice to create one file system on each LUN.
Data on Individual Arrays
On most other types of storage, LUNs are allocated from individual RAID arrays. There is a very good discussion of
how to arrange this storage in the IBM Database Storage DB2/LUW Best Practices Guide so I will not repeat it
here. If possible, you should get one LUN from each RAID array and create one tablespace per LUN.
Unknown
In many organizations the DBAs and others will be deemed unworthy of knowing what is behind the curtain of the
SAN and will not be told. In this case you just have to ask for enough LUNs to meet your needs and hope for the
best. The good news is that this often does provide adequate performance for many small and medium sized
systems. Again you should create one file system per LUN.
The IBM Database Storage DB2/LUW Best Practices Guide goes in depth about types of RAID arrays to create for
DB2 and I highly recommend that you read it. One important thing that I did not see there is how to create your file

systems from LUNs. It is good if you can create one file system per LUN, but sometimes this is not practical for
various reasons. If you find yourself in this situation do not despair. Just remember that when you create the file
system ensure that you stripe the tablespace across the LUNs and do NOT concatenate the LUNS. If you
concatenate the LUNs then as data is added it is only placed in one LUN until it is filled and then moves on to each
subsequent LUN. This is very bad and places the newest and probably hottest data into one or a few LUNS making a
bad hotspot.
Tablespaces
One of the things that Ive been hearing lately is that it is OK to put all of your tables into one or a very few
tablespaces. This is simply NOT TRUE if you need high performance. A good rule of thumb is to put any table with
more than about 5-10 MB of data into its own tablespace. Further it is a good idea to put the indexes for these
tables into an individual tablespaces. That is, you would put all of the indexes for a larger table into a tablespace
created solely for that tables indexes. You can place all of the smaller tables into one tablespace, and the indexes
for all of those tables into another. Graham and I recently worked with a customer who was having performance
problems with their OLTP database who had all of their tables in a single tablespace. Once he broke all of the larger
tables and their indexes into their own tablespaces performance improved dramatically. When he was done this
system had well over 100 tablespaces.
For almost all production data and index tablespaces you should use Large (not Regular) DMS storage. Regular
tablespaces may go away in future releases. Remember with DMS and Automatic Storage you can now specify a
start size and let the tablespace automatically extend as needed.
If you have Large OBject (LOB) data in your database you should design your tablespaces in one of two ways. If your
LOBS are small enough to fit onto the data page with the other data and is frequently accessed, then you should put
the LOBs in line. That means that the LOB column is just part of the row in the data page just like all other
columns. This saves I/O when accessing the LOB data. If the LOBs are large then they should be put into their own
tablespaces using the LONG IN clause in the create table command.

Putting the Tablespaces on File Systems


You should create each tablespace across all data file systems on your server. That is, each tablespace should have
one container (directory) on each data file system. Avoid putting tablespaces in your backup and transaction log file
system. I am aware of two recently redesigned systems that used SAN that stripes each LUN over all disks in the
storage unit. For both of these databases, five data LUNs were created with one file system being placed on each
LUN Both of these systems perform well and there were many tablespaces and every tablespace was striped across
all 5 data file systems. Both of these are high volume OLTP systems with significant reports being created from
them too.

Striping all tablespaces across all file systems can be made easier with Automatic Storage. With automatic storage
you define the available file systems to the database and then DB2 takes care of placing each tablespace across
those file systems as they are created.
Page Size
For OLTP databases use the 4K page size. End of discussion! When using LARGE tablespaces, as should always be
done these days, 4 K tablespaces can grow up to 2 Terabytes. For strictly Reporting or Operational Data Store
databases 8K or 16K pages might be more appropriate so that you get more rows per page. Compression may also
improve performance of reporting databases.

Extent and Prefetch Sizes


The IBM Database Storage DB2/LUW Best Practices Guide has a good description of extent size and provides a well
accepted formula for calculating it. However, there is an interesting alternative that is gaining acceptance in some
quarters for high-volume OLTP databases. This alternative says to use a small extent size of 2 pages. If you need
very high performance in your OLTP database then you may want to experiment with the traditional vs. small extent
size and see what performs better for your work load. I would lean more towards the traditional calculation for
reporting and ODS workloads.
Again the IBM Database Storage DB2/LUW Best Practices Guide has a good description of prefetch size and provides
a well accepted formula for calculating it. Graham has provided me with a formula that can give better performance
is some cases. This alternative formula is:
PREFETCH = Nbr_File_Systems * Extent_Size * Nbr_Channels_to_Disk_System
Where:

NBR_file_Systems is the number of file systems under the tablespace

Extent_Size is the extent size for the tablespace

Nbr_Channels_to_Disk_System is the number of channels to the disk subsystems. Some HBA cards have

multiple channels. The best way to get this figure is to ask your system administrator for the server.

You might also like