You are on page 1of 11

The Inevitable

Unicode Project

CONTENTS
The Inevitable Unicode Project............1
What is Unicode?................................2
Why should I adopt Unicode?..............2
Planning System Resources for
Unicode System..................................3
What should you know about Endians
in Unicode conversion?.......................4
What resources do I need to convert
my non-Unicode system to Unicode?. .4
Converting MDMP systems to Unicode
means more complexity?....................5
Unicode conversion project phases.....6
A. Planning.......................................6 Tik k ana Ak urat i ,
B. Execution.....................................6 U n i c o d e S A P. c o m
C. Finalization..................................6 Up g rad e & Unico d e Sp e cialis t
I) Unicode Preparation steps............6
II) Unicode conversion.....................8
III) Post-conversion steps.................9
Summary............................................9
About the Author................................9

The Inevitable Unicode Project Page 1 of 11


Display/Multiple Processing (MDMP)
What is Unicode? support is dropped by SAP starting
The fFollowing description from with NW2004s.
Unicode.org best describes what  Many companies are adopting
Unicode is. Service Oriented Architecture
“Computers store letters and other standards providing Web Services
characters by assigning a number for to enable global interoperability.
each one. Unicode provides a unique These standards require Unicode.
number for every character, no matter  SAP provides full Unicode support
what the platform, no matter what the starting from Web Application
program, no matter what the Server (Web AS) 6.20. Current
language. “ releases of NetWeaver and mySAP
Business Suite run Unicode-
The old standard code pages have the enabled.
following disadvantages:  Unicode is mandatory for SAP
 They Ccover only a subset of all systems deploying Java
characters applications.
 Different codepages have
incompatibilities between each
other.
 Data exchange is restricted
between code pages
 There are sSimply too many code
pages.
Unicode has the capability to support
all the languages in the world in one
code page! It supports 65,000
characters and has room to support an
additional 1 million characters.

Why should I adopt


Unicode?
Whether you like it or not, Unicode is
here to stay and you will be lured to
adopt Unicode, because offor the
following reasons and to accommodate
the following situations:
 ECC6 (NW70) and after, new
installations will only be possible
with Unicode.
 If your company is supporting
multiple regions of the world, you
are most likely in most probability
you might be using MDMP in your
SAP system. MDMP systems deploy
more than one system code page
on the application server. This
method has the an inherent flaw in
that the data would become
corrupt if a user logs onto one
language and tries to work with
data in another language. Multiple

The Inevitable Unicode Project Page 2 of 11


Planning System
Resources for
Unicode System
Plan for the increased computing
resources for your system to run a
Unicode converted system. I put Figure Increased disk size requirement
together a few figures below to give a 3 for databases in Unicode
visual perception of the increased systems
demand on resources.
 CPU: Increase CPU power by This is because, in a Unicode
upto +30% due to double byte system one character is not always
handling in Unicode equal to 1 byte. For example in
UTF16 (A fixed length Unicode
Transformation Format, 16 bit
encoding), 1 character = 2 bytes
and in UTF-8, 1 character = 1 to 4
bytes.

Note: In my experience, even


though it is recommended to plan
for the increased database size, the
demand for additional disk space
Figure Increased CPU requirement for for the database is not seen
1 Unicode systems immediately during the conversion.
In fact, there is usually a reduction
of the database size between 10%
 Memory: Increase Memory
to 30% which can be attributed to
Consumption by +50% as
the implicit database
Unicode SAP application servers
reorganization during the export /
are based on UTF-16 internally.
import process and better use of
Oracle extents into new reduced
tablespaces.

 Network: No impact on
network due to efficient
compression techniques
employed by SAP between DB
and application servers.

Figure Increased Memory requirement


2 for Unicode systems

 DB Size: Increase database


size by 10% to 30% depending
on which codepage
representation is used by the
Database.

The Inevitable Unicode Project Page 3 of 11


What should you IBM 390, AS/400, PowerPC
know about Big
Endia
(AIX),Linux on zSeries (S/390),
Linux on Power, Solaris_SPARC,
Endians in Unicode 4102 n HP PA-RISC, Itanium (HP-UX)

conversion?
When converting to Unicode, the
export code page must correspond to
the Endianness of the target system. What resources do I
But what is an Endian, and how many
Endians are out there? Fortunately
need to convert my
there are only two Endians. The word non-Unicode
“Endian” has to do with where the
most significant byte (MSB) comes first system to Unicode?
or in other words where big “end”
comes first. If the MSB comes at the Unicode conversion projects need
lowest address and the least thorough planning and meticulous
significant byte at the highest address execution. A good project plan can
then you are dealing with Big Endian. help you immensely in the long run.
Little Endian is the other type of For large systems’ conversion process,
arrangement of MSB ordering. That is if you need at least one more system
the least significant byte of the that is similar to your non-Unicode
number is stored in memory at the system. Your non-Unicode system is
lowest address and the most usually called the Source and the other
significant byte at the highest address system is called the Target. You would
then it is Little Endian. An example convert and export your source system
would be the following: to dump files on a shared disk space
0xAABBCC would be stored as follows: and import into the target system.
Big Little The export and import above can
Addre Endia Endia happen in parallel if you use Unicode
ss n n conversion tools like Migration Monitor
(MigMon) and/or Distribution Monitor
2 CC AA
(DistMon) to speed up the process.
1 BB BB GoodStrong experience using these
0 AA CC tools is essential for a successful
conversion.
The export code page should match
the Endianness of the machine that Here is a quick list of the essential
runs R3load to import data. resources you need:
1) Additional system for preparing an
Here is a table that helps with which empty target Unicode system. (You
code page to use: will import your Unicode converted
data into this system).
2) Additional disk space for common
Cod export area and for running
e Endia
pag n
language scans for a source MDMP
e type Machine Architecture system.
3) ABAP team to convert your non –
Little Alpha, Intel X86 (and Unicode programs to Unicode
Endia clones),X86_64, Itanium
4103 n (Windows+Linux),Solaris_X86_64 compatibility.
4) Language team to map the
vocabulary in different languages

The Inevitable Unicode Project Page 4 of 11


in your system to proper languages to the collected vocabulary
languages/codepages. can be done through automatedical
5) Testing team to test interfaces methods and manual effort. SAP
and/or scenario based verifications. provides tools like SPUMG / SPUM4 and
6) A dedicated Unicode conversion SUMG for such vocabulary work. The
specialist. (If you are going to use vocabulary mapping is often an
CU&UC – Combined Upgrade and underestimated task in the MDMP-
Unicode Conversion, then Unicode projects. To give an idea of
Kknowledge of Upgrades is also how long this manual task of
essential). vocabulary work may take, here is an
example.

In one MDMP – Unicode conversion


Converting MDMP project following were Vocabulary
systems to work related resource numbers:
Unicode means
more complexity? Database size: 2 TB
Vocabulary collected: 183,475 unique
Unicode preparation of MDMP systems
words
has additional steps of scanning the
Vocabulary team size: 10
database for vocabulary that needs to
Time taken to map all vocabulary: 4
be mapped to different languages.
weeks
These scans can take a long time
depending on the size and nature of
your MDMP system. The assignment of

Figure 4 Screenshot of vocabulary screen from SPUMG transaction in a MDMP


Unicode conversion project.

The Inevitable Unicode Project Page 5 of 11


The Inevitable Unicode Project Page 6 of 11
Test and re-test. You should validate
Unicode conversion your system with respect to the ABAP
project phases changes that happened during the
project and all interfaces that connect
to your Unicode system. A “scenario
The project primarilymainly consists of
based” verification methodology can
three phases:
be employed to validate data
A. Planning
consistency.
B. Execution or Unicode
conversion
C. Finalization
I) Unicode Preparation
steps
A. Planning Activities in this phase are performed
In this phase, you start building your on the source system. There are three
project team, check the pre-requisites major activities in this phase that take
for Unicode conversion, gather the majority most of the time.
requirements for system resources,  Vocabulary fixes: You should
plan the downtime for conversion and collect the vocabulary in your
start enabling customer developments system to determine that needs
to Unicode compatibility. determination of the codepage
they should belong to. The
B. Execution language/ vocabulary scans are
performed in this phase which can
This is the phase where the conversion take up considerable computer
happens and where the export / import time to run. See Figure 5 for the
of your non-Unicode system happen scans that you will need to perform
occur during a downtime. go through in transaction SPUMG
The project phase “B. Execution of (or SPUM4 if your source system is
Unicode conversion” also referred to as 46C MDMP). Once the vocabulary
“Realization” phase is the longest is scanned the vocabulary work is
phase in the project. There are three performed by your
main activities that should happen in language/vocabulary team in this
this phase: phase.
I. Unicode preparation steps Basis Tip: Plan for extra space to
o Vocabulary fixes increase PSAPTEMP and PSAPUNDO
o ABAP program fixes tablespaces for the scans to complete
o 3rd party tools and interfaces without fail.
fixes
II. Unicode conversion  ABAP program fixes: yYou
III. Post-conversion steps should also enable all your ABAP
programs to Unicode compatible
C. Finalization syntax in this phase. The ABAP
enablement of your programs can
During this phase your Unicode system
take considerable time as well
is given to teams for functional, and
depending on how many custom
technical testing and validation. The
programs you have in the system.
testing can involve custom testing
SAP provides the transaction
scripts focusing on different languages
UCCHECK to identify all the
and the vocabulary that exists in your
programs that need to be adjusted
system.
for Unicode compatibility.

The Inevitable Unicode Project Page 7 of 11


 3rd party tools and interfaces Unicode system in your
fixes: yYou should also address environment. Some of the
the compatibility of your future oftencommonly found third party
Unicode system with the existing tools are Vertex, BSI, RWD, BMC
interfaces and third party tools. and IXOS. FindIdentify Unicode
Thorough testing should be done compatible releases of these to
for compatibility issues with a test install in your Unicode system.

Figure SPUMG transaction screenshot showing all the tabs for the
5 language scans in a Unicode conversion of an MDMP
system.

The Inevitable Unicode Project Page 8 of 11


II) Unicode conversion Migration Monitor, Table Splitter,
OrderBy and Package Splitter to speed
This phase involves export and import up the conversion process. Please
of your non-Unicode system to the note that the actual Unicode
target Unicode system; hence the conversion of your data happens
system downtime begins in this phase. during the export by R3load. Plan for
It is best to use tools like Distribution extra disk space for the export dump
Monitor, files and other Unicode tools to be
stored.

Figure Depiction of two application servers performing parallel


6 export/ import from Source DB to Target DB

The Inevitable Unicode Project Page 9 of 11


III) Post-conversion
steps
This phase involves steps after the
import of the Unicode data into your
target system. The steps are
performed in the target system. If your
source system is MDMP, then your
language team will need to perform
final vocabulary adjustments in a
transaction called SUMG on the target
system.

Summary
Unicode conversion projects are often
under-estimated. A Unicode conversion
project should be treated as the same
if not more complex, if not more
complex, than a typical Upgrade
project. A high level overview of thea
Unicode conversion project and some
key project steps were givenprovided
in this article. There are many topics
that you will come across during an
Unicode Conversion project (e.g.for
example archiving, data reduction,
performance tuning for reduced
conversion downtime, adoption of new
tablespace naming convention and/or
custom tablespace names, etc.,). TThe
side benefits of a Unicode project are
reduced number of tablespaces and a
defragmented database. In addition,
Aa unicode conversion enables your
company to benefit from new
technologies and support all languages
in the world.

The Inevitable Unicode Project Page 10 of 11


Resources
> https://service.sap.com/Unicode
> https://service.sap.com/globalization

About the Author


Tikkana Akurati holds a Masters degree in Computer and Information Sciences. He has 20
years of experience in IT. Tikkana has played many roles in the area of SAP Basis for more
than 11 years. His current focus is on SAP upgrades and Unicode conversions. He has
completed numerous Unicode conversion projects for SAP America. He has recently
completed two Upgrade and Unicode Conversion projects for BayfForce for a client in the
mid-west region.
To contact Mr. Akurati or discuss how we can assist you with your upcoming projects,
please call Kim Snow, Vice President of Delivery at BayForce, at 813-908-8593 or send an
email to ksnow@bayforce.com. He can be reached at Ph:847-804-2970 or via email to
Tikkana@yahoo.com or Tikkana@UnicodeSAP.com

The Inevitable Unicode Project Page 11 of 11

You might also like