You are on page 1of 30

A complete guide to OpenSQL statements - Step-by-step tutorial

with screenshots
Posted by Tamas Holics in ABAP Development on May 21, 2014 10:00:54 AM
inShare40
In my first blog post on SCN, Id like to give an overview of the OpenSQL query syntax.
Starting from the simplest SELECT statement Id like to demonstrate all language elements
of OpenSQL, gradually building up a very complex database query. For each step, Ill
provide a business example and a screenshot so you can see the result of each SELECT
query. I have used the Flight demo database available in every SAP system, so you can
test the queries for yourself.


Note: in this post I will not discuss performance-related topics in detail. That would make
this post way too long. Maybe in a future post


So lets begin!


Example 1: the simplest SELECT statement


There are three mandatory parts of a SELECT statement, basically defining what you want
to read from which table and where to put the result:


1, After the SELECT keyword, we must specify the fields of the so called result set
(resulting set in the SAP help). Here we define which fields we want to see in the result of
the selection. In Example 1, we entered *, a special character that means that all fields
from the defined table(s) will be returned. This is a handy feature but keep in mind if that
you select all fields from a table that consists of 250 columns, and you use only five of them,
that wastes a lot of CPU, memory and network resources. In this case it is better to list the
five fields explicitly (especially if you select a lot of records).


2, the FROM clause defines which table(s) to read the data from. Here you must specify at
least one table or view that exists in the Data Dictionary. In the first example we will only
access one table, and the later examples will use more tables.


3, The INTO clause defines where to put the results, which is usually called as a "work
area".Here you must define the data object that will hold the result set using one of the
options below:


- INTO x, where x is a structure.
- INTO TABLE y, where y is an internal table.
- APPENDING TABLE z, where z is an internal table. In this the result set is appended to
the internal table (not clearing the existing contents).
- INTO (a,b,c, ... ), where a, b, c etc. are variables with elementary types



There are several instructions in the SAP help regarding the prerequisites of work areas
(data type etc.) and assignment rules (automatic conversions etc.) which I don't want to
copy and paste here. I'd anyway recommend either using a structure identical to the field
selection, or using the "CORRESPONDING FIELDS OF" addition. This will assign the
selected fields to the fields of the work area based on the field name (not from left to right).


For all constructs except INTO TABLE, if the result set is empty, the target remains
unchanged.


Hint: the INTO clause can be omitted in a special case, when you use a group function to
count the number of lines that match the WHERE clause. In this case system variable SY-
DBCNT will hold the number of records found. See Example 5.


Note: The INTO clause is missing from the screenshots because the tool Ive used
automatically generates it. Anyway, all the examples would use the INTO TABLE variant to
fetch all matching records at once.


Business requirement: we want to select all data for 100 bookings from the booking table.
Really simple, right?


How to achieve this? Simply define "*" after the SELECT keyword, specify table SBOOK in
the FROM clause and limit the number of records fetched using the UP TO N ROWS
clause. This can be used to define the maximum number of lines we want the query to
return. This is typically used for existence checks (UP TO 1 ROWS), but in this example we
have limited the query to return maximum 100 records. Unless you specify the ORDER BY
clause, this will return 100 arbitrary records, so you cannot know which 100 it will return.



Screenshot 1: Simply select 100 bookings from table SBOOK


Example 2: Adding a WHERE clause and a table join


The WHERE clause


Usually SELECT statements do not read all the contents of a table (typical exceptions are
small customizing tables holding a few records), but return only specific entries. The
WHERE clause is used to filter the result set, or in other words tell the database which
records to retrieve. Here you define a logical expression that the database will evaluate for
each row in the database.


Business requirement: return only bookings of Lufthansa (field CARRID must contain LH )
and cancelled bookings are not needed (CANCELLED must be equal to space).


The database will evaluate this condition for each record in the table and if the condition is
true, it will be placed into the result set. You can write much more complex logical
conditions as we will see in the following examples. Any number of logical expressions can
be linked to a logical expression using keywords AND or OR and the result of a logical
expression can be negated using keyword NOT. Keep the order of evaluation in mind
(ensure the proper use of parentheses). Simple operators used to compare field values are
EQ, NE, GT, LT, GE and LE (equivalent to =, <>, >, <, >=, <=).


Table joins


The second interesting addition in this example is the table join. Often data required for a
specific business process is stored in several tables. Although it could be an option to select
data from each table using separate SELECT commands and combine the results using
ABAP code executed on the application server, many times it is more convenient and
performant to use only one SELECT statement to read all the tables at once.


Business requirement: read customer data from table SCUSTOM for the customer of each
booking.


This is achieved using a so called table join: a construct that instructs the database to
access a further table based on a condition. This so called join condition (or join
expression) represents the logical link between the two tables. In this example it is the
customer number: for each booking in table SBOOK the database will search for the
customer data in table SCUSTOM based on the customer number.


Because we use * to define the fields of the result set and we read from two tables in this
example, the result set contains all the fields from both tables. If there is a field with the
same name in both tables, only one will be returned (the one from the last table in the
FROM clause - each join overwrites the field contents).


Note: the tool Ive used for the screenshots automatically generates a separate field in this
case. This is the reason why you can see duplicate fields.


The syntax of a join condition is almost the same like in a WHERE clause with some
notable differences which I dont want to copy and paste here from the SAP Help (can not
use subqueries, must use AND to link logical expressions etc).


There are two kinds of table joins in OpenSQL: inner joins and outer joins. We will discuss
the difference in the next example.


Hint: SELECT statements using table joins bypass SAP buffering.



Screenshot 2: adding a WHERE clause and a table join


Example 3: Adding data from another two tables


Fortunately OpenSQL allows to join multiple tables together at once: maximum 25 tables
can be joined in a SELECT statement. Here in this example we add data from two more
tables: T005T which holds country data and GEOT005S which contains geographic
information of regions.


Business requirement: Display the country name instead of country code and display the
latitude and longitude of the region of the customer.


One special thing in this example is the join condition of table T005T. This table is language
dependant, so a language key is needed to get the textual description of a country in a
specific language. Remember: the join condition tells the database how to get a record
from table B based on a record from table A (A being the left hand side and B being the
right hand side table). This is special now because we use the logon language of the
current user from field SY-LANGU. All fields of the SY structure can be used in join
conditions (current date, time, system ID etc.) as well as WHERE and HAVING clauses.


What would happen if we would omit the language key specification? The database would
return multiple rows with the same booking, customer and region information. Why? Simply
because there are more than one entries in table T005T for the same country. Lets say we
have two entries for country DE: it is Deutschland in German and it is Germany in
English. In case of a German customer, the database engine would evaluate the join
condition (T005T-LAND1 = SCUSTOM-COUNTRY) for both records, and both would be
true, so two rows would be returned: one with English text and one with German text.



Screenshot 3: multiple table joins. Notice that all lines contain a latitude and a longitude.


Example 4: Another kind of table join - Left outer join


Here comes the difference between the two kinds of table joins in OpenSQL: the inner join
and the outer join (for ex. SELECT * FROM A INNER JOIN B / SELECT * FROM A LEFT
OUTER JOIN B).


Basically the difference is the behavior in case there is no corresponding entry in table B for
a record in table A. In case of an inner join, there would be no record placed into the result
set. In case of a left outer join, there would be a record in the result set, but all fields coming
from table B would be empty.


This is behavior is very easy to see in this example: we added geographic coordinates to
our query in the previous example using an inner join. Table GEOT005S contains
coordinates for country regions. Whenever a record was not found in table GEOT005S for
the region of a customer, the whole line was dropped from the result set. This is why you
can only see customers with non-empty latitude and longitude.


In the current example we add the latitude and longitude of the region of the customer using
a left outer join. As you can see in the screenshot, there is only one customer, for whom
there are coordinates returned. For all other records, the DB did not find a suitable record in
GEOT005S, so the coordinates are empty. If GEOT005S would be accessed using an
INNER JOIN, these records would be excluded from the result set.



Screenshot 4: table join using a left outer join. Notice that now customers with empty
latitude and longitude appear on the list.


Note: if you have carefully checked the screenshot, you could notice that there is a
customer with a region defined (IL) but there are no coordinates displayed. The reason
could be that there is no entry in GEOT005S for country US region IL, but it is not the
case. Somehow the standard flight demo database contains incorrect entries having an
unnecessary space. This is the reason why the DB does not find the matching record, since
IL <> IL.



Screenshot 4 b: incorrect entries for region IL in table GEOT005S


I corrected these entries using this simple update statement:



After correcting the entries in table SCUSTOM, the query fills the coordinates for all
customers in region IL:



Screenshot 4 c: table join using a left outer join after correcting the problematic database
entries. Notice that now every customer which has a region defined has the coordinates
filled.


Example 5: adding a simple group function


Many times the business is not interested in all individual data records, but want to see an
aggregation based on a specific field. For example, the sum of all sales per salesperson,
costs per project etc. In this case we have two options: either retrieve all relevant records
from the database and perform the calculations on the application server using our own
ABAP code, or perform the calculation using the database engine.


Business requirement: we want to see the number of bookings that match our selection
criteria. (bookings of Lufthansa that are not cancelled).


In order to achieve this using the database, we have to add a GROUP BY clause and we
have to define the aggregate function (also called as group function) COUNT in the
SELECT clause.


The GROUP BY clause combines groups of rows in the result set into a single row. In this
very simple case, we want to see the total number of bookings, not broken down by any
other field.


The COUNT( * ) function determines the number of rows in the result set or in the current
group. Right now we dont have any groups defined so in our case it returns the total
number of bookings.



Screenshot 5: a simple group function counting the number of bookings that match the
selection criteria.


Note: The COUNT function can be also used to determine the number of different values
of a specific field in the result set. In this case, simply put the field name in the parentheses
and add the DISTINCT keyword. For ex. to count how many countries are the customers
from, use COUNT( DISTINCT country ) in the SELECT clause.


It is important to note that the group functions are always performed after the evaluation of
the WHERE clause. The database engine first reads the records matching the WHERE
clause, then forms groups (see next example) and then performs the group function.


Using the GROUP BY clause and aggregate functions ensures that aggregates and groups
are assembled by the database system, not the application server. This can considerably
reduce the volume of data that has to be transferred from the database to the application
server. Of course on the other side, this needs more resources from the database.


Hint: With the use of GROUP BY, the statement SELECT avoids the SAP buffering.


Example 6: defining groups for the group functions


Weve learned that group functions perform certain calculations on groups of database
records. If we do not explicitly specify a group, then all the records in the result set are
considered as one big group, as in the previous example.


Business requirement: display the number of (non-cancelled Lufthansa) bookings per
customer.


In this case we form groups of database records based on the customer number by listing
field ID in the group by clause. Also, we have to enter the ID field of table SCUSTOM in the
SELECT clause before the group function.



Screenshot 6: Grouping rows by customer ID. The COUNT group function is performed on
every group to count the number of bookings for each customer.


As you can see, now the database engine returns as many records as many customer IDs
we have in the result set, and the number of relevant bookings next to them. Exactly what
we wanted.


Note: in all of our previous examples, weve used the * sign in the SELECT clause (field
list). However, here we have to explicitly define the fields needed to create groups of
records. It is mandatory to add the table name and the ~ sign before the name of a
database field, if more than one table has a field with the same name.


Example 7: Adding extra information


Business requirement: display additional fields in the result list (name, city, country, region
code, coordinates).


How to do it? Simply add them to the SELECT clause after the customer number before the
COUNT function. Keep in mind to add the field to the GROUP BY clause too, otherwise you
will encounter a syntax error. The fields you use to form groups must be in the SELECT
clause, and nothing else should be in the SELECT clause that is not in the GROUP BY
clause and is not a group function.



Screenshot 6: additional fields are displayed in the list.


Note: Now you could wonder that in the previous example I told you that by adding fields
before the group function is how we define groups, but here the number of bookings did not
change. The reason is that we have added fields from tables that have a 1:1 relation to the
customer. A customer has only one name, a city is in one country and region and has one
pair of coordinates. If we would have chosen to add the Flight Class field (First class /
Business / Economy), then the result set could contain more than one line per customer: as
many lines per customer as many kind of flights he/she had. We will see how this works in
example 15.


Example 8: Defining the order of records in the result set


You can use the ORDER BY clause in a query to define the sort order of the records
returned. Simply list all the fields that you want to use as a sort criteria. You can use
keywords ASCENDING and DESCENDING for each field to specify the sort mode
(ASCENDING is the default so it can be omitted).


Business requirement: display the customers with the most bookings.


What do we do now? We sort the list by the result of the COUNT function in descending
order and then by the name of the customer.


Screenshot 7: the result set is sorted using the ORDER BY clause.


As you can see, there are several customers who have more than ten bookings (non-
cancelled, Lufthansa).


Note: If all key fields are in the field list and a single database table is specified after FROM
(not a view or join expression), the addition PRIMARY KEY can be used to sort the result
set in ascending order based on the primary key of the table.


Example 9: Filtering based on a group function


Business requirement: the boss is only interested in customers having more than ten non-
cancelled Lufthansa bookings.


How do we do this? I guess the first idea would be to add a new condition to the WHERE
clause to filter records where the COUNT function is higher than ten. However, this will not
work because of the OpenSQL (and SQL in general) language.


The reason is that the WHERE clause filters the database records before the groups are
created by the database engine. After the groups are created, the group functions are
calculated and the result set is created. The WHERE clause cannot be used to filter based
on the group function results.


In cases like these, the HAVING clause must be used. This is similar to the WHERE clause,
but the difference is that it is evaluated after the groups are created and group functions are
performed. To simply put: to filter based on the result of group functions, the HAVING
clause must be used (also called as group condition).




Screenshot 8: using the HAVING clause to filter the result set based on group functions.


As you can see on the screenshot, now only 23 records returned, although we have allowed
to have 100 records by using the UP TO N ROWS addition. So this means that there are 23
customers having more than ten non-cancelled Lufthansa bookings.


Note: If you dont specify any groups using the GROUP BY clause, the HAVING clause will
consider the whole result set grouped into one line. For a quick example assume we have
10 records in table SCARR. The query SELECT COUNT( * ) FROM scarr HAVING count( *
) GT 0 will return one single line with the number of records in the table, but the query
SELECT COUNT( * ) FROM scarr HAVING count( * ) GT 10 will not return any lines
(empty result set).


Example 10: using subqueries


The possibility to combine multiple SELECT statements into one is a very handy feature in
OpenSQL. This is mostly used when you dont know for exactly the filter criteria during
design time or when you have to use a comparison to a dynamically calculated value.


Business requirement: the list must include only customers who have never cancelled a
flight (any airline, not only Lufthansa). At the first glimpse, you could logically ask that this is
already done, since we have cancelled EQ space in our WHERE clause. This is not
correct, because this only influences our group function so that only non-cancelled bookings
are counted. This means that if a customer has 20 bookings with one cancelled, he/she will
be on our list with 19 bookings. According to our requirement, we dont want this customer
to be on our list, so how do we achieve that?


An easy way to solve this is to add a so called subquery to our WHERE clause which
checks if there is a cancelled booking for the customer.


Subqueries in general


Basically a subquery is a SELECT statement in parentheses used in a logical expression.
There is no need to have an INTO clause because the result of the subquery will be
processed by the database engine.


How are the results of subqueries evaluated? This depends on if the subquery is correlated
or not. If a subquery uses fields from the surrounding SELECT statement in its WHERE
condition, is called a correlated subquery. In this case, the result of the subquery will be
evaluated for each line in the result set of the surrounding SELECT statement (in whichs
WHERE clause the subquery is placed). This implies that the subquery is for each record in
the result set of the surrounding SELECT statement. On the other hand, a subquery without
any reference to the surrounding SELECT statement is executed only once.


There are different operators which one can use with subqueries, we will use the EXISTS
operator (negated) in this example. Ill discuss the others in Example 13.


Subqueries can be nested which means that you can put a subquery in the WHERE clause
of a subquery. We will see an example of this later in Example 13.


Now take the current example: we need to check if a customer has a cancelled booking, so
we have to create a relation between the customer ID in the result set of our outer SELECT
statement and the subquery (so we use a correlated subquery). This is done by adding a
condition to the WHERE clause of the subquery to match the customer IDs.


Screenshot 9: using a subquery. As you can see, now only 20 records are returned, so
three customers had a cancelled flight on our previous list.


Table aliases


Notice that here we must use so called table aliases because we select from the same
table in both the surrounding query and the subquery. This means that we must somehow
explicitly define the table name for the fields to be compared, otherwise the DB engine
would not know which table field do we refer to. This is done with the use of table aliases.
Basically you can give a name to your tables and refer to them using the alias. Here Ive
defines sq as an alias for the subquery. You have to use the ~ character between the
table alias and the field name (and of course between the table name and field name as in
the previous examples).


Notes:
Subqueries cannot be used when accessing pool tables or cluster tables.
the ORDER BY clause cannot be used in a subquery.
If a subquery is used, the Open SQL statement bypasses SAP buffering.

Note: subqueries can be used in the HAVING clause too as seen in example 14.

Hint: we could have solved the requirement without a correlated subquery. In this case, the
subquery would select all the customers who had a cancelled booking, and the surrounding
SELECT statement would check every customer if it is in the result set of the subquery:



Simplified example:

SELECT ...WHERE customid NOT IN ( select customid from sbook where cancelled EQ 'X' ).
is equal to
SELECT ...WHERE not exists (
select bookid from sbook as sq where sq~customid EQ sbook~customidand cancelled EQ 'X' )

Example 11: Special operators

In all the previous examples we have only used the EQ (same as =) operator and the GT
(>=) to compare field values. However, there are more complex ones too.

BETWEEN

This operator is very simple, it is basically <= and >= together.

Business requirement: exclude only customers from our list, who have a cancelled bookings
in the first quarter of 2005.

The syntax for this is field BETWEEN value1 AND value2.

Hint: from the business perspective we expect more customers in the result set, since we
exclude less customers due to a more restrictive subquery.


Screenshot 10: using the BETWEEN operator. As you can see, there are 21 customers on
our list, so there is one customer who appears again (had cancellation before or after Q1 of
2005).

LIKE

This expression is a very flexible tool for character string comparisons based on a pattern.
The pattern can be defined using wildcard characters: "%" represents any character string
(even an empty one) and "_" represents any single character. The LIKE expression is case
sensitive, and blank characters at the end of the pattern are ignored (LIKE __ is equal to
LIKE __ ).

Business requirement (quite strange): restrict our list further to only contain customers
with a name starting with A.

We add name LIKE A% to the WHERE clause to achieve this.


Screenshot 11: using the LIKE operator to filter the list based on the name of the customer.

Note: What to do if we want to search for records that contain _ in a specific field? Since
this is a reserved symbol, we have to use the addition ESCAPE, which allows an escape
character can be defined. This escape character cancels the special functions of wildcard
characters (simply place the escape character before the wildcard character to be
cancelled).


A quick example: select all material numbers which contain an underscore:

Wrong approach (returns every material number):
SELECT matnr FROM marawhere matnr LIKE '%_%'



Good approach:
SELECT matnr FROM marawhere matnr LIKE '%~_%' ESCAPE '~'

Hint: It is not possible to specify a table field as a pattern.

Business requirement (even more strange): now we want to only see customers with a
name starting with A and having d as the third letter.


Screenshot 12: using the LIKE operator with both special characters % and _. As you
can see, we still have three customers who match all our selection criteria.



IN

This operator allows you to compare a field to a set of fixed values. This comes handy as it
can replace a longer and much less readable expression:

field IN (01, 03, 05) is equal to the much longer field EQ 01 or field EQ 02 or field EQ
03

Business requirement: extend our selection to customers of American Airlines and United
Airlines too.


Screenshot 13: using the IN operator to count bookings of other airlines too. As expected,
we now have much more customers who match the less restrictive selection criteria.

Note: the IN operator can be used with a selection table too, as seen in chapter 17.

Example 12: Other group functions

So far we have only used the COUNT group function to count the number of bookings that
match our selection criteria. There are a total of five group functions available in OpenSQL.
The remaining four that we havent seen yet are all mathematical calculations: SUM, MIN,
MAX and AVG that calculate the total, minimum, maximum and average of a field
respectively.

There are some restrictions related to the use of group functions:

If the addition FOR ALL ENTRIES is used in front of WHERE, or if cluster or pool tables are listed
after FROM, no other aggregate expressions apart from COUNT( * ) can be used.
Columns of the type STRING or RAWSTRING cannot be used with aggregate functions.
Null values are not included in the calculation for the aggregate functions. The result is a null value
only if all the rows in the column in question contain the null value.


Business requirement: the boss wants to see the total, average, minimum and maximum of
the booking price for each customer (in the currency of the airline).



Screenshot 14: using all group functions (MIN, MAX, SUM, AVG, COUNT).


Note: just like with the COUNT function, the DISTINCT keyword can be used to perform the
group function only on distinct values (so the result of SUM( DISTINCT A ) for two records
having value 10 in a field A would be 10).


Note: the data type for AVG and SUM must be numerical. The data type of MIN, MAX and
SUM is the data type of the corresponding table field in the ABAP Dictionary. Aggregate
expressions with the function AVG have the data type FLTP, and those with COUNT have
the data type INT4.


Hint: the tool Ive used for demonstration replaces the field types of these aggregate
functions for a more user friendly display (instead of the exponential notation).



Example 13: Nesting subqueries


Subqueries can be nested which means that you can put a subquery in the WHERE clause
of a subquery. A maximum of ten SELECT statements are allowed within one OpenSQL
query (a SELECT statement may have maximum nine subqueries).


Business requirement: exclude only customers who have cancelled bookings in Q1 of 2005
and the language of the travel agency is English, where the cancellation was made. Pretty
awkward, but I had to figure out something


We can implement this logic using a nested subquery: we add this criteria to the WHERE
clause of the outer subquery (select all agency numbers where the language is English).
This is different from our previous subquery, because of the keyword we use for evaluating
it.


Logical expressions for subqueries


- EXISTS: this is what we have used in our first subquery. This returns TRUE if the
subquery returns any records (one or more) in its result set, otherwise this returns FALSE.


- EQ, GT, GE, LT, LE: these operators can be used to compare a field with the result of the
subquery. If the subquery returns more than one row, obviously the database engine will not
know which one to use for comparison: a non-catchable exception will occur.


- In order to use subqueries that return multiple rows, you either have to use the IN operator
(checks if the field is equal to any of the values returned by the subquery) or one of the ALL,
ANY, and SOME keywords together with EQ, GT, GE, LT or LE. These will influence the
comparison in a pretty self-explaining way: the comparison will be carried out with all the
records returned by the subquery, and the comparison will return TRUE if all (ALL) or at
least one (ANY, SOME) records return TRUE for the comparison. There is no difference
between keywords SOME and ANY.


What result do we expect? Since our outer subquery is used to filter out customers with
cancelled bookings, a less restrictive subquery (achieved by the nested subquery) would
mean more customers on our list. Practically: the less agencies we include in our search for
cancellations, the more customers we get on the list.



Screenshot 15: nesting subqueries. As you can see, we actually have one more customer
on our list, who cancelled his booking at an agency where the language is not English.


Example 14: HAVING and GROUP BY in a subquery


Business requirement: only exclude customers who have at least three cancellations
(Lufthansa flight in Q1 of 2005 at an English speaking agency).


Since we have to count the number of these bookings, we have to use group function
COUNT and group the bookings by the customer ID. This way we get the number of
matching bookings per customer. Then we simply add the HAVING clause to make sure we
only exclude customers having more than two cancellations from our main query.


We can expect to have more customers in our result set, since we have a more restrictive
subquery that we use to filter out customers.




Screenshot 16: using the GROUP BY and HAVING clauses in a subquery. As you can see,
we have two more customers on our list (who have one or two matching cancelled
bookings).


Example 15: Lets extend the GROUP BY clause


Business requirement: include only customers who have more than 10 bookings for the
same airline. It doesnt matter which, but it should be more than ten.


So far we have counted all the bookings of the customers who satisfy all our criteria (for
example having more than 10 bookings of any airline). This could be like someone having 5
bookings for Lufthansa, 5 for American Airlines and 2 for United Airlines (the total being
higher than 10). Now we want to see something like 11 for Lufthansa.


It is very simple to solve this by adding the airline code (CARRID) to the field list of our main
query. Remember, the database engine will create groups of records based on all fields
listed before the first group function in the field list (SELECT clause). If we add the airline
code here, groups will be made per airline for each customer and the COUNT function will
count the number of bookings per customer and airline.


What changes do we expect in our result set? There should be much less customers on our
list, because they must have more than ten bookings for the same airline.



Screenshot 17: adding the carrier ID to the GROUP BY clause (and the SELECT clause as
well).


The result shows exactly this: we only have three (very loyal) customers who match our
selection criteria. Notice that the highest number of bookings is now only 12, while in the
previous example it was 19.


Example 16: Going back to the LEFT OUTER JOIN


In order to have some coordinates displayed in the result list, we make two changes:


- change the WHERE condition of the main query: instead of checking the name of the
customer and the airline code, we select customers from the US. This way we will have
much more customers on our list (100 which is limited by the UP TO N ROWS addition) and
since they are from the US, we will see some region codes for the customers (coordinates
are maintained for the US regions).


- remove the HAVING clause to include customers with less than 11 matching bookings.



Screenshot 18: first change: removing the check to have at least ten bookings.



Screenshot 19: second change: select customers from the US.


Now you can see again the behaviour of the LEFT OUTER JOIN: coordinates are filled for
all records, where the region code is filled and coordinates are found for the region in table
GEOT005S.


Example 17: Using a selection table in the WHERE clause


Selection tables are used to define complex selections on a field. They are mostly used
together with selection screens (using statement SELECT-OPTION). Any selection the user
makes on the user interface will be converted to a complex logical expression using the
operators that we have worked with in this tutorial (EQ, GT, LE, GE, LT, NE, BETWEEN,
NOT etc.). This conversion is made by the OpenSQL engine automatically.


In order to compare the values of a table field with a selection table, you have to use the
IN operator.


Business requirement: only count bookings of Business Class and Economy Class (C and
Y) in a complex time range.



Screenshot 20: performing complex selections using selection tables. Notice the IN
keyword used as a comparison operator.


Note: the tool used for this demonstration offers a UI to define the selection tables as on
selection screens. Also, the generated WHERE clause is visible on the right side, next to
the selection tables R_SBOOK_FLDATE and R_SBOOK_CLASS.


Example 18: The FOR ALL ENTRIES IN construct


This construct is widely used in ABAP and is similar to a table join in a way that it is used to
read data from a table for records we already have (typically selected from another table or
returned by a Function Module). However, there are big differences between the two
constructs.


The FOR ALL ENTRIES IN internal table construct allows you to use an internal table as
the basis of your selection, but not like a selection table from Example 17. If you use this
addition, you can (actually, must) refer to the fields of the internal table in the FOR ALL
ENTRIES IN clause to perform comparison with the fields of the database table(s) being
read. Naturally the fields used in the comparison must have compatible data types.


As this construct is ABAP specific, there is a mechanism that translates the OpenSQL
command to one or more native SQL statements. Actually the WHERE clause(es) that will
be passed to the database engine will be generated based on the contents of the internal
table and the WHERE clause you define. There are several profile parameters that
influence this conversion, which you can check in SAP Note 48230 - Parameterization for
SELECT ... FOR ALL ENTRIES statement.


The main difference between table joins and this construct is that table joins are carried out
by the database server and all data is passed to the application server at once. On the other
hand, in case of a FOR ALL ENTRIES IN construct, the entire WHERE clause is evaluated
for each individual row of the internal table. The result set of the SELECT statement is the
union of the result sets produced by the individual evaluations. It is very important to note
that duplicate records are automatically removed from the result set (but on the application
server and not on the database server).


Syntactically, there is a difference that you have to use the - sign instead of the ~ sign
between the internal table name and the field name of the internal table in the WHERE
clause.


Very important note: If the referenced internal table is empty, the entire WHERE clause is
ignored and all lines from the database are placed in the result set. Always make a check
on the internal table before executing a select query using this construct.


Business requirement: read airline information for all airlines that appear on our list.


How to implement this? We already have a SELECT statement from the previous example,
so create a second SELECT statement using the FOR ALL ENTRIES IN construct. Simply
add the carrier ID as a link into the WHERE clause (similar to the join condition in case of
table joins) and thats it.



Screenshot 21: using the FOR ALL ENTRIES IN internal table construct.


Note: the tool Ive used for demonstration uses outer_table as the name of the internal
table. The contents of it are coming from the select query of example 17.


Note: As of release 6.10, the same internal table can be specified after FOR ALL ENTRIES
and after INTO. However, be careful because in this case all fields of the internal table that
are not filled by the SELECT query will be cleared.


Note: performance-wise there are endless discussions on SCN if a table join or the FOR
ALL ENTRIES IN construct is better. It really depends on the buffering settings of the tables
you select from, the fields you use for joins and selections, the indexes that are available,
the number of records in both tables, profile parameters of the SAP system etc. In general I
prefer joins since it is a tool that is designed especially for the purpose of reading data
from multiple tables at the same time and it is done on the database layer. Of course certain
situations are against a table join. Also, you have no choice if you use a BAPI/Function
Module/Class method to get records from the database, since obviously in that case you
cannot use a table join but you have to use the FOR ALL ENTRIES IN construct.


Other keywords


SINGLE and FOR UPDATE


If you use the SINGLE addition, the result set will contain maximum one record. If the
remaining clauses of the SELECT statement would return more than one line, only the first
will be returned.


The FOR UPDATE addition can be used only with the SINGLE addition, which you can use
to set an exclusive lock for the selected record. However, this is rarely used and I also
prefer using lock function modules separately.


Note: The addition SINGLE is not permitted in a subquery and the ORDER BY clause can
not be used together with it.


CLIENT SPECIFIED


This addition switches off the automatic client handling of Open SQL. When using the
addition CLIENT SPECIFIED, the first column of the client-dependent database tables can
be specified in the WHERE and ORDER BY clauses.


BYPASSING BUFFER


This addition causes the SELECT statement to bypass SAP buffering and to read directly
from the database and not from the buffer on the application server.


ENDSELECT


A SELECT statement may retrieve database records one by one (functioning as a loop
using keyword INTO), or together at once (this is called array fetch and is used with
keyword INTO TABLE/APPENDING TABLE). In the first case the ENDSELECT statement
closes the loop started with SELECT. Both constructs retrieve the same result.


Out of scope for this blog post


Database hints


Basically using hints is used to specify how the database engine should execute our query.
If you omit this, then the DB engine will use its own optimizer to determine the best strategy
to execute the SELECT statement. Using hints to override this default strategy is quite
frequent outside of SAP, but it is seldom used with ABAP.


One reason is that not many developers know that it is possible to use hints with OpenSQL
statements (not only with native ones). Also, there are certain drawbacks (problems during
DB upgrade or change of DB server) and there is more possibility for human errors.


There is a very good overview of database hints in SAP Note 129385 - Database hints in
Open SQL


Dynamic token specification


It is possible to assemble OpenSQL statements during runtime. So instead of coding a
static SELECT statement, you can use character-type variables to hold the SELECT,
FROM, WHERE etc. clauses. This may come handy in certain cases, but it has
disadvantages regarding performance, maintainability and code readability. Also, there are
certain SAP release dependent restrictions on the allowed syntax elements. There are
some nice materials on SCN and other sites that deal with this topic.


Conclusion


As you can see, even though Open SQL is a very limited subset of the modern SQL
language, it still allows you to execute quite complex queries. In most cases the whole
business logic of a complex report cannot be mapped into a single select query, however if
you know what possibilities you have, you can write much more elegant and compact
program code with better performance.


Thanks for reading and have fun using Open SQL.

You might also like