Professional Documents
Culture Documents
John B. Go0denough
Susan L. Gerhart ~
SofTech, Inc., Waltham, Mass.
W h a t a r e t h e p o s s i b l e s o u r c e s of f a i l u r e i n
a program?
What test data should be selected to demonstrate that failures do not arise from these
sources ?
proofs of correctness
Abstract
This paper examines the theoretical and
p r a c t i c a l r o l e of t e s t i n g i n s o f t w a r e d e v e l o p m e n t .
We prove a fundamental theorem showing that
p r o p e r l y s t r u c t u r e d t e s t s a r e c a p a b l e of d e m o n strating the absence of errors in a program. The
theoremts proof hinges on our definition of test
reliability and validity, but its practical utility
hinges on being able to show when a test is actuaUy reliable.
We explain what makes tests
unreliable (for example, we show by example
why testing all program statements, predicates,
or paths is not usually sufficient to insure test
reliability), and we outline a possible approach
to developing reliable tests.
We also show how
the analysis required to define reliable tests can
help in checking a program's
design and specifications as wetl~as in preventing and detecting
implementation errors.
1.
1.1
of t h i s p a p e r i s :
questions
examined
Durham,
Testing Concepts
Introduction
The purpose
Fundamental
through-
DAAA25-74N.C.
493
Given a p r o g r a m F, wlth domain D, output r e q u i r e m e n t OK(d) = OUT(d, F(d)) and t e s t data selection
c r i t e r i o n C:
(I)
SUCCESSFUL(T)
(2)
(3)
( V t i T ) OK(t)
Fundamental T h e o r e m
(~[TC~D)(~[C)(COMPLETE(T, C)~RELIABLE(C) ^VALID(C)~SUCCESSFUL(T)) D (Vd ~ D) OK(d)
Figure
Formal
criterion is reliable if and only if every T satisf y i n g C O M P L E T E ( T , C) i s p r o c e s s e d s u c c e s s fully by F, or if every such T is unsuccessfully
p r o c e s s e d ( s e e F i g u r e 1). I n s h o r t t o b e r e l i a b l e , C m u s t i n s u r e s e l e c t i o n of t e s t s t h a t a r e
consistent in their ability to reveal errors,
as
opposed to necessarily being able to detect all
errors.
N o t e t h a t i f C i s r e l i a b l e , i t i s only--"
necessary to test one complete set of test data-no further information will be derived from testing other complete sets of test data.
Theorem
of T e s t i n g
s t r a t e s t h a t t e s t s s a t i s f y i n g C O M P L E T E ( T , C}
where C satisfies RELIABLE and VALID are
"thorough" in the appropriate sense.
Note that
proving a data selection criterion to be reliable
and valid, and then finding and successfully executing a complete test satisfying this criterion
is just a way of proving the correctness
of the
program.
In effect, the theorem states that
i n s o m e c a s e s , a t e s t i.s_ a p r o o f o f c o r r e c t n e s s .
The proof of a selection criterion's validity
and reliabil~ty is easy in some cases.
For example, if C is defined so the only complete test is
a n e x h a u s t i v e o n e , i. e . , i f C O M P L E T E ( T , C ) D
(T = D), t h e n C o b v i o u s l y s a t i s f i e s R E L I A B L E
and VALID. Another interesting example is
w h e n C i s u n s a t i s f i a b l e b y a n y d e D. T h e n T
will be empty, i.e., no testing will be done,
In this case, such a C clearly satisfies RELIABLE(C).
Proof of Cts validity, however, is
more difficult. In fact, such a proof exists if
and only if the program contains no errors.
Hence in this case, proof of C'S validity is
equivalent to a direct proof of the program*s
correctness.
A p r o o f of v a l i d i t y i s t r i v i a l , h o w e v e r , i f
C does not exclude any member of D from some
s e t o f t e s t d a t a , i. e. , i f i t c a n b e s h o w n t h a t f o r
a l l d ~ D, t h e r e e x i s t s a T c o n t a i n i n g d a n d s a t i s f y i n g C O M P L E T E ( T , C). I n t h i s c a s e , s o m e
testing must be performed,
and in general, the
proof of Cts reliability is not trivial.
The remainder of the paper will concentrate on what
must be known about programs to insure reliability in this case, and in Section 5 we will give
some guidelines for finding non-trivial reliable
test data selection criteria.
A l s o i n S e c t i o n 5,
we will give a specific example of a type of data
selection criterion and its correspondfng deftnitions of COMPLETE, RELIABLE, andVALID.
But to motivate this example, we first need to
look at sources of errors in programs,
both in
general (Section 1.2) and with reference to example programs
( S e c t i o n 2).
T h e f o r m a l d e f i n i t i o n s of R E L I A B L E a n d
VALID given in Figure 1 merely state precisely
what we already have said informally about
RELIABLE(C) and VALID(C). To show the
utility of the definitions, we use them in stating and proving the Fundamental Theorem on
w h i c h a l l t e s t i n g i s b a s e d ( s e e F i g u r e 1). I t s
proof is simple:
Assume there exists some d ~ D for which
F f a i l s (i. e. , - v OK(d)). T h e n V A L I D ( C )
i m p l i e s t h e r e e x i s t s a c o m p l e t e s e t of t e s t
data, T, that is not successful. RELIABLE(C)
implies that if one complete test fails, all
fail. But this contradicts the theorem~s
premise, i. e., that there exists a complete
test that is successfully executed. Q.E.D.
1.2
Types of Program
Errors
494
Example
I:
ASimple TextReformatter
N__I: T h e p r o g r a m d o e s n o t t e r m i n a t e w h e n t h e
end of the given text is reached, although Naur
provides for termination (through the undefined
language construct "Alarm") if a word containing
more than h/AXPOS characters
(an oversizeword)
is seen.
(Note that in this case the specification
cannot be satisfied. ) Non-termination on end of
text will, of course, be discovered with any test
data not containing an oversize word. The effect
of processing an oversize word, however, would
only be seen if test data contained such a word:
A reliable test methodology will insure that oversize words are presented, to the program (even
though they are not mentioned inthe specification),
since such data are not excluded from the prog r a m V s i n p u t d o m a i n a n d t h e p r o g r a m WiU n o t
necessarily process oversize words correctly if
it processes shorter words correctly.
495
bufpos := 01
outcharacter(LF);
fill := 01
next character :
incharacter (CW)
i f CW = B L A N K V C W = LF
then b e g i n
i_f fill + 1 + b u f p o s < M A X P O S
then b e g i n
o u t c h a r a c t e r (BIJOU) ;
fill := fill + 1 end
els,e b e q i n
o u t c h a r a c t e r (LF) ;
fill := 0 end;
f o r k := 1 step 1 until b u f p o s d ~
o u t c h a r a c t e r (buffer [k]);
fill := fill + b u f p o s ;
b u f p o s := 01
end
else
i_f b u f p o s = M A X P O S
th
e__n A l a r m
else b e g i n
b u f p o s := b u f p o s + i;
b u f f e r [bufpos]
:= CW end;
~_o t o next character;
Figure
1 Alarm := .false;
2 bufpos := 01
f11 := O;
3
4 repeat
n c h a r a c t e r (CW) ;
5
I f C W = B L V CW = N L V C W = ET
6
then
7
if bufpos ~ 0
8
t h e n ,be~!n
9
I f fill + ' b u f p o s < M A X P O S A fill # 0
10
then b e g i n
11
o u t c h a r a c t e r (BL) ;
12
fill := fill + 11 end
13
else b e g i n
14
o u t c h a r a c t e r (NL) ;
15
fill := 0
end;
16
for k := 1 step 1 u n t i l b u f p o s d_o
17
o u t c h a r a c t e r (buffer [k]);
18
fill := fill + b u f p o s ;
19
b u f p o s := 0
end
20
else
21
i_f b u f p o s = M A X P O S
22
.then A l a r m := true;
23
.e.ls e b e g i n
24
b u f p o s :=. b u f p o s + I ;
25
b u f f e r [bufpos]
:= C W .end
26
27 ~intil A l a r m V C W = ET~
Figure
Z
Corrected
a s s u m p t i o n a b o u t t h e f o r m o f t h e i n p u t (i. e . ,
no c o n s e c u t i v e BLANKS o r N L c h a r a c t e r s ) a n d
do s o n o t r e v e a l t h e e r r o r .
How can this sort
of error be discovered through a systematic
a p p r o a c h to t e s t i n g ?
N2: T h e l a s t w o r d i n t h e t e x t w i l l n o t be o u t
put unless it is followed by a BLANK or NL.
N3: A b l a n k w i l l a p p e a r b e f o r e t h e f i r s t w o r d
on the f i r s t line e x c e p t w h e n the f i r s t w o r d is
exactly MAXPOS characters long. This can
c a u s e a v i o l a t i o n of c o n s t r a i n t (2) f o r t h e f i r s t
line. The reason for this error is clear.
After
creating action clusters for a word buffer, Naur
makes the following assertion: "The input chara c t e r p r e c e d i n g t h e o n e h e l d i n b u f f e r [1] w a s a
B L A N K q r N L . T h i s h a s n o t b e e n o u t p u t " ( p . 252).
This assertion is false for the first word and is
never disproved.
T h u s a b o u n d a r y type of c a s e
(first character, first word) causes a proof
error.
Of c o u r s e t h i s e r r o r w i l l b e f o u n d f o r
any test data whose first word is less than
MAXPOS characters long.
N e : If t h e f i r s t w o r d o f t h e i n p u t
p r e c e d e d by a BL or NL, the output
rain either two blanks preceding the
or a line c o n t a i n i n g j u s t two b l a n k s ,
c o n s t r a i n t (2).
text is
will confirstword
violating
N7: T h e s p e c i f i c a t i o n s u s e
NL as the newline character but the program uses LF. This
e r r o r p r o b a b l y a r i s e s f r o m f a i l u r e to p r o o f r e a d , hut c o u l d a l s o b e t r a c e d to a f a i l u r e to
specify the character set of the problem.
If
LF and NL are distinct characters,
then any
input text containing a NL or LF character
will reveal the problem.
N5: No p r o v i s i o n i s m a d e f o r p r o c e s s i n g
s u c c e s s i v e a d j a c e n t b r e a k s (e. g . , t w o b l a n k s ) .
This error arises because a word is defined as
the characters (other than NL or BLANK) appearing between successive NL or BLANK
characters,
and w o r d s of z e r o l e n g t h a r e s i m ply not c o n s i d e r e d in any d i s c u s s i o n of the
program's assertions.
S p e c i f i c a t i o n (Z) r e quires as many words as possible on a line,
and this specification makes no sense if zerolength words are permitted.
So a n i m p o r t a n t
case has not been considered in either the program, the input description, or the program's
informal proof. Naur's suggested test data for
t h i s p r o g r a m a p p e a r to m a i n t a i n t h i s i m p l i c i t
496
1.
2.
If t h i s p r o g r a m had b e e n c o d e d and r u n
on s o m e t e s t data, s u c h a s t h a t u s e d to
i l l u s t r a t e t h e p r o g r a m o u t p u t (p. Z51),
e r r o r s N1, NZ, N3, a n d N7 w o u l d h a v e
been detected.
E r r o r s N4, Nb, a n d N6
would not have been revealed.
N7:
Since w e w i s h to u s e t h i s e x a m p l e to i l l u s t r a t e
other types of possible errors and then again in
Section 4 to illustrate a general method for selecting test data, we will clean up the specifications
and program:
The c o r r e c t e d v e r s i o n of N a u r ' s p r o g r a m a p pears in Figure 3. First let's look at the correct i o n s f o r e r r o r s N1 t h r o u g h N7.
N1, N2: T h e e n d l e s s f l o o p c o n s t r u c t e d w i t h a
goto in Naur's program has been replaced
with a repeat-until having Alarm as a
Boolean variable and ET as an end-of-text
indicator.
N3: T h e c o n d i t i o n f i l l ~ 0 h a s b e e n c o n j o i n e d
with the condition fill + bufpos < MAXPOS
to p r e v e n t the output of a BL b e f o r e the
f i r s t w o r d and to p r o d u c e a N L i n s t e a d .
The condition fill ~ 0 holds only for the
first line, since every other line will contain at least one word. Equally well, we
c o u l d h a v e i n i t i a l i z e d fill to the v a l u e
lVIAXPOS, i n s u r i n g t h a t f i l l + b u f p o s <
IV[AXPOS w o u l d b e f a l s e t h e f i r s t t i m e .
N4: L i n e Z of N a u r ' s p r o g r a m , o u t c h a r a c t e r
(NL), i s r e m o v e d s o t h a t o n l y o n e N L
c h a r a c t e r w i l l p r e c e d e t h e f i r s t w o r d of
the output text.
N5, N6: A n e x t r a p r e d i c a t e , b u f p o s ~ 0, p r e vents output when two consecutive breaks
o c c u r ; the f i r s t b r e a k f o r c e s the w o r d to
be output and b u f p o s to be r e s e t to z e r o .
The d e c i s i o n t a b l e f o r m a t m a k e s it e a s i e r to
s e e how v a r i o u s s e t s of t e s t data s a t i s f y v a r i o u s
test data selection criteria Below the program
a r e w r i t t e n f o u r s e t s of t e s t d a t a . E a c h s e t a s s u m e s
I~AXPOS = 3 and meets some test data selection
c r i t e r i o n b a s e d on the p r o g r a m ' s s t r u c t u r e .
D1
e x e r c i s e s a l l s t a t e m e n t s (by e x e r c i s i n g r u l e s 3, 5, 9
a n d 1 0 ) , DZ a l l s t a t e m e n t s a n d c o m p o s i t e p r e d i c a t e s ,
a n d D3 a l l s t a t e m e n t s a n d i n d i v i d u a l p r e d i c a t e s . ~
~An i n d i v i d u a l p r e d i c a t e is c o n s i d e r e d to b e e x e r cised when it is necessessarily
evaluated for
s o m e data and it t a k e s on b o t h t r u e and f a l s e
values. For example, data satisfying rules 1-4
a r e n o t c o n s i d e r e d t o e x e r c i s e CZ b e c a u s e CZ n e e d
n o t b e e v a l u a t e d . CZ i s e x e r c i s e d b y d a t a s a t i s f y i n g o n e o r u l e s 5 - 8 a n d e i t h e r r u l e 9 o r r u l e 10.
497
Table 1
DECISION T A B L E
Initial c0nditions:.
Cl:
C2:
C$:
CA:
C5:
C6:
AI a:
b:
A2 a:
b:
A3 a:
b:
c:
REPRESENTATION
"7 C3A-TCSA
CW= B L V C W = NL
CW = ET
OF PROGRAM
CW = incharacter
zl3
yIY
819
10
NiNIN
N
-
'-eC3~--1C6
I'
(Y)
,Y
(N)-
outcharacter (BL)
/ill := /111 + I
outcharacter (NL)
till := 0
X
X
(N) Y
Ix, !x
/ o r k := 1 untilbufpoJ
xlx'
x X x,
Ix~x
xlxx
X
X
X
X
k4:
Alarm
A5 a:
b:
bufpol := bufpos + 1
buffer [bufposl := CW
A6 a:
b:
incharacter (CW)
Repeat table
A7:
Exit table
DI.1
DI.2
Test Data
A,AoA.A. ET
A.A,A. BL, B. BL. C, ET
DZ.I
DZ.2
A . A . A . A . ET
A. A,A. BL. B, BL, C. NL. ET
D3.1
D3.Z
A . A . A . A . ET
A, BL. B, NL. C. NL. ET
D4.1
D4.2"
D4. 3
D4.4
D4.5
A.A.A.A.ET
A. BL0 BL. B. BL. C. NL. ET
A. ET
A. BL, BoET
A. BL, B. NL. C0 BL, D. D. D. ET
X
X
X
X
C2
C2 ~ - 7 C l
C1~'-~
X
X
x
= FALSE.
77
(CW) AAlarm
&
bufpoe ~ 0
(N) (N)
X
X
X
X
"7 Cl A "s CZ AC6
:= T R U E
X
X
X
X
ClV (--~ClA--TC2A'-~C6)
X
:, CZ V(--TC1 AC6)
X X
Rule Exercised
Xs
X
X
I0. I0, I 0 . 9
X X
X
:X
Ixllx
BI.
Ill*
.N' .|
.
I~L
X N L.X I l l , X I l L
498
X
X
X
X
X
We wish to show that none of these exercising criteria are completely reliable,
i.e.,
itis
p o s s i b l e to e x e r c i s e a p r o g r a m
containing an
error using any of these criteria without necessarily discovering
the error.
This shows
that tests based solely on a programts
internal
structure are unreliable; their success is poor
evidence that a program
contains no errors.
Five examples of errors
are summarized
in
T a b l e 3. T h e e r r o r s
can be characterized
as
follow s:
1) a n i n c o r r e c t p r e d i c a t e ( f i l l + b u f p o s ~ M A X P O S
instead of fill + bufpos < MAXPOS) causing
rules 1 or 5 to be selected under circumstances where rules 3 or 7 should have been
selected.
This is an inappropriate
path selection type of construction error.
It w i l l n o t b e
detected by any of the four test data sets. To
Table 2
1.
R U L E S E X E R C I S E D T O SATISFY V A R I O U S C O M P L E T E N E S S
Composite Predicates
Composite
Rules To Be Exercised
Predicates
(I-8)
CI V C 2
Alarm VC2
C3
(5-9)
(1-3, 5-7)
(1, 5)
|
(9)
C4 A C5
C6
Z.
Rules Exercised By D2
TRUE
FALSE
FALSE
(9, 10)
(1-4, 10)
(4.8)
(z. 3, 6.7)
(10)
1, 3
8, 9
lr3
1
9
Rules Exercised By DI
TRUE
9, 10
1, 3, 10
8
3
10
FALSE
9, 10
5.9
10
3,5
5
10
Individual Predicates
Rules Exercised By D3
TRUE
FALSE
i, 2
3, 8, 9, I0
R u l e s To B e E x e r c i s e d
Individual
Predicates
TRUE
FALSE
CIBL
(I-4)
(I-10)
CINL
(1-4)
(5-10)
cz
(5-8)
(9, lO)
c3
(l-3, 5-7)
c4
(1, z, 5, 6)
FALSE
3, 8, 9, I 0
8, 9, I0
8, 9, 10
9,10
9, 10
(4, 8)
1, Z. 3
1,3
(3, 7)
1, Z
(2, 6)
C5
(1, 5)
C6
(9)
(10)
10
10
(9)
(10)
10
10
Alarm
3.
TRUE
CRITERIA
Rules Exercised By D 3
1
(I, 5)
(Z, 3, 6, 7)
2,3
(4, 8)
(9)
(lO)
10
Table 3
E F F E C T OF ERRORS
Error in Program
I.
Change
f i l l + bufpos
< MAXPOS
to
f i H bufpos
< ~.AXPOS
2. O m i t l i n e 13
flU:= f i l l + 1
3. O m i t bufpos ~ 0
t e s t (line 8)
(This i s N a u r ' s
e r r o r N5, 6. )
4. O m i t
from
(This
error
flU 4 0
l i n e 10
is Naur's
N3. )
5. O m i t CW= E T
f r o m line 6
E r r o r Type
inappropriate path
selection
missing action
missing path
E f f e c t on T a b l e
C o n d i t i o n C4
changed similarly.
No m a r k on
Action Alb.
C o n d i t i o n C3 l i n e
w i n be r e m o v e d a n d
therefore rules 4
and 8 can be dropped.
inappropriate path
selection
C o n d i t i o n C5 r e m o v e d
from table. Rules
2 and 6 are eliminated,
being subsumed under
r u l e s I a n d 5.
inappropriate path
selection
R u l e s 5-8 e l i r a i n a t e d
and rule 9 is modified.
See T a b l e 4.
499
Table
DECISION
TABLE
REPRESENTATION
Illzl3i4
I
Cl:
CW=BLVCW=NL
CZ:
CW = ET
C3:
C4:
bufpos ~ 0
C5:
C6:
bufpos = M A X P O S
A 1 a:
b:
IY
(Y)
(N)
fill ~ 0
N I
(N)
Y'
outcharaeter
(BL)
fill := fill + I
AT:
Exit table
, C6DC3;
AC6~--YC4
C6~-~C4
C I A C3 A C4 AC5A--wC6
X
X
C1 AC3 A (-'~C4V-7C5)
IX
C1 AC3
X
X
X
X
X
X
-~C1A C6
X
X
X
,,
buffer [bufpos] := CW
i n c h a r a c t e r ICW)
Repeat table
CZ=:~--tCl
~-IC3~'-vC6
bufpos := bufpos + 1
A6 a:
: b:
CI~'wCZ
A l a r m := TRUE
b:
N.,
N
f o r k := 1 u n t l l b u f p o s
"--outcha r a c ~ (buffer [k])
fill := fill + bufpos
bufpos := 0
A 5 a:
Y
N
(N) (N) -
5 1 6 1 7 el9' lO
Y
Y
A3a:
A4:
YIYIY
P R O G R A M CONTAINING ERROR
Y
Y
o u t c h a r a c t e r (NL~
fill:= 0
c:
I ( N ) ) N } (N) (N)
AZ a:
b:
b:
OF
X
X
X
X
X
X
X
X
X
X
-rC1A-7 Cb
X
X
C1 V ('-'tC6A--sCZ)
o
cZ v ( ~ C l ^ C 6 )
r e p l a c e d w i t h f i l l := M A X P O S ) . T h i s e r r o r
is easily detected by any data whose first
word is less than MAXPOS characters long.
N o t e t h a t D1 a n d D2 do n o t d e t e c t t h i s e r r o r
precisely because the first word is exactly
MAXPOS characters long. In fact, data set
DZ e x e r c i s e s a l l i n d i v i d u a l p r e d i c a t e s , i n t e rior loop paths, and statements in the erroneous program without showing this error.
It is a l s o r e a d i l y p o s s i b l e to d e v i s e d a t a to
exercise all rules in a decision table representation of the erroneous program without
revealing the error.
3) a m i s s i n g p r e d i c a t e ( b u f p o s ~ 0) y i e l d i n g a
m i s s i n g p a t h t y p e of e r r o r t h a t w o u l d not be
d e t e c t e d w i t h t h e D1 d a t a s e t . ( T h i s i s o n e o f
Naur's errors. ) Although the other data sets
would detect this error, different data sets
can be constructed that will not detect the
error and yet will satisfy the various exercising criteria for the erroneous program.
F o r e x a m p l e , D1 a s i t s t a n d s w o u l d e x e r c i s e
all statements and composite predicates in
t h e m o d i f i e d p r o g r a m ( s e e T a b l e 2 w i t h C3
eliminated).
A l s o , t e s t D4. 2 c o u l d t h e n
b e e l i m i n a t e d a n d t h e o t h e r D4 d a t a w o u l d
then be sufficient to exercise all rules, into.
rior loop paths, and individual predicates
without revealing the error.
T h i s a n a l y s i s of f i v e p o s s i b l e e r r o r s s h o w s
the unreliability of test data selected merely to
exercise all statements, compopite predicates,
individual predicates, loop iteration paths, or
rules in a program's decision table representation. These exarnDles of errors show that:
5O0
Table 5
A L L F E A S I B L E R U L E S I M P L I E D BY T A B L E 1
$1L
CW - B L V C W N L
C l ~ - I C2
~3.*
b u r as iI 0
-~ C3:~ -'i C6
C4:
C5.,=.:~
f l - + bu,f o$ < M A X P O S
t i l l II 0
C6:
buf~__._.___...=...M__~A X P O 3
C6 ~ C 3 : C 6 ~ " I C 4
AI a:
( C l V C 2 ) A C 3 A C4 A C S A ' V C 6
b:
~2 a:
b:
~3 at
f i l l := R I I 1
~
outcharactor
t i l l := f i l l + b _ ~ _ p . ~ _ ~
e:
but 08 :- 0
f~4.~
Alarm
:= T R U E
h5 a :
buf.o~
bur
b:
buffer
b:
( C l VC2) A C 3
k :u I u n t i l bu.fpo8
--outcharac~er
buffee" k
~
b:
&6 a :
f i l l : 0
bu.f o s
lncharacter
--VCl A--IC2 A C 6
+ 1
'
:= CW
~_~
C I V ~"~ C l A ~ C 2 / ~
C6)
Re e a t t a b l e
C2 V ( ~ C l
AT_,~___Extt t a b l e
Test I~bL
AC6)
l~alo S e q u e n c e E x e r c i s e d
Rules E x e r c i s e d
27 20 20 22
27,20.20 5
S 6 25.10
27 20 20 5 25 6 2$ 1 17
27,2,7,25.1,266,16
27,11
27.2. Z5.10
1) T o d e t e c t e r r o r s r e l i a b l y , i t i s i n g e n e r a l
n e c e s s a r y to e x e c u t e a s t a t e m e n t u n d e r
m o r e t h a n o n e c o m b i n a t i o n o f c o n d i t i o n s to
v e r i f y that its e f f e c t is a p p r o p r i a t e u n d e r all
circumstances.
Exercising any statement
just once is usually inadequate.
2) E q u a l l y w e l l , t h e s a m e p a t h t h r o u g h a l o o p
w i l l u s u a l l y h a v e to be e x e r c i s e d m o r e t h a n
once before the right combfnation of condit i o n s i s f o u n d to r e v e a l a m i s s i n g p a t h o r i n appropriate path selection error. For examp l e , e r r o r 5 ( s e e T a b l e 3) r e q u i r e s e x e r c i s i n g r u l e 8' o f T a b l e 4 w i t h b u f p o s 0 to s h o w
the error.
This path should also be exerc i s e d w i t h b u f p o s = 0 to g u a r d a g a i n s t s o m e
o t h e r e r r o r , e . g . , e r r o r 3, e v e n t h o u g h
bufpos's value is not even tested when executing r u l e 8'.
In s h o r t , the s e c r e t of r e l i a b l e t e s t i n g is to
f i n d a l l c o n d i t i o n s r e l e v a n t to a p r o g r a m l s c o r r e c t o p e r a t i o n a n d to e x e r c i s e a l l p o s s i b l e c o m b i n a t i o n s o f t h e s e c o n d i t i o n s . In S e c t i o n 4, w e
will illustrate how relevant conditions can be
discovered.
In s h o r t , a r e l i a b l e t e s t i s d e s i g n e d n o t s o m u c h
to e x e r c i s e p r o g r a m a t l a s
to e x e r c i s e a t e
under circumstances such that an error is detectable if one exists. Tests based solely on the
i n t e r n a l s t r u c t u r e of a p r o g r a m a r e l i k e l y to be
unreliable.
2.3
E x a m p l e 2:
An Exam Scheduler
T h e p r o b l e m i s to c o n s t r u c t a. t i m e t a b l e f o r
university examinations such that
1) T h e n u m b e r o f s e s s i o n s i s k e p t s m a l l , t h o u g h
no a b s o l u t e m i n i m u m i s s t i p u l a t e d .
Z) E a c h e x a m i s s c h e d u l e d f o r o n e s e s s i o n .
3) No e x a m i s s c h e d u l e d f o r m o r e t h a n o n e
session.
4)No session involves more than K exams.
S) No s e s s i o n i n v o l v e a m o r e t h a n h s t u d e n t s .
6)No student takes more than one exam in a
session.
501
-t
p r o o f of c o r r e c t n e s s
bilit~r.
3.
a p p r o a c h to s o f t w a r e r e l i a -
Views on Testing
3.1
"Exhaustive"
testing
1) P e r f o r m a n c e .
E x a c t l y 500 e x a m s a r e
a l w a y s p r o c e s s e d e v e n if t h e r e a r e f e w e r
t h a n 500 e x a m s t o b e s c h e d u l e d . S i n c e t h e
program is recursive and searches all comb i n a t i o n s of u n s c h e d u l e d e x a m s , the p e r f o r m
anc~ will be degraded by processing "empty
e x a m s " ( e x a m s f o r w h i c h no s t u d e n t i s r e g i s tered).
Z) V i o l a t i o n o f c o n s t r a i n t s .
Empty exams still
h a v e to b e s c h e d u l e d in s o m e s e s s i o n , s o i t
i s p o s s i b l e t h a t t h e l i m i t a t i o n i m p l i c i t i n (1)
could be violated unnecessarily.
That is,
the p r o g r a m a s it s t a n d s m a y be u n a b l e to
produce a solution where it should if empty
exams were not processed.
Is t h e r e a w a y to t e s t a p r o g r a m w h i c h is
e q u i v a l e n t t o e x h a u s t i v e t e s t i n g (in t h e s e n s e
of b e i n g r e l i a b l e and v a l i d ) ?
Is there a practical
tive testing ?
approximation
to e x h a u s -
It s h o u l d b e n o t e d t h a t e x h a u s t i v e t e s t i n g o f a
program:s input domain is a process not necess a r i l y g u a r a n t e e d to t e r m i n a t e .
There are some
p r o g r a m s w h o s e b e h a v i o r (e. g . , w h e t h e r t h e y
stop) is i m p o s s i b l e to v e r i f y by t e s t i n g o r any
other means and some programs have infinite
input domains, so exhaustive testing can never
be completed.
Although some programs cannot be exhaustively tested, a basic hypothesis for the reliab i l i t y and v a l i d i t y of t e s t i n g is that the input
domain of a program can be partitioned into a
f i n i t e n u m b e r of e q u i v a l e n c e c l a s s e s s u c h t h a t
a t e s t o f a r e p r e s e n t a t i v e of e a c h c l a s s w i l l , b y
induction, t e s t the e n t i r e c l a s s , and h e n c e , the
e q u i v a l e n t of e x h a u s t i v e t e s t i n g of the input d o main can be performed.
If s u c h t e s t s a l l t e r m inate and if the p a r t i t i o n i n g is a p p r o p r i a t e , a
c o m p l e t e l y r e l i a b l e t e s t of the p r o g r a m will h a v e
been performed.
T h i s i s n o t , of c o u r s e , a n o v e l
i d e a . H o a r e [in B u x t o n ( 1 9 7 0 ) , p . 21] h a s p o i n t e d
o u t t h a t t h e e s s e n c e o f t e s t i n g i s to e s t a b l i s h t h e
b a s e p r o p o s i t i o n of a n i n d u c t i v e p r o o f (we p u r s u e
t h i s i d e a f u r t h e r i n S e c t i o n 5). T h i s p i n p o i n t s
the f u n d a m e n t a l p r o b l e m of t e s t i n g - - t h e i n f e r e n c e
f r o m t h e s u c c e s s of o n e s e t o f t e s t d a t a t h a t o t h e r s
will also succeed, and that the success of one
5~
Than Testing?
3. 3
Just because testing is not a completely reliaMe means of demonstrating program correctness
does not mean it's sensible to rely solelyonproofs.
Proofs aren't completely reliable either. Proofs
c a n o n l y p r o v i d e a s s u r a n c e of c o r r e c t n e s s i f a l l
the following are true:
on
T h e E f f e c t of T e s t i n g C o n s i d e r a t i o n s
Program Development
with
c. T h e p r o g r a m i s c o m p l e t e l y a n d f o r m a l l y
implemented in such a way a proof can be
performed or checked mechanically.
d. The specifications are correct in that if
every program in the system is correct
with respect to its specifications, then the
entire system performs as desired.
a) T e s t i n g b e i n g i n e v i t a b l e , i t i s g o o d p r a c t i c e
to identify testing needs, e.g., weak or
critical links in a system, early in program
d e s i g n (e. g . , s e e H a n s e n (1973)).
These requirements
a r e f a r b e y o n d t h e s t a t e of
t h e a r t of p r o g r a m s p e c i f i c a t i o n a n d m e c h a n i c a l
theorem proving, and we must be satisfied in
practice with informal specifications, axiomatizations, and proofs.
Then problems arise when
proofs have errors, specifications are incomplete,
a m b i g u o u s , o r u n f o r m a l i z a b l e ( a s i n e x a m p l e Z),
and systems are not axiomatizable.
The two
examples already discussed have clearly shown
that an incomplete attempt at a program proof
does not assure a program will not fail. These
examples are very realistic in terms of the state
of t h e a r t of p r o g r a m p r o v i n g f o r r e a l p r o g r a m s .
b) Programs
should be structured so logical
t e s t i n g o f v a r i o u s a b s t r a c t i o n s of t h e p r o g r a m c a n r e d u c e a c t u a l t e s t i n g of t h e f i n a l
program.
c) S p e c i f i c a t i o n s
testable.
must be precise
enough to be
d) T h e n e e d t o g e t s p e c i a l i n f o r m a t i o n j u s t t o
verify the successful execution of a test run
a f f e c t s p r o g r a m d e s i g n , (e. g . , t e s t i n g a n
operating system scheduler requires access
to information not ordinarily available).
Conclusions
a) D e s i g n a n d s p e c i f i c a t i o n w h e r e i t i s n e c e s s a r y
to exclude cases of data where the program is
not expected to operate or to identify cases in
the input or output which require Special treatment.
I n E x a m p l e 1, t h e s e w o u l d i n c l u d e s u c cessive break characters and termination characters, and in Example Z the empty exam.
b) P r o g r a m c o n s t r u c t i o n , w h e r e t h e s o l u t i o n t o
the problem dictates that certain cases require
503
Tests
O u r p u r p o s e i n t / U s p a r t of t h e p a p e r i s 1 ) t o
iUustrate the l~roblems and issues faced in developing reliable and valid tests for a particular
p r o g r a m s o t h a t l a t e r , i n S e c t i o n 5, w h e n w e
discuss formal criteria for reliability and validity, examples illustrating the formal crite.riawill
b e a t h a n d ; a n d 2) t o s h o w i n f o r m a l l y , w h a t m a y ,
with further research,
become a practical approach to defining reliable tests.
4. 1
An Overview
of t h e M e t h o d
Deriving
A program's
specifications are an important
source of testbecause data satisfying suchpredicates are able to detect missing path and inappropriate path selection construction errors.
Such
data can also detect design and specification errors,
as we shall see. We will useNaur's
specifications
for Example 1 to illustrate test case development
from specifications.
Then, using knowledge of
an actual implementation (either that shown in
F i g u r e Z o r F i g u r e 3), w e w i l l s h o w h o w t o e l i m inate some test predicates without impairing test
reliability.
(It m a y a l s o h a p p e n t h a t s o m e ' t e s t
predicates will have to be added when information
about an actual implementation becomes available.
but this does not occur here. )
T h e f i r s t p a r t of N ~ u r ' s
specification
states:
(1) " G i v e n a t e x t , c o n s i s t i n g of w o r d s s e p a r a t e d
by BLANKs or NL (new line)characters .... "
This clause attempts to describe the program's
input domain.
We begin to develop a condition
Table 6
FIRST CONDITION TABLE FOR INPUT DOMAIN
6~
Ot
1'
1'
I'
1'
1.
s~
~.,
1.
ZZ. 23
Table 7
REVISED
1
BL
BL B L
BL B L
CONDITION
TABLE
FOR
INPUT
DOMAIN
24
15
2'6 127
z8
BL B L B L B L B L B L
BL I ML N L N L N L : N L N L N L N L NL N L N L NL
I
!
Gt
Ot
Ot
Ot
Ot
Ot
ETBL
BL ~ L
NL ~
Y'
10Jl1
819
i lZ [ ~3
14
16
17
18
19
20
11
29130;31
3z
ET!ET
r
ET
>1
>1
' 1'
1'
1'
I~
N
E T E T E T IE ~
i
Yiy
>1
iOt
1,
1, [ 1.
1,
B L N L Ot
ET
BL NL 0t
E~
(1') (Y) ~ )
C4,
Sink
lensth[1,>l]
. I
1' i 1'
?
1 ' >1
'1
>1 J 1
>1
>1
>1
1, I ?
1.
1.
1'
1'
C o n s t r a i n t 8 b e t w e e n onditiol~s
C 2 ~ L ) V CZ(NL) : C4(1.)
Cl(Ot) : G3
c1(1') =-~c3
table by loolcing for conditions relevant to the input domain that are also relevant to the processi n g r e q u i r e d b y t h e s p e c i f i c a t i o n , i. e . , w e t r y t o
extract conditions from the specification that we,
as programmers,
would consider relevant to deciding when it is appropriate to perform certain
actions.
F o r e x a r n p l e , i f w e t h i n k i n t e r m s of
scanning the input text character by character,
then it's possible and reasonable to describe the
i n p u t i n t e r m s of t h e c h a r a c t e r c u r r e n t l y b e i n g
~canned and the one irnrnediately preceding it.
This approach gives rise to the condition table
s h o w n i n T a b l e 6. T h e n o t a t i o n t h e r e s h o w s
t h a t t h e v a l u e s of P r e v C h a r ( c o n d i t i o n C I ) a n d
C u r C h a r ( c o n d i t i o n CZ) c o n s t i t u t e c o n d i t i o n s
relevant to the program's
correct operation. In
particular,
t h e p o s s i b l e v a l u e s f o r C1 a n d CZ a r e
partitioned into three subsets, the value BL (for
BLANK), the value NL, and all other character
v a l u e s , O t . A n y c o n d i t i o n m a y , of c o u r s e , b e
undefined (represented in the table by a question
mark), and so there are four possibilities to be
c o n s i d e r e d f o r e a c h c o n d i t i o n , y i e l d i n g 16 p r e d i c a t e s i n a11.
f o r d e a l i n g w i t h v a r i o u s t y p e s of i n p u t m e d i a ( e . g . ,
cards or paper tape) and this interpretation subsumes the more restricted case of single break
characters,
s o w e a d o p t i t . Of c o u r s e , t h e a m biguity should be checked with the specification
writer, since our reasoning for permitting data
satisfying these predicates may not be valid in
t h e c o n t e x t of t h e a c t u a l u s e o f t h e p r o g r a m .
It
can be seen here how preparation of a condition
table is a natural way to check a specification
f o r c o r n p l e t e n e s s a n d l a c k of a m b i g u i t y , a s w e l l
as for identifying characteristics
of test datathat
should be Presented to the program during the
test phase.
Next we need to check the test predicates for
r e l i a b i l i t y , i. e . , i f w e s e l e c t d a t a t o c o v e r e a c h
predicate, wiU we be forced to select data capa b l e o f r e v e a l i n g al'l e r r o r s i n a n i m p l e m e n t a t i o n ?
To answer this question, we must decide whether
any conditions relevant to the correct operation
of t h e p r o g r a m a r e m i s s i n g f r o m t h e t a b l e a n d
whether value sets have been partitioned appropriately.
We must also decide whether the test
p r e d i c a t e s a r e i n d e p e n d e n t , i. e . , w h e t h e r t h e
sequence in which test predicates are exercised
when processing test data is potentially signifi c a n t to t h e c o r r e c t n e s s
of the program.
For
example, we can see that data chosen to exercise all the predicates in Table 6 will have to
include data where words are separated by
at least two break characters,
but all predicates can be exercised without necessarily
having exactly one break character between any
w o r d s . ~ I s t h e o c c u r r e n c e of a b r e a k of l e n g t h
one significant to the correct operation of a
program?
If a program correctly processes
t e x t c o n t a i n i n g b r e a k s of l e n g t h t w o o r g r e a t e r ,
will it necessarily correctly process text conraining breaks of length one? What about breaks
of l e n g t h o n e a n d t w o p r e c e d i n g t h e f i r s t w o r d i n
505
suggested.
Examining the clause with respect
t o T a b l e 7, h o w e v e r , s h o w s t h a t a l i n e b r e a k c a n
never be made before the first word in the text
unless the first word is preceded by one or more
break characters.
Naur's program itself violates
this clause because it always puts out a line break
at the beginning of the output. Undoubtedly, the
i n t e n t of t h e c l a u s e w a s n o t t o f o r b i d a n i n i t i a l i z ing line break in the output, but rather to say that
a word in the input text must be contained completely on a single line in the output; here is
another specification error (failure to correctly
e x p r e s s t h e i n t e n t of t h e d e s i g n e r ) r e a d i l y d e t e c t e d
by case analysis.
4. 3
T e s t p r e d i c a t e s c a n b e e l i m i n a t e d (i. e . , t h e i r
exercising made unnecessary) if it can be shown
that data satisfying the predicates are treated the
same by the actual program and from the viewpoint of the specifications,
For example, distinguishing between BL and NL in Table 7 is un~necessary.
C u r s o r y e x a m i n a t i o n of t h e p r o g r a m s
in either Figure 1 or Figure 2 shows that every
t i m e CW i s t e s t e d f o r e q u a l i t y w i t h B L i t i s a l s o
tested for equality with NL; the specifications,
moreover,
do n o t r e q u i r e d i f f e r e n t e f f e c t s d e p e n d ing on whether a BL or NL is the value of a break
character.
So w e c a n s a f e l y p a r t i t i o n t h e v a l u e
set for CurChar into BL or NL, ET, and Ot and
the value set for PrevChar into BL or NL, and
Ot without reducing the reliability of this set of
test predicates for these particular programs.
This will reduce the number of test predicates
i n T a b l e 7 f r o m 32 t o 16 a n d t h e n u m b e r o f t e s t
runs required from ten (one for each ET condition) to six. This illustrates how the amount
of t e s t i n g c a n b e s a f e l y r e d u c e d a f t e r a s e t o f
test predicates has been developed independently
of program structure.
P r o o f s of s i m p l e p r o g r a m p r o p e r t i e s c a n r e d u c e t h e a m o u n t of t e s t i n g
required.
The next part of Naur's
specification
The major problem in our approach as illustrated so far is that conditions are considered
as they spring to mind.
This may mean wasting
effort on conditions unlikely to be connected with
errors in the actual implementation to be tested.
Nonetheless,
s u c h c o n d i t i o n s do t e s t s o m e t h i n g
about a program.
T h e q u a l i t y of t h e t e s t p r e d i cates cannot be decided at the exact moment of
their conception.
Test predicate analysis must
be carried out in its entirety and then checked
for overall reliability before deleting individual
test predicates.
states:
NL... "
T h i s c l a u s e states a constraint o n w h e n
Summary
the action
of o u t p u t t i n g a N L c h a r a c t e r i s v a l i d . S i n c e w e
have already distinguished BLANKs and NL chara c t e r s i n T a b l e 7, n o n e w t e s t c o n d i t i o n s a r e
506
a c t u a l p r o g r a m s t r u c t u r e and w h a t a p r o g r a m
s e e m s to doe I t a v o i d s the f l a w s of t e s t i n g
m e t h o d s that f o c u s s o l e l y on the i n t e r n a l s t r u c t u r e of a p r o g r a m , but i s not n e c e s s a r i l y d i vorced from a program's internal structure
since all p r o g r a m p r e d i c a t e s m u s t ultimately
be r e p r e s e n t e d in the c o n d i t i o n ' t a b l e i f the
t a b l e i s to d e f i n e a r e l i a b l e s e t of t e s t p r e d i cates.
s a i d to b e t r u e . With t h i s de~finition in m i n d ,
C O M P L E T E ( T , C) i s d e f i n e d a s f o l l o w s :
i f TC_D, C O M P L E T E ( T , C) =
( V c e C)(~It GT)c(t) h (Vt T)(~ice C)c(t)
This definition states that every test predicate
b e l o n g i n g t o C m u s t b e s a t i s f i e d by a t l e a s t one
t e T, and e v e r y t m u s t ss~fisfy a t l e a s t one t e s t
predicate.
Both r e q u i r e m e n t s a r e n e c e s s a r y ,
s i n c e p r o o f of Cts v a l i d i t y w i l l r e q u i r e p r o v i n g
the c o r r e c t n e s s o f a p r o g r a m f o r data that s a t i s fy no t e s t p r e d i c a t e , a n d h e n c e , a r e i n c l u d e d in
no s e t of t e s t d a t a T.
Th e type of r e a s o n i n g u s e d in s e l e c t i n g
t e s t p r e d i c a t e s i s m u c h l i k e that u s e d in
c r e a t i n g a s s e r t i o n s , and h e n c e the a p p r o a c h
f o c u s e s a t t e n t i o n on the a b s t r a c t p r o p e r t i e s of
the p r o g r a m and i t s s p e c i f i c a t i o n s . A t e s t
p r e d i c a t e a n a l y s i s of a p r o g r a m m a y be a p r a c tical f i r s t step toward p r o g r a m proving with
the a d v a n t a g e t h a t both t e s t i n g and p r o v i n g
c ou l d be p e r f o r m e d s e q u e n t i a l l y o r in p a r a l l e l .
5.
T h e ex~amples in S e c t i o n 2 showe~ t h a t it i s
r e a d i l y po~lsible to c h o o s e data t h a t e x e r c i s e
t e s t p r e d i c a t e s in an o v e r l a p p i n g m a n n e r (e. g . ,
s e e T a b l e !5). T h i s s u g g e s t s a n a t u r a l p a r t i t i o n i n g of t h e i n p u t d o m a i n into r e l a t e d e q u i v a l e n c e c D i s s e s , E ( C ' ) d e f i n e d as f o U o w s :
L e t C ~_~C.
T h e T h e o r y of T e s t i n B
Th e m e t h o d o l o g y i l l u s t r a t e d in the p r e c e d i n g
s e c t i o n c o n s t i t u t e s an i n f o r m a l a p p l i c a t i o n of t h e
t h e o r e t i c a l conce15ts d e f i n e d in S e c t i o n I , i. e . ,
S e c t i o n 4 d e m o n s t r a t e s th e u s e of C O M P L E T E ( T , C )
R E L I A B L E ( C ) , and V A L I D ( C ) , a s a m e a n s of d e vising thorough t e s t s when the t e s t data s e l e c t i o n
c r i t e r i o n , C, c o n s i s t s of a s e t of t e s t p r e d i c a t e s .
In t h i s S e c t i o n , w e w i l l p o i n t out how o u r u s e of
t h i s s o r t of t e s t d a t a s e l e c t i o n c r i t e r i o n s a t i s f i e s
these previously defined theoretical concepts.
= ( c | , c2, c~
C123 = C
--
c23
= (2
C1
= (c a
E(Cz3)
E(C 3)
E()
Figure 4
S t r u c t u r e s h o w in g r e l a t i o n s h i p s b e t w e e n e q u i v a l e n c e c l a s s e s i n d u c e d
by a s e t of t e s t p r e d i c a t e s , C. If C is reli~Lble, the c o r r e c t p r o c e s s i n g of data d r a w n f r o m one e q u i v a l e n c e cDLss (e. g . , E(C~2)) p r o v e s
th e c o r r e c t n e s s of the p r o g r a m f o r data d r ~ w n f r o m c e r t a i n o t h e r
c l a s s e s ( e . g . , E(C~) and E(C2)), i . e . , s u c c e s s p r o p a g a t e s d o w n w a r d s
in the l a t t i c e of e q u i v a l e n c e c l a s s e s . S i m i l a r l y , f a i l u r e p r o p a g a t e s
u p w a r d . F o r e x a m p l e , the i n c o r r e c t p r o c e a s i n g of data d r a w n f r o m
E(C3) i m p l i e s data d r a w n f r o m E(Cz3), E( C1 3 ) , and E(C1z3) w i l l a l s o
be p r o c e s s e d i n c o r r e c t l y .
507
D VAI~ID(C)
T h i s m e a n s C i s v a l i d i~ a p r o g r a m i s c o r r e c t f o r
a l l d a t a s a t i s f y i n g n o t e s t p r e d i c a t e . With r e s p e c t to the c o n d i t i o n t a b l e t e c h n i q u e , t h i s v a l i d i t y
c r i t e r i o n r e q u i r e s that c o n s t r a i n t s a m o n g c o n d i t i o n s be p r o p e r l y d e s c r i b e d ; e v e r y c o n d i t i o n c o m b i n a t i o n e x c l u d e d a s i m p o s s i b l e m u s t a c t u a l l y be
i m p o s s i b l e . M o r e o v e r , t h e d o m a i n of v a l u e s
a s s o c i a t e d with s o m e v a r i a b l e m u s t be c o r r e c t l y
d e s c r i b e d , e . g . , CW in S e c t i o n 4 m u s t a t m o s t
t a k e on the v a l u e s B L , NL, E T , and O t h e r . A l t h o u g h a n o n - e m p t y E(~) c a n b e v i e w e d as w a r n ing t h a t a p r o g r a m c a n e x e c u t e a r e l i a b l e t e s t
s u c c e s s f u l l y and s t i l l c o n t a i n e r r o r s b e c a u s e i t
i s i n v a l i d , i t c a n a l s o be v i e w e d m e r e l : ~ a s i n d i catting that t h e c o r r e c t n e s s of t h e p r o g r a m f o r
s u c h data i s to be a s s e s s e d b y m e a n s o t h e r than
t h e t e s t s i m p l i e d by C. S e p a r a t e v e r i f i c a t i o n ,
f o r e x a m p l e , by p r o o f , m a y b e e a s i e r t h a n v e r i t i c a t i o n b y t e s t i n g , o r i t m a y b e that t h e d a t a in
E(~) h a v e a l r e a d y b e e n v e r i f i e d b y p r i o r t e s t s .
It m a y be t h a t a c o n s c i o u s d e c i s i o n h a s ]been
m a d e not to e x e r c i s e c e r t a i n t e s t p r e d i c a t e s
b e c a u s e it i s " o b v i o u s " the p r o g r a m will c o r r e c t l y p r o c e s s data s a t i s f y i n g t h e m . Thins c a n
r e d u c e the t o t a l t e s t i n g e f f o r t . In a n y e v e n t ,
the p r o g r a m V s c o r r e c t n e s s f o r d a t a b e l o n g i n g
to E(~) i s not d e t e r m i n e d by t h e t e s t s d e f i n e d
by C.
It m a y s o m e t i m e s be i m p r a c t i c a l to c o n d u c t
c o m p l e t e t e s t s . In t h i s c a s e , if C i s kno,~,n to
be v a l i d b e c a u s e E(~) i s e m p t y , t h e n t e s t
data should be c h o s e n j u s t f r o m equivalLence
c l a s s e s t h a t s a t i s f y t e s t p r e d i c a t e s l i k e l y to be
e n c o u n t e r e d in p r a c t i c e . W h il e s u c h a t e s t w i l l
not n e c e s s a r i l y b e v a l i d , i t w i l l s t i l l be~ r e l i a b l e i f C i s r e l i a b l e , and e s t i m a t e s of o v e r a l l
p r o g r a m r e l i a b i l i t y c a n be d e v i s e d by e s t l m a t i n g
t h e f r e q u e n c y w i t h w h i c h data s a t i s f y i n g u n e x e r c i s e d t e s t p r e d i c a t e s w i l l be e n c o u n t e r e d i n a c t u a l
u s e of t h e p r o g r a m . T h i s s o r t of a n a l y l J i s c a n
s e t a l o w e r bound on the e s t i m a t e d f r e q m e n c y of
f a i l u r e in the p r o g r a m V s a c t u a l o p e r a t i o n , e v e n
w h e n a p r o g r a m i s t e s t e d only i n c o m p l e t e l y .
T h e f o r m a l d e f i n i t i o n of R E L I A B L E ( ( : ) i m p l i e d by the u s e of t e s t p r e d i c a t e s i s r a t h e r
c o m p l e x , b u t the e s s e n t i a l i d e a i s to d e t i r ~ e t e s t
p r e d i c a t e s so s u c c e s s o r f a i l u r e when e x e c u t i n g
a p r o g r a m F with a p a r t i c u l a r datum d depends
only on w h a t i n d i v i d u a l t e s t p r e d i c a t e s d s ~ t i s t i e s , not the p a r t i c u l a r c o m b i n a t i o n s a t i s f i e d .
T h i s i s the s e n s e in w h i c h t e s t p r e d i c a t e s 1Trust
be independent. F o r example, using the d e f i n i t i o n of C g i v e n in F i g u r e 4, i f d12c E(C12) and
OK(dlZ), then the d e f i n i t i o n of C O M P L E T E ~ s a y s
t h a t t h e r e i s no n e e d to t e s t F w i t h d a t a b e l o n g i n g to E ( C i ) o r E(C2). H e n c e O K ( d i e ) i m p l ' i e s
OK(d1) an d OK(de), W h e r e d I e E(C I) a n d d 2 c E( Cz)
i f C i s r e l i a b l e . S i m i l a r l y , OK(d1) and OKA[d2)
i m p l i e s O K ( d l 2 ) , s i n c e one c o m p l e t e t e s t c o u l d
I n c l u d e d12 and a n o t h e r , b o t h d I and dz; r e H a r d l e s s of w h i c h c o m p l e t e t e s t i s a c t u a l l y p e r f 0 , r m e d ,
t h e s u c c e s s of e i t h e r c o m p l e t e t e s t m u s t I m | ~ l y
t h e s u c c e s s of a l l o t h e r s i f C i s r e l i a b l e . N,ote
a l s o t h a t the d e f i n i t i o n of C O M P L E T E i m p l l e ; s
t h a t to b e r e l i a b l e , n e i t h e r t h e s e q u e n c e of t e s t
p r e d i c a t e e x e c u t i o n n o r t h e f r e q u e n c y of e x e c u tion can be r e l e v a n t to the s u c c e s s f u l p r o c e s s i n g
of t e s t data b e c a u s e data a r e s a i d to s a t i s f y a
t e s t p r e d i c a t e w h e t h e r the data c a u s e the p r e d i c a t e t o l b e e x e r c i s e d one t i m e o r m a n y t i m e s ,
and no m a t t e r w h a t s e q u e n c e s of p r e d i c a t e s a r e
e x e r c i s e d b e f o r e o r a f t e r a p a r t i c u l a r one. If
f r e q u e n c y o r s e q u e n c e of t e s t p r e d i c a t e e x e c u t i o n i s p o t e n t i a l l y r e l e v a n t to an e r r o r ' s e x i s t ence, s e p a r a t e test p r e d i c a t e s m u s t be defined
to c o v e r a l l r e l e v a n t s e q u e n c i n g o r f r e q u e n c y
p o s s i b i l i t i e s . ( T h i s w a s t h e r e a s o n we a d d e d
" b r e a k l e n g t h " and " f i r s t w o r d ? " to t h e c o n d i t i o n t a b l e c r e a t e d i n S e c t i o n 4. )
T h e f o r m a l d e f i n i t i o n of R E L I A B L E ( C ) , w h e n
C i s a s e t of t e s t p r e d i c a t e s , t h e r e f o r e i s :
RELIABLE(C)
f r o m f a i l u r e to find a l l c o n d i t i o n s r e l e v a n t to t h e
s u c c e s s f u l p r o c e s s i n g of data. F o r e x a m p l e , t h e
s e t of t e s t p r e d i c a t e s d e s c r i b e d by a c o n d i t i o n
t a b l e w i l l f a i l to b e r e l i a b l e i f t h e d o m a i n of
v a l u e s a s s o c i a t e d with s o m e v a r i a b l e is not
c o r r e c t l y p a r t i t i o n e d i n t ~ r e l e v a n t c l a s s e s of
v a l u e s (e. g . , t h e t e s t p r e d i c a t e s d e v e l o p e d i n
S e c t i o n 4 would be u n r e l i a b l e i f t h e o n l y r e l e v a n t
v a l u e s of CW w e r e not B L , N L , E T , a n d ' H e r ,
i. e . , i f t h e s e w e r e n o t the only v a l u e s r e l e v a n t
w h e n c e r t a i n a c t i o n s a r e to h e p e r f o r m e d ) o r i f
some v a r i a b l e or condition w e r e completely
m i s s i n g f r o m t h e t a b l e (e. g . , th e p r e d i c a t e
fi11+bufpos = MAXPOS). Note t h e d i f f e r e n c e
h e r e b e t w e e n a r e l i a b i l i t y e r r o r and a v a l i d i t y
e r r o r . R e l i a b i l i t y r e q u i r e s p a r t i t i o n i n g the
v a l u e s e t f o r CW c o r r e c t l y i n t o s e t s of v a l u e s
" t r e a t e d the s a m e " b y t h e p r o g r a m . V a l i d i t y
r e q u i r e s the p o s t u l a t e d v a l u e .set f o r CW i n c l u d e
a l l p o s s i b l e v a l u e s f o r CW.
5) t h e t e s t p r e d i c a t e s m u s t b e i n d e p e n d e n t , e. g . ,
a l t data s a t i s f y i n g a p a r t i c u l a r t e s t p r e d i c a t e
m u s t e x e r c i s e the s a m e path in the p r o g r a m
and t e s t the s a m e b r a n c h p r e d i c a t e s .
Note t h a t c o n s t r a i n t s 1 , 2 , and 3 a r e s a t i s f i e d only
w i t h k n o w l e d g e of the d e t a i l s of a p r o g r a m * s i m p l e mentation. Satisfying c o n s t r a i n t 5 r e q u i r e s knowing s o m e t h i n g of the p r o g r a m l s i n t e r n a l s t r u c t u r e .
V e r i f y i n g c o n s t r a i n t 4 m a y r e q u i r e both i n t e r n a l
and e x t e r n a l k n o w l e d g e of a p r o g r a m . U n d o u b t e d l y ,
m o r e c o n s t r a i n t s than t h e s e f i v e n e e d to b e s a t i s f i e d
to i n s u r e r e l i a b i l i t y , but t h i s i s a s u b j e c t f o r f u t u r e
work.
Clearly, satisfying all these c o n s t r a i n t s can
l e a d to l o t s of t e s t p r e d i c a t e s r e q u i r i n g a l a r g e
s e t of t e s t data f o r a c o m p l e t e t e s t . If an u n r e a s o n a b l e a m o u n t of t e s t data i s r e q u i r e d , a
j u d i c i o u s c o m b i n a t i o n of d i r e c t p r o g r a m p r o v i n g
and e m p i r i c a l j u d g e m e n t c a n r e d u c e s i z e of a
c o m p l e t e t e s t . A s l o n g as t e s t p r e d i c a t e s a r e
e l i m i n a t e d only a f t e r a11 c o n d i t i o n s p o t e n t i a l l y
r e l e v a n t to the c o r r e c t o p e r a t i o n of the p r o g r a m
h a v e b e e n i d e n t i f i e d , any r e d u c t i o n s in t e s t v a l i d i t y a r e at l e a s t b e i n g m a d e c o n s c i o u s l y r a t h e r
than unwittingly.
I t ' s i m p o s s i b l e to f o r m u l a t e g e n e r a l n e c e s s a r y
conditions for reliability since reliability means
t h a t a l i e r r o r s in a p r o g r a m w i l l be d e t e c t e d by
a c o m p l e t e t e s t . If a p r o g r a m in f a c t h a s no e r r o r s,
a ny s e t of t e s t p r e d i c a t e s i s r e l i a b l e . F o r e x a m p l e , e x e r c i s i n g a l l s t a t e m e n t s is not a n e c e s s a r y
c o n d i t i o n f o r r e l i a b i l i t y as l o n g a s u n e x e r c i s e d
s t a t e m e n t s a r e n e v e r r e s p o n s i b l e f o r an e r r o r .
But s u p p o s e t h a t t e s t p r e d i c a t e s a r e d e f i n e d so
data b e l o n g i n g to a p a r t i c u l a r e q u i v a l e n c e c l a s s
s o m e t i m e s e x e r c i s e a p a r t i c u l a r s t a t e m e n t and
s o m e t i m e s do not. T h e n p r o v i n g the r e l i a b i l i t y
of t h e s e t e s t p r e d i c a t e s w i l l be h a r d e r b e c a u s e
it m u s t be shown that the s t a t e m e n t s not e x e r c i s e d by s o m e data in t h e e q u i v a l e n c e c l a s s n e v e r
c a u s e an e r r o r .
O t h e r w i s e , i f an e r r o r i s a s s o c i a t e d with the o c c a s i o n a l l y e x e c u t e d s t a t e m e n t ,
s o m e data in the c l a s s w i l l e x e c u t e s u c c e s s f u l l y
and s o m e w i l l not, and this c o n t r a d i c t s R E L I A BLE(C)o So, in g e n e r a l , we c a n only d e s c r i b e
c e r t a i n c h a r a c t e r i s t i c s t h a t if not s a t i s f i e d e i t h e r
m a k e a p r o o f of r e l i a b i l i t y h a r d e r o r m a k e i t
m o r e l i k e l y the t e s t p r e d i c a t e s a r e not r e l i a b l e
f o r t h e p r o g r a m to be t e s t e d . F o r e x a m p l e , it
a p p e a r s t h a t a s e t of t e s t p r e d i c a t e s m u s t at
l e a s t s a t i s f y t h e f o l l o w i n g c o n d i t i o n s to h a v e a
r e a s o n a b l e c h a n c e of b e i n g r e l i a b l e :
6.
Summary
1 ) e v e r y i n d i v i d u a l b r a n c h i n g c o n d i t i o n in the
p r o g r a m m u s t be" r e p r e s e n t e d by an e q u i v a l e n t "~ c o n d i t i o n in the t a b l e ;
2) e v e r y p o t e n t i a l t e r m i n a t i o n c o n d i t i o n in th e
p r o g r a m [Sites(1974)] e . g . , o v e r f l o w , m u s t
be r e p r e s e n t e d by a c o n d i t i o n in the t a b l e ;
3) e v e r y v a r i a b l e m e n t i o n e d in a c o n d i t i o n
m u s t h a v e b e e n p a r t i t i o n e d c o r r e c t l y into
c l a s s e s t h a t a r e " t r e a t e d the s a m e " by t he
program;
4 ) e v e r y c o n d i t i o n r e l e v a n t to the c o r r e c t o p e r a t i o n of the p r o g r a m t h a t i s i m p l i e d by the
s p e c i f i c a t i o n , k n o w l e d g e of th e p r o g r a m ' s
data s t r u c t u r e s , o r k n o w l e d g e of t h e g e n e r a l
m e t h o d b e i n g i m p l e m e n t e d by t h e p r o g r a m
m u s t be r e p r e s e n t e d as a c o n d i t i o n in the
table;
2. T e s t W e a k n e s s e s - Th e w e a k n e s s of t e s t i n g
l i e s in c o n c l u d i n g that f r o m t h e s u c c e s s f u l e x e c u t i o n of s e l e c t e d t e s t d at a, a p r o g r a m is c o r r e c t
f o r a11 data. C r i t e r i a f o r t e s t data s e l e c t i o n b a s e d
solely, on i n t e r n a l p r o g r a m s t r u c t u r e a r e c l e a r l y
too w e a k to i n s u r e c o n f i d e n c e in the r e s u l t s of a
successful test. since such testing methods a r e
509
too weak to n e c e s s a r i l y r e v e a l a l l d e s i g n e r r o r s
o r m a n y t y p e s of c o n s t r u c t i o n e r r o r s . 11eliabilitty
i s the k e y to m e a n i n g f u l t e s t i n g and l a c k of r e l i a b i l i t y is the r e a s o n for the c u r r e n t w e a k n e s s of
t e s t i n g as a m e t h o d for i n s u r i n g s o f t w a r e c o r r e c t n e s s . F u r t h e r w o r k is n e e d e d to show when a t e s t
is reliable.
f973), 145-150.
K i n g , 1n, J . H . The i n t e r p r e t a t i o n of l i m i t e d e n t r y
decision t a b l e f o r m a t and r e l a t i o n s h i p s a m o n g
c o n d i t i o n s . C o m i m t e r J o u r n a l 12, 4 (November
3. T e s t Methodology - We b e l i e v e m o s t s o f t w a r e
e r r o r s r e s u l t f r o m f a i l i n g to see or deal c o r r e c t ly with a l l c o n d i t i o n s and c o m b i n a t i o n s of c o n d i t i o n s r e l e v a n t to the d e s i r e d o p e r a t i o n of a p r o gram. An effective methodology for reliable
p r o g r a m d e v e l o p m e n t m u s t focus on u n c o v e r i n g
t h e s e c o n d i t i o n s and t h e i r c o m b i n a t i o n s . This
i s the m o t i v a t i o n for o u r u s e of c o n d i t i o n t a b l e s - it m a k e s it e a s i e r to find a n d a n a l y z e c o n d i t i o n
c o m b i n a t i o n s . E x p e r i m e n t a l u s e of c o n d i t i o n
t a b l e s and f u r t h e r study of the m e t h o d is n e e d e d ,
howeverp b e f o r e i t s full b e n e f i t s c a n be r e a l i z e d
i n p r a c t i c e , In a n y event, a s y s t e m a t i c a p p r o a c h
to t e s t data s e l e c t i o n i s c l e a r l y n e c e s s a r y to o b t a i n effective t e s t data with r e a s o n a b l e effort.
M o s t s y s t e m a t i c m e t h o d s p r o p o s e d to date have
b e e n b a s e d s o l e l y on knowledge of a p r o g r a m t s
i n t e r n a l s t r u c t u r e , and y e t such k n o w l e d g e i s
i n s u f f i c i e n t by i t s e l f to y i e l d a d e q u a t e l y r e l i a b l e
tests.
1969), 3Z0-326.
London R . L . Software r e l i a b i l i t y t h r o u g h p r o v i n g
p r o g r a m s c o r r e c t . I E E E Int. Syrnp. on F a u l t To!er:a n t Computing h4arch 1971.
N a u r , P. P r o g r a m m i n g by a c t i o n c l u s t e r s . BIT 9~ 3
(1969a), 250,258.
N a u r , P . , and 11ande11, B. ( e d s . ) Software E n g i n e e r r
O S c i e n t i f i c A f f a i r s D i v i s i o n N~ATO B r u s s e l s 39,
gJ.u.m, J a n u a r y 1969b.
P o o c h , U. W, T r a n s l a t i o n of d e c i s i o n t a b l e s . Com......:
p u t i n g S u r v e y s 6, 2 (June 1974), 125151.
P o o l e , I:~. C. Delmgging and t e s t i n g I n B a u e r , F . L.
A d v a n c e d C o u r s e on Software E n g i n e e r i n g S p r i n g e r V e r l a g , N e w York~ 1973, Z78-318.
11ubey, 11. J. et. al. C o m p a r a t i v e E v a l u a t i o n of
PL/I___~ AD 6 6 9 0 ~ , A p r i l 1968.
4. T h e o r y of T e s t i n g - We have m a d e a s t a r t
t o w a r d a t h e o r y of t e s t i n g by d e f i n i n g s o m e
b a s i c p r o p e r t i e s of t h o r o u g h t e s t s - - r e l i a b i l i t y ,
v a l i d i t y j and c o m p l e t e n e s s . We have a l s o d e v e l o p e d a s p e c i f i c v e r s i o n of the t h e o r y i n which
the t e s t data s e l e c t i o n c r i t e r i o n c o n s i s t s of s e t s
of t e s t p r e d i c a t e s o r g a n i z e d i n a c o n d i t i o n t a b l e .
O t h e r t h e o r e t i c a l d e v e l o p m e n t s could p r o c e e d b y
d e f i n i n g o t h e r types of t e s t data s e l e c t i o n c r i t e r i a .
P~efer e n c e s
Boehm, B. Software and its i m p a c t : a q u a n t i t a t i v e
a s s e s s m e n t . D a t a m a t i o n 19(May 1973), 48-59.
Boehm, B . W . etoal. Some e x p e r i e n c e with a u t o m a t e d aids to the d e s i g n of l a r g e - s c a l e r e l i a b l e
s o f t w a r e . In this p r o c e e d i n g s , A p r i l 1975,
Buxton, J . N . and R a n d e l l , B. Software E n g i n e e r ing TechniqLL~s, S c i e n t i f i c A f f a i r s D i v i s i o n , NATO,
B r u s s e l s 39, [~elgium, A p r i l 1970.
Dzthl, O. -J. , D i j k s t r a , E. W . , and H o a r e , C. A. R.
S t r u c t u r e d P r o g r a n l m i n g . A c a d e m i c P r e s s , New
York, 1972.
Elmendr~rf0 ~V. R. C o n t r o l l i n g the f u n c t i o n a l t e s t ing of an o p e r a t i n g s y s t e m . I E E E T r a n s . Sys. Sci.
and C y b . ~ S S C - 5 , 4 (October 1969); 284-289.
510