Home
Personal Area
Where I Live
Photo Album
Henry Muccini Ph.D. in Computer Science
Definition of Software Testing The purpose of software testing is to increase the software engineer confidence in the proper functioning of the software. As defined in [AntoSwebok], ``software testing consists of the dynamic verification of the behavior of a program on a finite set of test cases, suitably selected from the usually infinite executions domain, against the specified expected behavior". Testing cannot show the absence of defects; it can only show that software defects are present. Since the objective of testing is to find errors, a successful test is one that uncovers undiscovered errors. A good test is one that has a high probability of finding an as yet undiscovered error. The tester aim is to design tests that systematically uncover different classes of errors and do so with a minimum amount of time and effort. The primary role of testing is to provide techniques to reveal bugs that would be too costly (or impossible) to find with other verification and validation techniques. What must be clear is that exhaustive testing is usually not possible. We may distinguish fault-directed tests, that are those exercised to reveal faults, from conformance-directed tests, those directed to demonstrate conformance to required capabilities. To understand testing, we need to define the concepts of failure and fault: a failure is the manifested inability of a system or component to perform a required function. It is evidenced by incorrect output, abnormal termination or unmet time and space constraints. A software fault, instead, is missing or incorrect code. Many different failures can result from a single fault, and the same failure can be caused by different faults. An important factor in testing software systems is the cost} of the testing process and an important decision point is when to conclude the testing process. To make an effective evaluation, it should be possible to predict the cost of discovering (or of not discovering) the remaining defects. The test activity consists of designing test cases, execute the software with these test cases and observe the results produced by these executions. In other words, the tester has to define the inputs data, run the system using these inputs and analyze the resulting outputs. Test cases may be manually or automatically generated in various forms, depending on what we want to test. The test activity is no longer associated only to source code: in the past, in fact, software testing was synonymous of testing programs while nowadays it has seen as an activity that should encompass the whole development process. It implies that we can perform different kinds of testing along the development process [AntoSwebok]: we can test system specifications [Jo,ZH+,KF+], formally [ZH+,VDM,HP] or informally defined, architectural specifications [DebraWolf96,TSE01], in an architectural high-level design, component-based systems [Testing_FOSE,DavidCOTStest,MaryJeanCOTS], in a component-based development, object oriented systems [BinderBook] at the source code level. All these test selection techniques deserve consideration. Indeed, recent studies have shown that different test selection techniques target different classes of faults (e.g., [cap13Lyu]), and that the combined use of diverse techniques is always more effective than concentrating the effort on only one technique (even though proved to be the most efficacious) [Littlewood]. Naturally testing has a number of limitations: as said, it can never guarantee the fault freeness of the software (because it is not exhaustive); test execution results for specific test cases cannot usually be generalized. Testing Techniques Formal and informal Specification-based Testing Specification-based testing checks that the Implementation Under Test (IUT) fulfills the specifications: the specifications, either formal or informal, are carefully analyzed in order to ``capture'' all and only the important properties against which the IUT has to be tested. Several testing techniques have been proposed dealing with informal (or semi-formal) specifications. An approach to generate specification-based tests consists in partitioning the input domain of a function being tested and to select test data from each class. The assumption is that elements in the same class expose errors as well as any other elements, and hence correct results for a single element in a class provides confidence that all elements in the class correctly behaves [Jo,KF+,CategoryPartitionMethod]. In the ``Condition Table Method" [GG75,Jo] a condition table is built in which columns represent conditions that can occur during the program execution. The method examine the program specification trying to identify those conditions that have a significant impact on the execution behavior of the program. The ``Cause-Effect Graphing" strategy [Pf], originally defined by Elmendorf [Cause-effect], is based on the assumption that causes and effects may be related and represented using a graph. Functions are identified, with causes that influence the function's behavior and the function's effect. A graph is built, combining causes and effects and test cases are produced considering the combination of causes that produce that effect. The importance of the use of formal methods in software specification and design does not need to be stressed. Several authors have highlighted the advantages of formal methods in testing as well, and different techniques have been proposed to generate test plans using a regular grammar derived from the functional specification [Bauer], to select tests from algebraic specifications [Bernot], to derive tests from a ``Z" specification [Hall] and from model-based specification [VDM]. Several approaches have been proposed to test Finite State Machines (FSMs), Labeled Transition Systems (LTSs) or Input/Output Transition Systems derived from formal specifications or from Object Oriented programs [TestOO,BinderBook]. Unit, Integration and System Testing Unit testing checks each module for the presence of bugs. A module may be a procedure, a piece of code, one component (in object oriented systems), one system as a whole. Unit testing's purpose is to ensure that each as-built module behaves according to its specification defined during detailed design. Unit test is usually used to test interfaces (parameter passed in correct order, number of parameters equal to number of arguments, parameter and argument match), local data structures (improper typing, incorrect variables name, inconsistent data type), boundary conditions. To build complex systems, units need to be integrated to cooperate together. Integration testing connects sets of previously tested modules to ensure that the sets behave as well as they did as independently tested modules. Its purpose is to ensure that each as-built component behaves according to its specification defined during preliminary design. Especially communicating interfaces among integrated components need to be tested to avoid communication errors. Integration testing may be executed using a non-incremental integration approach, putting everything together at once, or integrating and testing incrementally. Top down, Bottom up, Mixed and Sandwich are the most known integration testing techniques. In top down integration testing the high-level control routines are tested first, possibly with the middle level control structures present only as stubs. It can proceed in a depth-first or a breadth-first manner. For depth-first integration each module is tested in increasing detail, replacing more and more levels of detail with actual code rather than stubs. Alternatively, breadth-first would proceed by refining all the modules at the same level of control throughout the application. In practice a combination of the two techniques would be used. The other major category of integration testing is bottom up integration testing where an individual module is tested from a test harness. Once a set of individual modules have been tested, they are then combined into a collection of modules, known as builds, which are then tested by a second test harness. This process can continue until the build consists of the entire application. In practice a combination of top-down and bottom-up testing would be used. System Testing checks that the entire software system embedded in its actual hardware environment behaves according to the requirements document. It may happens, in fact, that components correctly integrated, do not work properly when deployed on a particular environment. For software based systems, recovery testing, security testing, stress testing and performance testing can be carried out. Stress testing is designed to test the software with abnormal situations. Stress testing attempts to find the limits at which the system will fail through abnormal quantity or frequency of inputs. The test is expected to succeed when the system is stressed with higher rates of inputs, maximum use of memory or system resources. Performance testing is usually applied to real-time, embedded systems in which low performances may have serious impact on the normal execution. Performance testing checks the run-time performance of the system and may be coupled with stress testing. Moreover, performance is not strictly related to functional requirements: functional tests may fail, while performance ones may succeed. The metaphor I like to use to explain how unit, integration and system testing differ considers a unit as a ``musician", playing her instrument. Unit testing consists in verifying that the musician produces high quality music. Putting together musicians each one playing good music is not enough to guarantee that the group (i.e., the integration of units) produces good music. They may play different pieces, or the same piece with different rhythms. The group may then play in different environments (e.g., theater, stadium, arena) with different results (i.e., system testing). Functional and Structural Testing In Functional testing, also called Black Box testing, the internal structure and behavior of the program is not considered. What is assumed is to have the (formal or informal) specification of the system under test and the objective is to analyze if the system correctly behaves with respect to the specifications, i.e., to find discrepancies between the actual behavior of the implemented system's functions and the desired behavior, as described in the functional specifications. Tests sets are derived that fully exercise all functional requirements. This analysis technique is particularly useful when the source code is not available (for example, in testing Components Off The Shelf (COTS) [COTS]). A functional test consists of analyzing how the system reacts to external inputs. The computed outputs are then checked with the expected one (for the selected input) and possible discrepancies are revealed. How to identify test cases strongly depends on how the specification is defined: it could be formal or informal. In order to minimize the number of test cases, the input domain is usually partitioned into equivalence classes so that elements in the same class behave similarly. This testing technique is called ``equivalence partition method" [CategoryPartitionMethod]. As numerous errors occur at boundaries of the input domain, a Boundary Value Analysi may complement equivalence partitioning, choosing values at the extremes of the class. The major drawbacks of black-box testing are its dependence on the specification's correctness and the necessity of using every possible input as test case in order to get good confidence of acceptable behavior. Structural testing, also known as White Box testing, differently from functional testing assumes visibility into internal data and structures. In structural testing the structure of the program is examined and test data are derived from the program's structure. Structural testing compares test program behavior against the apparent intention of the source code, i.e., it analyzes the software structure to track what parts of the code have been executed during testing. Coverage analysis generally consists of measurements based on several white box techniques to ensure that the code is extensively exercised. With this respect, it is really useful that the criterion provides some metrics to determine the thoroughness of verification. Traditional white box analysis techniques use control flow graph representation of a program in which nodes correspond to sequentially executed statements while edges represent the flow of control between statements. The aim of white box testing criteria is to cover as much as possible the control flow graph, limiting the number of selected test cases. Designers have been applying white box techniques for a long time and several coverage criterion are applied: Statement coverage: This criterion reports whether each executable statement is encountered. This coverage selects a test set T such that, by executing a program P for each test in T, each elementary statement of P is executed at least once. The chief disadvantage of this measure is that it is insensitive to some control structures [Bullseye]. Branch coverage: It measures the coverage of all blocks and case statements that affect the control flow. Boolean expressions are evaluated for both true and false conditions. This criterion select a test set T such that, by executing P for each test in T, each of P's control flow graph is traversed at least once. This measure has the advantage of simplicity, but may ignore branches within boolean expressions (see condition coverage below), or relevant combinations of branches (considered in path-based criteria). Condition coverage: It is similar to decision coverage but it measures the subexpressions independently of each other, allowing for a better analysis of the control flow. This coverage criterion select a test set T such that, for each test in T, each edge of P's control flow is traversed at least once and all possible values of the constituents of compound conditions are exercised at least once. Path coverage: It measures the percentage of all possible paths through the code exercised. Path coverage is similar to decision coverage, but it handles multiple sequential decisions. As the number of paths soon becomes infeasible, several variations of this criterion are considered to limit the number of loops. This technique selects a test set T such that, executing P for each test in T, all paths leading from the initial to the final node of P's control flow graph are traversed at least once. Data-Flow coverage: Data-flow testing [LK83,FW88,BeBook] offers a family of criteria for unit testing of programs. In data-flow testing, a data definition of a variable is a location where a value is stored into memory (definition) and a data use is a location where the value of the variable is accessed for computation (c-use) or for predicate uses (p-use). Data-flow testing goal is to generate test that execute programs subpaths from definition to use. Traditional data-flow analysis techniques work on control flow graphs. Let us use a simple example to graphically explain the different white-box techniques. Let P the following program: 1 function P return INTEGER 2 begin 3 X, Y:INTEGER; 4 READ(X); READ(Y); 5 while (X > 10) loop 6 X := X – 10; 7 exit when X = 10; 8 end loop; 9 if (Y < 20 and then X mod 2 = 0) then 10 Y := Y + 20; 11 else 12 Y := Y – 20; 13 end if; 14 return 2 * X + Y ; 15 end P; The corresponding control flow graph is shown in Figure... Some white-box tests follow: - Statement coverage: (X = 20, Y = 10) and (X = 20, Y = 30) - Branch coverage: (X = 20, Y = 10) and (X =15, Y = 30) - Condition coverage: (X = 20, Y = 10) and (X = 5, Y = 30) and (X = 21, Y = 10) - Path coverage: (X = 5, Y = 10) (X = 15, Y = 10) (X = 25, Y = 10) (X = 35, Y = 10) … Many others coverage criteria are used in practice. For more details please refer to [Bullseye,isdmag,McCabe,CtCoverage]. How coverage criteria has been adapted to concurrent programs may be found in [Structural]. An analysis of coverage criteria costs may be found in [Marick].
Contact me