C++ Next

Monday, July 11, 2011

The Art of Software Testing by Glenford Myers : Chapter 6

In this chapter, the author focuses on the part of testing that starts when human-testing ends. The author talks about origination of errors in software programming. Since software development is a process of communicating information about the program and translating it from one form to another, the vast majority of software errors are attributable to breakdowns, noise and mistakes in communication. The author then describes the flow of the software development process. The author suggests to include more precision into the development process and having a separate verification step at the end of each process. The author also suggests having distinct testing processes toward distinct development processes i.e. focus each testing process on a particular translation step. The author then discusses testing at different stage of development cycle:

Function Testing: It is a process of finding discrepancies between the program and its external specification.
System Testing:

System testing is the process of attempting to demonstrate how the program does not meet its objectives.
System testing is impossible if the project has not produced a written set of measurable objectives for its product.

The author then mentions following categories of test cases:

Facility Testing: It is the determination of whether each facility mentioned in the objectives was actually implemented.
Volume Testing: The program is subjected to heavy volume of data.
Stress Testing: It involves testing a program under heavy load in a short span of time.
Usability Testing: The author provides following aspects of usability testing

Does system contain excessive number of options ?
Does system return acknowledgment to all inputs?
Is the program easy to use?
Where accuracy is vital, is sufficient redundancy present?
Are the outputs of the program meaningful?

Security Testing: It is a process of attempting to devise test cases to subvert the program's security checks.
Performance Testing: Test cases should be designed to show that the program does not meet its performance objectives like response times and throughput rates.
Storage Testing: Test cases should be designed to show that the program does not meet storage objectives in terms of main and secondary storage and spill/temporary files.
Configuration Testing: The program should be tested with each kind of hardware device and with the minimum and maximum configuration. Each possible configuration of the program should also be tested.
Compatibility/Conversion Testing: Test cases should be designed to make sure the program meets the objectives of compatibility with and conversion procedures from, the existing system.
Installability Testing: Testing of installation procedures should be part of this testing.
Reliability Testing: If the objective of the program contain specific statements about reliability, tests should be devised to
Recovery Testing: Tests should be designed to see the system's recovery from programming errors, hardware failures and data errors.
Serviceability Testing: Serviceability objectives defined in terms of service aids, mean time to debug a problem, maintenance procedures, and quality of internal-logic documentation.
Documentation Testing: User documentation should be subject to an inspection for accuracy and clarity. Examples in the documentation should be part of test cases used to test the program.
Procedure Testing: Any human procedures involved in large programs should be tested.

The author clearly specifies not performing system test by programmers who have written the program as well as not within the organization developing the program.

Acceptance Testing: This is carried out by conducting tests to show that the program does not do what it is contracted to do.
Installation Testing: This may include test cases to check that all the files have been created and have necessary contents, all parts of the system exists and is working.

The author then discusses about test planning and enlists following components of a good test plan:

Objectives: Each testing phase should have an objective.
Completion criteria: Criteria for specifying the completion of each test phase needs to be specified.
Schedules: The schedules for when the test cases will be designed, written and executed should be created.
Responsibilities: The responsibilities of people regarding testing and fixing of errors should be clearly defined.
Test-case libraries and standards: Systematic methods of identifying, writing and storing test cases are necessary.
Tools: The required test tools must be identified including the owner for their development or acquisition and how to use them or when they are needed.
Computer time: Each testing phase's required computer time should be calculated.
Hardware configuration:
Integration: System integration plan should be in place that defines the order of integration and the functional capability of each version of the system.
Tracking procedures: There should be tracking of errors and the estimation of progress with respect to schedule, resources and completion criteria.
Debugging procedures: Mechanisms must be in place to track the progress of corrections and adding them to the system.
Regression testing: Regression testing is important because changes and error corrections tend to be more error-prone than the original code. The purpose of the regression testing is to determine if the change has regressed other aspects of the program.

The author then discusses about test completion criteria which can be two of the following:

Scheduled time for testing expires.
When all the test cases execute without detecting any errors.

The author describes how both the criteria are useless. The author provides three criteria

The first one is based on use of testing methodologies:

Completion of module testing can be achieved if the test cases are derived from multi-condition coverage criterion or boundary value analysis of interface specification and all the test cases are unsuccessful.

Completion of function testing can be achieved if the test cases are derived from cause-effect graphing, boundary value analysis and error guessing and all the test cases are unsuccessful.

The second criteria is to state the completion requirements in positive terms, i.e. test of a module is not complete till x errors are found or an elapsed time of y months. The author discuss the problem of estimating the number of errors in a program and provides several ways like

Experience with previous programs
Apply predictive models
Use industry-wide averages

The third criteria is to plot the number of errors found per unit time during test phase. By examining the shape of the curve, it can be determined whether to finish or continue the testing.

The author again emphasizes on hiring an independent test agency to test the program.

The Art of Software Testing by Glenford Myers : Chapter 5

The author discusses about unit testing in this chapter and enlists following advantages of module testing:

It makes combinatorics of managing testing easier.
It makes debugging easier.
It speeds up testing since multiple modules can be tested simultaneously.

The author then takes an example and generates test cases for it.

The author then discusses incremental testing where one module at a time is integrated with already tested modules and then tested. In non-incremental testing, all the modules are first tested and then integrated to create a program. The author then provides following observations for incremental testing:

Non-incremental testing requires more work.
Programming errors related to mismatching interfaces or incorrect assumptions among modules will be detected earlier if incremental testing is used.
Debugging becomes easier.
Incremental testing results in thorough testing.
Non-incremental approach uses less machine-time.
Non-incremental testing can be more parallelized in the earlier phase.

The author then discuss two approaches of incremental testing:

Top-down approach: The author points out following shortcomings of this approach:

When a new module is introduced, it is impossible to test some predefined situations.
If the newly introduced module is at a distance from the module doing I/O operations, it is difficult to decide the set of inputs that allows to execute all tests on the newly introduced module.
Since displayed output is from a module large distance away from the module being tested, correlation between output and behaviour of the module is difficult to establish.
Even though it seems possible, it cannot be combined with top-down design.
Sometimes, a module may remain untested before moving to the other module.

Bottom-up approach: The author points out following advantages/disadvantages of this approach:

The working program does not get created till the last module is added.

The author then provides a list of guidelines to perform the actual test:

The set of test cases needs to be reviewed or inspected before the test is performed.
Use automated test tools.
Look at the side effects of a module.
Testing of a module should be done by programmer of a calling module.
Debugging should always be done by the programmer.
If a high number of errors are found in a subset of the modules, such modules should be subjected to further module testing.

The Art of Software Testing by Glenford Myers : Chapter 4

The author discuss about how to design test cases in this chapter. The problem is to determine what subset of the possible test cases has the highest probability of detecting the most errors. The author then tries to develop a strategy which uses black-box-oriented test-case-design methodologies combined with testing the logic of the program using white-box testing.

The author then discusses following methods of test-case design:

Logic-Coverage testing: The author discusses about designing a testing strategy such that every code path is executed exactly once. The author provides an example program and shows that it is a very weak criterion. The stronger criterion for logic coverage is called decision coverage or branch coverage. The criterion states that every decision has true or false outcome at least once. Decision coverage generally satisfies the statement coverage except when

The program has multiple entry points
The program has no decisions
The author then discusses a stronger method called condition coverage where each condition in a decision takes on all possible outcomes at least once.

Equivalence partitioning: Since exhaustive-input test of a program is impossible, selecting the subset of test cases with the highest probability of finding the most errors is required. Test case design by equivalence classes proceed in two steps:

Identifying the equivalence classes: Each input condition is partitioned into two or more groups such as valid equivalence class, invalid equivalence class and external condition. The author then provides some guidelines for finding equivalence classes:

If an input condition specifies a range of values, identify one valid equivalence class and two invalid equivalence classes.
If an input condition specifies the number of values, identify one valid equivalence class and two invalid equivalence classes.
If an input condition specifies a set of input values, identify one valid equivalence class for each value and one invalid equivalence class.
If a condition specifies a "must be" situation, identify one valid equivalence class and one invalid equivalence class.

Defining the test cases: This follows following process:

Assign a unique number to each equivalence class.
Until all valid equivalence classes have been covered by test cases, write a new test case covering as many of the covered valid equivalence classes as possible.
Until all invalid equivalence classes have been covered by test cases, write a test case that covers one and only one of the uncovered invalid equivalence classes.

Boundary value analysis: Boundary conditions are those situations directly on, above or beneath the edges of input equivalence classes and output equivalence classes. The author then gives guidelines for boundary value analysis:

If an input condition specifies a range of values, write test cases for valid ends of the range and invalid-input test cases for situations just beyond ends.
If an input condition specifies a number of values, write test cases for the minimum and maximum number of values and one beneath and beyond these values.
Try to create input conditions that test boundaries of output range and beyond.
Try to create input conditions that test boundaries of output values and beyond.

Cause Effect Graphing: This technique can explore combinations of input circumstances. The author then describe a process to derive test cases using this method. The author then take an example and apply the method to derive test cases.
Error Guessing: it is not possible to give exact procedure for this technique since it is intuition based or ad-hoc. The author then provides an example of how to guess errors.

The author then provides a final strategy which is a combination of the methods provided earlier as follows:

Use Cause Effect Graphing from specification of input conditions.
Use Boundary Value Analysis
Identify valid and invalid equivalence classes for the input and output.
Use error-guessing techniques to add additional test cases.
Use decision coverage, condition coverage, multiple-condition coverage to meet coverage criterion.

The Art of Software Testing by Glenford Myers : Chapter 3

The author introduces non-computer based or human testing in this chapter. Methods related to this kind of testing are applied between the time the program is coded and the time the computer-based testing begins. This provides two advantages:

The errors are found earlier in the development cycle and the cost of correcting them is lesser and the probability of correcting the error correctly is higher.
While correcting the errors found during computer-based testing, the programmers are under immense pressure forcing them to make more mistakes.

The author then discusses about similarities and differences between code inspections and walkthroughs. The author provides several points that work in favour of this processes:

They involve people including the author and so the program is tested not just by the author but by others.
The errors found using these processes result in lower debugging costs since the precise nature of the error is located while computer-based testing only exposes symptom of an error.

The authors specifies that modifying a program is a more error-prone process than writing a new program and so program modifications should also be subjected to these testing processes.

The author then discusses about code inspections, the breakup of the team, the roles people play and the general procedure. There are two main parts:

The programmer is requested to narrate statement by statement the logic of the program.
The program is analyzed with respect to a checklist of historically common programming errors.

The author provides the ideal time for the sessions to be between 90-120 minutes. The author emphasizes on the programmer being open-minded to make the whole process effective. Code inspection identifies the error-prone sections of the code and allow to focus more on that area when computer based testing begins.

The author then provides a common checklist for code-inspection:

Is an uninitialized variable referenced?
Is array subscript value within the defined bounds?
Is array subscript an integer value?
Is the lifetime of reference variable within the lifetime of referenced storage?
If a data structure is referenced in multiple procedures or functions, is the structure defined identically in each procedure.
When indexing into a string, are the limits of the string exceeded?
Are there any "off by one" errors in indexing operations or in subscript references to arrays?
Are there any variables with similar names?
Verify mixed-mode arithmetic correctly.
Verify different size same data type arithmetic correctly, i.e. long with short.
Verify divisor not becoming zero in division operation.
For expressions with more than one operator, are the assumptions about precedence of operators and order of evaluation correct?
Are operands of a boolean operator boolean?
Will every loop always terminate?
Have all files been opened before use?
Are end-of-file conditions detected and handled correctly?
Are I/O error conditions handled correctly?

The author then discusses code walkthrough. The author describes the participants and their role in walkthrough process. The author then describes the difference in walkthrough from code inspection, i.e. the participants execute the set of paper test cases mentally. The comments of code walkthrough should be targeted towards the program and not the programmer. The rest of the process follows that similar to code inspection.

The author then discuss about desk checking and tells that it is not as effective as code-inspection of walkthrough.

The author then talks about peer ratings. The participants are told to evaluate anonymous programs in terms of their overall quality, maintainability, extensibility, clarity and readability. The participants are asked to rate the programs on a scale of 1 to 7 in terms of different questions such as:

Was the program easy to understand?
Would it be easy to modify this program?
Would you be proud to have written this program?

The Art of Software Testing by Glenford Myers : Chapter 2

The issues of economics and human psychology such as feasibility of completely testing a program and adopting an appropriate frame of mind towards testing a program contribute more towards successful testing than do the purely technological considerations.

The author then discuss the incorrect definition of testing such as:

Testing is the process of demonstrating that errors are not present
The purpose of testing is to show that a program performs its intended functions correctly
Testing is the process of establishing confidence that a program does what it is supposed to do

The author then provides more appropriate definition of testing as:

Testing is the process of executing a program with the intent of finding errors.

The author points out that it is not enough that a program does what it is supposed to do, it should also not do what it is supposed not to.

The author then discusses about different testing methods:

Black-box testing: It is called data-driven or input/output driven technique. The test cases are derived directly from specification and the tester is completely unconcerned about the internal behaviour and structure of the program. The author concludes that exhaustive input testing is not possible.
White-box testing: It is called logic-driven testing. Unlike black-box testing which focuses on data, the white-box testing focuses on control-flow and try to exhaustively test all the paths through a program. The author describes two flaws in this kind of testing:

Exhaustive path testing is impractical.
Even if all the paths are tested, since the testing does not verify whether it generates correct output or not, the program can still have errors.
Exhaustive path testing cannot identify missing paths.

The author then discusses Testing Principles as follows:

A necessary part of a test case is a definition of the expected output or result.
A programmer should avoid attempting to test his or her own program.
A programming organization should not test his own programs.
Thoroughly inspect the results of each test.
Test cases must be written for invalid and unexpected, as well as valid and expected, input conditions.
There are two parts of the testing process, testing to see that the program does not do what it is supposed to and that the program does what it is not supposed to do.
Avoid throw-away test cases unless the program is truly a throw-away program.
Do not plan a testing effort under the tacit assumption that no errors will be found.
The probability of the existence of more errors in a section of a program is proportional to the number of errors already found in that section.
Testing is an extremely creating and intellectually challenging task.
A good test case is one that has a high probability of detecting an as-yet undiscovered error.
A successful test case is one that detects an as-yet undiscovered error.