Thursday, February 10, 2011

Code Complete by Steve McConnell : Chapter 5

The author discusses a low-quality routine and enlists the problems associated with it which include:

  • Bad name
  • No documentation
  • No error check on input data
  • Use of magic numbers instead of constants
  • Unused parameters
  • Modification of global variables
  • Too many parameters
  • Poor parameter ordering
The author emphasizes importance of routines in computer science and discusses about reasons to create a routine:
  • Reducing complexity: A routine hides information and makes it easier to manage complex programs.
  • Minimize code size: Similar code at different places in the program can be put into a routine to reduce size of code.
  • Improve correctness: Code in a routine is likely to be more correct since it is called with different inputs from several places.
  • Improve maintainability: If a defect is discovered, the code needs to be changed at only one place.
  • Limit effects of changes: Isolate areas that are likely to change in future.
  • Hiding sequences: A routine hides the sequences of events and bundles them as a single operation.
  • Improve performance: If a routine's performance is improved, many parts of the code benefit.
  • Hiding implementation: A routine will hide implementation details of an operation be its data structures or algorithm.
  • Promoting code reuse: Code in a routine can be reused in another program more easily.
  • Improve readability: Putting a section of a code in a routine and giving it a proper name improves readability of the code where the routine is invoked.
  • Improving portability: Use of routines isolate non-portable capabilities.
The author provides guidelines to create effective routine names:
  • For a procedure, a verb followed by an object is a good combination, e.g. PrintReport, CheckOrder, etc... For Object Oriented languages, the object is not required in the name because it is already part of the procedure call, e.g. report.Print, order.Check, etc...
  • For a function (which returns a value), the description of the return value should be used as the name, e.g. NextCustomer, CurrentPenColor, etc...
The author suggests to include everything that a routine does in a name and if the name becomes too long to describe everything it does, it should be broken in several parts. The author suggests that a good routine name has an average length of 15 to 20 characters against 9 to 15 for that of a good variable name.

The author enlists several types of acceptable and unacceptable cohesion in a routine as below:
  • Acceptable
    • Functional Cohesion: This occurs when a routine does one and only one operation.
    • Sequential Cohesion: This occurs when a routine performs operations in a specific order sharing data from step to step and makes only a part of a bigger operation.
    • Communicational Cohesion: This occurs when operations in a routine make use of the same set of data and aren't related in any other way.
    • Temporal Cohesion: This occurs when operations are combined into a routine because they are all done at the same time.
  • Unacceptable 
    • Procedural Cohesion: This occurs when a routine performs operations in a specific order but the operations do not share any data.
    • Logical Cohesion: This occurs when several operations are combined into a routine and one of the operations is selected by a control flag that is passed in.
    • Coincidental Cohesion: This occurs when operations in a routine are not at all related.
The author discusses the idea of coupling. The degree of coupling between two routines describe the strength of connection between them. It is a complement to cohesion and it is better to have as loose coupling as possible. The author then discusses several parameters for coupling:
  1. Size: Small routines with less number of parameters are loosely coupled compared to large ones or with lots of parameters.
  2. Intimacy: The routines should try to interact directly through parameters and avoid interaction through indirect ways like global variables or database records.
  3. Visibility: The connection between different routines should be prominent and not hidden.
  4. Flexibility: It refers to how easily you can change the connections between routines.
The author gives examples of good and bad coupling.
  1. Simple-data coupling: Two routines are simple-data coupled if all the data passed between them is non-structured and through a parameter list.
  2. Data-structure coupling: Two routines are data-structure coupled if the data passed between them is structured.
  3. Control coupling: Two routines are control coupled if one routine passes data to the other that tells the second routine what to do.
  4. Content (Pathological) coupling: Two routine are pathologically coupled if one uses the code inside another or if one alters the local data used inside another.
The first two kind of coupling are acceptable while the last two are not.

The author then discusses about ideal length of a routine. Contrary to general belief, the author provides some empirical results to show that:
  1. The size of routine is inversely proportinal to the number of errors per line of code. Small routines have more errors per line of code than larger routines.
  2. Larger routines were cheaper to develop per line of code.
  3. Code needed to be changed the least when routine averaged 100 to 150 lines of code.
But author still warns on writing routines larger than 200 lines of code since they become difficult to understand.

The author discusses about defensive programming, a technique where if a routine receives incorrect input, the behaviour of the routine is still defined and consistent. The author enlists several techniques of using defensive programming:
  1. Assertions: The author suggests to use preprocessor macros for assertions and warns against putting executable code in assertions since the assertions may not execute in production code.
  2. Check values of data input and routine parameters: Routines should be coded to verify the range of values for data obtained from a file or user and all the input parameters. A strategy should be in place on the software's behaviour on invalid data.
  3. Exception Handling: Unexpected situations/errors should generate exceptions that becomes obvious in development and are recoverable in production code.
  4. Change: Anticipate changes that can happen to a program and code in such a way that least number of routines get affected.
  5. Function Return Values: Code should be written to always check function return values even if it will always be success.
The author then provides guidelines for function parameters:
  1. Parameters should be in input-modify-output order.
  2. If several functions have similar parameters, their orders should match.
  3. Never have unused parameters.
  4. Put status or error variable last.
  5. Never change the routine parameters. Take a copy and operate on the copy.
  6. Document assumptions about parameters
    • Whether the parameter is input, modify or output
    • Units (Millisecond, Inch, etc...)
    • Range of expected values
    • Special values used for error etc... that should never appear
    • Meanings of status codes and error values
  7. Limit the number of parameters to seven
  8. Never assume any parameter passing mechanism (i.e. calling conventions)
The author ends the chapter with comments on issues of using macros and how to solve the issues.

No comments:

Post a Comment