Version 36 of Tips for writing quality software

Updated 2013-11-07 07:48:30 by suchenwi
  1. Keep procedures short - Why you may ask? Would you rather see 200 lines of code that does X, or a command call saying do X. If function/procedure calls you feel are too expensive, I suggest you keep in mind that machine coders used a similar argument when assemblers came around. (Back then they thought that computer time was more expensive than programmer time, which isn't generally true nowadays.)
  2. Use descriptive names for variables and procedures - Try to describe what you're storing rather than how it interacts with something. Be consistent in the names you use. If you want nTimes to mean "number of times" don't later use numTimes or nt. Being lazy just wastes your time. Also keep in mind that if you use variable names that are too short and use long comments to describe them you are wasting your time. This can be taken too far and then the variable names clutter the pattern of the program. A healthy balance between names like j and jumpPointStoresAPointerThatDoesX is needed.
  3. Keep a consistent pattern of object/memory management - Decide ahead of time if the parent of a procedure should manage a certain object, or if the child or called procedure should, and try to be consistent about how things should be managed.
  4. Use comments that describe how and why you are doing something - Some programmers may be confused by your code if the purpose is not known. You may even confuse yourself when you come back to the code months later.
  5. Use comments when the pattern of what happens next may not be expected
  6. Look for patterns in your code, and minimize repeats by using proc or interp alias (see 1) or functions and macros in C.
  7. Don't make assumptions that a command or function won't return an error. - Check the results and catch if absolutely needed.
  8. Have a plan for each file before you begin coding. - It's all too easy to fall into the trap of staying up late coding massive amounts of code. Often if you really think about the problem you can reduce the amount of code substantially. By having a plan you also can work out potential problems before you invest time. This is what separates a software-engineer from a programmer.
  9. Use the interactive tclsh/wish shell if you aren't sure about how something will work - Don't assume
  10. Use a consistent pattern of capitalization or _ or - for classes of keywords - If you decide that you want all classes to be like BoxClass don't later change to ball_class type naming. You will only confuse yourself and make it more difficult for your mind to parse your own code later on. Obviously in the world of packages this may not be possible, but strive to keep your own code consistent.
  11. Don't rely on the interpreter/compiler to find bugs for you. - If you find yourself fixing bugs that the interpreter/compiler tells you about too often then you probably haven't planned it out well. The same applies to excessive use of a debugger. (See 8)
  12. When coding in C be careful with == (equality testing) - Consider if (var = NULL) -- usually the intended usage was if (var == NULL). The compiler accepts var = NULL, and you may wonder what is causing var to become NULL later on during runtime. An easy solution is to use if (NULL == var). This doesn't have the same problem, because the compiler will report it as an error if you try to assign a variable to NULL/0 as in if (NULL = var). Tcl/Tk could learn from this.
  13. Design your code to be tested - Use a low degree of coupling between procs and the rest of the software so that procs can be tested in isolation. If using object-oriented extensions, try to follow the Law of Demeter.
  14. Implement unit tests for your code - You designed your code to be tested, so test it already!

Please append your own tips.

DKF - Here's a few that are related to the ones above.

  1. Your ultimate goal is to write clear and correct code. Remember: if your code is clear, it is easier to make it correct.
  2. It is better to have a function do a conceptually consistent action than it is to keep the function short.
  3. It is better to keep the number of places that know about a data-structure's implementation layout very small. This is another variation on the Law of Demeter but it really bears repeating. Note that macros and inline functions do not really conceal the knowledge of a data-structure; they just hide it from the programmer and not from the code itself.
  4. If you're allocating structures, initialise all the fields at the same time. Better yet, write a function to do allocation and basic initialisation (either to valid null values or to obvious marker/guard values) and use that function everywhere else.
  5. Do not use the result of assignment (assuming you're using a language that defines it.) Sure it's defined, but it leads to really murky code. Do the assignment on a separate line (yes, you can afford it.)
  6. KPV A generalization of the above tip is to avoid horizontalizing code. I admit I'm often guilty of this in the quest for compactness (see tip #1).
  7. wdb Avoid making an object more powerful than necessary. If you have written a method, and later you see that this method is not really used, then don't hesitate to delete it. An object shouldn't be almighty. Less is more.

Duoas I think that the names we give things in our code often expresses how well we actually understand that code. A few additional thoughts:

It is worth every programmer's time to read and appreciate <Ottinger's Rules for Variable and Class Naming> . It is an excellent, straight-forward, peer-reviewed (that is, by our peers --i.e. programmers) treatise. Some of the following reiterates stuff in the paper (but is not an excuse not to read it ;-)

Really Good Naming Conventions:

  • Use nouns (and noun phrases) for variables. A variable is thing to be acted upon; it exists and nothing else.
  • Use predicates (verbs and verb phrases) for procedures. A procedure does something. It acts upon something else.
  • PascalCase/camelCase/CamelCaps/etc. Use them. (Tcl core conventions encourage both forms, where UpperCaps is used for function and types names and lowerCaps is used for variables.)
  • Underscore_delimited_names. Also good. (Personally, I prefer them.) They make for natural reading, while the old space and visibility concerns are a thing of the past. Mixture with CamelCase is also fine, so long as it is consistent.
  • Use names that you can actually speak in natural language. Words from your local language are OK, but for code with international scope just stick to English.
  • Short or one-letter names iff (1) they have universally recognized meaning, and (2) are local to small, bounded contexts (such as a function or loop fitting 10-15 lines or less). For example, x and y are universally understood to refer to some horizontal and vertical tabulation, typically in graphics, as row and col are oft used for textual tabulations. Likewise, src and dest are instantly understood to be 'source' and 'destination'. The simple n or i are fine for one- or two-line loops where the actual index count is inconsequential or immaterial to the algorithm. An s is fine for a string in the middle of some transformation. Functional programmers will use x and xs for temporary list constructs; they are instantly recognized as car/element and cdr/(remaining) elements. But, though an fft is a fast-fourier-transform, a better name would be FastFourierTransform.

Horribly Wrong Naming Conventions (or 'Corollary')

  • sTuDlyCapS and other l33t forms are for weenie wannabes; not professionals. Likewise SUIT_friendlyFORMs are to be despised.
  • Hungarian notation [L1 ] is evil. Unfortunatly it has seduced countless masses for many years, meaning it is often inescapable when handling existing code. Only use it if you are required to do so. (Yes, this one is still a hot religious issue for many. I found an interesting paper advocating Hungarian notation [L2 ]. Personally, I don't believe a Microsoft employee is really an impartial commentator, and I think it succinctly demonstrates some of the warped thinking that Hungarian notation inculcates... but that's all I will say about that here.)
  • Positional notation, PRIME-MODIFIER-CLASS notation, and other silly 'one size fits all' schemes are likewise dead-weight traps that focus more attention on form and deciphering than meaning.
  • There is no longer any excuse for short, vowel-less, scrunched-up variable names (like ctmrnm and lclwtt --can you guess what they mean? [They do mean something]). Tcl can handle names of any (non-negative) size, and ANSI-C can handle 31 characters at minimum. That is plenty of space for legible names. If you find that your names are exceeding 30 or so characters, then you are killing your keyboard. Here is an example of just how much space that is:
  • Scrunchedupnames are nearly as unreadable. English script delineates on lexemes, and people's brains are trained to read it that way.
  • Even highly-specialized, domain-specific applications should avoid abbreviations and symbols that require considerable domain knowledge. Often, there are more than one symbol to express the same thing in mathematics, physics, and other sciences. Don't take shortcuts. Just because you are an expert in the domain doesn't mean that anyone else will recognize the notation.

Well, that's enough from me. :-P

NEM See also the Tcl Style Guide and the Tcl/Tk Engineering Manual (TIP 247).

I'd add that while the above advice is mostly good and useful, real quality software comes from a firm grasp of the fundamentals of computer science and software engineering, rather than just coding style guidelines. Learn about the underlying theories (mathematics, logic, set theory, number theory, data structures, algorithm analysis, etc, etc) and how to apply them. Learn as many programming languages as possible (including Logic Programming, Functional Programming, OOP, etc). Learn about concurrency, networking, databases, and so on. Then worry about whether your variable names are stylishly capitalized.

RS 2013-11-07: My rules for well-readable code include to use habitual names instead of being very creative. For instance, my stand-alone scripts always have the pattern (borrowed from C):

 proc main argv {
 main $argv

Or, the frequent operation of reading lines from a file:

 set f [open $filename]
 while {[gets $f line] >= 0} {
 close $f