Updated 2017-05-31 14:17:41 by pooryorick

The four criteria by which the quality of a program is judged are correctness, performance, maintainability, and usability. The extent to which these points can even be judged depends primarily on the readability of the code, where "readability" is mostly about how clear the interactions betweeen the moving parts are. The tips below describe techniques that contribute to the development of one or more of these qualities.

  1. Keep blocks of code short - This tactic kills at least two birds with one stone. Each block of code is a component in the larger system. Short blocks of code are easier to think about during development, easier to audit for errors later, and in general reduce the number of interactions between moving parts, which reduces the "attack surface" for unwarranted assumptions. The key to keeping a block of code short is articulating its essential functionality and moving non-essential functionality out. Separating different concerns into their own components is one of the fundamental activities a computer programmer engages in. Concern about that cost of function/command calls is often a harbinger of premature optimisation. When assemblers came around, machine coders objected to functions on the grounds that function calls ate into performance. At the time, the argument that computer time was more expensive than programmer time carried more weight than it does now.
  2. Use descriptive names for variables and procedures - Try to describe what you're storing rather than how it interacts with something. Be consistent in the names you use. If you want nTimes to mean "number of times" don't later use numTimes or nt. Being lazy just wastes your time. Also keep in mind that if you use variable names that are too short and use long comments to describe them you are wasting your time. This can be taken too far and then the variable names clutter the pattern of the program. A healthy balance between names like j and jumpPointStoresAPointerThatDoesX is needed.
  3. Use each variable to mean exactly one thing. It can be tempting to "hijack" a variable to make decisions based on what its value indirectly implies given the current structure of the code. The problem here is that the structure of code changes as development continues, causing the implications to change. If a variable is used early in a block of code to mean one thing, and several lines later hijacked for its implications in order to make some other decision, the code in between the two sites may eventually change, changing the implied meaning and introducing a bug.
  4. Consider the permutations - In a block of code, each variable represents some characteristic of the thing being modeled. When you introduce a new variable, scan the existing variables and ask how the characteristics they represent might affect each other. Pay particular attention to any variables that are being modified in loops, watching for edge-cases where assumptions don't hold.
  5. Keep a consistent pattern of object/memory management - Decide ahead of time if the parent of a procedure should manage a certain object, or if the child or called procedure should, and try to be consistent about how things should be managed.
  6. Use comments that describe how and why you are doing something - At the moment a line of code is written, the author is holding in their mind a set of principles, insights, and assumptions about the program. Well-placed comments bring the reader up to speed on what those principles, insights, and assumptions were. You may have been the author, and if you read the code months later, might find that comments help you as much as anyone else to remember those things and understand the code. "Why does this code work?", it's probably very apt.
  7. Use comments when the pattern of what happens next may not be expected
  8. Look for patterns in your code, and minimize repeats by using proc or interp alias (see 1) or functions and macros in C.
  9. Don't make assumptions that a command or function won't return an error. - Check the results and catch if absolutely needed.
  10. Have a plan for each file before you begin coding. - It's all too easy to fall into the trap of staying up late coding massive amounts of code. Often if you really think about the problem you can reduce the amount of code substantially. By having a plan you also can work out potential problems before you invest time. This is what separates a software-engineer from a programmer.
  11. Use the interactive tclsh/wish shell if you aren't sure about how something will work - Don't assume
  12. Use a consistent pattern of capitalization or _ or - for classes of keywords - If you decide that you want all classes to be like BoxClass don't later change to ball_class type naming. You will only confuse yourself and make it more difficult for your mind to parse your own code later on. Obviously in the world of packages this may not be possible, but strive to keep your own code consistent.
  13. Foresee and avert future bugs - As code continues to be developed, it changes shape. Some patterns are more robust in the face of constant change, and some more brittle. Sometimes it's worth writing slightly more code at the outset to reduce the possibility that a future change will inadvertently introduce a bug. For example, if something then {return this} else {return that}. Although else could be removed, and return that could be the subsequent command that is only reached if something isn't true, doing that might increase the chances of those two lines of code being inadvertently separated in the future. As another example, if a variable is re-used for multiple purposes within a script, then future restructuring might inadvertently lead to a variable being in the wrong state at some point.
  14. Don't rely on the interpreter/compiler to find bugs for you. - If you find yourself fixing bugs that the interpreter/compiler tells you about too often then you probably haven't planned it out well. The same applies to excessive use of a debugger. (See 8)
  15. When coding in C be careful with == (equality testing) - Consider if (var = NULL) -- usually the intended usage was if (var == NULL). The compiler accepts var = NULL, and you may wonder what is causing var to become NULL later on during runtime. An easy solution is to use if (NULL == var). This doesn't have the same problem, because the compiler will report it as an error if you try to assign a variable to NULL/0 as in if (NULL = var). Tcl/Tk could learn from this.
  16. Design your code to be tested - Use a low degree of coupling between procs and the rest of the software so that procs can be tested in isolation. If using object-oriented extensions, try to follow the Law of Demeter.
  17. Implement unit tests for your code - In addition to confirming that a program operates as expected, the act of writing tests is indispensible to articulating the design and behaviour of the program in the first place. It isn't uncommon to write the first test before writing a single line of the program. The program and the test suite should grow up together. Expect to spend as much time writing tests as writing the program. If at the end of any particular day the program code base is larger than the test suite code base, it might be time to write more tests. As the program grows more complex, it's quite comforting to run a well-developed test suite after making some changes and find that all tests pass. Once this has become a habit, when you don't have a good test suite you'll feel like you do in one of those dreams where you went to school in your underwear. Check out Tcl's own test suite for inspiration. tcltest is the de-facto standard tool for the job.
  18. Review code for readability - If you can easily read and understand the code you wrote a day, a week, a month after writing it, you may have some quality code on your hands.

Please append your own tips.

DKF - Here's a few that are related to the ones above.

  1. Your ultimate goal is to write clear and correct code. Remember: if your code is clear, it is easier to make it correct.
  2. It is better to have a function do a conceptually consistent action than it is to keep the function short.
  3. It is better to keep the number of places that know about a data-structure's implementation layout very small. This is another variation on the Law of Demeter but it really bears repeating. Note that macros and inline functions do not really conceal the knowledge of a data-structure; they just hide it from the programmer and not from the code itself.
  4. If you're allocating structures, initialise all the fields at the same time. Better yet, write a function to do allocation and basic initialisation (either to valid null values or to obvious marker/guard values) and use that function everywhere else.
  5. Do not use the result of assignment (assuming you're using a language that defines it.) Sure it's defined, but it leads to really murky code. Do the assignment on a separate line (yes, you can afford it.)
  6. KPV A generalization of the above tip is to avoid horizontalizing code. I admit I'm often guilty of this in the quest for compactness (see tip #1).
  7. wdb Avoid making an object more powerful than necessary. If you have written a method, and later you see that this method is not really used, then don't hesitate to delete it. An object shouldn't be almighty. Less is more.

Duoas I think that the names we give things in our code often expresses how well we actually understand that code. A few additional thoughts:

It is worth every programmer's time to read and appreciate <Ottinger's Rules for Variable and Class Naming> . It is an excellent, straight-forward, peer-reviewed (that is, by our peers --i.e. programmers) treatise. Some of the following reiterates stuff in the paper (but is not an excuse not to read it ;-)

Really Good Naming Conventions:

  • Use nouns (and noun phrases) for variables. A variable is thing to be acted upon; it exists and nothing else.
  • Use predicates (verbs and verb phrases) for procedures. A procedure does something. It acts upon something else.
  • PascalCase/camelCase/CamelCaps/etc. Use them. (Tcl core conventions encourage both forms, where UpperCaps is used for function and types names and lowerCaps is used for variables.)
  • Underscore_delimited_names. Also good. (Personally, I prefer them.) They make for natural reading, while the old space and visibility concerns are a thing of the past. Mixture with CamelCase is also fine, so long as it is consistent.
  • Use names that you can actually speak in natural language. Words from your local language are OK, but for code with international scope just stick to English.
  • Short or one-letter names iff (1) they have universally recognized meaning, and (2) are local to small, bounded contexts (such as a function or loop fitting 10-15 lines or less). For example, x and y are universally understood to refer to some horizontal and vertical tabulation, typically in graphics, as row and col are oft used for textual tabulations. Likewise, src and dest are instantly understood to be 'source' and 'destination'. The simple n or i are fine for one- or two-line loops where the actual index count is inconsequential or immaterial to the algorithm. An s is fine for a string in the middle of some transformation. Functional programmers will use x and xs for temporary list constructs; they are instantly recognized as car/element and cdr/(remaining) elements. But, though an fft is a fast-fourier-transform, a better name would be FastFourierTransform.

Horribly Wrong Naming Conventions (or 'Corollary')

  • sTuDlyCapS and other l33t forms are for weenie wannabes; not professionals. Likewise SUIT_friendlyFORMs are to be despised.
  • Hungarian notation [1] is evil. Unfortunatly it has seduced countless masses for many years, meaning it is often inescapable when handling existing code. Only use it if you are required to do so. (Yes, this one is still a hot religious issue for many. I found an interesting paper advocating Hungarian notation [2]. Personally, I don't believe a Microsoft employee is really an impartial commentator, and I think it succinctly demonstrates some of the warped thinking that Hungarian notation inculcates... but that's all I will say about that here.)
  • Positional notation, PRIME-MODIFIER-CLASS notation, and other silly 'one size fits all' schemes are likewise dead-weight traps that focus more attention on form and deciphering than meaning.
  • There is no longer any excuse for short, vowel-less, scrunched-up variable names (like ctmrnm and lclwtt --can you guess what they mean? [They do mean something]). Tcl can handle names of any (non-negative) size, and ANSI-C can handle 31 characters at minimum. That is plenty of space for legible names. If you find that your names are exceeding 30 or so characters, then you are killing your keyboard. Here is an example of just how much space that is:

  • Scrunchedupnames are nearly as unreadable. English script delineates on lexemes, and people's brains are trained to read it that way.
  • Even highly-specialized, domain-specific applications should avoid abbreviations and symbols that require considerable domain knowledge. Often, there are more than one symbol to express the same thing in mathematics, physics, and other sciences. Don't take shortcuts. Just because you are an expert in the domain doesn't mean that anyone else will recognize the notation.

Well, that's enough from me. :-P

NEM See also the Tcl Style Guide and the Tcl/Tk Engineering Manual (TIP 247).

I'd add that while the above advice is mostly good and useful, real quality software comes from a firm grasp of the fundamentals of computer science and software engineering, rather than just coding style guidelines. Learn about the underlying theories (mathematics, logic, set theory, number theory, data structures, algorithm analysis, etc, etc) and how to apply them. Learn as many programming languages as possible (including Logic Programming, Functional Programming, OOP, etc). Learn about concurrency, networking, databases, and so on. Then worry about whether your variable names are stylishly capitalized.

RS 2013-11-07: My rules for well-readable code include to use habitual names instead of being very creative. For instance, my stand-alone scripts always have the pattern (borrowed from C):
proc main argv {
main $argv

Or, the frequent operation of reading lines from a file:
set f [open $filename]
while {[gets $f line] >= 0} {
close $f

LES 2013-11-07: Although I really appreciate the content of this page, I don't see much in the way of "tips for writing quality software" in it. It has, instead, plenty of tips for writing quality CODE and strictly from the point of view of maintenance, which is too narrow to say the least or a whole different topic to say the most. You can always write top-notch maintainable code and end up with a crappy application, so unrelated the two things are.

DKF 2016-07-09: The key principles for writing quality software are to design for clear purpose, with low coupling and lots of testing. There are many sorts of testing. There are many notions of clear purpose. There are many levels at which it is often a good idea to reduce coupling. They are all relevant.

Oh, and approach everything with good taste. Any specific technique can be overdone.

Further Reading  edit

The Power of Ten – Rules for Developing Safety Critical Code1, Gerard J. Holzmann, NASA/JPL Laboratory for Reliable Software
Ten rules that lead to code that can be analyzed mechanically.