Version 41 of damage by duplication

Updated 2007-03-16 11:59:39

Anyone who's worked on a database knows that it's a Bad Thing to duplicate information in such a way as to require (or permit) updates in different places for the same information. There's a DB term for this ... normalization?

One reason this is a Bad Thing is (as experience has taught us): if there are two places to update something, sometime somewhere someone will update one, and not the other - what began as two identical copies will diverge over time.

Redundancy (defined as duplication of information in two or more distinct places) can be protective: one might back up a database, for example, or one might store a checksum somewhere.

The whole point of hypertext is to permit references to be constructed and copied, to permit hyperlinking instead of quoting. It therefore never makes sense to duplicate information for the purpose of referring to it.

One virtue of Wiki is that it enables full-text searching, a.k.a. "keyword in context" searching. So a Wiki partakes of some of the virtues of a document (editing) and some of the virtues of a DB (searching.) This is a Good Thing. This is why Wiki is useful as a collaborative space and as a repository.

The recurrent desire to construct encyclopaedic indices of cluelessness, most recently expressed in Call for suggestions for presentation of Tcl Apps is a form of duplication: duplication of search results. Someone constructs a plausible search, captures it to a page, and makes an index out of it.

Publishing pre-digested searches on this wiki is bad, for several reasons:

  1. the constructed/captured searches are an out of date snapshot.
  2. they occupy space in the title space of pages
  3. they compete with the primary materials for searches.
  4. they necessarily duplicate information in different places.
  5. they carry with them a hermeneutics, which they tend to reify, and whose privileged expression tends (for reasons 1-4) to drown out other voices.

So, in summary: merely because one individual has difficulty using the search facilities, and prefers to view his/her information as a pre-digested index, or even a structured pre-digested index (taxonomic pabulum), and merely because that individual has successfully constructed a search on some occasion is no reason for that individual to construct a so-called index page as a monument to their transitory moment of glory.

escargo 6 Mar 2007 - I understand your point, but I am not sure I agree entirely. Here's why.

  1. The search facility of this wiki is not as powerful as it could be.
  2. An index that is built rather than being captured from a search could add value to the Wiki rather than adding noise.
  3. I don't agree that a taxonomic set of indices is equivalent to a pre-digested index.
  4. As compilations are recognized as legitimate original works, so a taxonomic structure might be an original work of some worth.

So, it may be in practice that the indices that you dislike are of no worth, it would be hasty to say that they are always of no worth.

CMcC point by point:

  1. if the search is weak, fix the search, don't write around it.
  2. if you are arguing that it is possible to build a valuable index, I agree. Such an index would have to have a lot of original content compared to its merely/purely referential content. As such, it would resemble a link-rich page. That is not what we're seeing in these -Index pages.
  3. any taxonomy represents some pre-selection, some hermeneutics, which by analogy is pre-digestion, in that it processes/selects/strucures information before the reader can critically review it. Again, it could be done with style and quality - it's just not.
  4. absolutely - deriving a taxonomy is a useful work ... a lot of work. What I'm seeing is people banging what's not much more than screen caps of searches up on the wiki, calling them indices, and leaving it at that.

Human beings have been breaking information down hierarchally for ages - perhaps it's a consequence of literacy, it certainly helps them deal with complexity. However: repackaging what search gives you does not. I was actually trying to be generous calling these indices 'taxonomy' - they're hierarchy for its own sake.

I think my DB normalization analogy is apt. We're suggesting that possibly there's some information here which someone sometime can't easily find. We're suggesting the search be duplicated, along with some of the page content, because it will be easier for this hypothetical person to find ... that is premature optimization, sacrificing correctness for convenience.

jcw - I agree. One fact in one place. This does not imply "each fact in a separate place", btw. As for fixing the search - yep (like so many things), there is always room for improvement. Or just google with the "site:" tag.

[HW's rant and mixed metaphors moved to the sock puppet drawer: [L1 ]]


wdb If I see in this wiki a page named "Index Apps", I trustfully expect apps. Everybody agrees about what an application is. If I see a page named "Index Apps Geography", my trust dies down. I will not try to explain why. Just a feeling. I am not annoyed about its existence, I just ignore it. Not my way.

The electrically-generated lists of related pages in this wiki are fine. The list is phone-book-like, but I never had the ambition to read one from top to bottom. I always took it and looked it up alphabetically (on paper) or by some smart search function (on HTML).

(Btw, have you ever seen the movie Rain Man with Dusty Hoffmann? Dusty --- the rain man --- sees the name sign of the waitress and says her phone nr. His brother asks him: heave you read the whole phone book? Dusty says: No. Just until "K".)


(HW) CMC how can I possibly be Abitbol ? CMcC the real question is how can Abitbol possibly be Abitbol?


MSH - I've been using [TiddlyWiki:[L2 ] says One fact in one place. Was it ever considered adding tags to these wiki pages, these are relativly simple to handle but give a powerful index access.

(Administrator message to 24.37.108.108/videotron.ca and to 124.189.1.146: please identify yourself, or email me. -jcw ... CMcC here - I was 124.189.1.146)


Category Discussion | Category Concept ]