A branch doesn't know it is a branch. It thinks it is God.
A tree is an directed acyclic graph (DAG) data structure composed of branches. One branch is the trunk, and every other branch stems from it, either directly or indirectly. Trees can be used, e.g., for representing the result in syntax parsing. An XML document is structured as a tree.
EMJ 2021-01-04
So a node is a branch. Is a branch a node? I think this terminology is just confusing and counter-intuitive.
PYK 2021-01-04
One term is needed for the node that does not stem from any other node, and one term is needed for the other nodes. "trunk" and "branch" seemed suitable to me, as they fit it with the analogy to a tree in the physical world. What are the other options?
EMJ 2021-01-04
I am aware of the need for two terms. In the physical world, a trunk is a path from the root to an end point, usually taken to mean the "thickest" such path. A branch is the same, except it starts at a node other than the root.
I have used root and node for a very long time, and seen it used by others in very many places. I don't think I've seen trunk and branch used with that particular meaning at all.
PYK 2020-01-05
I tried just now to replace "trunk" with "root", but it didn't work out. The problem may be the term, "node". There's no need for it. It's branches all the way down. I've removed the word "node" where possible in this page. If everything is branches, "root" isn't the right term because a root isn't a branch. A "trunk", on the other hand, is a branch, so it still seems like the better term.
I would define a tree (as a summary what I read in internet) as:
A tree exists out of nodes. Nodes are the elements which contains the data.
A tree can't be empty. It contains at least out of one node.
Nodes are connected by edges. Edges are often called connections or branches. The edges between root and its children is sometimes named trunk. Edges define the relation between nodes.
parent and children are relations of nodes.
Every node has exactly one parent beside the root node which has no parent.
Every node has one or more children beside leaves which have no children.
Ancestors are all nodes from one node point of view in direction of the root and including the root.
Descendant are all nodes from one node point of view in direction of the leaves and including the leaves.
No node can have an edge to an ancestor beside its parent.
No node can have an edge to a descendant beside its children.
Another definition could be (summary from what I read in book "Algorithmen in C" by Robert Sedgewick; there is an english version of this book, too):
A tree is a not empty amount of nodes and edges.
A node is a simple object with a name and other associated data.
An edge is the connection between two nodes.
A path in a tree is a list of nodes in which the sequential nodes are connected by edges.
One node is defined as root node. Between root and every other node of the tree exists exactly one path.
A tree is a collection of named elements, called nodes, one of which is distinguished as a root, along with a relation ("parenthood") that places a hierarchical structure on the nodes. (Data Structures and Algorithms; Aho, Hopcroft and Ullman; Addison-Wesley, 1987).
The book is a classic data structures text.
HE 2022-05-03
As far as I remember the book cited by EMJ and the documentation of struct tree package defines the tree formally as:
A single node by itself is a tree. This node is also the root of the tree.
Suppose n is a node and T1, T2, . . . , Tk are trees with roots n1, n2, . . . , nk, respectively. We can construct a new tree by making n be the parent of nodes n1, n2, . . . , nk. In this tree n is the root and T1, T2, . . . , Tk are the subtrees of the root. Nodes n1 , n2, . . . , nk are called the children of node n.
A tree can be represented as a deep dictionary: Each key is the name of a branch and each value is either a dictionary of further branches or a list containing a single value which is the value corresponding to that key.
AMG: A tree is sometimes called a DAG because it is a directed acyclic graph. I find this to be a useful mnemonic.
NEM: Well, all trees are DAGs, but not all DAGs are trees.
AMG: Yes, quite right; see the other tree properties listed in the first line of this page.
LV In the following text, I notice a number of cases where GUI widget sets have some sort of tree widget. In each of these cases, is the data structure intermingled with the data representation? Seems a shame that someone doesn't just build a graphically viewable widget that depends on a data structure that is separate. That way, someone could do work with one piece of code, and then, in other pieces, display that data. Seems like a natural method of developing and maintaining code....
In Tcl, data structures such as lists and dictionaries are values. In order to form a value that represents a tree, a value that represents a branch is needed. A branch typically has three components: A value, attributes, and branches. The purpose of an attribute is to provide additional information about the branch itself rather than the value it contains or the thing it refers to in a model. In other words, attributes describe the branch itself rather than the thing the value of the branch describes. A good format for a branch might be a list where the first item is the value of the branch, additional items are keys and values representing attributes of the branch, and any remaining final item is a list of further branches. In other words:
value ?{key 1}{attribute 1}? ... ?branches?
The advantages of this format are that it is as standard Tcl list, and that both attributes and branches can be independently omitted or provided in a concise way.
For processing efficency and also for verification, it may be useful to provide a count of the items in a list representing a branch or list of branches. Such a count would quickly indicate the number of attributes, and also whether there are any branches. This could be provided using a notation similar to that of Tcl's {*} operator: Lists are preceded by a number enclosed in braces:
This is not a valid Tcl list, but as an extension it makes about as much sense as extending Tcl with the {*} did, and perhaps more, since it isn't as disruptive to the larger system.
The same notation could make the representation more robust by providing the number of characters used to represent each value: