Version 16 of Rust

Updated 2023-01-22 22:54:50 by pooryorick

Rust is a programming language designed to be performant, safe, and productive.

Description

Rust combines concepts from C, C++, Haskell, and other languages. Each value is either a primitive value or a structure, and each value has a type. The compiler uses the type of each value and function to ensure that a program is type-safe, and to compile it to performant machine code. Where the compiler can not infer a type of a value, variable, parameter, or result, that the value is assigned to must be annotated with a type. The key innovation of Rust is that for each location in memory there is a code block that is its owner, which makes it possible for the compiler to reason about the lifetime of values, which relieves the developer of the burden of deciding when to free resources, and opens up new avenues for optimization where copy-free data management can be worked out at compile time.

A struct may contain either named or unnamed fields. Each field is either a primitive value or a reference to primitive value, struct, enum or function.

An enum is a type that enumerates a set of possible types, each of which may be a reference to a struct. match is used to provide a way to handle each of the possible types of an enum.

Each type has a set of functional features, which are organized into traits. Each trait specifies a set of functions that implement the trait, acting as an interface to the value or struct. Such a function is called a method since the first argument to it is a reference to the value, and it typically implements a method for performing some operation on the value.

impl is used to provide a set of function definitions that implement a particular trait for a particular type of struct. Any number of traits may be implemented for a given struct.

If the copy trait is implemented for a type, then a value of that type may be trivially copied, i.e. it does not involve any memory allocation from the heap. When such a variable is assigned to another variable, a copy is made and both variables remain valid. When a variable that doesn't have the "copy" trait is assigned to another, the compiler avoids aliasing by making the variable holding the original invalid.

If the clone trait is implemented for a type, then a value of that type may be cloned: A new independent value is produced by copying the underlying data.

Like Tcl, Rust does not employ a garbage collector, but it also does not normally employ reference counting. Instead, each variable is cleaned up and resources deallocated accordingly as soon as the block of code that owns it is finished with it. For each variable, at any point in the program there is one owner, and either zero to one writer, or zero to many readers. No reader or writer may outlive the owner.

In both a type annotation and a function definition, & indicates that the variable holds a reference to a value. Rust tracks all references and ensures that no reference has a lifetime greater than the lifetime of the variable to which it refers. Where the compiler can not infer the lifetime of a reference, the lifetime must be explicitly annotated using &'. This managment of data lifetimes eliminates a whole class of issues that a C program might be vulnerable to.

When a function accepts a value as an argument, the given value moves to the corresponding variable in the function, making the scope of the function call the new owner, and the original variable becomes invalid, which prevents aliasing. When a function returns data, the returned data moves to the caller and the caller becomes its new owner. Data may also be assigned or passed by reference, in which case the original variable continues to own the data.

Both variables and references are immutable by default. mut declares mutable variable, and &mut declares a mutable reference. To prevent data races, while a mutable reference exists, no other references are allowed.

Programs and libraries are organized into modules, which contain any of the various programming artifacts Rust provides, namely structures, variables, functions, traits, and trait implementations.

To maintain coherence, a module may only implement a trait if either the trait or the structure is local to the module. The type of some data may be specified by specifying what traits are required, which allows duck typing at compile time. A trait object provides a function table that may be modified at runtime, which allows duck typing at runtime.

Rust has various features for programming the structure of the program itself, i.e. for metaprogramming: A macro language, closures, syntax for templating generic structs and functions, and pattern matching similar to that of Haskell, on language items for such purposes as destructuring variable assignment and code branching according to data type.

Rust is actually segmented into two different languages: safe Rust and unsafe Rust. In the safe parts, nothing is undefined, nothing is null, only safe type conversions are possible, and only safe memory operations are allowed. There is, however, an escape hatch: A function annotated as unsafe takes upon itself the responsibility to guarantee the same things that the Rust compiler guarantees. , so the compiler makes no further effort to check that function for conformance. This makes it possible to write such a function in another language such as C, if needed.

One of the design goals of Rust is for the compiler to be able, as much as possible to infer the intent of the programmer. The compiler also takes a more active role than most compilers in making suggestions for fixing problems that it finds. This helps to smooth the learning curve.

Memory Management: Tcl Vs Rust

Both Tcl and Rust provide automatic memory management without using a garbage collector, and both are memory-safe. Where Tcl uses reference counting, Rust uses its borrow rules to constrain the use of references, and tracks all references at compile time to ensure that a value may be cleaned up immediately when the program exits from the block that owns it. Tcl's internal reference counting relies on the developer to properly increment and decrement references, which is error-prone since proper reference counting can not be checked for correctness. Because Rust would automate the task of reference counting, relieving the developers of Tcl from this burden, it may prove to be an ideal implementation language for Tcl.

Object Orientation

Rust does not provide traditional object-oriented programming facilities. Inheritance has proven to be a liability, so Rust doesnt have it. Instead, the emphasis is on structures and interfaces that those structures support. Each structure has a type, and where inheritance might be used in other languages, it is common instead to create a new type of structure and implement customized traits for it. Composition is achieved by having a field in one structure be a reference to another structure. This design allows for zero-cost abstractions, since all the details of function dispatch are worked out at compile time. Where needed, Rust also provides facilities that provide for runtime dispatch.

Rust Vs. Tcl

Both languages are founded on the concept of immutable value, but Tcl values are much more abstract, both semantically and technically. In Tcl, copy-on-write can be implemented underneath the surface of the language whereas in Rust, a programmer details directly with such issues. This level of abstraction is the most fundamental difference between the languages.

As value-oriented languages, both languages have no concept of null, which is anything but a value.

Both languages are memory-safe, assuming that the underlying implementation/compiler is free of bugs. Rust is type-safe, both in the sense of memory layout and data structure, and also in the sense of the semantics of the data and the operations on it, as only the intended structure is accepted as the argument of a function or the value of a variable. while Tcl is not type-safe, it is also not type-unsafe: Each procedure may choose to implement the necessary runtime type verification. However, even then a procedure has no way to determine which program component the declared the data type, and therefore can never achieve the level of semantic type safety that Rust does.

For a program written in Rust, the additional burden of annotating data types and tracking data mutability and ownership is high compared to practice of taking things at face value in Tcl, and all the type annotation, templating, and boilerplate code makes a program written in Rust far more verbose, and development far slower. Tcl is far more concise than Rust, but far less performant. In addition to performance, another big win for Rust is that all the additional type information a programmer provides in the source code allows the compiler to verify some aspects of its correctness, and probably reduces the size any accompanying test suite by at least the same amount of code and effort. If a program compiles successfully, it's already some indication that the program works as expected. An equivalent program in Tcl would have to ensure the same level of correctness via a series of unit tests.


"When you use Rust, it is sometimes outright preposterous how much knowledge of language, and how much of programming ingenuity and curiosity you need in order to accomplish the most trivial things. When you feel particularly desperate, you go to rust/issues and search for a solution for your problem. Suddenly, you find an issue with an explanation that it is theoretically impossible to design your API in this way, owing to some subtle language bug. The issue is Open and dated Apr 5, 2017." [L1 ]


dkf - 2022-06-15 14:30:02

I suspect that Rust will not prove to be as ideal a language to implement Tcl as some think. There are a number of concerning aspects, but the limitations on traits and lifetimes would seem to be the main ones (along with the general hostility to dynamic linking); it's not at all clear how one might have third party commands implemented (where those need the equivalent of ClientData) and the lifetime management is almost certainly going end up forcing a lot more copies of things being used, which is known to be expensive.

The final problem, the one that really dissuades me, is the sheer cost in developer effort of moving a codebase the size of Tcl and Tk.