Intro to compiled languages (Rust)#
For the classic approach, please use the jupyter notebook in this directory to follow the lecture notes. This is a complete rewrite using Rust.
Intro to compiled languages#
So far, we’ve used interpreted languages. These work by distributing a compiled “interpreter”, a program that runs some set of instructions that is given to it “at runtime”, that is, well after the program was compiled. These are easy to run, but they tend to have some drawbacks:
They tend to be (much) slower due to taking general input.
They require an interpreter on the target machine.
They generally require a lot of setup (packaging) to ensure all the libraries can be found. (Similar things are true if you use dynamic linking on a compiled language)
Interpreters take time to start up and load std libraries (40 ms or so for Python).
These drawbacks may often be fine - for example, if you are limited by IO speed, using a compiled language won’t be much faster. Use the best tool for the job!
What language?#
There are several good choices for a compiled language. A quick summary:
Fortran: The classic numerics language. Poor support from modern compilers.
C: The classic compiled “systems” language. Simple, but provides very little to make safe designs.
C++: Much more powerful design possibilities compared to C. But all the old unsafe stuff is still there too. Higher level than C.
All three of these are older than modern packaging, so using them is quite tricky. Modern languages take packaging seriously; a few of those are:
Go: Google tool “all the best stuff” from previous languages and made a solid language. Has a garbage collector.
Rust: A “memory-safe” language without a garbage collector. Used in Linux/Windows kernels, but as powerful as (very modern) C++ in many of it’s concepts, without the old ways of doing things.
Of these, the most interesting language is currently Rust - it’s “systems” design means you can use it everywhere (unlike Go, which is only for applications).
Step 1: Getting the compiler#
Just like downloading the interpreter in an interpreted language, compiled
languages require you to download a compiler. Unlike interpreted languages,
however, only the developer needs the download. You can probably get the Rust
compilers directly; however, most developers prefer rustup
, which is a tool
that gets Rust compilers for you, allowing you to control the version, and
request the nightly features. You can go to https://rustup.rs/ for general
instructions on how to get it.
Brew users can brew install rustup
and then follow the instructions to get the
latest version, or just brew install rust
and accept whatever version was
given (usually the latest version). The main tool that you get with rust
besides the compiler (rustc
) is cargo
, its package manager.
It is also possible to get rust via Conda from the conda-forge channel. This
will come with rustc
and cargo
as well, and is usually the latest version.
If you use one of the conda family (conda, mamba, micromamba, pixi, etc), then
feel free to try this method. Rustup is not available via conda.
Step 1.5: Try compiling a little file#
Let’s just make sure we can compile code, and see how a compiler works. Make the
following example.rs
file:
fn main() {
println!("Hello world!");
}
Now, compile it with rustc
:
$ rustc example.rs
And run it:
$ ./example
Hello world!
We skipped one step you often see: linking. You can build library code with making an executable; then you can either statically link (final binary has everything) or dynamically link (the library is still required at runtime) the code. Your build system (CMake for classic languages, Cargo for rust) will generally handle setting this up.
Modern languages like Go and Rust highly prefer static linking by default. Binary size isn’t as important as it used to be, and having everything in a single binary makes it much easier to work with and distribute. They also strongly prioritize building all dependencies from source, which is important for cross-compiling and optimizations. Older languages tended to prioritize making source easy to keep from users.
Step 2: Start a new project#
Now that you’ve seen the compiler, let’s try the package-based approach. Use the following command to make a new project with Cargo:
$ cargo new foo
This will make a new directory /foo
with a binary package named foo
. (You
can pass --lib
if you are making a library.) You can add --vcs git
if you
want it to also set up git for you (due to the way cargo works, you’ll have a
.gitignore
regardless).
This makes just three files: .gitignore
with the /target
directory ignored,
a simple src/main.rs
hello world application, and creates the Cargo.toml
configuration file. That’s the new one we want to look at; it’s contents should
look about like this:
[package]
name = "foo"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
This has a package.name
, package.version
(which is now optional in Rust
1.75+!), and a package.edition
, which tells Rust that you are writing in the
2021 dialect of Rust. Unlike most other languages, you can mix and match
libraries in different editions.
Nothing else is required, and you now have a working package!
Step 3: Build and run your project#
You should interact with your project via cargo
commands. You can try building
and running your project:
$ cargo run
Compiling foo v0.1.0 (~/tmp/foo)
Finished dev [unoptimized + debuginfo] target(s) in 0.56s
Running `target/debug/foo`
Hello, world!
This will build in debug mode if it’s not already built. A couple of common
options are --bin <name>
if you have more than one binary and --release
for
release mode. As always, use --help
or -h
to see what you can run.
The file will end up in ./target/debug/foo
if you want to run the binary
directly.
Step 4: Add tests#
Now let’s start adding all the things we learned how to do in Python. First,
let’s tackle unit tests. Try adding the following to the bottom of
src/main.rs
:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_42() {
assert_eq!(42, 42);
}
}
There are several new Rust features shown here, so let’s walk through it:
#[...]
is an “inner attribute” - that means it’s an attribute on the next item.cfg(...)
is an attribute that enables or disables an item based on what’s inside the parens. in this case,#[cfg(test)]
will only compilemod tests
if building tests.We put the tests in a module. That’s really mostly so we can apply an attribute to the module to conditionally compile it. We name it “tests” because that’s a good name for tests.
Unlike some languages, you don’t get the outer scope for free in an inner scope, so we explicitly
use super::*
to get all the private goodies in the outer scope. Pretend we had something in the outer scope we wanted to test, of course.Then we add a
test
attribute to next item (in this case, a function). Rust will then be able to find and run this tests for us.Finally, we have an
assert_eq!
macro. It will pass/fail based on if both items are equal, and it will be able to capture the values for nice printout on failure.
Notice that we write the unit tests in the file that contains the code. This will allow the unit tests to test private parts of the code that are not available outside this file.
Now, try running the tests:
$ cargo test
Compiling foo v0.1.0 (~/tmp/foo)
warning: unused import: `super::*`
--> src/main.rs:7:9
|
7 | use super::*;
| ^^^^^^^^
|
= note: `#[warn(unused_imports)]` on by default
warning: `foo` (bin "foo" test) generated 1 warning (run `cargo fix --bin "foo" --tests` to apply 1 suggestion)
Finished test [unoptimized + debuginfo] target(s) in 0.38s
Running unittests src/main.rs (target/debug/deps/foo-f51fd73e2da0c0ec)
running 1 test
test tests::test_42 ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Notice that we got a warning about our super
being unused. We even got a
recipe to fix it automatically! Let’s not worry about that for now. We want to
see this again from the linter in the next step!
Step 5: Format the code#
Let’s go ahead and format this, similar to black
or ruff fmt
from Python:
$ cargo fmt
Unless we’ve been sloppy, it likely won’t do much, but now we know how to format our source files!
Step 6: Static checks#
Now let’s try adding linting checks. We already get type checking whenever we compile, but let’s add helpful optional checks. We can run:
$ cargo clippy --all-targets
Checking foo v0.1.0 (~/tmp/foo)
warning: unused import: `super::*`
--> src/main.rs:7:9
|
7 | use super::*;
| ^^^^^^^^
|
= note: `#[warn(unused_imports)]` on by default
warning: `foo` (bin "foo" test) generated 1 warning (run `cargo clippy --fix --bin "foo" --tests` to apply 1 suggestion)
Finished dev [unoptimized + debuginfo] target(s) in 0.08s
Note: you might not have
cargo-clippy
if you installed Rust minimally. See https://doc.rust-lang.org/stable/clippy/installation.html for info if so.
This will run the checker (clippy) on all targets (including the test target).
However, not much is enabled by default. Add the following to your Cargo.toml
:
[lints.clippy]
all = "warn"
pedantic = "warn"
nursery = "warn"
This will enable “all” lints, “pedantic” lints, and “nursery” lints, which will provide a lot more help for a young (at heart) Rustacean.
Some lints can be fixed automatically with cargo clippy --fix
.
Step 7: Adding dependencies#
One key feature of Rust is how easy it is to add dependencies. This is usually
something that is very challenging for older languages. Because of how easy it
is, Rust doesn’t encourage a massive standard library; things like regex are
relegated to dependencies. Even the Rust AST parser is a dependency. Let’s try
adding a fun dependency: strum
, a
string enum package. We want to use it’s “derive” feature too, so we’ll request
that.
You can add it with a command:
$ cargo add strum --features derive
Updating crates.io index
Adding strum v0.26.1 to dependencies.
Features:
+ std
+ derive
+ strum_macros
- phf
Updating crates.io index
Though, to be fair, all it did is add:
[dependencies]
strum = {version="0.26.1", features=["derive"]}
which you could have done yourself. In fact, I’d recommend loosening the pin;
cargo has a lock file (Cargo.lock
) already. Building or running cargo update
will update the local files from your Cargo.toml
/Cargo.lock
.
Aside: features
Rust allows dependencies to provide optional components, called “features”, with additional dependencies and compiled parts. Unlike Python, features can also be enabled by default. Many packages have a default enabled “std” feature that you can disable to not require Rust’s standard library, for use in embedded systems without memory management and other OS-level things.
Now, let’s use our new dependency:
use std::string::ToString;
#[derive(Debug, PartialEq, strum::EnumString, strum::Display)]
enum Card {
World,
Planet,
}
fn main() {
let world = Card::World.to_string();
println!("Hello, {world}!");
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_world() {
let world: Card = "World".try_into().unwrap();
assert_eq!(Card::World, world);
}
}
Here, we bring the ToString
trait into scope so that we can use it (more on
traits later). Then, we make a new enum. We add derive
attributes, which
inject code for us. We derive:
Debug
, which is like__repr__
in Python and is used in theassert_eq!
to print out the enum if the comparison failsPartialEq
, which allows us to compare (inassert_eq!
)strum::EnumString
, which allows us to convert strings into our enumstrum::Display
, which allows us to convert our enum into strings
Then we make our enum, which has two variants, World
and Planet
. Strum will
use these as the conversions to and from strings, though we could customize this
with the #[strum(...)]
attribute if we wanted to.
Then we use conversion to a string in main
, and from a string in the tests.
Try cargo r
(short for cargo run
) and cargo t
(short for cargo test
) and
make sure it’s working!
Notice that Cargo downloaded and compiled your dependencies for you. Also notice
that you now have a Cargo.lock
with the exact known working versions of
everything.