I was watching the interview
-
Checkout the issue with Async Operation, and Async Cancellation
- He mentioned something about going to Nicholas Matsakis blog
Using ~ on vim change the case of a character
We have modifiers in VIM like i and a. If you are in the middle of (), you can use "d+i+b" or "d+i+(" to remove the content inside ().
1 - Porting FlameGraph to rust
When running an application in Linux, you can use these commands before your application to understasnd more about where the CPU actually spending time when running your applicaation:
- perf stat
- perf record -g
- After you are done with command, you can run perf report which reads the report for you! However, it is not much usefull!
- perf record -g --call-graph dwarf
What is better for this? Use FlameGraph! :)
On MAC you can use dTrace instead of perf:
First run you application when running dTrace:
sudo dtrace -n 'profile-997 /pid == $target/ { @[ustack()] = count(); }' -c "./target/release/your_app" -o out.stacks
Then you need to run flamegraph:
./Flamegraph/stackcollapse.pl out.stacks > out.folded
./Flamegraph/flamegraph.pl out.folded > flamegraph.svg
Hardware and Softwares :D :
look at Firefox multi-account container look at j command
Question: What is the difference between using as usize or usze::form?
#![allow(unused)] fn main() { fn new(rows: u16, columns: u16) -> Self { Self { cells: vec![b' '; usize::from(rows)] } } }
In the code above, if instead of using usize::from we use rows as u16, in case function parameter type changes to u128, using as would compile without any issue. But, usize::from would not!
Programming Rust
Chapter 1: Programing Rust
The author recommends doing side project when reading the book!
Look at Chris Biscardi's video on Youtube.
Traits are:
- Like interfaces in C++ and Java
- The main way Rust supports integrating your types into the language itself
You can find the github page of the book here
What does code do in C?
int main(int argc, char **argv) {
unsigned long a[1];
a[3] = 0x7ffff;
return 0;
}
This code is somehow categorized as undefined behavior in C!
What is the undefined behavior?
- Behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standarad imposes no requirements.
- Underfined behavior doesn't just have an unpredictable result: the standard explicitly permits the program to do anything at all.
Note: Any software that might handle data from an untRusted source could be the traget of an exploit: - Look at the TrueType font issue in 2010.
Rust Promise: If a your program passes compiler's checks, it is free of undefined behavior!
Concurrency: Rust allows to easily share data betweeen threads, AS LONG AS it is NOT changing! Data that does change, can only be accessed using Synchronization primitives.
What is Read-Copy-Update
In the realm of concurrent programming, ensuring that multiple threads can access shared data without causing corruption is a fundamental challenge. Traditional locking mechanisms, while effective, can introduce performance bottlenecks. The Read-Copy-Update (RCU) mechanism offers an elegant, lock-free approach to synchronization, particularly optimized for read-heavy workloads. It allows readers to access data without waiting for locks, while writers can modify data without blocking those readers.
Here's a breakdown of how this ingenious mechanism works:
The Core Idea: Out with the Old, In with the New
At its heart, RCU operates on a simple yet powerful principle: instead of modifying data in-place, which would require locking to prevent readers from seeing inconsistent intermediate states, an updater creates a copy of the data, modifies the copy, and then atomically updates a global pointer to point to this new version.
This "copy-on-write" strategy ensures that readers, who are concurrently accessing the data, will always see a consistent snapshot of the data—either the old version or the new one, but never a partially modified state.
The Three Phases: Read, Copy, and Update in Action
The name "Read-Copy-Update" neatly encapsulates the three key phases of its operation. Let's delve into each:
- Read Phase: Unfettered Access
Readers in an RCU-protected environment can access the shared data without acquiring any locks. They simply enter a read-side critical section, which is a lightweight mechanism to inform the system that they are actively reading. This is a key advantage of RCU as it makes read operations incredibly fast and scalable.
Inside this critical section, a reader obtains a pointer to the current version of the data. Because the original data is never modified directly, the reader is guaranteed to have a consistent view of it for the duration of its read operation.
- Copy and Update Phase: A Graceful Transition
When a thread needs to modify the shared data, it performs the following steps:
- Create a Copy: The updater first creates a new copy of the data structure it intends to modify.
- Modify the Copy: All modifications are then made to this new, private copy. Since no other thread is aware of this copy yet, no locks are needed.
- Publish the Changes: Once the modifications are complete, the updater atomically changes the global pointer to point to the new, updated copy of the data. Modern processors guarantee that this pointer update is an atomic operation, meaning it's seen as a single, indivisible step by all other threads.
From this point on, any new readers entering a read-side critical section will see the new version of the data.
The Crucial Fourth Step: The Grace Period and Reclamation
A critical question arises: what happens to the old version of the data? It cannot be deallocated immediately because existing readers might still be using it. This is where the concept of a grace period comes into play.
A grace period is a waiting period that an updater must observe before it can safely reclaim the memory of the old data. This period is defined as the time it takes for all readers that were active at the time of the update to complete their read-side critical sections.
Here’s how it works:
- Waiting for Pre-existing Readers: After publishing the new version, the updater waits for a grace period to elapse.
- Quiescent State: The system keeps track of which threads are in a read-side critical section. When a thread exits its critical section, it is said to be in a quiescent state.
- Reclamation: Once all threads that were active during the update have passed through a quiescent state, the grace period is over. The updater now knows that no reader is holding a reference to the old data, and it can be safely deallocated.
It's important to note that the grace period mechanism doesn't need to wait for readers that started after the update was published, as those readers will only ever see the new version of the data.
In essence, RCU splits the update process into two key stages:
- Removal: The atomic pointer update that makes the old data unreachable for new readers.
- Reclamation: The deallocation of the old data after a grace period.
This separation is what allows RCU to achieve its lock-free nature for readers, providing a highly efficient synchronization mechanism for read-dominant scenarios.
You can update Rust using :
Rustup update
You can tell Cargo to skip git using:
cargo new test_app --vcs
You can clean the build folder (called target) using:
cargo clean
Note: Four-space indentation is standard Rust style.
isize and usize types hold pointer-sized singed and unsigned integers, 32 bit long on 32-bit platforms, and 64 bit long on 64-bit platforms.
Why isize and usize both?
- Use usize when indexing collections or sizes (because indexes can't be negative).
- Use isize when you might subtract two pointers or offsets (since the result can be negative).
Assert! macro:
It checks that its argument is true and if not, it panics. Unlike C/C++, you cannot skip checking assertions in Rust. You need to use debug_assert! instead.
if statement
if a > b {
println!("A is greater than b");
} else {
println!("A is NOT greater than b");
}
As you can see, it does not require () over the condition, but requires {}.
Rust infers the types of value only in the function body! It always requires the function return type and function parameters' types, explicitly!
In Rust, usually you return from a function with an expression that is NOT followed by a semicolon and use return statement only from explicit early returns from midst of a function.
#[test]
foo() {
assert_eq!(add_i32(12,12), 24)
}
You can use Rust test to run all tests you wrote which can be scattered throught your project.
Reading Command-line:
use std::str:FromStr; use std::env; fn main() { let mut numbers = Vec::new(); for arg in env::args():skip(1) { numbers.push(u64::from_str(&arg) .expect("error parsing argument"); } if numbers.len() == 0 { eprintln!("Usaeg: number number"); std::process::ext(1); } }
trait: It is collection of methods that types can implement.
It is true that vec is supposed to be dynamic array! However, you still need to mark the variable mut for Rust to let you push numbers onto the end of it.
Rust iterators are efficient like they were written with a simple loop.
Rust does not have any exceptions! Instead, all errors are handled using either Result or panic.
When you use vec which is allocated on heap and its size is dynamic, Rust is cautious to leave the programmer in control of memory consumption, making it clear how long each value lives, while still ensuring memory is freed propmty when no longer needed.
For example in this code:
let mut my_vevtor : Vec<i32> = vec![1,2,3,4,5];
for num in &my_vector[1..] {
println!({}, *num);
println!({}, num); // the macro will automatically dereference it.
}
-
In this example we are telling Rust that the ownership of the vector should remain with
my_vector! We are just borrowing its element for the loop. The&operator borrows a reference to the vector's elements, letting num borrow each element in succession! -
println!takes a template string, and then substitutes formatted versions of the remaining arguments for the {...} forms as they appear in the template string, and writes the result to the standard output stream.
In Rust (unlike C/C++ which require main to return zero if the program finished successfully, or a non-zero exit status if something went wrong, Rust assumes that if main returns at all, the program finished successfully. You can explicity cause the program to terminate using functions like expect or std::process::exit().
Run this command to open the rust documentation:
rustup doc --std
A Rust package whether a library or an executable, is called a crate.
If you define a particular version of a package in your toml file, you'd make sure that the code will continue to compile even if a newer version of the package are published!
Also note that crates can have optional features: part of the interface or implementation that not all users need, but that nonetheless make sense to include in that crate. For example serde crate offers ways to handle data from the web. But, it will be available only if we select the crate's derive feature:
[dependencies]
serde= = { version = "1.", features = ["derive"] }
We need to import only those crates that we'll use directly! Cargo takes care of bringing in whatever other crates need in turn!
Rust raw string syntax:
#![allow(unused)] fn main() { let my_str = r###"Ahmad Mansouri "ahmad" \n "### }
It is like the letter r, zero or more hash marks, a double quote, and then the contents of the string, terminated by another double quote followed by the same number of hash marks.
Usually use delcarations are added to the beginning of the file. However, it is not strictly necessary.
You can use format! macro to create you strings:
#![allow(unused)] fn main() { let response = format!("{} + {} = {}", 12 ,13, 25); }
Rust makes sure that a shared data structure cannot be acceessed unless when you are holding the lock and release the lock automatically when you are done!
If youre program compiles in Rust, it is free of data races.
All Rust functions are thread-safe.
#![allow(unused)] fn main() { enum Option<T> { None, Some(T), } }
Documentation Commentt: Use /// to mark the comment lines above the function definition. Note that rustdoc utility knows how to parse them, together with the code they describe, and produce online documentation.
#![allow(unused)] fn main() { fn parse<T: FromStr>(s: &str, operator: chat) -> Option<(T,T)> { match s.find(operator) { None => None, Some(index) => { match(T::from_str(&s[..index]), T::from_str(&s[index+1..])) { (Ok(l), Ok(r)) => Some((l,r)), _ => None } } } } #[test] fn test_pase_pair() { assert_eq!(parse_pair::<i32>("", ','), None); assert_eq!(parse_pair::<i32>("10,", ','), None); assert_eq!(parse_pair::<i32>(",10", ','), None); assert_eq!(parse_pair::<i32>("10,20", ','), Some((10,20))); assert_eq!(parse_pair::<i32>("10,20xy", ','), None); assert_eq!(parse_pair::<i32>("0.5×", '×'), None); assert_eq!(parse_pair::<i32>("0.5×1.5", '×'), Some((0.5, 1.5))); } }
A Rust programmer would call T a type parameter of parse_pair.
When we use a generic function, Rust often are able to infer type parameters for us and we won't need to write them out as we did in the test code.
If you look at the match block, it will be resolved to (Ok(l), Ok(r)) only if both sides are evaluated to Ok.
#![allow(unused)] fn main() { _ => None }
_ is a wildcard pattern that matches anything and ignores its value.
If you want to take the value:
#![allow(unused)] fn main() { let x = 42; match x { 0 => println!("zero"), val => println!("something else: {val}"), // captures the value } }
It's common to initialize a struct's fields with variables of the same name, so rather than forcing you to write something like
#![allow(unused)] fn main() { MyClass { re: re, im : im } }
Rust let's you simply write
#![allow(unused)] fn main() { MyClass {re, im}. }
You can access the first element of tuple variable t by t.0.
Rust, generally does not intend to convert between numeric types implicitly! Therefore, we need to write out the conversion we need:
#![allow(unused)] fn main() { t.0 as f64 }
Important: Fallible functions in Rustshould return a Result which is either Ok(result) on success, or Err(e) on failure where is an error code.
#![allow(unused)] fn main() { let output = match File::create(filename) { Ok(f) => f, Err(e) => { return Err(e); } }; }
This kind of match statement is such a common pattern in Rust that the language provides the ? operator as shorthand for the whole thing.
#![allow(unused)] fn main() { let output = File::create(filename)?; }
If File::create fails, the ? operator returns from write_image, passing along the error. Otherwise, output holds the successfylly opened File.
The crossbeam crate provides a number of valuable concurrency facilities, including a scoped thread facility.
Unlike functions declared with fn, we don't need to decalre the types of a closure's argument: Rust will infer them, along with its return type.
Use bat command line tool which replaces cat.
#![allow(unused)] fn main() { #[derive(Debug)] struct Arguments { target: String, replacemnet: String, filename: string, output: string } }
#[derive(Debug)] attribute tells the compiler to generate some extra code that allows us to format the Arguments struct with {:?} in println!.
use text-colorizer for creating colourful output in the termial.
#![allow(unused)] fn main() { let data = match fs::read_to_string(&args.filename) { Ok(v) => v. Err(e) => { eprintln!("{} failed to read from file '{}': {:?}", "Error:".red().bold(), args.filename, e); std::process::exit(1); } }
As you can see in the code, from a match statement, we could print the error into err stream as well as exiting the application in case it fails to read the file.
#![allow(unused)] fn main() { use regex::Regex; fn replace(target: &str, replacement: &str, test: &str) -> Result<String, regex::Error> { let regex = Regex::new(target)?; Ok(regex.replace_all(text, replacement).to_string() } }
The function returns a Result. We use ? to short-circuit in case Regex::new fails.
-
Rust language is to some extent designed around its types!
-
Rust's memory and thread safety comes from the soundness of its type systems and its flexibility stems from its generic types and traits.
-
Although Rust doesn't promise it will represent things exactly as you've requested, it takes care to deviate from your request only when it's a reliable improvement.
What does it mean: Rust lets you leave out or elide?
Look at the code below:
#![allow(unused)] fn main() { fn build_vector() → Vec<i16> { let mut v : Vec<i16> = Vec::<i16>::new(); v.push(10i16); v.push(20i16); v } }
The issue is that the code is cluttered and repetitive. Since the function's return type is Vec<i16>, V should be a Vec<i16>. From that, it follows that each element of the vector should be an i16. This is exactly the type of reasoning Rust's type inference applies, allowing the programmer to instead write:
#![allow(unused)] fn main() { fn build_vector() → Vec<i16> { let mut v = Vec::new(); v.push(10); v.push(20); v } }
Note:
- Rust will generate the same machien code either way!
- Type inference gives back much of the legibility of dynamically types languages, while still catching errors at compile time!
Note:
- Rust's generic functions give the language a degree of the same flexibilty as
duck typingin Python, while still catching all type errors at compile time.
Imporant: Generic functions are just as efficient as their nongeneric counterparts:
- There is not inherent perforamnce advantage to be had from writing a specific
sumfunction for each integer over writing a generic one that handles all integers.
| Type | Description | Example |
|---|---|---|
| i32 | -5i8, 0x400u16, 0o100i16, 20_999u64, b'*' (u8 byte literal) | |
| isize,usize | -0b0101_0010isize | |
| char | Unicode Character, 32-bit wide | '*', '\x7f', '\u{CA0}' |
| (char, u8, i32) | Tuple: mixed types allowed | ('%', 0x7f, -1) |
| () | "Unit" (empty tuple) | () |
| struct S { x: f32, y : f32} | Named-field struct | S { x: 120.0, y: 209.0} |
| struct T (i32, char) | Tuple-like struct | T(120, 'X') |
| struct E; | Unit-like struct; has no fields | E |
| enum Attend { OnTime, Late(u32) } | Attend::Late(5), Attend::OnTime | |
| Box | Box: owning pointer to value in heap | Box::new(Late(15)) |
| &i32, &mut i32 | Shared and mutable references: non-owning pointers that must not outlive their referent | |
| String | UTF-8 string, dynamically sized | "Hello".to_string() |
| &str | Reference to str: non-owning pointer to UTF-8 text | "Ahmad", &s[0..12] |
| &[u8], &mut [u8] | Reference to slice: reference to a portion of an array or vector, comprising pointer and length | &v[10.20] |
| &dyn Any, &mut dyn Read | Triat object: reference to any value that implements a given set of methods | value as &dyn Any, &mut file as &mut dyn Read |
| fn (&str) → bool | Pointer to function | str::is_empty |
What is the difference between Unicode and utf-8:
🔤 Unicode
Unicode is a character set. It defines a universal standard for what characters exist and assigns each one a unique code point.
Examples:
'A'→U+0041'€'→U+20AC'😀'→U+1F600
Think of Unicode as a big dictionary of all characters from all languages, emojis, symbols, etc.
🧩 UTF-8
UTF-8 stands for Unicode Transformation Format - 8 bit. It is a character encoding — it defines how to store or transmit Unicode characters using bytes.
Examples:
'A'→ Unicode:U+0041→ UTF-8:0x41(1 byte)'€'→ Unicode:U+20AC→ UTF-8:0xE2 0x82 0xAC(3 bytes)
⚖️ Analogy
| Concept | Analogy |
|---|---|
| Unicode | A list of all books in a library (each with a unique ID) |
| UTF-8 | How the books are stored on shelves (in binary form) |
✅ Summary Table
| Feature | Unicode | UTF-8 |
|---|---|---|
| What it is | Character set (abstract) | Encoding format (concrete) |
| Defines | Code points | Byte sequences |
| Scope | Universal | One of many encodings for Unicode |
| Byte size | N/A | Variable-length (1–4 bytes) |
| Compatibility | N/A | Backward compatible with ASCII |
Fixed-Width Numerric Types
Fixed-width numeric types can overflow or lose precision, but they are adequate for most applications and can be thousands of times faster than representations like arbitrary-precision integers and exact rationals.
Rust used u8 type for byte values:
- reading data from a binary file
- reding data from a binary socket
Unlike C and C++, Rust treats characters as distinct from the numeric types: a char is not a u8, nor it is u32 (although it is 32 bits long).
The usize and isize types are analogous to size_t and ptrdiff_t in C and C++.
Rust requires array indices to be usize values.
What happens if an integer literal lacks a type suffix?
Rust will attemp to infer its type until it find the value was used in a way that pins it down:
- stored in variable of a particular type
- passed to function that expects a particular type
- compared with another value of a particular type
In the end, if multiple types could work Rust defaults to i32 if that is among the possibilities. Otherwise, Rust reports the ambiguity as an error.
Byte literal
Although numeric types and the char type are distinct, Rust does provide byte literals, character like literals for u8 values: b'X' represents the ASCII code for the character X, as a u8 value. For example, since the ASCII code for A is 65, the literals b'A' and 65u8 are exactly equivalent.
- Only ASCII characters may appear in byte literals.
Table: Characters requiring a stand-in notation
| Character | Byte literal | Numeric equivalent |
|---|---|---|
| Singe Quite, ' | b'\'' | 39u8 |
| Backslash, \ | b'\' | 92u8 |
| Newline | b'\n' | 10u8 |
| Carriage return | b'\r' | 13u8 |
| Tab | b'\t' | 9u8 |
For characters that are hard to write or read, you can write their code in hexadecimal instead. A byte literal of the form b'\xHH', where HH is any two-digit hexadecimal number, represents the byte whose value is HH. For example, you can write a byte literal for ASCII "escape" control character as b'\x1', since the ASCII code for "escape" is 27 or 1B in hexadecimal. Since byte literal are just another notation for u8 values, consider whether a simple numberic literal might be more legible.
Integer Conversion
You can convert from one integer type to another using the as operator.
#![allow(unused)] fn main() { //Conversion that are out of range for the destination // produce values that are equivalent to the original modulo 2ⁿ // where N is the width of the destination in bits. This is sometimes // called "truncations" assert_eq!(1000_i16 as u8, 232_u8); assert_eq!(65535_u32 as i16, -1_i16); println!("Hey"); assert_eq!(-1_i8 as u8, 255_u8); assert_eq!(255_u8 as i8, -1_i8); }
The standard library provides some operations as methods on integers:
#![allow(unused)] fn main() { assert_eq!(2_u16.pow(4), 16); assert_eq!((-4_i32).abs(), 4); assert_eq!(0b101101_u8.count_ones(), 4); }
Question: Does the code below compile?
#![allow(unused)] fn main() { println!("{}", (-4).abs()); }
It would not compile! Rust wnats to know exactly which integer type a value has before it will call the type's own methods. The default of i32 applies only if the type is still ambiguous after all method calls have been resolved, so that's too late to help here.
What is the solution then?
#![allow(unused)] fn main() { println!("{}", (-4_i32).abs()); println!("{}", i32::abs(-4)); }
Question: When an integer artithmatic operation overflows, what happens?
- In debug build, Rust will panic!
- In release build, the operation wraps around: it produces the value equivalent to the mathematically correct result module the range of the value.
Enhanced Integer Operations
When the default behavior isn't what you need, the integer type provide methods that let you spell out exactly what you want.
Checked Integer Operations
They return an Option of the result: Some(v) of the mathematically correct can be represented as a value of that type, or None if it cannot.
#![allow(unused)] fn main() { assert_eq!(10_u8.checked_add(20), Some(30)); assert_eq!(100_u8.checked_add(200), None); // Do the addition; panic if it overflows let sum = x.checked_add(y).unwrap(); }
Wrapping Integer Operations
Return the value equivalent to the mathematically correct result modulo the range of the value:
#![allow(unused)] fn main() { assert_eq!(500_u16.wrapping_mul(500), 53392); assert_eq!(500_i16.wrapping_mul(500), -12144); // In bitwise shift operations, the shift distance // is wrapped to fall within the size of the value. // So a shift of 17 bits in a 16-bit type is a shift of 1 assert_eq!(5_i16.wrapping_shl(17), 10); }
NOTE: The advantage of these methods is that they behave the same way in all builds.
Saturating Integer Operations
Return the representable value that is closest to the mathematically correct result (the result will be clamped).
#![allow(unused)] fn main() { assert_eq!(32760_i16.saturating_add(10), 32767); assert_eq!((-32760_i16).saturating_sub(10), -32768); }
NOTE: There are no saturating division, remainder, or bitwise shift methods.
Overflowing Integer Operations
Return a tuple (result, overflowed), where result is what the wrapping version of the function would return and overflowed is a bool indicating whether an overflow occurred:
#![allow(unused)] fn main() { assert_eq!(255_u8.overflowing_sub(2), (253, false)); assert_eq!(255_u8.overflowing_add(2), (1, true)); }
overflowing_shl and overflowing_shr deviated from the pattern a bit: they return true for overflowed only if the shift distance was as large or larger than the bit width of the type itself:
#![allow(unused)] fn main() { assert_eq!(5_u16.overflowing_shl(17), (10, true)); }
Floating-Point
Every part of a floating-point number after the integer is optional, but at least one of the fractional part, exponent or type suffix must be present, to distiguish it from an integer literal. The fractional part may consist of a lone decimal point, so 5. is a valid floating-point constant.
The types f32 and f64 have associated constants for the IEEE-required special values like INFINITY, NEG_INFINITY, NAN and MIN and MAX.
#![allow(unused)] fn main() { assert!((-1. / f32::INFINITY).is_sign_negative()); assert_eq!(-f32::MIN, f32::MAX); assert_eq!(5f32.sqrt() * 5f32.sqrt(), 5.); // Exactly 5.0, per IEEE }
Why this is true in Rust?
This is a great example of how floating-point arithmetic and rounding behave in Rust (and IEEE-754 floats in general).
You might expect floating-point rounding errors to make this false (e.g. 4.9999995 ≠ 5.0),
but it actually passes. Let’s see why.
🧠 Step 1: What Happens Numerically
-
5f32.sqrt()computes the square root of 5 as af32.
In decimal, that’s approximately:√5 ≈ 2.236068 -
Multiplying it by itself:
2.236068 * 2.236068 ≈ 5.0000005
That seems slightly off, but not enough to fail equality.
⚙️ Step 2: Floating-Point Precision in f32
f32 (32-bit float) has about 7 significant decimal digits of precision.
When you multiply two f32 values, the result is rounded to the nearest representable 32-bit float.
That rounding can make the result exactly 5.0, even if the true mathematical value was 5.0000005.
Example (conceptually):
sqrt(5f32) ≈ 2.23606801033 (stored as a 32-bit float)
2.23606801033 * 2.23606801033 = 5.000000476837158
Then, because 5.000000476837158 is closer to 5.0 than to the next representable float,
it gets rounded to exactly 5.0 in f32 precision.
So:
#![allow(unused)] fn main() { 5f32.sqrt() * 5f32.sqrt() == 5.0f32 }
🧪 Step 3: Verifying in Code
fn main() { let x = 5f32.sqrt(); println!("{:?}", x * x); // prints "5.0" println!("{:?}", (x * x).to_bits()); // prints "1084227584" println!("{:?}", 5f32.to_bits()); // prints "1084227584" }
Both values have the exact same bit pattern (0x40A00000),
so they are bitwise identical — the assert_eq! passes.
⚠️ Step 4: But Not Always True
For other numbers, rounding can go the other way.
For example:
#![allow(unused)] fn main() { assert_ne!(3f32.sqrt() * 3f32.sqrt(), 3.0); }
This fails, because the result is slightly below 3.0 after rounding.
✅ Summary
| Expression | Result | Equal to 5.0? | Why |
|---|---|---|---|
5f32.sqrt() * 5f32.sqrt() | 5.0 | ✅ Yes | rounds exactly to 5.0 |
3f32.sqrt() * 3f32.sqrt() | 2.9999998 | ❌ No | rounding error slightly below |
💡 In Practice
Always use approximate comparison for floats:
#![allow(unused)] fn main() { assert!((a - b).abs() < 1e-6); }
Only use assert_eq! when you know the result is exactly representable —
as in this rare case with 5f32.
The std::f32::consts and std::f64::consts modules provide various commonly used mathematical constants like E, PI and the square root of two.
Note: Implict integer conversions havea well-established record of causing bugs and secuirty holes, epecially when the integer in question represent the size of something in memory, and an unanticipated overflow occurs.
bool
Rust is very strict: control structures like if and while require their conditions to be bool expressions, as do the short-circuiting logical operators && and ||
Rust convert bool values to integer types:
#![allow(unused)] fn main() { assert_eq!(false as i32, 0); assert_eq!(true as i32, 1); }
Note: Rust won't convert a numeric type to bool.
Characters
Rust character type char represents a single Unicode character, as a 32-bit value.
Important:
- Rust uses the
chartype for single characters in isolation. - However, it uses the UTF-8 enconding for strings and streams of text. Therefore, a
Stringrepresents its text as a sequence of UTF-8 bytes, not as an array of characters.
You can use '\xHH' format to write any asky character set (U+0000 to U+007F). Also, you can write any Unicode character as '\u{HHHHHH}', where HHHHHH is a hexadecimal number up to six digits long, with underscores allowed for grouping as usual.
Rust never implicitly converts between char and any other type.
char to integer type conversion:
You can use as operator to convert a char to an integer type; for types smaller than 32 bits, the upper bit of the characterr's value are truncated:
#![allow(unused)] fn main() { assert_eq!('*' as i32, 42); }
conversion to char:
u8 is the only type the as operator will convert to char. Why?
- Rust intends the
asoperator to perform only cheap, infallible conversions - Every integer type other than
u8includes values that are not permitted Unicode code points - As a result, those conversion would require run-time checks.
How to convert to char then?
The standard library function std::char::from_u32 takes any u32 value and returns an Option<char>. If the u32 is not a permitted Unicode code point, then from_u32 returns None; Otherwise it returns Some(c), where c is the char result.
Some more useful methods provided by standard library:
#![allow(unused)] fn main() { assert_eq!('*'.is_alphabetic(), false); assert_eq!('β'.is_alphabetic(), true); assert_eq!('8'.to_digit(10), Some(8)); assert_eq!('ج'.len_utf8(), 2); assert_eq!(std::char:from_digit(2, 10), Some('2')); }
Tuples
Tuples allow only constants as indices, like t.4. You can't write t.i or t[i] to get the ith element.
Rust code often uses tuple types to return multiple values from a function.
#![allow(unused)] fn main() { fn split_at(&self, mid: usize) -> (&str, &str); }
One commonly used tuple type is the zero-tuple (). This is traditionally called unit type because it has only one value, also written (). Rust uses the tye unit type where there's no meaningful value to carry, but context require some sort of type.
| Situation | Why () is used |
|---|---|
| Function returns nothing | Return type must exist |
| Statements | All statements evaluate to () |
| Closures with no return | They still have a return type |
| Match arms/branches | Must have same type |
| Placeholder in generics | Type needed, no value to carry |
Rust consistently permits an extra trailing comma everywhere comma are used!
There ven tuples that contain a single value. The literal ("lonely hearts",) is a tuple containing a single string; its type is (&str,). Here the comma after the value is necessary to distinguish the singleton tuple from a simple parenthetic expression.
fn main() { let a = ("hello"); // &str, it is equal to let a = "hello"; let b = ("hello",); // (&str,) println!("{:?}", a); // prints: hello println!("{:?}", b); // prints: ("hello",) }
Pointer Types
At runtime, a reference to an i32 is a single machine word holding the address of the i32 which may be on the stack or in the heap.
The expression &x produces a reference to x; in Rust terminology, we say that it borrows a reference to x.
Given a reference r, the expression *r refers to the value r points to.
A reference does not automatically free any resources when it goes out of the scope! Instead when the owner goes out of scope, the data will be cleared!
&T
This is an immutable shared reference. You can have many shared references to a given value at a time, but they are read-only. Modifying the value they point to is forbidden, as with const T* in C.
&mut T
A mutable and exclusive reference.
Boxes
The simplest way to allocate a value in the heap is to use Box::new
#![allow(unused)] fn main() { let t = (12, "eggs"); let b = Box::new(t); }
In this code, when b goes out of scope, the memory is freed immediately, unless b has been moved - by returning it, for example.
Raw pointers
Rust has the raw pointer types *mut T and *const T. Using a raw pointer is unsafe, because Rust makes no effort to track what it points to. You may only dereference a raw pointer within an unsafe block. An unsafe block is Rust's opt-in mechanism for advanced language features whose safety is up to you. If your code has no unsafe block (or if those it does are written correctly), then the safety feature guarantees we emphasize throughout this book still hold.
Arrays, Vectors, and Slices
The types &[T] and &mut [T] called a shared slices of Ts and mutable slice of Ts. A mutable slice lets you read and modify elements, but can't be shared; a shared slice lets you share access among several readers, but doesn't let you modify elements.
Given a value v of any of these types, the expression v.len() gives the number of elements in v.
Rust has not notion for an initialized array.
Note: The useful methods you'd like to see on arrays, are all provided as methods on slices, not arrays. But, Rust implicity converts a reference to an array to a slice when searching for methods, you can call any slice method on an array directly
#![allow(unused)] fn main() { let mut chaos = [3,5,4,1,2]; chaos.sort(); assert_eq!(chaos, [1,2,3,4,5]); }
The sort method is actually defined on slices, but since it takes its operand by reference, Rust implicity produces a &mut [i32] slice referring to the entire array and passes that to the sort method.
Vectors
#![allow(unused)] fn main() { let mut primes = vec![2,3,5,7] assert_eq!(primes.iter().product::<i32>(), 210); }
Question: Look at the code below. What happens if rows * cols cannot fit into a usize? How can we make it more resilient.
#![allow(unused)] fn main() { fn new_pixel_buffer(rows: usize, cols: usize) -> Vec<u8> { vec![0; rows * cols] } }
-
In debug mode, Rust panics when arithmetic overflows.
-
In release mode, Rust performs wrapping arithmetic, meaning the value will silently wrap around (like modulo 2^64 or 2^32), which leads to allocating far fewer bytes than expected, potentially causing a buffer overflow or logic bugs later.
#![allow(unused)] fn main() { fn new_pixel_buffer(rows: usize, cols: usize) -> Option<Vec<u8>> { rows.checked_mul(cols).map(|size| vec![0; size]) } }
One way of making vectors is from values produced by an iterator:
#![allow(unused)] fn main() { let v: Vec<i32> = (0..5).collect(); assert_eq!(v, [0,1,2,3,4]); }
If you know the number of elements a vector will need in advance, instead of Vec::New() you can call Vec::with_capacity to create a vector with a buffer large enough to hold them all, right from the start.
You can insert and remove elements wherver you like in a vector, although these operations shift all the elements after the affected position forward or backward, so they may be slow if the vector is large:
#![allow(unused)] fn main() { let mut v = vec![1,2,3]; v.insert(3,35); v.remove(1); }
You can use the pop method to remove the last element and return it. Note that this method returns an Option<T>.
Despite ites fundamental role, Vec is an ordinary type defined in Rust, not built into the language.
Example: Read arguemnt values:
#![allow(unused)] fn main() { let languages: Vec<String> = std::env::args().collect(); for l in languages { println!("{l:?}"); } }
Slices
A slice, written [T] without specifying the length, is a region of an array or vector.
- Since a slice can be any length, slices can't be stored directly in variables or passed as function arguments (note that you cannot have a variable of type
[T]but you can have of type&[T])
#![allow(unused)] fn main() { let noodle = "noodle".to_string(); let oodle = noodle[1..]; // this is an error }
- Slices are always passed by reference.
- A reference to a slice is fat pointer.
While an ordinary reference is a non-owning pointer to a singled value, a reference to a slice is a non-owning pointer to a range of consecutive values in memory. This makes slice references a good choice when you want to write a function that opreates on either an array or a vector.
#![allow(unused)] fn main() { fn print(n: &[f64]) { for elt in n { println!("{}", elt); } } let a = [12., 13., 14.]; let v = vec![12., 13., 14.]; print(&a); // works on arrays print(&v); // works on vectors }
String
In string literals, unlike char literals, single quotes don't need a backslash escape, and double quotes do.
Note: If one line of a string ends with a backslash, then the newline character and the leading whitespace on the next line are dropped!
In a few cases, the need to double every backslash in a string is s nuisance (The classic examples are regular expressions and Windows paths). For those cases, Rust offers raw strings. A raw string is tagged with the lowercase r. All backslashes and whitespace characters inside a raw string are included in verbatim in the string. No escape sequence are recognized.
#![allow(unused)] fn main() { let default_win_install_path = r"C:\Program Files\Gorillas"; let pattern = Regex::new(r"\d+(\.\d+)*"); }
Note: There cannot be a double-quote character in a raw string simply by putting a backslash in front of it - We said no scape sequences are recognized. However, there is a cure for that:
#![allow(unused)] fn main() { println!(r###" This raw string started with 'r###"'. Therefore it does not end until we reach a quote mark ('"') followed by immediately by thtree pound signs ('###'): "###) }
Byte Strings
#![allow(unused)] fn main() { let method = b"GET"; assert_eq!(method, &[b'G', b'E', b'T']); }
Byte strings can use all the other strings syntax we've shown: They can span multiple lines, use escape sequencess, and use backslashes to join lines. Raw byte strings start with br".
Byte strings can't contain arbitrary Unicode characters. They must make do with ASCII and \xHH escape characters.
Question: Where should we use Byte Strings?
In Rust, you’d use byte strings (b"...") when you want to work with raw bytes instead of UTF-8 &str. This is useful when you’re dealing with binary data, protocols, or non-UTF-8 text.
#![allow(unused)] fn main() { let s = b"hello"; // type is &[u8; 5] }
- b"..." creates a byte string literal (&[u8]).
- Normal string "..." is UTF-8 (&str).
- Byte strings cannot contain Unicode characters outside 0–127.
Suppose you’re parsing a binary protocol where the first 4 bytes are a magic number:
#![allow(unused)] fn main() { let packet: &[u8] = &[0xDE, 0xAD, 0xBE, 0xEF, 1, 2, 3]; if packet.starts_with(b"\xDE\xAD\xBE\xEF") { println!("Valid packet header!"); } else { println!("Invalid packet header!"); } }
Strings in Memory
Rust strings are sequences of Unicode characters, but they are not stored in memory as array of chars. Instead, they are stored using UTF-8, a variable-width encoding.
A string or &str's .len() method returns its length. The length is measured in bytes, not characters.
#![allow(unused)] fn main() { let a = "احمد".to_string(); let l = a.len(); let c = a.chars().count(); println!("a's length = {l} and its char length = {c}"); }
It is impossible to modify a &str.
When a String variable goes out of scope, the buffer is automatically freeds, unless the String was moved.
There are several ways to create Strings:
- The
.to_string()method converts a&strto aString. This copies the string. - The
format!()macro works just likeprintln!(), except that it returns a newStringinstead of writing text to stdout, and it doesn't automatically addd a new line at the end. - Arrays, slices, and vectos of strings have two methods,
.concat()and.join(sep)that form a newStringfrom many strings.
#![allow(unused)] fn main() { let bits = vec!["veni", "vidi", "vici"]; assert_eq!(bits.contact(), "venividivici"); assert_eq!(bits.join(", "), "veni, vidi, vici"); }
Using Strings
String support == and != operators. Two strings are equal if they contain the same characters in the same order (regardless of whether they point to the same location in memory):
#![allow(unused)] fn main() { assert!("ONE".to_lowercase() == "one"); }
Note: Give the nature of Unicode, simple char-by-char comparison does not always give the expected answers. For example, the Rust string "th\u{e9}" and "the\u{301}" are both valid Unicode reprsenations for thé, the French work for tea. Unicode says they should both be displayed and processed in the same way, but Rust treats them as two completely distinct strings.
IMPORTANT: Sometimes a program really needs to be able to deal with strings that are NOT valid Unicode. This usually happens when a Rust program has to interoperate with some other system that doesn't enforce any such rules. For example, in most operating systems it's easy to create a file with filename that isn't valid Unicode. What should happen when a Rust program comes across this sort of filenames?
Rust solution is to offer a few string-like types for these situations:
- Sitck to
Stringand&strfor Unicode text. - When working with filenames, use
std::path:PathBufand&Pathinstead. - When working with binary data file that isn't UTF-8 encoded at all, use
Vec<u8>and&[u8]. - When working with environment variable names and command=-line arguments in the native form represented by operating system, use
OsStringand&OsStr. - When interoperating with C libraries that use null-terminated strings, use
std::ffi::CStringand&CStr.
Chapter 4: Ownership and Moves
Rust is something between "Safety First" where a garbage collection is used and "Control First" where program's memory consumption is entirely in programmer's control.
Note:
- Relying on garbage collection means relinquishing control over exactly when objects freed to the collector
- Understanding thy memory wasn't freed when you expected, can be a challenge.
A bug in a Rust program cannot cause one thread to corrupt another's data, introducting hard-to-reproduce failures in unrelated parts of the system.
Ownership
A std::string own its buffer: when the program destroys the string, the string's destructor frees the buffer.
The owner of a data determines the lifetime of the owned and everyone else must respect its decisions.
- Every value has a single owner that determines its lifetime.
- When the owner is freed - dropped, in Rust terminology - the owned value is dropped too.
Just as variables own their values, structs own their fields, and tuples, arrays and vectors own their elements.
The way to drop a value in Rust is to remove it from the ownership tree somehow:
- Leaving the scope of variables
- Deleting an element from a vectors
- Something of that sort
Very simple types like integers, floating point numbers, and characters are excuses from the ownership rules. They are called Copy types.
These operations moves the value instead of copying them:
- assigning a value to a variable
- passing it to a function
- returning it from a functio
It means that source relinquishes ownership of the value to the destination and becomes uninitialized.
A quick comparison between C++ and Rust
Let's say you have a list of strings in a variable and then you assign it into another variable.
using namespace std;
vector<string> s = {"udon", "ramen", "soba"};
vector<string> t= s;
vector<string> u = s;
In C++, a deep copy will occur. On stack you'll have s, t, u, and their data will copied into heap.
let s = vec!["udon".to_string(), "ramen".to_string(), "soba".to_string()];
let t = s;
let u = s; // Error since s is already moved and is uninialized.
In Rust, the data on heap (representing the strings) will be moved to t. Now, t is pointing the data on heap and s is uninitialize. Therefore the thrird command will be an error!
If you want to end up the same state as C++, you need to use Clone:
let s = vec!["udon".to_string(), "ramen".to_string(), "soba".to_string()];
let t = s.Clone();
let u = s.Clone();
Some examples with move and control flow:
let x = vec![1,2,3];
if c {
f(x); // ... ok to move from x here
} else {
g(x); // ... and ok to also move from x here
}
h(x); // bad: x is uninitialized here if either path uses it
#![allow(unused)] fn main() { let x = vec[1,2,3]; while f() { g(x); // bad: x would be moved in first operations // uninitialized in second... } }
remedy for the previous code:
#![allow(unused)] fn main() { let x = vec[1,2,3]; while f() { g(x); x = h(); } e(x); }
Question: What if we really do want to move an element out of a vector?
#![allow(unused)] fn main() { let mut v = Vec::new(); for i in 101..106 { v.push(i.to_string()); } // 1. Pop a value off the end of the vector let fifth = v.pop().expect("vector empty"); assert_eq!(fifth, "105"); // 2. move a value out of a given index in the vector, // and move the last element into its spot. let second = v.swap_remove(1); assert_eq!(second, "102"); // 3. swap in another value of the one we're taking out: let third = std::mem::replace(&mut v[2], "substitude".to_string()); assert_eq!(third, "103"); assert_eq!(v, vec!["101", "104", "substitude"]); }
Now we need to analyze the code below:
#![allow(unused)] fn main() { let v = vec!["ahmad".to_string(), "sholeh".to_string()]; for mut s in v { s.push('!'); println!("{s}"); } }
- We are passing the vector directly therefore this moves vector out of v, leaving v uninialized
- The
forloop internal machinery takes ownership of the vecto and dissects it into its element - On each iteration, s owns the string, therefore we're able to modify it inside the loop- The vecotr itslef is no longer visible to the code and nothing can observe it mid-loop in some partially emptied state
Now, how would change a type so that in a vector or another collection you can track the presence/absence of a value? Probably using Option. Right?
#![allow(unused)] fn main() { struct Person { name: Option<String>, birth: i32 } let mut composers = Vec::new(); composers.push(Person { name: Some("Palestrina".to_string()), birth: 1525 }); }
We can't do:
#![allow(unused)] fn main() { let first_name = composers[0].name; }
Instead we can:
#![allow(unused)] fn main() { let first_name = std::mem::replace(&mut composers[0].name, None); }
The code above (using Option) is so common that the type provides a take method for this very purpose:
#![allow(unused)] fn main() { let first_name = composers.[0].name.take(); }
Copy Type
The standard Copy types include all teh machine integer and floatint-point numeric types, the char and bool types and a few others. A tupe of fixed-size array of Copy types is itself a Copy type.
Rc and Arc
#![allow(unused)] fn main() { use std::rc::Rc; let s: Rc<String> = Rc::new("ahmad".to_string()); let t: Rc<String> = s.clone(); let u: Rc<String> = s.clone(); }
- s, t, and u are located on the stack frame each pointing to the Rc and its value.
- Cloning an Rc
value does not copy the T; instead, it simply creates another pointer to it and increments the reference count. - The usual ownership rules does apply to the
Rcpointers themeselves, and when the last extantRcis dropped, Rust drops theStringas well. - You can use any of
String's usual methods directly onRc<string>
A value owned by an Rc is immutable.
Note: One well-known problems with using reference counts to manage memory is that, if there are ever two reference-count values that point to each other, each will hold the other's reference count above zero, so the values never will be freed.
Chapter 5: References
Shared references are Copy
Mutable references are NOT Copy
As long as there are shared references to a value, not even its owner can modify it; the value is locked down!
If there is a mutable reference to a value, it has exclusive access to the value; you can't use the owner at all, until the mutable reference goes away!
Iterating over a shared reference to a HashMap is defined to produce shared references to each entry's key and value.
IMPORTANT:
- Since references are so widely used in Rust, the
.operator implicitly dereferenecs its left operand, if need. - The
.operator can also implicitly borrow a reference to its operand, if needed for a method call
#![allow(unused)] fn main() { let mut v = vec![1973, 1968]; v.sort(); (&mut v).sort();; // equivalent, but more verbos }
Also, Rust permits reference to references. On the other hand, the . operator follows as many references as it takes to find its target:
#![allow(unused)] fn main() { struct Point { x: i32, y: i32 } let point = Point {x: 1000, y: 1000}; let r : &Point = &point; let rr: &&Point = &r; let rrr: &&&Point = &rr; assert_eq!(rrr.y, 1000); }
If you actually want to know whether two references point to the same memory, you can use std:prt::eq which compares their addresses:
#![allow(unused)] fn main() { assert!(rx == ry); // their referent are equivalent assert!(!std::ptr::eq(rx, ry)); // but occupy different addresses }
Rust references are nevel null.
You can't use any variable until it's been initialized, regardless of its type
Rust won't convert integers to references (outside of unsafe code), so you can't zero into a reference.
- In Rust, if you need a value that is either a reference to something or not, use the type
Option<&T> - At the machine level, Rust referents
Noneas a null pointer andSome(r), whereris a&Tvalue, as the nonzero address, soOption<&T>is just as efficient as a nullable pointer in C or C++, even though it's safer:- Its type requires you to check whether it's
Nonebefore you can use it.
- Its type requires you to check whether it's
Lifetime
- A lifetime is some stretch of your program for which a reference could be safe to use: a statement, an expression, the scope of some variable, or the like.
- Lifetimes are entirely figmnets of Rust's compile-time imagination. At run-time,a reference is nothing but an addressl its lifetime is part of its type and has no run-time representation.
Static variables:
- Every static must be initialized
- Mutable statics are inherently not thread-safe (after all, any thread can access a static at any time), and even in single thread programs, they can fall pery to other sorts of reentry problems.
- For these reason, you may access a mutable static only within an
unsafeblock.
#![allow(unused)] fn main() { static mut STASH: &i32 = &128; fn f(p: &i32) { unsafe { STASH = p; } } }
The signature of f as written is actually standard for following:
#![allow(unused)] fn main() { fn f<'a>(p: &'a i32) {...} }
You can read <'a> as "for any lifetime 'a" so when we write fn f<'a>(p: &'a i32), we're defining a function that takes a reference to an i32 with any given lifeetime 'a. Since STASH has a static lifetime, assigning a variable to it which is not static is gonna break Rust lifetime rules!
#![allow(unused)] fn main() { statis mut STASH: &i32 = & 10; fn f(p: &'static i32) { unsafe { STASH = p; } } }
Question: What is the reentry we discussed above? Please refer to: here
Whenever a reference type appears inside another type's definition, you must write out its lifetime.
Understanding static mut and Reentrancy in Rust
That's a great question, as it gets to the heart of why Rust is so strict about global state.
In a single-threaded program, reentrancy means a function's execution is paused in the middle, and then code is run that calls back into that function (or another function that modifies the same data) before the first call has finished.
The static mut variable is like a single whiteboard for the entire program. A reentrancy problem happens when you're in the middle of writing something on the whiteboard, you get interrupted, and the "interruption" also tries to read or write to that same whiteboard — messing up your original, unfinished thought.
Here are the most common non-threading examples:
1. Interrupts (Embedded / Bare-metal)
This is the most classic example. Imagine you're writing code for a microcontroller.
Your main loop is running. It reads a static mut COUNTER (value: 10) and is about to add 1 to it.
static mut COUNTER: u32 = 0; fn main_loop() { unsafe { // 1. Reads COUNTER (value: 10) let x = COUNTER; // <-- 3. INTERRUPT HAPPENS RIGHT HERE! // 6. Resumes. x is still 10. let y = x + 1; COUNTER = y; // 7. Writes 11 } }
A hardware timer interrupt fires. The CPU immediately pauses main_loop and jumps to the interrupt service routine (ISR).
#![allow(unused)] fn main() { fn timer_interrupt_handler() { unsafe { // 4. Interrupt code runs. Reads COUNTER (value: 10) let a = COUNTER; let b = a + 1; COUNTER = b; // 5. Writes 11 } } // Interrupt finishes, returns to main_loop }
Result: The counter was incremented twice, but the final value is 11, not 12.
This is a classic race condition — all within a single thread.
2. Signal Handlers (Unix-like Systems)
This is the operating system equivalent of an interrupt. A user can send a signal (like SIGINT via Ctrl+C) at any time.
Your program is in the middle of modifying a static mut variable.
A signal arrives. The OS pauses your code and runs your registered signal handler.
If that signal handler also tries to read or modify that same static mut variable, you have the exact same race condition as the interrupt example.
The handler is re-entering the logic that accesses the shared data.
3. Nested Callbacks
This is a "pure" software version. It happens when you pass a function (a closure) to another function, and that function calls your closure in a way you didn't expect.
Imagine you have a global state for a simple logger:
#![allow(unused)] fn main() { static mut LOG_PREFIX: &str = "MAIN"; // A function that sets a temporary prefix, does work, and restores it. fn do_work_with_log(new_prefix: &str, work: fn()) { unsafe { let old_prefix = LOG_PREFIX; // 1. Backs up "MAIN" LOG_PREFIX = new_prefix; // 2. Sets prefix to "WORK" work(); // 3. Calls the work function // 6. Resumes. But LOG_PREFIX is "RECURSIVE"! LOG_PREFIX = old_prefix; // 7. Restores "RECURSIVE"?? No, "MAIN". // But the state is all messed up. } } // A function that also logs fn recursive_call() { do_work_with_log("RECURSIVE", || { // 5. We are here. LOG_PREFIX is "RECURSIVE" println!("{} - Oh no!", unsafe { LOG_PREFIX }); }); } }
Now, what if you do this?
fn main() { do_work_with_log("WORK", || { // 4. We are here. LOG_PREFIX is "WORK". // But what if this work... calls do_work_with_log again? recursive_call(); }); }
The first call to do_work_with_log("WORK") is paused at step 3.
The second call (recursive_call) runs, and it also calls do_work_with_log("RECURSIVE").
This second, nested call will overwrite LOG_PREFIX. When recursive_call finishes, the original, outer do_work_with_log resumes — but the LOG_PREFIX it depended on has been changed by the nested call.
Summary
This is why static mut requires an unsafe block:
You are promising the compiler,
"I know about all these reentrancy issues (threads, interrupts, signals, callbacks) and I have personally handled them."
In safe Rust, you typically use abstractions like Mutex, Cell, or RefCell to handle interior mutability and avoid these pitfalls.
Chapter 6: Expressions
In Rust, an if statement can be used to retrun a value:
let status = if cpu.temperature <= MAX_TEMP { HttpsStatus::Ok } else { HttpsStatus::ServerError };
This explains why Rust deos not have C's ternary operator.
Control flow expression examples:
#![allow(unused)] fn main() { let a = if let Some(x) = f() { x } else { 0 }; let a = match x {Non => 0, _ => 1}; }
The operators that can be chained:
#![allow(unused)] fn main() { * / % + - << >> & ^ | && || as }
Note: The comparison operators, the assignment operators, and the range operators can't be chained at all.
Question: What is the issue with this code:
#![allow(unused)] fn main() { if preferences.changed() { page.compute_size() } }
With the semicolon missing, the block's value would be whatever page.compute_size() returns, but an if without else must alaways return ().
A let delcaration can declare a variable without initializing it. The variable can then be initialized with a later assignment. This occasinally useful, because sometimes a variable should be initialized from the middle of some sort of control flow construct:
#![allow(unused)] fn main() { let name; if user.has_nickname() { name = user.nickname(); } else { name = generate_unique_name(); user.register(&name); } }
Here there are two different ways the local variable name might be initialized, but either way it will be initialized exactly once, so name does not need to be declared mut.
match statement
In a match block, if you place a _ pattern before other patterns, it means that it will have precendence over them. Those patterns will never match anything and the compiler will warn you.
A pattern can:
- match a range of value
- unpack value
- match against individual fileds of construct
- chase preferences
- borrow parts of a value
A comma after an arm may be dropped if the expr is a block!
if let
#![allow(unused)] fn main() { if let pattern = expr { block1 } else { block2 } }
Sometimes this is a nice way to get data out of an Option or Result:
#![allow(unused)] fn main() { if let Some(cookie) = request.session.cookie { return restore_session(cookie); } }
It's never strictly necessary to use if let, because match can do everything if let can do.
Loops
There are four loops expressions:
#![allow(unused)] fn main() { while condition { block } while let pattern = expr { block } loop { block } for pattern in iterable { block } }
The value of a while or for loop is always ().
A break can have both a label and a value expression:
#![allow(unused)] fn main() { let sqrt = 'outer: loop { let n = next_numer(); for i in 1.. { let square = i * i; if square == n { // found square root break 'outer i; } if square > n { // 'n' isn't a prefect square, tru the next break; } } }; }
? operator
These two are equivalent:
#![allow(unused)] fn main() { let output = File::create(filename)?; }
#![allow(unused)] fn main() { let output = match File:create(filename) { Ok(f) => f, Err(err) => return Err(err) }; }
In the method call player.location(), player might be a Player, a reference of type &Player, or a smart pointer of type Box<Player> or Rc<Player>. The .location() method might take the player either by value or by reference. The same .location() syntax works in all cases, becuase Rust's . operator automatically derefrences player or borrow a reference to it as needed.
Type-associated funtions
Like Vec::new()
One quirk of Rust syntax is that in a function call or method call, the usual syntax for generic type, Vec<T> does not work:
#![allow(unused)] fn main() { return Vec<i32>::with_capacity(1000); let ramp = (0..n).collect<Vec<i32>>(); }
The problem is that in expressions, < is the less-than operator. You should instead use:
#![allow(unused)] fn main() { return Vec::<i32>::with_capacity(1000); let ramp = (0..n).collec::<Vec<i32>>(); }
The symbol ::<...> is affectionately known in Rust community as the turbofish. Alternatively, it is often possible to drop the type parameters and let Rust infer them:
#![allow(unused)] fn main() { return Vec::with_capacity(10); let ramp: Vec<i32> = (0..n).collect(); }
Note: It is considered good style to omit the type whenever they can be inferred!
If the value to the left of the dot is a rnference or smart pointer type, it is automatically dererenced, just as for method call.
The value to the left or the brackets is automatically derefrences.
Extracting a slice from an array or vector is straightforward:
#![allow(unused)] fn main() { let second_half = &game_moves[midpoint..end]; }
Here, game_moves may be either an array, a slice, or a vectorl; the result, regardless, is a borrowed slice of length end - midnight. game_moves is considered borrowed for the lifetime of second_half.
Unary - negates a number. It is supported for the all numeric types except unsigned integers. There is not unary + operator.
Shift operators
Bit shifting is always sign-extending on signed integers and zero-extending on unsigned integer types. Since Rust has unsigned integers, it does not need an unsigned shigt operator, like Java's >>> operator.
Unlike C, Rust doesn't support chaining assignment: you can't write a = b = 3 to assign the value 3 to both a and b.
Rust does not have C's increament and decrement operators ++ and --.
Type Cast
- Conversion to a narrower type result in truncation.
- Conversion from a floating-point to an integer type rounds towar zero. If the alue is too large to fit in the integer type, the cast produces the closest value that the integer type can represent: the value of
1e6 as u8is255. - The value of type
boolorchar, or of a C-likeenumtype, may be cast to any integer type.- Cating in the other direction is not allowed, as
bool,char, andenumtypes all have restrictions their values that would have to be enforced with run-time checks.- As an exception, a
u8may be cast to typechar, since all integers from 0 to 255 are validUnicodecode points forcharto hold.
- As an exception, a
- Cating in the other direction is not allowed, as
Conversion usually requires a cast. A few conversions involving preference tyes are so straightforward that the language performs them even without a cast. One trivial example is converting a mut reference to a non-mute reference.
Several more significant automatic conversions can happen:
- Values of type
&Stringauto-convert to type&strwithout a cast - Values of type
Vec<i32>auto-convert to type&[i32] - Value of type
&Box<Chessboard>auto-convert to&Chessboard
These are called deref coersions, because they apply to types that implement Deref built-in trait. The purpose of Deref coresion is to make smart pointers types, like Box, behave as much like the underlying value as possible.
Chapter 7: Error Handling
Panic:
- Out-of-bounds array access
- Integer division by zero
- Calling
.expect()on aResultthat happens to beErr - Assertion failure
panic() accetps optional println() style arguments, for building an error message.
IMPORTANT: When a panic happens, by default:
- Stack is unwound: Any temoporary values, local variable, or arguments that the current function was using are dropped, in the reverse of the order they were created. Dropping a value simply means cleaning up after it: any
Strings orVecs the program wwas using are freed, any openFiles are closed, an so on. User-defineddropmethods are called too. ** In the particular case ofpirate_share, there's nothing to clean up.** Once the current function call is cleaned up, we move on to its caller, dropping is variables and arguments the same way. Then we move up to that function's caller, and so on up the stack. - Finally, the thread exits. It the panicking thread was the main thread, then the whole process exits (with a non-zero code).
Panic is safe. It doesn't violate any of Rust's safety rulse: even if you manage to panic in the middle of standard library method, it will ne ver leave a dangling pointer or a hald-initialized value in memory!
Panic is per thread. A parent thread can find out when a child thread panics and handle the error gracefully!
There is also a way to catch stack unwinding, allowing the thread to survive and continue running. The standard library function std::panic::catch_unwind() does this.
You can use threads and catch_unwind() to handle panic, making your program more robust. One imporatant caveat is that these tools only catch panics that unwind the stack. Not every panic proceeds this way!
If a drop() method triggers a second panic while Rust is still trying to clean up after the first,this is considered fatal. Rust stops unwinding and aborts the whole process.
Also, Rust's panic behavior is customizable. If you compile with -c panic=abort, the first panic in your program immediately aborts the process. (With this option, Rust does not need to know how to unwind the stack, so this can reduce the size of your compiled code).
How to handle a Result:
match is a bit verbose. So Result<T, E> offers a variety of methods:
result.is_ok(), result.is_err()result.ok()Returns the success value, if any, as anOption<T>result.err()Return the value, if any, as anOption<E>result.unwrap_or(fallback)Returns the success value, ifresultis a success result. Otherwise, it returnsfallback, discarding the error.result.unwrap_or_else(fallback_fn)This is the same, but instead of passing a fallback value directly, you pass a function or closure. Note: This is for cases where it would be wasteful to compute a fallback value if you're not going to use it. Thefallback_fnis called only if we have an error result.result.unwrap()Also returns the success value, ifresultis success. However, ifresultis an error result, this method panics.result.expect(message)This is the same asunwrap(), but provides a message that it prints in case of panic.result.as_ref()Converts aResult<T, E>toResult<&T, &E>.result.as_mut()Convers aResult<T, E>toResult<&mut T, &mut E>
Note: One reason these two methods are usefull is that all of the other methods listed here, except .is_ok() and .is_err(), consumes the result they operate on.
Result Type Aliases
Sometimes you'll see in Rust documentation that seems to omit the error type of Result:
#![allow(unused)] fn main() { fn remove_file(path: &Path) -> Result<()> }
This means that a Result type alias is being used. Modules often define a Result type alias to avoid having to repeat an error ttype that's used consistenly by almost every function in the module. For example the standard library's std::io module includes this line of code:
#![allow(unused)] fn main() { pub type Result<T> = result::Result<T, Error>; }
Printing an error value does not also print out its source. If you want to be suire to print all the available information:
#![allow(unused)] fn main() { use std::error:Error; use std::io::{Write, stderror}; fn print_error(mut err: &dyn Error) { let _ = writeln!(stderr(), "error: {}", err); while let Some(source) = err.source() { let _ = writeln!(stderr(), "caused by: {}", source); err = source; } } }
- Note that
writeln!macro works likeprintln!, except that it writes to a stream of your choice. - We could use the
eprintln!macro to do the same thing, buteprintln!panics if an error occurs.
Propagating Errors
It is simply too much code to use a 10-line match statement every place where something might go wrong.
#![allow(unused)] fn main() { let weather = get_weather(hometown)?; }
The behvaior of ? operator depends on whether this function returns a success result or an error result:
- On success, it unwraps the
Resultto get the success value inside. - On error, it immediately returns from the enclosing function, passing the error result up the call chain. To ensure that this works,
?can only be used on aResultin functions that have aResultreturn type.
? also works similarly with the Option type. In a function that returns Option, you can use ? to unwrap a value and return early in the case of None.
Working with Multiple Error ttype
#![allow(unused)] fn main() { use std::io::{Self, BufRead}; fn read_numers(file: &mut dyn BufRead) -> Result<Vec<i64>, io:Error> { let mut numbers = vec![]; for line_result in file.lines() { let line = line_result?; numbers.push(line.parse()?); } Ok(numbers) } }
Here the compiler will complain that it ? cannot convert a std::num::ParseIntError value to the type std::io:Error.
What should we do then?
- Type the
thiserrorcrate, which is designed to heop you define good error types with just a few lines of code. - A simpler approach is to use what's build into Rust. All of the standard library error types can be converted to the type
Box<dyn std::error::Error + Send + Sync + 'static>. This might seem mouthful. However,dyn std::error:Errorrepresents "any error" andSend + Sync + 'staticmakes it safe to pass between threads, which you'll often want.
type GenericError = Box<dyn std::error::Error + Send + Sync + 'static>;
type GenericResult<T> = Result<T, GenericError>;
- You could also consider using crate
anyhow
If you're calling a function that returns a GenericResult and you want to handle one particular kind of error but let all other propagate out, use the generic method error.downcast_ref::<ErrorType>(). It borrows a reference to the error, if it happens to be the particular type of error you're looking for.
#![allow(unused)] fn main() { loop { match compile_project(){ Ok(()) => return Ok(()), Err(err) => { if let Some(mse) = err.downcast_ref::<MissingSemicolonError>() { insert_semicolon_in_source_code(mse.file(), mse.line())?; continue; } return Err(err) } } } }