EMIKG
Rust-based EMI web portal
Goals of this presentation
Rust Lexicon
LLVM
LLVM (Low Level Virtual Machine) is a collection of modular and reusable compiler and toolchain technologies
It provides a set of intermediate representations and optimization passes, making it a versatile foundation for building compilers for various programming languages.
LLVM is often used for optimizing and generating machine code from high-level programming languages
Rust compiles with LLVM
Rust compilation
Syntax Analysis: Rust source code is parsed to ensure it follows the language's syntax rules.
Intermediate Representation (IR): The code is translated into LLVM intermediate representation, a platform-independent format.
Optimization: LLVM performs various optimization passes on the IR to enhance code efficiency.
Code Generation: Optimized IR is translated into machine code specific to the target architecture.
Linking: The compiled code is linked with other necessary components to create the final executable.
Compile time and Run time
Compile Time: The phase when the source code is translated and checked by the compiler, resulting in the generation of executable files.
Run Time: The phase during which the compiled program is executed, performing its tasks and responding to dynamic conditions.
Unless we use jitting, there is no code generation at Run time
Crates
Lifetimes
Refers to a concept used in the ownership system to manage memory safety without the need for garbage collection
Lifetimes specify the scope over which references are valid and help the compiler ensure that references do not outlive the data they point to
E.g. a marriage is valid until both spouses are alive
‘a ‘_
Mutable reference
Strict separation between mutable and immutable references to prevent data races and ensure thread safety.
Mutable references allow modification of the data they point to, while immutable references only allow reading.
You can only have one mutable reference at once*
*Unless you really ask nicely.
&’a object
&’b mut object
Enum
Short for enumeration, is a custom data type that allows you to define a type by enumerating its possible values
Enums are commonly used to represent a finite set of related possibilities
The 7 dwarves are an enumeration of 7 elements.
Each element in an enum will require as much memory as the largest of them.
Visibility
Visibility is a descriptor for enums, structs, their attributes, functions, traits and methods.
It specifies whether an external file can read such an element.
pub enum Option<T> {
None,
Some(T)
}
pub(crate) enum Option<T> {
None,
Some(T)
}
enum Option<T> {
None,
Some(T)
}
Option
Enum type that is used to represent the presence or absence of a value
E.g. in an MGF, we may or may not have the name of the instrument used for the experiment
pub enum Option<T> {
None,
Some(T)
}
Result
Enum type that is used to represent the outcome of uncertain operations
E.g. a file may be presented as an MGF, but it is actually a CSV. Reading such a file as an MGF will cause an error, but that outcome is unknown at compile time, so we need to be prepared for both outcomes.
pub enum Result<T, E> {
Ok(T),
Err(E)
}
Panic
When a result or generally speaking an illegal state is reached, a panic is raised.
This is similar to a Python Exception.
panic!(“This should have worked!”);
todo!(“This has to be done!”);
unreacheable!(“This cannot happen!”);
Generics
The T field in the Option enum is a generic, i.e. is an arbitrary type.
You can use generic to compose very complex structs from simple elements.
pub enum Option<T> {
None,
Some(T)
}
Strongly typed language
The Rust compiler ensures that operations and assignments are performed only on values of compatible types, and it rejects any attempts to mix incompatible types.
This approach helps catch potential errors early in the development process, promoting code safety and reliability.
The base rust type: Structs
Short for "structure", is a custom data type that allows you to group different variables of various types under a single name
It's a way to create a compound type that represents an entity with multiple attributes or fields
Structs are used to model and organize data in a more structured manner
struct Person {
first_name: String,
last_name: String,
age: u32,
is_adult: bool,
}
Numbers
A number in a strongly typed system needs to have a specific representation.
An u8 will be an integer from 0 to 255 that requires 8 bits of memory.
An i32 will be a signed integer that represents from -2^31 to 2^31 using 32 bits of memory.
An f32 is a signed float (there is no such thing as an unsigned float in Rust) following the IEEE 754 format which uses 32 bits
An f64 is a signed float that uses 64 bits
struct Person {
seconds_since_birth: u64,
height: f32, // half of f64
age: u8,
is_adult: bool,
}
Traits
Similar to an interface, a trait is a language construct that defines a set of methods that can be implemented by types.
Traits enable code reuse and polymorphism by allowing different types to share common behavior without relying on inheritance.
Types can implement traits to provide a standardized interface, promoting modular and flexible code design.
trait PersonInfo {
// Method to get the full name of a person
fn full_name(&self) -> String;
// Method to get the age category
fn age_category(&self) -> &'static str;
}
Implementing a trait for a struct
An example of trait implementation for the Person struct of the trait Person Info
impl PersonInfo for Person {
fn full_name(&self) -> String {
format!("{} {}", self.first_name, self.last_name)
}
fn age_category(&self) -> &'static str {
if self.age < 18 {
"Young"
} else if self.age < 65 {
"Adult"
} else {
"Senior"
}
}
}
Macros
Define reusable and flexible code snippets that can be invoked in various contexts. Metaprogramming: generally, you will not want to write macros.
Safety
In some cases, you may want to run operations on memory in Rust that are not allowed in safe conditions, i.e. maintaining the certainty of absence of data races and non-null pointers
In these cases, you can use the ‘unsafe’ descriptor to specify a method or a part of a function that executes memory unsafe code
impl PersonInfo for Person {
unsafe fn to_mutable(&self) -> &mut Self {
core::mem::transmute_copy(self)
}
fn to_mutable2(&self) -> &mut Self {
unsafe {self.to_mutable()}
}
fn age_category(&self) -> &'static str {
if self.age < 18 {
"Young"
} else if self.age < 65 {
"Adult"
} else {
"Senior"
}
}
}
ASAN
ASAN stands for AddressSanitizer, and it is a memory error detector tool commonly used in software development to catch memory-related errors such as out-of-bounds accesses, use-after-free, and memory leaks.
ASAN is designed to identify and report memory safety issues during both compile-time and runtime.
Pray you don’t ever need this.
RUSTFLAGS=-Zsanitizer=address cargo test -Zbuild-std --target x86_64-unknown-linux-gnu
Docstrings
Rust methods and struct can and should be documented.
Docstrings allow for inclusion of Safety sections regarding memory safety, Arguments and testable Examples
To generate a project documentation, use:
cargo doc
To run a project doctests, use:
cargo test --doc
impl PersonInfo for Person {
/// Returns the age category.
///
/// # Examples
///
/// ```
/// // We import the crate� /// use person::prelude::*;
/// // We instante the person.
/// let person = Person::new(32);
/// assert_eq!(person.age_category(), “Adult”);
/// ```
fn age_category(&self) -> &'static str {
if self.age < 18 {
"Young"
} else if self.age < 65 {
"Adult"
} else {
"Senior"
}
}
}
Release
Before we mentioned how the Rust compiler and more specifically LLVM optimizes the code, but this does not happen by default, but only in release.
In order to run your code in release mode, add the release flag. For instance, to run a test in release mode you can use:
cargo test --release
NoCapture
There are instances where, while a test is passing, you still want to see the output.
In these cases, you can pass the flag nocapture. The presence of -- twice is NOT an error!
cargo test -- --nocapture
Iterator
The Iterator trait is a fundamental trait that provides a sequence of elements and supports iteration over them
pub trait Iterator {
type Item;
fn next(&mut self) -> Option<Self::Item>;
// Other iterator methods...
}
vector.iter();
Zip
The zip method in Iterator trait is used to combine two iterators into a single iterator of pairs. It stops when either of the original iterators is exhausted.
let numbers = vec![1, 2, 3, 4];
let words = vec!["one", "two", "three", "four"];
// Using zip to combine two iterators into pairs
let zipped: Vec<_> = numbers.iter().zip(words.iter()).collect();
// Printing the zipped pairs
for (num, word) in zipped {
println!("Number: {}, Word: {}", num, word);
}
Chain
The chain method in Iterator trait is used to concatenate two iterators together, producing a new iterator that yields elements from the first iterator until it is exhausted and then continues with the second iterator.
let numbers = vec![1, 2, 3];
let more_numbers = vec![4, 5, 6];
// Using chain to concatenate two iterators
let combined: Vec<_> = numbers.iter().chain(more_numbers.iter()).collect();
// Printing the combined elements
for &num in combined.iter() {
println!("Number: {}", num);
}
Map
The map method in Iterator trait is used to transform each element of an iterator into a new value by applying a specified function
let numbers = vec![1, 2, 3, 4, 5];
// Using map to square each element in the iterator
let squared_numbers: Vec<_> = numbers.iter().map(|&x| x * x).collect();
// Printing the squared numbers
for &num in squared_numbers.iter() {
println!("Squared Number: {}", num);
}
FlatMap
The flat_map method in Iterator trait is used to both map and flatten a nested structure. It applies a transformation function to each element, and the result is flattened into a single iterator.
let nested_numbers: Vec<Vec<i32>> = vec![vec![1, 2, 3], vec![4, 5, 6], vec![7, 8, 9]];
// Using flat_map to flatten the nested structure and double each element
let flattened_and_doubled: Vec<_> = nested_numbers
.iter()
.flat_map(|inner_vec| inner_vec.iter().map(|&x| x * 2))
.collect();
// Printing the flattened and doubled numbers
for &num in flattened_and_doubled.iter() {
println!("Doubled Number: {}", num);
}
For Each
The for_each method in the Iterator trait is used to apply a function to each element of the iterator. Unlike map or other iterator methods, for_each is intended for side effects and doesn't produce a new iterator.
let numbers = vec![1, 2, 3, 4, 5];
// Using for_each to print each element
numbers.iter().for_each(|&num| {
println!("Number: {}", num);
});
// Using for_each to double each element in-place
let mut doubled_numbers = numbers.clone(); // cloning for demonstration purposes
doubled_numbers.iter_mut().for_each(|num| {
*num *= 2;
});
// Printing the doubled numbers
for &num in doubled_numbers.iter() {
println!("Doubled Number: {}", num);
}
Clippy
Clippy is a code linter for the Rust language.
cargo check
You can run it by using:
cargo clippy
Inlining directives
In general, #[inline] is a suggestion to the compiler, while #[inline(always)] is a stronger directive. Developers should use them based on performance requirements and conduct profiling to understand the impact on code size and execution speed. The decision to use these directives should be made with careful consideration of the trade-offs involved.
// Example function without inline directive
fn multiply_numbers(a: i32, b: i32) -> i32 {
a * b
}
// Example function with #[inline] directive
#[inline]
fn add_numbers(a: i32, b: i32) -> i32 {
a + b
}
// Example function with #[inline(always)] directive
#[inline(always)]
fn subtract_numbers(a: i32, b: i32) -> i32 {
a - b
}
Box
The Box type is a heap-allocated, owned smart pointer. It is used to allocate memory on the heap and store data there, providing ownership semantics.
While Box is useful for scenarios where ownership and heap allocation are required, it can introduce performance overhead compared to stack-allocated types.
#[derive(Debug)]
enum LinkedList {
Node { value: i32, next: Option<Box<LinkedList>> },
Empty,
}
fn add_node(value: i32, next: LinkedList) -> LinkedList {
LinkedList::Node {
value,
next: Some(Box::new(next)),
}
}
fn main() {
let linked_list = add_node(1, add_node(2, add_node(3, LinkedList::Empty)));
println!("{:?}", linked_list);
}
Phantom Data
PhantomData is a marker type provided by the standard library that allows you to indicate ownership of a particular type parameter in a generic context without actually storing any values of that type.
It is often used to convey information to the Rust compiler for enforcing certain invariants or ownership relationships at compile time.
The PhantomData<T> type is a zero-sized type, meaning it doesn't consume any memory at runtime. It is primarily used as a tool to express lifetimes, ownership, or other relationships between generic types.
use std::marker::PhantomData;
// A resource type representing some external resource (e.g., a file, network connection, etc.)
struct Resource {
// Details of the resource...
id: usize,
}
// A generic Handle type associated with a specific resource type
struct Handle<T> {
resource_id: usize,
phantom: PhantomData<T>,
}
Builder pattern
The objective of the Builder pattern is to construct complex objects with optional parameters incrementally for improved readability, maintainability and ensuring internal consistency of the resulting data structure
impl CarBuilder {
fn year(mut self, year: u16) -> Result<Self, String> {
if year < 1886 {
return Err("Cars didn't exist before 1886".to_string());
}
self.year = Some(year);
Ok(self)
}
fn brand(mut self, brand: Brands) -> Self {
self.brand = Some(brand);
self
}
fn build(self) -> Result<Car, String> {
let year = self.year.ok_or("Year is required")?;
let brand = self.brand.ok_or("Brand is required")?;
Ok(Car { year, brand })
}
}
fn main() {
let car = CarBuilder::default()
.year(2021).unwrap()
.brand(Brands::Tesla)
.build()
.unwrap();
}
struct Car {
year: u16,
brand: Brands,
}
enum Brands {
Toyota,
Tesla,
}
#[derive(Default)]
struct CarBuilder {
year: Option<u16>,
brand: Option<Brands>,
}
Tools Lexicon
Web Framework
A web framework is a software framework designed to aid the development of web applications, providing a structured and standardized way to build, deploy, and manage web-based software.
Web frameworks offer a set of pre-built components, tools, and patterns that simplify common tasks, such as handling HTTP requests, managing databases, and structuring the overall application architecture.
In Python, Django is a web framework.
Frontend framework
A frontend framework, such as Yew, is a set of tools, libraries, and conventions designed to simplify and accelerate the development of the user interface (UI) or frontend portion of web applications.
Frontend frameworks provide a structured and organized approach to building interactive and visually appealing user interfaces.
Templating framework
A templating framework, such as Jinja, is a tool used in web development to separate the presentation layer from the application logic. Templating frameworks provide a way to dynamically generate HTML, XML, or other markup languages by embedding variables and control structures within templates.
Web Assembly (WASM)
WebAssembly, often abbreviated as wasm, is a binary instruction format designed as a portable compilation target for high-level programming languages.
It is a low-level, efficient, and platform-independent technology that enables the execution of code written in languages other than JavaScript in web browsers.
With Rust, we can easily write code that can be executed in the browser with safety.
Web Socket
A WebSocket is a communication protocol that provides full-duplex communication channels over a single, long-lived connection.
It enables bidirectional communication between a client (typically a web browser) and a server, allowing both to send messages independently at any time.
With WASM, it will be rather easy to use this technology.
Object Relational Mapping (ORM)
Object-Relational Mapping (ORM) is a programming technique and a software design pattern that facilitates the interaction between a relational database and an object-oriented programming language.
The primary goal of ORM is to bridge the gap between the object-oriented model used in programming languages and the relational model used in databases, making it easier to work with databases using programming language constructs.
Library bindings
Process of creating a connection or interface between a programming language and an external library.
External libraries are pre-compiled sets of code that provide specific functionalities or resources that can be utilized by applications written in various programming languages.
If at all possible, let’s avoid bindings.
Oauth2
The OAuth 2.0 authorization framework enables a third-party application to obtain limited access to an HTTP service, either on behalf of a resource owner by orchestrating an approval interaction between the resource owner and the HTTP service, or by allowing the third-party application to obtain access on its own behalf
We use Oauth so to avoid storing sensible user information, following the GDPR.
SMTP mail client
SMTP is a communication protocol used to transmit electronic mail (email) over the Internet. It is a set of rules governing the exchange of messages between email servers, allowing for the sending, relaying, and receiving of emails.
We need a mail client to notify users.
@
Auto Reloading
When you develop a website, you want to be able to see the effect of your edits without having to restart the web server every time.
For this goal, we will use cargo-watch and systemfd, which we can install with:
cargo install systemfd cargo-watch
The command to keep watch to the code edits is:
systemfd --no-pid -s http::5000 -- cargo watch -x run
Tools
What we need
De/Serialization - Serde
Crate: https://crates.io/crates/serde_json
GitHub: Serde JSON, Serde, Serde YAML
Website: https://serde.rs/
For generic data structures we can use epserde (developed by Tommaso Fontana and Sebastiano Vigna)
De/Serialization - Quick XML
Web Frameworks - Rocket
GitHub: https://github.com/rwf2/Rocket
Website: https://rocket.rs/
Tutorial: https://github.com/SergioBenitez/RustLab2023
Issue: not maintained as much as other libraries
Web Framework - Axum
GitHub: https://github.com/tokio-rs/axum
Crate: https://crates.io/crates/axum
Website: https://tokio.rs/
Based on Tokio and Hyper
Web Framework - Actix
GitHub: https://github.com/actix/actix-web
Website: https://actix.rs/
Crate: https://crates.io/crates/actix-web
Tutorial: https://gill.net.in/posts/auth-microservice-rust-actix-web1.0-diesel-complete-tutorial/
Comes equipped with HTTP test framework and Websockets
I believe this is the way to go
Frontend framework - Yew
HTML Validation - Ammonia
ORM - Diesel
GitHub: https://github.com/diesel-rs/diesel
Website: https://diesel.rs/
Tutorial: https://cloudmaker.dev/how-to-create-a-rest-api-in-rust/
Multi language - Gettext
Bindings to Gettext (a GNU project)
Gettext: https://www.gnu.org/software/gettext/
GitHub: https://github.com/Koka/gettext-rs
Crate: https://crates.io/crates/gettext-rs
Avoid mostly because VERY old
Multi Language - Fluent
GitHub: https://github.com/projectfluent/fluent-rs
Crate: https://crates.io/crates/fluent
Website: https://projectfluent.org/
A localization framework designed to unleash the entire expressive power of natural language translations. Available for multiple programming language.
HTTP Requests - Reqwest
GitHub: https://github.com/seanmonstar/reqwest
Crate: https://crates.io/crates/reqwest
ORM - SQLx
Authentication - Oauth2-rs
GitHub: https://github.com/ramosbugs/oauth2-rs
Mailing system - Lettre
GitHub: https://github.com/lettre/lettre
Website: https://lettre.rs/
Environment variables - Dotenv
GitHub: https://github.com/dotenv-rs/dotenv�Crate: https://crates.io/crates/dotenv
AVOID: development discontinued
Environment variables - Dotenvy
GitHub: https://github.com/allan2/dotenvy
Crate: https://crates.io/crates/dotenvy
A maintained fork of Dotenv
Logging - log
Crate: https://crates.io/crates/log
GitHub: https://github.com/rust-lang/log
To be used with pretty_env_logger or, alternatively, the more complete fern
Bindings - RusQLite
GitHub: https://github.com/rusqlite/rusqlite
Rayon
Tool for handling simple thread parallelism
pub fn main() {
let vector1 = vec![0, 4, 5];
let vector2 = vec![0, 4, 5];
vector1.into_iter();
vector2.into_par_iter();
}
Next up: Project structure & Timeline