Federated Learning for Adaptive Road Efficiency Project Log

March 3rd Shenanigans

To Do list:

(1) Think about network architectures to integrate, what type of protocols to use, etc.
(2) Figure out how to simulate my chosen architecture

(1)

Some thoughts on network architectures for traffic intersections

Thinking about a Content Addressable Network (CAN) for intersection sensors

Given the fact that traffic lights are in a fixed position, we can probably leverage the spatial locality of CANs to manage updates and state information efficiently.

Some research on CAN locality property: Each node is only aware of its neighbours

This simplifies routing, reduces overhead in a controlled environment
Note: traffic control systems require low-latency communication
CANs are scalable and decentralised but routing performance needs to be evaluated under high update frequencies of the learning model given the fact there will be real-time requirements and expectations
Since the nodes (intersection nodes) are static, this gives some stability which plays tot the strength of CANs
A foreseeable issue here is if this project is ever expanded to include actual smart vehicles, we might be cooked. Too many things happening at once, too many nodes trying to familiarise their neighbours, etc.
We gotta be able to handle churn btw - this might be a game-ending L

Pivot to a more layered architecture?

Local CAN is okay for just closely spaced intersections (immediate, short-range updates)
Kademlia or Chord have better lookup features and can provide improved performance and fault-tolerance
PIVOT TO KADEMLIA

Read Kademlia paper [1]
Also read Emily Martin’s RS2 Kademlia paper (Thank you Emily)

Also want to look into gossip-based protocols

Read the following papers:

Gossip-based Peer Sampling [2]
Gossip-Based Computation of Aggregate Information *Key paper right here* [3]

One more note: with the federated learning integration, the network architecture needs to support efficient aggregation of model updates

Low latency and reliable message delivery

(2)

Looked into OPNet simulation and modelling - too difficult to set up, couldn’t register for Riverbed
Did research on OMNeT++

Looks good potentially
Downloaded/installed - all is well - green across the board
Built sample project - simple node to node packet connection

Now that I have OMNeT++ setup and running on my computer, need a new TO DO

TO DO:

Define objectives / requirements of my project

Objectives/Requirements

Familiarise myself more with OMNeT++ and its frameworks

Find a framework for integrating Kademlia simulations easily (can also use Java extensions) - found OverSim - looks good
Run through the rest of the technical documentation - done, very good stuff here

Figure out how to build traffic/city layout and simulation

OMNeT++ is insanity. Figured out how to import OSM files from openstreetmap.org and have it outputted in the GUI.
Figured out how to dynamically generate “cars” which are rectangles that traverse across found roads
Need to figure out how to have rectangles ‘stop’ at intersections. I briefly read in the documentation of a project that someone built regarding traffic lights and traffic light controllers. Could be the move - check back in later.

Design architecture for actual network simulation

Design the network topology
Communication protocols (want Kademlia/gossip/anything else I think would be good)
Federated learning integration

Plan how learning updates will be aggregated and shared across the network. Probably will use info from [3]

Develop Simulation Model

Create the NED files to define the network nodes and connections
Integrate traffic light control logic
Data exchange based on the P2P protocols
Integration of the actual inflow-outflow equations

Specific Scenarios Simulations (lol SSS 🐍)

Peak (rush hour) vs. normal conditions (initial thought: maybe gather data from City of Victoria?)
Parameter tuning (message frequency, node latency, learning update intervals, etc.)

Testing / Iterative Development and Design

Just keep hammering it until it’s put together I guess

Data Collection and Logging

OMNeT++ has a really good data collection and logging system so I’m not really stressed here

Finally, document everything along with proof of concept for final presentation

March 4th Shenanigans

What I would like to do today:

Figure out traffic light control logic
Figure out how on earth to integrate this into what I have thus far

What we need:

create a simulation model where a dedicated module acts as the traffic light controller, using a state machine to cycle through red, yellow, and green light phases.

Later on, we need to figure out how to modify the duration of phases dynamically with the learning algorithm lol, this project is bananas, what am I doing

Traffic simulators: Simulation of Urban Mobility (SUMO) through Traffic Control Interface (TraCI)

Need to download and understand Veins (open source vehicular network simulation framework

Trying to get SUMO, TraCI and Veins into the project. Veins builds with the project so that works for now (using this doc)

Having some troubles getting SUMO to cooperate, trying to download additional prerequisite dependencies like xquartz and xerces-c
Okay, not working, so am cloning and building from github repo
Might as well try TraCI as a Service (TraaS), might be worth my time after doing all this

While waiting for SUMO to build, I renamed my project:

Federated Learning for Adaptive Road Efficiency (FLARE)

March 5-8th Shenanigans

No huge updates. Having troubles integrating SUMO into the project.

Reading paper for RS4: Federated Learning through model gossiping in wireless sensor networks

Objective Function [4]

March 11th Shenanigans

Thinking about the mathematics behind determining optimisation through matrices of intersections

1. Can We Find a Standard Matrix for Traffic Flow?

Yes! If we model the inflow and outflow of vehicles at an intersection (or multiple intersections) as a linear system, then we can absolutely find its standard matrix.

2. Why Would This Work?

Traffic flow (in its simplest form) is based on conservation of vehicles—cars don’t just vanish or appear out of nowhere. This is very similar to conservation of mass in physics, which is often modeled linearly.

Each intersection can be treated as a node, and the streets connected to it define the edges (like a directed graph). The number of cars entering and leaving must balance according to:

Total Inflow=Total Outflow

If the relationship between in-flow and out-flow is linear, we can describe the entire system as:

Ax=b where:

A is the traffic flow matrix (the "standard matrix" of this system).
x is the vector of unknowns (e.g., number of cars moving through each road).
b is the known inflow/outflow at different points (like traffic entering from a highway).

3. What Would the Standard Matrix Represent?

The columns of the standard matrix would represent how each individual inflow affects the system.

For a single intersection, we could construct a standard matrix that transforms incoming traffic flows into outgoing traffic flows.

For multiple intersections, we'd get a larger system of equations, and the matrix would describe how traffic from one intersection influences others—kind of like a networked transformation!

March 18th Shenanigans

What we’re working on today:

Installing omnet++ on my windows machine
Installing SUMO on my windows machine
Researching traffic data for the city of victoria

Found the following here

Issues with the data found for the traffic of city of victoria

What I currently have:

Traffic Volume (Label field) – I have 24-hour vehicle counts, e.g., "450(92)" means 450 vehicles were counted in 1992.
Direction field – It tells you if the traffic was counted in both directions (bi-directional).
SHAPE_Length – Likely the length of the line segment where the count was taken (could be street segments).
Year – The year of the data point.
No explicit intersection information – The data appears linear (road segments), not node-based (no junction/intersection node IDs or coordinates provided).

Why inflow-outflow models are tricky here:

For inflow-outflow systems (aka traffic balance at intersections), you need:

A network structure: intersections (nodes) and roads (edges).
Directional volumes: inflows to and outflows from intersections should be distinct (e.g., traffic moving northbound, southbound, eastbound, westbound at an intersection).
Connectivity data: which segments connect to which intersections (to model conservation of vehicles).

Problems with the data:

No explicit intersection/node data: Without knowing which lines meet at which intersections, you can’t fully close the system.
No split of direction-specific volumes: the "Direction" = "both" means total two-way counts, but you don’t know how much is inbound vs outbound on each segment.
24-hour aggregate: These are daily totals, not time-of-day (morning peak, afternoon peak, etc.), so inflows/outflows may vary heavily depending on time.

What I could potentially do:

Estimate average flows: I could approximate directional flows by assuming a 50/50 split (common heuristic in the absence of directional data).
Partial inflow/outflow balances: If I supplement with GIS data (spatial coordinates of line endpoints) or if I manually map lines to a road network, I might be able to infer where segments connect.
Use assumptions: Some models use assumptions like uniform split at intersections (e.g., equal probability of turning left, right, or going straight) to build rough systems of equations.

What’s missing to fully model:

Intersection geometry and turning movements.
Time-of-day variation (since flows may reverse directions at peak hours).
Directional breakdown of counts (e.g., northbound = X, southbound = Y).

HOLD ON TEAM, THEY HAVE A GEOjson.

Now that we have this we can:

1. Map it all in QGIS:

Need to download QGIS
Import the GeoJSON straight into QGIS and you'll see all your traffic counts spatially.
The Label field can be styled to show traffic volumes, or parsed further (we could split "450(92)" into volume = 450 and year = 1992 if you want).

2. Create symbology based on counts or year:

We can symbolize traffic volume (thicker lines for higher counts).
Or use color ramps based on the year.

3. Create an interactive map (e.g., Leaflet or Mapbox):

plug GeoJSON into a simple web map.

Here

4. Create inflow-outflow system of equations

For each node (intersection), apply:

Inflow−Outflow=Δ

If we assume steady-state (traffic isn't "piling up" at intersections), set Δ=0 (in = out).

Inflow = sum of all traffic volumes from edges pointing into the node.
Outflow = sum of all traffic volumes leaving the node.

March 20th Shenanigans

Transferring project over to windows computer

Transferring 5gb of data

Split the zip folder (2.5gb) of project into 4 parts (0.625gb)

$ split -d -a 3 -b 625m openstreetmap.zip openstreetmap.part

Used WeTransfer to temp store zipped partitions and download them on windows machine

TODO:

Figure out traffic control logic in SUMO

How the heck to use SUMO at all tbh

Figure out how to integrate into omnetpp project

Omg massive breakthrough here:

With sumo installation, there’s a python script that you can run in the tools folder (..\Eclipse\Sumo\tools)

$ python osmWebWizard.py

Find location you want to generate a simulation of

here , I wanted Victoria so I found the coordinates of Victoria
Generate
It’ll open the simulation in SUMO for you

You can edit the simulation: Edit > Open sumo config in netedit (or Ctrl + T)

Here are the videos of the traffic simulation of downtown Victoria that I’ve managed to get running.

Now that I have this up and running, I want to pick a specific set of intersections, and draw out a diagram of how the traffic control sensors communicate with others.

I’m also thinking with how much I’ve figured out with SUMO, it might be difficult to create an architecture of learners, but we’ll see. I need to figure out how to import into omnetpp and go from there.

March 21st Shenanigans:

As promised, I wanted to draw out a diagram of my thought process for the architecture. Which oddly turned out to represent more of a hierarchical but partially meshed network topology with local cliques and inter-clique connections.

Fig 1

Here's a breakdown of the diagram:

Inside each "circle" (A, B, C, D):

Each "circle" is a fully connected mini-network (like a complete graph of 4 nodes, where each intersection’s node connects directly to the 3 others in its group).
For example, in Circle A:

a-A connects to b-A, c-A, and d-A.

Between the "circles":

Only the a-nodes from each circle connect across circles, forming a kind of ring or square topology.
So a-A connects to a-B and a-D,
a-B connects to a-A and a-C, etc., creating that square linkage between the four quadrants.

Visually:

Think of each "circle" (A, B, C, D) as a local intersection cluster.
The a-nodes act as "gateway nodes" or "parent nodes" connecting to other clusters.

TL;DR:

Each cluster is fully meshed locally (4 nodes tightly connected).
The a-nodes form a square/loop between the clusters (inter-cluster links).

Aligning it to Gossip-Based FL:

In a gossip-based FL setting, we want:

Decentralized model sharing: models don’t only travel "up" to a central aggregator but also horizontally and diagonally across peers.
Multi-hop flexibility: if direct neighbors are busy or down, nodes can still communicate via alternative paths, which you’re engineering with this mesh.

Thoughts for further evolution:

Multi-role nodes: Instead of rigid "a = parent/gateway" and "b/c/d = leaf" nodes, make all nodes in the mesh capable of temporarily stepping up as aggregation points. This fits well with randomized pairwise gossiping [5], where any node can influence its neighbors dynamically.
Dynamic α-values tied to node roles:
Maybe introduce a system where nodes dynamically adjust their gossip weight (α) based on:

Traffic volume they observe (urban vs. rural load)
Redundancy level (nodes part of multiple inter-cluster links might have moderated α-values to prevent dominance)

Local vs. Global Gossiping:

Local Gossip happens within the mini-circle (A, B, C, D).
Global Gossip happens over the a-node and b-node rings, allowing models to propagate across the wider network.
The key challenge is to balance these two types of communication during gossip rounds.

This is what I’m gonna do: Build an FL network from scratch in OMNeT++ as a standalone proof of concept before integrating SUMO via TraCI. By first focusing on the FL network, I'll be able to concentrate on understanding the key components, algorithms, and communication dynamics of gossip-based federated learning in isolation. Once I have the core FL functionality working in OMNeT++, I can then integrate the mobility model from SUMO to enhance the simulation with realistic movement patterns and flow of traffic.

Example of what gossip protocol would look like:

// Simple example of gossip message handling
class GossipNode : public cSimpleModule {
protected:
virtual void initialize() override;
virtual void handleMessage(cMessage *msg) override;
private:
void startGossipRound();
};

void GossipNode::initialize() {
// Initialize node with model parameters
scheduleAt(simTime() + 1.0, new cMessage("startGossip"));
}

void GossipNode::handleMessage(cMessage *msg) {
if (strcmp(msg->getName(), "startGossip") == 0) {
// Start a gossip round, send updates to neighbors
startGossipRound();
}
}

void GossipNode::startGossipRound() {
// Create gossip messages and send them to neighbors
for (int i = 0; i < numNeighbours(); i++) {
send(createGossipMessage(), "out");
}
}

March 25th Shenanigans

GeoJSON is a data format for representing geographic objects and their attributes. It is used to store and exchange geodata such as points, lines and polygons, as well as their associated attributes such as name, description, address, etc. If you want more technical and official explanations, visit the GeoJSON format page. [6]

Began building a linear regression ML model to train on the GeoJSON file. Had to convert to CSV such that I can build with pandas dataframe. Once converted to csv and turned into data frame, did the following:

print(df.isnull().sum())

This printed out the following info:
printed the following:

OBJECTID 0

Traffic_Volume 1

Direction 1

Segment_Length 0

Year 15

Start_Coordinates 0

End_Coordinates 0

dtype: int64

Oddly, there were 15 years that weren’t included in the data, so I simply removed any line that didn’t contain a year. Not worth the trouble. For traffic_volume, I simply replaced the one missing value with the median of all traffic_volumes, that’s probably okay. And for direction, I just opted to add the most frequent value (which is ‘both’ - two way street).

I tested and trained the model. The results are the following:

Mean Squared Error: 4.233521005310553

R-squared: 0.0028718122151949466

0.7 or higher: Generally considered a good fit, meaning that the model explains a significant portion of the variability in the target variable. 0.5 to 0.7: Indicates moderate explanatory power.

Below 0.5: Poor fit. The model doesn’t explain much of the variance in the data. And we have 0.003. So we’re cooked chat.

The ideal MSE is 0, meaning no error at all. In practice, good MSE values depend on the scale of the target variable. We have a FOUR. my lord. We might as well have a monkey sitting in the intersection deciding on when to change the light.

If linear regression doesn't perform well, we can experiment with more complex models like Decision Trees or Random Forest.

Am attempting a decision tree model tomorrow.

March 26th Shenanigans

Attempted decision tree and got the following:
Decision Tree Mean Squared Error: 4.207823254050241

Decision Tree R-squared: 0.008924445971398631

This sucks. Though I had max_depth=5 so now I’m gonna try max_depth=None

Decision Tree Mean Squared Error: 4.466740230712524

Decision Tree R-squared: -0.05205869676065089

Lol dude.

Data might be overfitting with the none so I’m gonna fix some things. The features: Segment_Length, Distance, and Direction might actually have no real relationship with Traffic_Volume.

It would be optimal to have time of day, or even day part of this training set because I would have that data in a real-life scenario. It’s a difficult set to use. I’m gonna drop a couple features that aren’t really useful in predicting traffic volume. Like direction and distance between intersections aren’t huge game changers. As opposed to year, which is really helpful because obviously the amount of traffic scales each year. If there is a clear trend with the year feature, it might be good to focus on predicting yearly trends instead of short-term fluctuations (even though preferably, it is these short term fluctuations I want to prioritise later in the simulation).

If you wanna see something interesting, this is the diagram of the traffic trends from 1980 to just over 2020. The average traffic volume has significantly DECREASED. Which is insanity.

Okay, this idea obviously isn’t working and this model is learning how to predict traffic volume on some year, which is not really in the scope of this project. I’m going to attempt to find more specific data sets for traffic here.

Since traffic has decreased over time in the data, what external factors could explain this? This is truly staggering information. It makes me wonder if I plotted this correctly. I’m gonna do the following to test this:
Line plot of traffic volume over time.

Rolling average plot to smooth fluctuations.

Box plots to analyze traffic volume distributions by decade.

Correlation heatmap to check relationships with other features

This distribution is truly disgusting. No wonder my models don’t know what to do.

I might have an idea here. Since I already have SUMO and TraCI set up, I could simply collect simulated traffic data from intersections using TraCI and then use that for modeling. Since I’ve already utilised the OSM Web Wizard to generate a detailed traffic simulation scenario based on OpenStreetMap data, I can install the TraCI Python library to establish a communication link between our simulation and external control scripts. I’m going to write a Python script employing TraCI to connect to the running SUMO simulation, collect vehicle count data at specified intersections, and manage simulation steps programmatically.

April 6th Shenanigans

(Also including some undocumented, sporadic shenanigans from after March 26th)

What happened today? I’ve finally managed to successfully build out the diagram from March 21st Shenanigans (Fig 1).

Here is simulation: https://youtu.be/tFbmnrT2hEk

What needs to happen now:

a) Since ‘traffic light’ nodes are sending packets to each other, need to have these nodes computing the learning algorithm, with some type of weight (size of packets to be sent to neighbours)
b) Have nodes receiving packets from neighbour node process the information, update personal model
c) Simulate some ‘traffic’ for nodes to send to each.
d) Analyse network load, performance of nodes (computing local models/updating models upon retrieval of new model from neighbour)

Each node (GossipNode) will:

Maintain a local model (e.g., a vector of weights: std::vector<double> modelWeights)
Be capable of computing/updating this model
Send a "model update" (packet) to its neighbors

TODO:

Add std::vector<double> modelWeights to GossipNode
Add a timer/self-message in initialize() to trigger local updates
Create a trainLocalModel() method that simulates training (e.g., add small noise to weights)
Create a sendModelUpdate() method to broadcast model weights to neighbors

Integrated nodes to host ‘modelWeights’

For analysing the data, I need to think about what attributes I specifically want to look at:

Maybe performance of local training of nodes?
Performance of sending new models

What I’m doing now:

Integrating Weight Gossiping

Added an array of weights to more appropriately simulate a learning model on the modes (different params represent diff attributes of a model)

Simulation runs well.

Now that the simulation is running smoothly and we’ve got model updates and weight sharing working, here are some directions on what I could explore next:

1. Aggregation Mechanism (Federated Averaging)

Current State: Right now, we're performing local training and sharing model weights.
Next Step: Implement federated averaging (FedAvg). This is a common technique where each node averages the received weights from other nodes and updates its own model accordingly.
Why: This is how most federated learning algorithms converge, and it will allow us to simulate more realistic federated learning behavior.

Time to add performance metrics to the project (the whole reason we’re here)
How well is the model performing and how efficiently is the federated learning network operating?

Implementing Convergence Rate (track this by comparing the change in the model's weights across time)

After each round of weight updates, calculate the difference between the current weights and the previous weights.
Track the maximum change in weights over time.
Define a threshold below which the model is considered to have converged.

Okay perfect. Here’s a summary of what happened today:
1. Simulated Gossip-based Federated Learning (FL) Setup

Built upon the GossipNode class to introduce a model update mechanism using a custom ModelUpdate message.
Implemented local training and model weight update functionality for each node. Each node now performs local training, adjusts its model weights, and sends out updated weights in a message.

2. Message Handling and Forwarding

Implemented the forwarding of ModelUpdate messages between nodes to simulate gossiping.
Added a forwardModelUpdate() function to ensure the message gets passed on to neighboring nodes.
Introduced custom setWeights() and setSenderId() methods in the ModelUpdate message to handle model weights and sender information.

3. Model Weight Evaluation

Introduced a mechanism for each node to evaluate the model's performance by adding an evaluateModel() function.
The function uses the model weights to evaluate their effectiveness based on the local training process.
Nodes now evaluate their weights during each cycle, depending on incoming updates.

4. Error Handling and Debugging

Addressed and fixed multiple errors related to the compilation and linkage process, including incorrect parameter usage in functions like setWeights().
Ensured proper linkage of .msg and .cc files, eliminating errors related to missing or undefined symbols.
Fixed issues related to setting the correct size for the weight array and correctly referencing and passing model weights in function calls.

5. Tracking Operations for Simulation Complexity

Introduced a method to track the number of operations occurring during the simulation, such as local training, message forwarding, and model evaluations.
Added simple counters to measure the scale and complexity of the simulation based on the number of nodes and events.

6. Makefile and Compilation Issues

Fixed issues in the makefile related to compilation and linking of the necessary files (ModelUpdate_m.h and ModelUpdate_m.cc). This makefile is something nasty btw.
Successfully cleaned and rebuilt the project, ensuring the correct inclusion and generation of necessary files.
Resolved linkage errors by fixing missing symbols for ModelUpdate.

Cried a little cuz everything works as expected.

Okay, what we gotta do now:

// Dummy accuracy evaluator function (replace with actual logic)

double GossipNode::evaluateModel(const std::vector<double>& modelWeights) {

// Simulated accuracy value (can be replaced with actual evaluation logic)

return 0.85; // Dummy value for now

}

Is unfortunately just acting as a placeholder. Time to genuinely incorporate a model and its performance. Now’s the time to incorporate the information I obtained via xls from the City of Victoria.

Key Considerations:

Traffic Data Representation: will use traffic data from intersections to model the flow of vehicles and pedestrians. This can be translated into the weights of the model for nodes (vehicles/pedestrians as the features).
Time-Based Data: Traffic flows are time-dependent (e.g., peaks at 8 AM or 4 PM). need to account for time windows (AM Peak, Midday Peak, PM Peak) when evaluating models.
Intersections as Nodes: Each intersection (e.g., "Douglas St & Pandora St") can be considered a node in your Federated Learning model, where each node trains locally based on its own traffic data.

can preprocess this traffic data to simulate how the nodes (intersections) will "learn" from the local traffic flow.

After this, can start Integrating with Federated Learning

Now that we have traffic data available, we can use it to evaluate the model at each node. For each time slice (e.g., 8:15 AM - 9:15 AM), we can take the traffic flow data and use it to adjust the model’s weights based on real-world patterns.

Map traffic data to model weights: For each intersection, we could take the sum of all vehicles (or specific movements) as a feature. This could be used as input to the model's weight calculation or evaluation function.

Train models: Use the traffic data for each node (intersection) as the basis for local model updates.

April 7th Shenanigans

Parsed and cleaned real-world traffic data into CSVs per intersection, per direction, per time period
- linear regression model predicting vehicle volume, delay, or congestion score based on:

Time of day
Approach direction
Vehicle classes (cars, trucks, buses)
Pedestrians/bikes
Historical volumes

we have actual locations, so what if we assign each unique location, like Douglas St & Pandora Ave, so one of the nodes! and then 4 of those make up a cluster. We can randomise which node gets what location I guess.

Extracting locations to randomly assign to nodes in the network. Then creating clusters. (randomly)

** finger guns **

In the makefile, I want to have the preprocessSheets.py script and locationAssignment.py script to run, and then our gossipnode.cc will take that info someway, somehow to assign the assigned nodes.

I have scripts outputting csv’s for .cc to use :)

Preprocessing Issues:

I ran into issues with file naming and file pathing for preprocessSheets.py, which caused errors when attempting to preprocess the traffic data.
I updated the preprocessSheets.py script to ensure that the processed files are correctly named without the ID prefix (e.g., DouglasStTolmieAve_total_volume_class_breakdown.csv).
I made sure the script cleared the output directory (processed_csvs) before generating new files.

Fixing Makefile Path:

I adjusted the path to preprocessSheets.py in the Makefile to ensure that the correct directory was being used.
Updated the SCRIPTS_DIR variable in the Makefile to point to victoriaTrafficData.

Errors and Debugging:

I fixed the issue of calling preprocessSheets.py incorrectly in the Makefile.
Encountered and fixed errors related to missing traffic data for specific locations, like Fairfield Rd & Vancouver St. I tracked this down to naming mismatches in the CSV file sanitization and adjusted the code accordingly.

Linear Regression Script:

I worked on the linear regression setup and adjusted the logic to load traffic data, train a model using provided weights, and evaluate the model's performance.
Updated the path and file handling to ensure the traffic data files matched the sanitized location names correctly.

General Challenges:

Faced several issues with file paths, directory structures, and data mismatches during preprocessing.
Ultimately, decided to mention these challenges, particularly with Python, in the final presentation.

References

[1] Maymounkov, P., & Mazieres, D. (2002). Kademlia: A Peer-to-peer Information System Based on the XOR Metric. New York University. https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf

[2] Jelasity, M., Voulgaris, S., Guerraoui, R., Kermarrec, A.-M., & Van Steen, M. (2004). Gossip-based peer sampling. distributed-systems.net. https://www.distributed-systems.net/my-data/papers/2007.tocs.pdf

[3] Kempe, D., Dobra, A., & Gehrke, J. (2003). Gossip-Based Computation of Aggregate Information. IEEE-Explore. https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=1238221&tag=1

[4] Abhishek Jain, Feb 2, 2024. “All about loss functions like MSE, MAE, RMSE etc.” Medium https://medium.com/@abhishekjainindore24/all-about-loss-functions-like-mse-mae-rmse-etc-36596e3802f5

[5] J. S. Mertens, L. Galluccio and G. Morabito, "Federated learning through model gossiping in wireless sensor networks," 2021 IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom), Bucharest, Romania, 2021, pp. 1-6, doi: 10.1109/BlackSeaCom52164.2021.9527886. https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9527886&isnumber=9527733

[6] D. Sobolevsky, March 15, 2023. “GeoJSON tutorial for beginners” Medium. https://medium.com/@dmitry.sobolevsky/geojson-tutorial-for-beginners-ce810d3ff169