2 of 30

Motivation

Most carbon-aware computing targets data centers
Battery-powered Internet of Things devices are underexplored
Always-on charging can increase carbon footprint unnecessarily
Edge devices face strict energy, latency, and uptime constraints

Key insight: Charging decisions matter as much as model selection

3 of 30

Problem Statement

Goal: Jointly optimize inference and charging decisions

Choose perception model each task interval
Decide charge vs no-charge each interval
Balance accuracy, latency, uptime, and carbon cost

Constraint: Real-time decisions under limited battery capacity

4 of 30

State of the Art

Most carbon-aware computing research focuses on large-scale data centers
These systems shift flexible workloads to cleaner grid periods
Optimization assumes access to substantial infrastructure such as large batteries, cooling systems, and energy storage

Limitations for edge devices:

Assumes large energy buffers that edge devices do not have
Ignores per-timestep battery feasibility constraints
Does not address strict real-time latency and uptime requirements

5 of 30

Novelty and Core Idea

Oracle-driven framework

Oracle has full knowledge of the future carbon intensity trajectory over a fixed horizon
Oracle solves a finite-horizon Markov Decision Process via dynamic programming
Oracle produces optimal state–action trajectories under fixed system parameters

Key contribution

Train a real-time controller via imitation learning from oracle trajectories
Learned controller operates under partial observability using only real-time signals
No access to future carbon values or explicit carbon forecasting

6 of 30

System Overview

High-level pipeline

System and model parameters
Oracle planning (finite-horizon Markov Decision Process)
Data generation over parameter sweep
Imitation learning training
Deployable controller for real-time decisions

Training & Data Pipeline Diagram

7 of 30

Model Profiler

Model profiling outputs

Baseline power (mW)
Inference power (mW)
Inference latency (s)
Energy per inference (mWh)
Stability across repeated runs

Models profiled

YOLOv10 N, S, M, B, L, X

stored as JSON model profiles

NOTE: We only use the ‘inference latency’ and ‘energy per inference’ metric.

Model Accuracy

Normalized_Accuracy = COCO_mAP_50_95_Score / 57.0

Value between [0,1] for all models.

8 of 30

System Parameters

Controller parameters
Battery capacity B_max
Task interval Δ
Horizon 𝑇
2024 Carbon time series {𝑑𝑡}𝑇−1|𝑡=0
Model set 𝑀
Energy per model 𝐸(𝑚)
User requirements (u_acc,u_lat)
Reward weights 𝑊, 𝑋, 𝑌, 𝑍

Restriction

Parameters fixed per controller instance

9 of 30

Oracle Controller

Markov Decision Process (MDP) Formulation

State: (𝑡, 𝐵_𝑡)

𝑡: timestep index
𝐵_𝑡: battery level

Action: (𝑚_𝑡, 𝑐_𝑡)

𝑚_𝑡 ∈ 𝑀 ∪ { ∅ }: model selection or no selection
𝑐_𝑡 ∈ { 0, 1 }: charge decision

Transitions:

Deterministic battery dynamics
Battery updated by charging minus inference energy

Feasibility:

Actions are invalid if battery constraints are violated

10 of 30

Oracle Reward Function

Outcome indicators (per timestep)

Success: model executed and met accuracy and latency requirements
Small miss: model executed but violated requirements
Large miss: no model executed due to infeasibility

Carbon cost

Dirty energy consumed:

Δ𝐷_𝑡 = 𝑐_𝑡 ⋅ 𝑟_𝑐ℎ𝑔 ⋅ Δ ⋅ 𝑑_𝑡

Reward Function:

𝑅 = 𝑊 ⋅ 𝑠𝑢𝑐𝑐𝑒𝑠𝑠_t − 𝑋 ⋅ 𝑠𝑚𝑎𝑙𝑙_𝑚𝑖𝑠𝑠_t − 𝑌 ⋅ 𝑙𝑎𝑟𝑔𝑒_𝑚𝑖𝑠𝑠_t − 𝑍 ⋅ Δ𝐷_𝑡

Objective

Maximize cumulative reward over the planning horizon

11 of 30

Oracle Solution Method

Solution

Finite-horizon dynamic programming (backward induction)

Implementation Details

Discretize battery state
KNN lookup for continuous battery rollout

12 of 30

Custom Controller

Why Partially Observable Markov Decision Process

No access to future carbon trajectory

Observation

Battery level 𝐵_𝑡
Current carbon intensity 𝑑_𝑡
Carbon change Δ𝑑_𝑡 = 𝑑_𝑡 − 𝑑_{𝑡 − 1}

Interpretation

Trend feature supports short-term carbon inference

13 of 30

Imitation Learning

Training data

Oracle demonstrations: ( 𝑜_𝑡 → 𝑎_𝑡^∗ )

Learning objective

Supervised classification over discrete action space
Cross-entropy loss between predicted action and oracle action

Result

Real-time controller that approximates oracle behavior from partial observations

14 of 30

Naive Baseline

Baseline Policy

Choose lowest-energy model that meets ( 𝑢_𝑎𝑐𝑐, 𝑢_𝑙𝑎𝑡 )
If infeasible, charge and choose lowest-energy model regardless of requirements
If no model feasible, run no model and charge

Purpose

Myopic baseline for comparison

15 of 30

Evaluation Metrics

Primary metrics

Accuracy compliance rate
Success, small miss, large miss counts
Failure rate
Utility gap vs oracle

Feasibility-Normalized Effective Uptime

Score each timestep by 𝑎 ( 𝑚_𝑡) / 𝑎_𝑡^∗ if a model runs, else 0
Report average over the horizon

16 of 30

Evaluation Setup

All controllers are evaluated under identical simulation parameters for fair comparison
Results are aggregated for fair comparisons

Simulation Pipeline Diagram

17 of 30

Model Accuracy-Latency Trade Off (YOLOv10)

Clear accuracy–latency tradeoff across YOLOv10 model sizes
Defines the feasible model choices available to the controller

18 of 30

Results 1: Overall Controller Performance

Learned controller closely matches oracle accuracy and utility.
Outperforms naive baseline without sacrificing uptime.

Figure 4.1: Average performance across oracle, learned, and naive controllers.

19 of 30

Results 2: Accuracy-Latency Trade Off

Accuracy and latency thresholds strongly affect feasibility as battery capacity decreases.
Naive controller’s aggressive policy boosts accuracy and uptime by favoring higher-powered models under tight constraints.

Figure 4.2 | High Accuracy & Latency (C1, C8): acc=0.95, lat=0.015s, Battery Capacity, Charging Rate = (105 mWh, 0.001598) | Low Accuracy & Latency (C6, C7): acc=0.819, lat=0.006s | Battery Capacity, Charging Rate (C1, C6, C7, C8) = (105 mWh, 0.001598)

20 of 30

Results 3: Reward Design Sensitivity

Controller performance remains stable across reward weight configurations.
Reward tuning does not materially change accuracy or uptime given fixed simulation parameters.

Figure 4.3 | Success-weighted (C2, C6): Performance-focused weights | Carbon-weighted (C3, C7): Carbon-focused weights | Battery Capacity, Charging Rate (C2, C3) = (610 mWh, 0.000269) | Battery Capacity, Charging Rate (C6, C7) = (105 mWh, 0.001598)

21 of 30

Results 4: Battery Capacity Effects

Smaller battery capacities introduce instability and reduce adaptability.
Carbon-focused policies degrade more sharply under tight battery constraints.

Figure 4.4 | Small Battery (C6, C8): Battery Capacity, Charging Rate = (105 mWh, 0.001598) | Large Battery (C4, C5): Battery Capacity, Charging Rate = (610 mWh, 0.000269) | Success-weighted (C4, C6): Performance-focused weights | Carbon-weighted (C5, C8): Carbon-focused weights

22 of 30

Results 5: Seasonal Robustness

Controller performance remains stable across seasonal carbon patterns.
No significant dependence on seasonality observed in accuracy, uptime, or utility.

Figure 4.5 | Winter: Feb. 20th 2024 | Spring: May 20th 2024 | Summer: Aug. 20th 2024 | Autumn: Nov. 20th 2024

23 of 30

Discussion

What worked

Oracle planning produces strong horizon-aware policies
Imitation learning transfers carbon-aware behavior
Simple observations can be sufficient

What did not

Oracle data generation is expensive
Performance depends on discretization choices
Generalization across regions and seasons is limited by data coverage

24 of 30

Conclusions

Takeaways

Carbon-aware control is feasible for battery-powered edge devices
Charging policy is a first-class decision variable
Oracle imitation enables real-time deployment without future knowledge

Broader impact

Extends to other battery-powered decision-making devices

25 of 30

Future Work

Incorporate algorithms and methodology from the Data-driven Planning via Imitation Learning paper^[4]

Incorporate a dual-algorithm controller

one algorithm / model at specific intervals, and another at different intervals.

Online adaptation to user preference changes

Some type of always-learning LSTM-type controller that updates itself every task interval at runtime

26 of 30

References

Carbon- and Precedence-Aware Scheduling for Data Processing Clusters[2]

Lechowicz, A., Shenoy, R., Bashir, N., Hajiesmaili, M., Wierman, A., & Delimitrou, C. Carbon- and Precedence-Aware Scheduling for Data Processing Clusters. arXiv:2502.09717, 2025. [Paper]

Carbon-Aware Workload Management in Data Centers[3]

Nkwawir, B.W., Kayalica, M.O., Guven, D., Duman, A.C., & Erden, H.S. Carbon-Aware Workload Management in Data Centers: A Multi-Energy Integration Approach. In Proceedings of 16th ACM International Conference on Future and Sustainable Energy Systems (E-Energy '25), Association for Computing Machinery, New York, NY, USA, 907-914. [Paper]

Data-driven Planning via Imitation Learning[4]

Choudhury, S., Bhardwaj, M., Arora, S., Kapoor, A., Ranade, G., Scherer, S., & Dey, D. Data-driven Planning via Imitation Learning. The Robotics Institute, Carnegie Mellon University & Microsoft Research. [Paper]

Monte-Carlo Planning in Large POMDPs[5]

Silver, D., & Veness, J. Monte-Carlo Planning in Large POMDPs. MIT & UNSW, Sydney, Australia. [Paper]

Optimal Control of Markov Processes with Incomplete State Information I[6]

Åström, K.J. Optimal Control of Markov Processes with Incomplete State Information I. In Journal of Mathematical Analysis and Applications 10. p.174-205, 1965. [Paper]

27 of 30

Supplementary Material

Electricity Maps grid carbon traces (external). 2024 time-series CSVs at 5-minute granularity for 4 U.S. regions from Electricity Maps, stored in energy-data/. We replay these traces in simulation as the time-varying carbon signal. Preprocessing: load CSV, sort by timestamp, select column 7 (Carbon-free energy percentage, CFE%), and align to the simulator timestep (hold the most recent 5-minute value when the task interval is finer). No labeling.
YOLOv10 model metadata (external). Per-variant accuracy and specs from Ultralytics YOLOv10 documentation used to parameterize model trade-offs in the simulator. No labeling.
External libraries: PyTorch, NumPy, Astral (uv).
External models: Ultralytics YOLOv10 variants (N/S/M/B/L/X).
AI Coding Tools: Windsurf, Claude Code, ChatGPT

28 of 30

Contributions by Team Members

Samyak Kakatur

Custom beam-search oracle implementation
Controller training pipeline
Result visualization and evaluation

Jasper Lin

Battery simulation framework
Energy Model framework
Carbon dataset preprocessing and simulation harness
Slide organization and presentation structure

30 of 30

Header w/editable table (optional slide)

Title	Title	Title
Title	00%	00%
Title	00%	00%
Title	00%	00%
Title	00%	00%
Title	00%	00%
Title	00%	00%
Title	00%	00%
Title	00%	00%

1 of 30