1 of 30

ECE M202A, Fall 2025

Authors: Jasper Lin, Samyak Kakatur

Optimal Charge Security Camera: Carbon-Aware Control for Battery-Powered Edge Devices

2 of 30

Motivation

  • Most carbon-aware computing targets data centers
  • Battery-powered Internet of Things devices are underexplored
  • Always-on charging can increase carbon footprint unnecessarily
  • Edge devices face strict energy, latency, and uptime constraints

Key insight: Charging decisions matter as much as model selection

2

3 of 30

Problem Statement

Goal: Jointly optimize inference and charging decisions

  • Choose perception model each task interval
  • Decide charge vs no-charge each interval
  • Balance accuracy, latency, uptime, and carbon cost

Constraint: Real-time decisions under limited battery capacity

3

4 of 30

State of the Art

  • Most carbon-aware computing research focuses on large-scale data centers
  • These systems shift flexible workloads to cleaner grid periods
  • Optimization assumes access to substantial infrastructure such as large batteries, cooling systems, and energy storage

Limitations for edge devices:

  • Assumes large energy buffers that edge devices do not have
  • Ignores per-timestep battery feasibility constraints
  • Does not address strict real-time latency and uptime requirements

4

5 of 30

Novelty and Core Idea

Oracle-driven framework

  • Oracle has full knowledge of the future carbon intensity trajectory over a fixed horizon
  • Oracle solves a finite-horizon Markov Decision Process via dynamic programming
  • Oracle produces optimal state–action trajectories under fixed system parameters

Key contribution

  • Train a real-time controller via imitation learning from oracle trajectories
  • Learned controller operates under partial observability using only real-time signals
  • No access to future carbon values or explicit carbon forecasting

5

6 of 30

System Overview

High-level pipeline

  • System and model parameters
  • Oracle planning (finite-horizon Markov Decision Process)
  • Data generation over parameter sweep
  • Imitation learning training
  • Deployable controller for real-time decisions

Training & Data Pipeline Diagram

6

7 of 30

Model Profiler

Model profiling outputs

  • Baseline power (mW)
  • Inference power (mW)
  • Inference latency (s)
  • Energy per inference (mWh)
  • Stability across repeated runs

Models profiled

  • YOLOv10 N, S, M, B, L, X
    • stored as JSON model profiles

NOTE: We only use the ‘inference latency’ and ‘energy per inference’ metric.

Model Accuracy

  • Normalized_Accuracy = COCO_mAP_50_95_Score / 57.0
    • Value between [0,1] for all models.

7

8 of 30

System Parameters

  • Controller parameters
  • Battery capacity Bmax ​
  • Task interval Δ
  • Horizon 𝑇
  • 2024 Carbon time series {𝑑𝑡}𝑇−1|𝑡=0
  • Model set 𝑀
  • Energy per model 𝐸(𝑚)
  • User requirements (uacc,ulat)
  • Reward weights 𝑊, 𝑋, 𝑌, 𝑍

Restriction

  • Parameters fixed per controller instance

8

9 of 30

Oracle Controller

Markov Decision Process (MDP) Formulation

  • State: (𝑡, 𝐵𝑡)
    • 𝑡: timestep index
    • 𝐵𝑡: battery level
  • Action: (𝑚𝑡, 𝑐𝑡)
    • 𝑚𝑡 ∈ 𝑀 ∪ { ∅ }: model selection or no selection
    • 𝑐𝑡 ∈ { 0, 1 }: charge decision
  • Transitions:
    • Deterministic battery dynamics
    • Battery updated by charging minus inference energy
  • Feasibility:
    • Actions are invalid if battery constraints are violated

9

10 of 30

Oracle Reward Function

Outcome indicators (per timestep)

  • Success: model executed and met accuracy and latency requirements
  • Small miss: model executed but violated requirements
  • Large miss: no model executed due to infeasibility

Carbon cost

  • Dirty energy consumed:
    • Δ𝐷𝑡 = 𝑐𝑡𝑟𝑐ℎ𝑔 Δ ⋅ 𝑑𝑡
  • Reward Function:
    • 𝑅 = 𝑊 ⋅ 𝑠𝑢𝑐𝑐𝑒𝑠𝑠t − 𝑋 ⋅ 𝑠𝑚𝑎𝑙𝑙_𝑚𝑖𝑠𝑠t − 𝑌 ⋅ 𝑙𝑎𝑟𝑔𝑒_𝑚𝑖𝑠𝑠t − 𝑍 ⋅ Δ𝐷𝑡

Objective

  • Maximize cumulative reward over the planning horizon

10

11 of 30

Oracle Solution Method

Solution

  • Finite-horizon dynamic programming (backward induction)

Implementation Details

  • Discretize battery state
  • KNN lookup for continuous battery rollout

11

12 of 30

Custom Controller

Why Partially Observable Markov Decision Process

  • No access to future carbon trajectory

Observation

  • Battery level 𝐵𝑡
  • Current carbon intensity 𝑑𝑡​
  • Carbon change Δ𝑑𝑡 = 𝑑𝑡 − 𝑑𝑡 − 1

Interpretation

  • Trend feature supports short-term carbon inference

12

13 of 30

Imitation Learning

Training data

  • Oracle demonstrations: ( 𝑜𝑡 → 𝑎𝑡 )

Learning objective

  • Supervised classification over discrete action space
  • Cross-entropy loss between predicted action and oracle action

Result

  • Real-time controller that approximates oracle behavior from partial observations

13

14 of 30

Naive Baseline

Baseline Policy

  • Choose lowest-energy model that meets ( 𝑢𝑎𝑐𝑐 , 𝑢𝑙𝑎𝑡 )
  • If infeasible, charge and choose lowest-energy model regardless of requirements
  • If no model feasible, run no model and charge

Purpose

  • Myopic baseline for comparison

14

15 of 30

Evaluation Metrics

Primary metrics

  • Accuracy compliance rate
  • Success, small miss, large miss counts
  • Failure rate
  • Utility gap vs oracle

Feasibility-Normalized Effective Uptime

  • Score each timestep by 𝑎 ( 𝑚𝑡 ) / 𝑎𝑡​ if a model runs, else 0
  • Report average over the horizon

15

16 of 30

Evaluation Setup

  • All controllers are evaluated under identical simulation parameters for fair comparison
  • Results are aggregated for fair comparisons

Simulation Pipeline Diagram

16

17 of 30

Model Accuracy-Latency Trade Off (YOLOv10)

  • Clear accuracy–latency tradeoff across YOLOv10 model sizes
  • Defines the feasible model choices available to the controller

17

18 of 30

Results 1: Overall Controller Performance

  • Learned controller closely matches oracle accuracy and utility.
  • Outperforms naive baseline without sacrificing uptime.

Figure 4.1: Average performance across oracle, learned, and naive controllers.

18

19 of 30

Results 2: Accuracy-Latency Trade Off

  • Accuracy and latency thresholds strongly affect feasibility as battery capacity decreases.
  • Naive controller’s aggressive policy boosts accuracy and uptime by favoring higher-powered models under tight constraints.

Figure 4.2 | High Accuracy & Latency (C1, C8): acc=0.95, lat=0.015s, Battery Capacity, Charging Rate = (105 mWh, 0.001598) | Low Accuracy & Latency (C6, C7): acc=0.819, lat=0.006s | Battery Capacity, Charging Rate (C1, C6, C7, C8) = (105 mWh, 0.001598)

19

20 of 30

Results 3: Reward Design Sensitivity

  • Controller performance remains stable across reward weight configurations.
  • Reward tuning does not materially change accuracy or uptime given fixed simulation parameters.

Figure 4.3 | Success-weighted (C2, C6): Performance-focused weights | Carbon-weighted (C3, C7): Carbon-focused weights | Battery Capacity, Charging Rate (C2, C3) = (610 mWh, 0.000269) | Battery Capacity, Charging Rate (C6, C7) = (105 mWh, 0.001598)

20

21 of 30

Results 4: Battery Capacity Effects

  • Smaller battery capacities introduce instability and reduce adaptability.
  • Carbon-focused policies degrade more sharply under tight battery constraints.

Figure 4.4 | Small Battery (C6, C8): Battery Capacity, Charging Rate = (105 mWh, 0.001598) | Large Battery (C4, C5): Battery Capacity, Charging Rate = (610 mWh, 0.000269) | Success-weighted (C4, C6): Performance-focused weights | Carbon-weighted (C5, C8): Carbon-focused weights

21

22 of 30

Results 5: Seasonal Robustness

  • Controller performance remains stable across seasonal carbon patterns.
  • No significant dependence on seasonality observed in accuracy, uptime, or utility.

Figure 4.5 | Winter: Feb. 20th 2024 | Spring: May 20th 2024 | Summer: Aug. 20th 2024 | Autumn: Nov. 20th 2024

22

23 of 30

Discussion

What worked

  • Oracle planning produces strong horizon-aware policies
  • Imitation learning transfers carbon-aware behavior
  • Simple observations can be sufficient

What did not

  • Oracle data generation is expensive
  • Performance depends on discretization choices
  • Generalization across regions and seasons is limited by data coverage

23

24 of 30

Conclusions

Takeaways

  • Carbon-aware control is feasible for battery-powered edge devices
  • Charging policy is a first-class decision variable
  • Oracle imitation enables real-time deployment without future knowledge

Broader impact

  • Extends to other battery-powered decision-making devices

24

25 of 30

Future Work

  • Incorporate algorithms and methodology from the Data-driven Planning via Imitation Learning paper[4]

  • Incorporate a dual-algorithm controller
    • one algorithm / model at specific intervals, and another at different intervals.

  • Online adaptation to user preference changes
    • Some type of always-learning LSTM-type controller that updates itself every task interval at runtime

25

26 of 30

References

  • Carbon- and Precedence-Aware Scheduling for Data Processing Clusters[2]
    • Lechowicz, A., Shenoy, R., Bashir, N., Hajiesmaili, M., Wierman, A., & Delimitrou, C. Carbon- and Precedence-Aware Scheduling for Data Processing Clusters. arXiv:2502.09717, 2025. [Paper]
  • Carbon-Aware Workload Management in Data Centers[3]
    • Nkwawir, B.W., Kayalica, M.O., Guven, D., Duman, A.C., & Erden, H.S. Carbon-Aware Workload Management in Data Centers: A Multi-Energy Integration Approach. In Proceedings of 16th ACM International Conference on Future and Sustainable Energy Systems (E-Energy '25), Association for Computing Machinery, New York, NY, USA, 907-914. [Paper]
  • Data-driven Planning via Imitation Learning[4]
    • Choudhury, S., Bhardwaj, M., Arora, S., Kapoor, A., Ranade, G., Scherer, S., & Dey, D. Data-driven Planning via Imitation Learning. The Robotics Institute, Carnegie Mellon University & Microsoft Research. [Paper]
  • Monte-Carlo Planning in Large POMDPs[5]
    • Silver, D., & Veness, J. Monte-Carlo Planning in Large POMDPs. MIT & UNSW, Sydney, Australia. [Paper]
  • Optimal Control of Markov Processes with Incomplete State Information I[6]
    • Åström, K.J. Optimal Control of Markov Processes with Incomplete State Information I. In Journal of Mathematical Analysis and Applications 10. p.174-205, 1965. [Paper]

26

27 of 30

Supplementary Material

  • Electricity Maps grid carbon traces (external). 2024 time-series CSVs at 5-minute granularity for 4 U.S. regions from Electricity Maps, stored in energy-data/. We replay these traces in simulation as the time-varying carbon signal. Preprocessing: load CSV, sort by timestamp, select column 7 (Carbon-free energy percentage, CFE%), and align to the simulator timestep (hold the most recent 5-minute value when the task interval is finer). No labeling.
  • YOLOv10 model metadata (external). Per-variant accuracy and specs from Ultralytics YOLOv10 documentation used to parameterize model trade-offs in the simulator. No labeling.
  • External libraries: PyTorch, NumPy, Astral (uv).
  • External models: Ultralytics YOLOv10 variants (N/S/M/B/L/X).
  • AI Coding Tools: Windsurf, Claude Code, ChatGPT

27

28 of 30

Contributions by Team Members

Samyak Kakatur

  • Custom beam-search oracle implementation
  • Controller training pipeline
  • Result visualization and evaluation

Jasper Lin

  • Battery simulation framework
  • Energy Model framework
  • Carbon dataset preprocessing and simulation harness
  • Slide organization and presentation structure

28

29 of 30

29

Q&A

30 of 30

Header w/editable table (optional slide)

30

Title

Title

Title

Title

00%

00%

Title

00%

00%

Title

00%

00%

Title

00%

00%

Title

00%

00%

Title

00%

00%

Title

00%

00%

Title

00%

00%