1 of 16

Lab 7: The World I See

Lab 7: Seeing is Believing

JC Hu

CS 123

2 of 16

Let’s make Pupper See!

CS 123

Credit: DALL-E

3 of 16

Dependencies

  1. Download Foxglove
    1. Foxglove is essentially RViz, just with more features
    2. Will allow you to visualize ros topics, including the camera images
  2. Downgrade numpy below 2.0.0 (should be a common practice in general)

3

4 of 16

Fisheye lens

  • Very wide view, as mentioned by Gabrael
  • Very hard to detect april tags due to distortion (painful experience from June)

4

5 of 16

Image Unwarping (implemented for you!)

  • Implemented in fisheye_converter.py (optional to look at, code from Nathan)
  • Should make vision-based final projects more feasible 😉

5

6 of 16

Lab Overview

We have implemented for you:

  1. A yolo object detector that runs on board on Pupper
  2. Backbone for integrating vision into our Lab 6 OpenAI Realtime API calls

Your Job:

  1. Tune a P controller to have Pupper track you as you move!
    1. For this lab, Pupper moves forward at a constant speed when tracking people (the chase goes on forever!)
  2. Implement a state machine for tracking and searching
  3. Wrap the tracking functionalities into a karel api call, tracking not only humans but a variety of other things
  4. Embed camera feed into your LLM feed & tune prompts and command parsing for a fully multimodal embodied AI system!

6

7 of 16

State Machines

  • State machines are a way to define states and transitions, and are very common in robotics
    • especially in decision making

7

Credit: DALL-E

States:

  • walking
  • stopped

Transitions:

  • object detected
  • free path

8 of 16

Breakpoint Tutorial

  • Open pdb with breakpoint()

Common pdb commands: p (print), c (continue), s (step), n (next line)

ctrl + d (or exit) to quit

Reference: https://kapeli.com/cheat_sheets/Python_Debugger.docset/Contents/Resources/Documents/index

8

9 of 16

Stepping through ROS Messages

  1. ROS messages typically have a BUNCH of fields
    1. Reference each file using the dot syntax
  2. View ROS message documentation

msg

detections

data

[i]

detection

bbox

Pose2D

center

x

indicates a field of the left side object

9

10 of 16

When you have two (or more) People in the Box…

How do you make sure you’re tracking the same person?

10

11 of 16

Small Motion Assumption in Optical Flow

Solution: You keep the last tracking position, and return the detection closest to that

11

12 of 16

Adding Vision to the LLM pipeline

  • The exact engineering code for image feed (part of which you’ll have to implement) isn’t very straightforward
    • But the starter code and instructions should provide enough details to integrate images to the LLM
  • The interesting parts:
    • Integrate visual servoing into karel api, set argument so that it also follows different objects
    • Update your prompt and command parsing from lab 6, so that your Pupper:
      • Receives an image every time your voice is detected along with your voice->text input
      • Can reason based on the image given
      • Activate/deactivate tracking mode on different things on your voice commands!

12

13 of 16

Quick Tips

Please ask the TAs if you have any questions, and come to OH if you need help/more time

Thank you, and have fun!

General

    • Don’t test the robot on the table, test on the ground
    • Make sure to replace your battery often and keep a spare charging (these vision and API calls drain batteries quickly)

Lab 6 & 7 specifics

    • Lab 7 builds up from lab 6. Make sure you finish lab 6 before starting lab 7!
    • Make sure to allocate enough time on this lab! Hope you enjoy :)

13

14 of 16

Final Project Logistics

  • General Logistics will be updated shortly; you will need to submit
    • Project proposal & Discussion with (at least) one member of the teaching team (2%)
    • Progress report (8%)
    • Final Demo Video & Presentation (14%)
  • Informal Final Project Workshop Wednesday (11/5) at 5pm
  • Location: CoDA, room TBD
  • In-person attendance is Strongly Encouraged
  • Brainstorm about final project ideas, and a lot of exclusive info/resources 😎

14

15 of 16

Optional Lab 2 is Out

  • Train Pupper to traverse 5 mysterious obstacles!

  • Each group gets 5 attempts to test out your policies (until 5/30)

  • Your policy’s performance will be evaluated from its traversal time and strategies

  • The difficulty of each obstacle is reflected in score multipliers (you won’t see the next obstacle until you’ve passed the previous one):
    • First obstacle: 1x multiplier
    • Second obstacle: 2x multiplier
    • Third obstacle: 3x multiplier
    • Fourth obstacle: 5x multiplier
    • Fifth obstacle: 8x multiplier

15

Winner gets a mysterious prize. Check out the website for more details!

16 of 16

Please ask the TAs if you have any questions, and come to OH if you need help/more time

Thank you, and have fun!

Fill Out Our Feedback Form!

16