1 of 40

CSE 373 SP25 Section AI Week 4

Isomorphism

2 of 40

Check-in

What study tips or rituals do you have?

3 of 40

Exam Details

  • Two-Stage Exam 1 Tomorrow in CSE2 G20 at 12:30 PM!
    • Any content from 03/31 to 04/24 is relevant for this quiz.
    • Check your email for assigned seating by today!
    • 10 A4 pages of notes are allowed, bring a pencil/pen and Student ID too.
    • Exam Prep? Practice Exam! Lecture Qs! Section Qs! Prework!
    • 25 minutes individual, 25 minutes group

4 of 40

Micro-Teach: 2-3 Trees, LLRBs, and Tries/TSTs

5 of 40

2-3 Tree

  • Node Invariant:
    • 2-node: a node of exactly 1 key and 2 non-null children
    • 3-node: a node of exactly 2 keys and 3 non-null children
  • 2-3 Tree Invariant:
    • A 2-3 Tree contains only 2-node and 3-node
    • A 2-3 Tree is a Search Tree
      • Search Tree Property: possible (but tricky) to read the order

6 of 40

2-3 Tree Construction, Insertion, and Search

General Flow

  • Search
    • At each node, compare values:
      • Equal => Done!
      • Less than (the left key) => go left
      • Larger than (the right key) => go right
      • Between the 2 keys => go down
    • Repeat until find the value or know it does not exists in the tree
  • Construction/Insertion: Demo
    • Same rule of finding + must go down to the very bottom level
    • Insert into the leaf node + fix 2-3 node invariant
    • If overstuffing, fix 2-3 tree invariant by moving the middle item up and splitting

7 of 40

Isomorphism between 2-3 Tree and LLRB Tree

  • What is isomorphism??
    • A ~fancy~ way to express an one-to-one correspondence or mapping!
  • Isomorphism/Mapping between 2-3 Tree and LLRB:
    • We ALWAYS can convert a 2-3 Tree to its LLRB and vice versa!
    • 2-3 Tree is a Search Tree, NOT guaranteed BST, but we can convert any 2-3 Tree to a BST by finding its corresponding Red-Black Tree!

8 of 40

Micro-Teach: one 3-node to two binary-tree nodes

1

Steps (intuitively):

  1. Split the parent node (with two keys) into two nodes, left and right
  2. Attach the left two children with the left parent and the rightmost child with the right parent
  3. Raise right side up and connect parent nodes together by a glue edge

3

5

2 4

Original VALID 2-3 Tree:

1

3

5

Step 1:

2

4

1

3

5

Step 2:

2

4

1

3

5

Step 3:

2

4

Always a valid LLRB!

Red

9 of 40

LLRB

LLRB invariants:

  • Left descendants are less than the current node
  • Right descendants are greater than the current node
  • No node has two red links
  • All red links are left-leaning
  • Every path from root to a null node (leaf’s child) has same number of black links

  • Java TreeSet implementation:
    • Red-Black Tree!
    • A Red-Black Tree is

a self-balancing BST

LLRB invariants:

must be BST

10 of 40

Trie

  • Specialized tree that stores individual characters in each node
  • A boolean flag marks the end of a valid String
  • A path from the root to a node represents either a full word or a prefix
  • Commonly used for:
    • Autocomplete :)
    • Spell checking
    • Dictionary ADT (supports insertion, removal, search)

What operations does a Trie optimize for?

Contains:

[a, awls, sad, sam, same, sap]

11 of 40

TST

  • Each node stores a single character, similar to a Trie, but is restricted to no more than three non-null children

Invariants:

  • Left child
    • Not using the current character
    • Comes before the current letter alphabetically
  • Right child
    • Not using the current character
    • Comes after the current letter alphabetically
  • Middle child
    • Using the current character (part of the word)

12 of 40

Trie vs TST

Trie

TST

Faster lookup, no comparison logic

Support BST-like ordering, requires character comparison at each node

Memory-heavy, large branching factor

More space efficient with 3-child constraint

Works best with small, overlapping data

More flexible for large/ sparse datasets

Worst Case Height: L

Worst Case Height: N + L

13 of 40

Question 1: Invalid LLRBs

The following LLRB trees are invalid, meaning they break an invariant. Why?

14 of 40

Any Questions?

15 of 40

Question 2: Inserting into a Balanced Tree

Draw the left-leaning red-black tree that results after calling insert(25) using both of the following approaches:

  1. LLRB tree approach: Insert 25 directly into the LLRB, connecting its invariants back to the 2-3 tree. Draw out the state of the LLRB tree after each flip or rotation.
  2. 2-3 tree approach: Draw out the process of inserting 25 into the equivalent 2-3 tree. Then, convert the resulting 2-3 tree back into an LLRB tree.

Use scratch paper on back sheet as needed.

red

red

red

20

10

40

17

5

30

12

16 of 40

Inserting into a Balanced Tree (Solution #1 - Rotations)

Draw the left-leaning red-black tree after inserting 25.

20

10

40

17

5

30

12

Invariants:

  • Red edges lean left.
  • No node has two red edges connected to it.
  • Every root-to-null path has the same number of black edges.
  1. Add 25 on a left red edge.

25

17 of 40

Inserting into a Balanced Tree (Solution #1 - Rotations)

Draw the left-leaning red-black tree after inserting 25.

20

10

40

17

5

30

12

Invariants:

  • Red edges lean left.
  • No node has two red edges connected to it.
  • Every root-to-null path has the same number of black edges.
  1. Add 25 on a left red edge.
  2. Rotate right on 40.

25

18 of 40

Inserting into a Balanced Tree (Solution #1 - Rotations)

Draw the left-leaning red-black tree after inserting 25.

Invariants:

  • Red edges lean left.
  • No node has two red edges connected to it.
  • Every root-to-null path has the same number of black edges.
  1. Add 25 on a left red edge.
  2. Rotate right on 40.
  3. Flip red edges up to 30's parent.

20

10

40

17

5

30

12

25

19 of 40

Inserting into a Balanced Tree (Solution #1 - Rotations)

Draw the left-leaning red-black tree after inserting 25.

Invariants:

  • Red edges lean left.
  • No node has two red edges connected to it.
  • Every root-to-null path has the same number of black edges.
  1. Add 25 on a left red edge.
  2. Rotate right on 40.
  3. Flip red edges up to 30's parent.
  4. Flip red edges up.

All invariants met!

20

10

40

17

5

30

12

25

20 of 40

Inserting into a Balanced Tree (Solution #2 - 2-3 tree)

21 of 40

Inserting into a Balanced Tree (Solution #2 - Expanded)

22 of 40

Design Decisions: Malicious Website Tracker

Kevin is working at an internet security company and wants to keep track of malicious websites that users have reported. Websites are represented by a String URL.

Kevin wants to find common domain patterns (such as “.com” and “.org”) in the urls of malicious websites. Additionally, Kevin wants to support international domain names, which may contain unicode characters (such as “.भारत” or “.한국”).

NOTE: Suppose Kevin values space efficiency the most and is willing to sacrifice speed.

23 of 40

Malicious Website Tracker: Task

Kevin designs an ADT with the following functionality:

public void addURL(String url)

  • Adds a malicious website url into the collection of reported urls

public int searchForSuffix(String suffix)

  • Returns the total count of URLs ending with the specified suffix.

For example, suppose the following urls have been added:

addURL(“washington.edu”)

addURL(“evilsite.com”)

addURL(“notNetflix.com”)

addURL(“korea.한국”)

addURL(“government.org”)

Then searchForSuffix(“.com”) would return 2

Hint: What data should a node store as fields?

24 of 40

Malicious Website Tracker: Possible Solution

Using a TST that allows for any Unicode Characters:

public void addURL(String url)

  • Insert URLs in reverse order to allow for suffix searching
    • Follow left < /middle = /right > pattern for TST insertion (look at next char in input String url in middle case)
      • For each node in url - if it does not exist in the TST, create a new one and add a value for the count (= 1)
      • If the node does exist in the TST and matches with the character in our string (going down middle case), add to the existing count field (count++)

public int searchForSuffix(String suffix)

  • Reverse the input URL following the same traversal pattern described above
    • Once you reach the end of your matching term/suffix (if found in TST), return count at that final char node.
      • No match = return -1

25 of 40

Any Questions?

26 of 40

Code Challenges: Tries

27 of 40

Why LeetCode Style Code Challenges?

  • Improvement on Algorithmics:
    • Be aware that such problems exist
    • Practice thinking through problems
    • Coming up with various approaches and solutions
    • Get comfortable with various data structures and algorithms
    • Better sense of what tools, approaches, strategies, refinements you can make
  • Improvement on Communication & Comprehension:
    • Be able to communicate approaches, solutions, thought processes outloud
    • Be able to refine thoughts and convey them effectively
    • Real-life interview-style practice
    • More about the process than the results
  • Get to interact with other classmates and HAVE FUN!

28 of 40

Problems + Approaches

Each group (3-4) will be assigned one of the following problems:

  • LeetCode 14. Longest Common Prefix
  • LeetCode 3042. Count Prefix and Suffix Pairs I

Different Approaches to Use:

  • Brute Force
  • Trie Approach
  • Binary Search/BST
  • Sorted Array/HashMap�

For instructions, search online for the problem by ID number, e.g. “LeetCode 219”.

29 of 40

Different Approaches Available - Leetcode 3042

Different Approaches Available - Leetcode 14

  • Brute Force
  • Trie approach
  • Sorted Array
  • Binary Search
  • Brute Force Approach
  • Trie approach
  • Balanced Binary Search Tree
  • HashMap

30 of 40

Step-by-Step Process

NO NEED to write up code for this activity.

Try to come up with solution approaches and how you would tackle the program on the high-level using specific data structures.

You’ll have about 20 minutes to complete:

  • Think individually for 2 min
  • Group discussion for 4 min (no computer)
  • Group discussion for 4 min with computer
  • Presentation prep for 4 min
  • Presenting to another group for 6 min

31 of 40

Let's get started!

32 of 40

Think individually for 2 min

33 of 40

Group discussion for 4 min (no PC)

34 of 40

Group discussion for 4 min (with PC)

35 of 40

Presentation prep for 4 min

36 of 40

Presenting to another group for 6 min

37 of 40

Presentation Guidelines

Questions to Ask Yourself:

  • What is the problem really asking?
  • What operations do I need to perform often?
  • In the examples, are there patterns that can help getting started?
  • When should I use specific data structures?

Presentation Guidelines

  • Explain how you approached the problem
  • Run through how you eliminated other options
  • Describe the data structure(s) and algorithm(s) you chose and why
  • Go through your solution step by step, explaining the logic

Complete the reflection activity in Gradescope and add your groupmates!

38 of 40

Trie Solution - Longest Common Prefix (14)

39 of 40

Trie Solution - Count Prefix and Suffix Pairs (3042)

40 of 40

Closing Announcements

  • Thanks for coming to section! Be sure to attend class tomorrow for the Two-Stage Exam! Stay safe and stay hydrated :)

Please don't hesitate to reach out if you have any

questions or concerns about the course, or if there's

anything else we can help you with! We're here to support

you however we can!