For this Design challenge I decided to tackle a problem that I’ve encountered a lot--the “command+f” action in physical documents. Many times I am reading a physical book or document and want to be able to “command+f” or search through the document to find a certain word. So my design solves exactly this! It uses a camera to read written words, and the Google Vision Cloud api to recognize the letters. Next I use my speaker to read back the word it is going to search for. Then the camera takes another picture of a printed document you show it, and uses Google’s “Detect text in image” api to pull all of the words off the document. Finally I do a search through the document for the written word that was being looked for.

I use a servo motor to then waves a green flag if the word was found, or a red flag if it was not. The device also says out loud “I found it” or “Sorry I didn’t find it”.

Watch the video of it in action here:

Laser cutting the paper stand.

The paper stand put together.

Top of the stand showing the slit that the paper goes through

Back of the paper stand.

Close up of camera contraption.

Entire gadget in view.

Entire set up.

Close up of flags.


import picamera     #camera library

import pygame as pg #audio library

import os           #communicate with os/command line

import io

from import vision  #gcp vision library

from time import sleep

from adafruit_crickit import crickit

#set up your GCP credentials - replace the " " in the following line with your .json file and path


# this line connects to Google Cloud Vision!

client = vision.ImageAnnotatorClient()

# global variable for our image file - to be captured soon!

image = 'image.jpg'

def takephoto(camera):

   # this triggers an on-screen preview, so you know what you're photographing!


   # sleep(5)                   #give it a pause so you can adjust if needed







   camera.capture('image.jpg') #save the image

   camera.stop_preview()       #stop the preview

def ocr_handwriting(path):

   with, 'rb') as image_file:

       content =

   image = vision.types.Image(content=content)

   response = client.text_detection(image=image)

   text = response.full_text_annotation

   word_text = ""

   for page in text.pages:

       for block in page.blocks:

           for paragraph in block.paragraphs:

               for word in paragraph.words:

                   word_text += " "

                   word_text += ''.join([

                       symbol.text for symbol in word.symbols


   return word_text

def detect_text(path):

   """Detects text in the file."""

   client = vision.ImageAnnotatorClient()

   with, 'rb') as image_file:

       content =

   image = vision.types.Image(content=content)

   response = client.text_detection(image=image)

   texts = response.text_annotations

   return texts[0].description

def turnOnCamera(camera):


def imageRec():

   camera = picamera.PiCamera()

   pathToImg = 'image.jpg'



   # turnOnCamera(camera)

   # while True:

   # First Recognize written word

   #then take written word and use it to search through document.


   writtenText = ocr_handwriting(pathToImg).strip()


   sayStuff('Searching for the word: "{0}"'.format(writtenText))

   #recogize words in document


   docText = detect_text(pathToImg)

   # print(docText)

   #search through words to find initial word

   if (writtenText.lower() in docText.lower()):

       crickit.servo_1.angle = 180

       sayStuff("Found it")


       crickit.servo_1.angle = 0

       sayStuff("Sorry, I didn't find it.")

def sayStuff(stuff):

   cmd_string = 'espeak -ven+f4 "{0}" 2>/dev/null'.format(stuff)



def resetServoPos():

   crickit.servo_1.angle = 90

def main():





if __name__ == '__main__':