Published using Google Docs
Copy of DESIGN01
Updated automatically every 5 minutes

AR CMD+F

For this Design challenge I decided to tackle a problem that I’ve encountered a lot--the “command+f” action in physical documents. Many times I am reading a physical book or document and want to be able to “command+f” or search through the document to find a certain word. So my design solves exactly this! It uses a camera to read written words, and the Google Vision Cloud api to recognize the letters. Next I use my speaker to read back the word it is going to search for. Then the camera takes another picture of a printed document you show it, and uses Google’s “Detect text in image” api to pull all of the words off the document. Finally I do a search through the document for the written word that was being looked for.

I use a servo motor to then waves a green flag if the word was found, or a red flag if it was not. The device also says out loud “I found it” or “Sorry I didn’t find it”.

Watch the video of it in action here: https://youtu.be/Qhu8wC34G-g.

Laser cutting the paper stand.

The paper stand put together.

Top of the stand showing the slit that the paper goes through

Back of the paper stand.

Close up of camera contraption.

Entire gadget in view.

Entire set up.

Close up of flags.


Code

import picamera     #camera library

import pygame as pg #audio library

import os           #communicate with os/command line

import io

from google.cloud import vision  #gcp vision library

from time import sleep

from adafruit_crickit import crickit

#set up your GCP credentials - replace the " " in the following line with your .json file and path

os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="DET-2019-aad44b497877.json"

# this line connects to Google Cloud Vision!

client = vision.ImageAnnotatorClient()

# global variable for our image file - to be captured soon!

image = 'image.jpg'

def takephoto(camera):

   # this triggers an on-screen preview, so you know what you're photographing!

   camera.start_preview()

   # sleep(5)                   #give it a pause so you can adjust if needed

   sayStuff("5")

   sayStuff("4")

   sayStuff("3")

   sayStuff("2")

   sayStuff("1")

   sayStuff("SNAP")

   camera.capture('image.jpg') #save the image

   camera.stop_preview()       #stop the preview

def ocr_handwriting(path):

   with io.open(path, 'rb') as image_file:

       content = image_file.read()

   image = vision.types.Image(content=content)

   response = client.text_detection(image=image)

   text = response.full_text_annotation

   word_text = ""

   for page in text.pages:

       for block in page.blocks:

           for paragraph in block.paragraphs:

               for word in paragraph.words:

                   word_text += " "

                   word_text += ''.join([

                       symbol.text for symbol in word.symbols

                       ])

   return word_text

def detect_text(path):

   """Detects text in the file."""

   client = vision.ImageAnnotatorClient()

   with io.open(path, 'rb') as image_file:

       content = image_file.read()

   image = vision.types.Image(content=content)

   response = client.text_detection(image=image)

   texts = response.text_annotations

   return texts[0].description

def turnOnCamera(camera):

   camera.start_preview()

def imageRec():

   camera = picamera.PiCamera()

   pathToImg = 'image.jpg'

   pg.init()

   pg.mixer.init()

   # turnOnCamera(camera)

   # while True:

   # First Recognize written word

   #then take written word and use it to search through document.

   takephoto(camera)

   writtenText = ocr_handwriting(pathToImg).strip()

   print(writtenText)

   sayStuff('Searching for the word: "{0}"'.format(writtenText))

   #recogize words in document

   takephoto(camera)

   docText = detect_text(pathToImg)

   # print(docText)

   #search through words to find initial word

   if (writtenText.lower() in docText.lower()):

       crickit.servo_1.angle = 180

       sayStuff("Found it")

   else:

       crickit.servo_1.angle = 0

       sayStuff("Sorry, I didn't find it.")

def sayStuff(stuff):

   cmd_string = 'espeak -ven+f4 "{0}" 2>/dev/null'.format(stuff)

   cmd_string

   os.system(cmd_string)

def resetServoPos():

   crickit.servo_1.angle = 90

def main():

   pg.init()

   pg.mixer.init()

   resetServoPos()

   imageRec()

if __name__ == '__main__':

       main()