Entrepreneurial Computer Vision Challenges
Opportunity: Prove that you or your team are computer vision experts! You will have 16 days to work on these challenges. This is an entrepreneurial computer vision challenge.
Serge Belongie and Jan Erik Solem have worked together to create the following Computer Vision Challenges for the LDV Vision Summit. Show off your wisdom to peers, investors, technology & media executives and startups.
World renowned entrepreneurs, computer vision experts and investors will be judges.
APPLY HERE: http://bit.ly/1muxGQZ
Judges: Listed in the Agenda http://www.visionsummit.net
Process and Guidelines:
Datasets: Provided by U. of California San Diego, U. of Texas,
Challenge Starts: April 23, 2014
Challenge End: May 14th, 2014 at 23.59 PDT
Challenge duration: 22 days
Pre-selection Phase 1: May 15-18th, 2014.
Pre-selection Phase 2: Before May 20th we will announce the finalists and invite 5-10 teams to NYC either June 2nd or 3rd for presentation coaching. Finalists will receive presentation coaching by Evan Nisselson and other investors, entrepreneurs. We will select the finalists who will present at the Summit after meeting in person. Teams who are not selected as finalists to present at the summit will receive 1 free ticket per team to the summit.
Finalists: 5 of the best teams selected by the pre-jury will be invited to give a short presentation of their solution and wisdom in front of the LDV Vision Summit audience in NYC on June 4th.
Winners: Winners will be announced at the end of the summit.
Challenges:
1. YouTube Video Text
The YouTube Video Text (YVT) dataset contains 30 videos collected from YouTube. Each video has a duration of 15 seconds with 30 frames per second at HD 720p quality. The text content in the dataset is divided into two categories, overlay text (e.g., captions, songs title, logos) and scene text (e.g. street signs, business signs, words on shirt).
Challenge: The challenge consists of designing an end-to-end system for detecting and recognizing words in the video frames.
Participants are encouraged to submit their own quantitative measures for each challenge. Preferably numbers that show performance in a real world context where both accuracy and compute cost is of importance.
Download dataset:
http://vision.ucsd.edu/content/youtube-video-text
Dataset: University of California San Diego
2. Egocentric (First Person) Video
The UT Egocentric (UTE) dataset contains 4 videos from head-mounted cameras, each about 3-5 hours long, captured in a very uncontrolled setting. The human faces in the videos are artificially blurred due to privacy reasons.
Challenge: The challenge consists of producing the best summarization of a given video to a specified length or number of frames, judged qualitatively by a panel.
Download dataset: http://vision.cs.utexas.edu/projects/egocentric_data/UT_Egocentric_Dataset.html
Dataset: University of Texas
3. Shoe Attributes
UT Zappos50K (UT-Zap50K) is a large shoe dataset consisting of 50,025 catalog images collected from Zappos.com. The images are divided into 4 major categories — shoes, sandals, slippers, and boots — followed by functional types and individual brands.
Challenge: The challenge consists of predicting the relative attributes for pairs of men's and women's shoes (open, pointy, sporty and comfort)
Participants are encouraged to submit their own quantitative measures for each challenge. Preferably numbers that show performance in a real world context where both accuracy and compute cost is of importance.
Download dataset:
http://vision.cs.utexas.edu/projects/finegrained/utzap50k/
Dataset: University of Texas
APPLY HERE: http://bit.ly/1muxGQZ
Good luck!
Thanks,
Serge, JanErik and Evan
Questions: Please contact Evan Nisselson: e AT ldvlabs.com