Jeopardy! Trivia Site
Group 7 – Somang Han, Sally Hu, Dat Nguyen, David Weissman
1
Have some coffee and play a Jeopardy! game with us!
About Us
Motivation
Welcome to Group 7 Presentation!
We are Somang, Sally, Dat and David.
2
Preview of Our Web Application
3
Preview of Our Web Application
4
Datasets
In our application, we will mainly leverage two datasets.
Kaggle Dataset
J! Archive Dataset
5
Entity-Relationship (ER) Diagram
6
Relational and Schema
7
Relation Name and Schema | Foreign Keys and References (if applicable) | # of Instances |
contestants(cid, name, occupation, state) | N/A | 12,748 |
jeopardy_qa(qid, category, question, answer) | N/A | 216,780 |
jeopardy_round(rid, round, value) | N/A | 118 |
jeopardy_show(showNum, season, airDate) | N/A | 6,820 |
jeopardy_episode(eid, showNum, rid, qid) | showNum: jeopardy_show(showNum) rid: jeopardy_round(rid) qid: jeopardy_qa(qid) | 216,910 |
contestants_play(cid, showNum, isWinner) | cid: contestants(cid) showNum: jeopardy_show(showNum) | 20,450 |
After normalizing, all of our relations are in 3NF.�
Application/Technologies
We utilized the following technologies.
8
�
Client
Server
Request
Network
Application Software (Client Portion)
Application Software (Server Portion)
Response
Communication Protocols
Complex Queries
See below some examples of our queries.
9
�
WITH top_answers AS (
SELECT *
FROM ( SELECT ANSWER
, COUNT(*) TOTAL_QUESTIONS
FROM JEOPARDY_QA qa
INNER JOIN JEOPARDY_EPISODE ep ON qa.qid = ep.qid
GROUP BY ANSWER
ORDER BY TOTAL_QUESTIONS DESC)
WHERE ROWNUM <= 20
),
top_categories as (
SELECT *
FROM (
SELECT DISTINCT qa1.category
,COUNT(qa1.answer) AS numAnswers
FROM jeopardy_qa qa1
INNER JOIN top_answers qa2 on qa1.answer = qa2.answer
GROUP BY qa1.category
ORDER BY numAnswers DESC
)
WHERE ROWNUM <= 10
)
SELECT tc.category ,qa.question ,ta.answer
FROM top_categories tc
INNER JOIN jeopardy_qa qa ON tc.category = qa.category
INNER JOIN jeopardy_episode ep ON qa.qid = ep.qid
INNER JOIN top_answers ta ON qa.answer = ta.answer
ORDER BY tc.category, ta.answer`;
WITH winners AS (
SELECT *
FROM contestants_view
WHERE isWinner=1
),
top_occupations AS (
SELECT *
FROM ( SELECT occupation ,COUNT(DISTINCT cid) as numOcc
FROM winners
GROUP BY occupation
ORDER BY numOcc DESC)
WHERE ROWNUM <= 20
),
top_players AS (
SELECT c.occupation,name ,COUNT(DISTINCT shownum) AS numWon
FROM winners c
INNER JOIN top_occupations o ON c.occupation = o.occupation
HAVING COUNT(DISTINCT shownum) > 1
GROUP BY c.occupation,name
ORDER BY c.occupation,numWon DESC
),
final_top AS (
select occupation ,name, numWon,
(ROW_NUMBER() OVER (PARTITION BY OCCUPATION ORDER BY NUMWON DESC)) AS grp
FROM top_players
)
SELECT occupation, name, numwon
FROM final_top
WHERE grp <= 3
ORDER BY occupation, numwon desc;
Performance
10
We employed several optimization techniques to speed up our performance.
Query | Timings (in seconds) | Improvements | ||||
Before | After | Delta | Creation & Indexing of View | Index on Jeopardy Q&A table | Restructure Query | |
Questions From Top Categories Of Top Answers | 1.125 | 0.682 | 0.443 | � | X | � |
Top Winners From Top Occupations | 0.227 | 0.173 | 0.054 | X | � | X |
Get Questions | 0.194 | 0.176 | 0.018 | � | X | � |
Top Winners With Most Consecutive Wins | 0.116 | 0.099 | 0.017 | X | � | X |
Top Questions by Category | 0.133 | 0.123 | 0.01 | � | X | � |
�
Technical Challenges
11
This is not an easy feat!
Play Function
While implementing the play page, we spent some time debugging (ex. overriding answers) to better user experience.
Fuzzy Matching
Matching a user’s answer to the correct answer was challenging since there may be discrepancies between actual answers and the user’s submission.
Front-End Design
Designing the website in a nice format was difficult. We began by browsing Color Hunt and used Jeopardy! show pictures.
Missing Data on the
J! Archive website
Missing data on the web made html syntaxes different and led to getting information that we did not want.
Thank You!
12