1 of 17

Human Vs. AI

Question: Human can beat AI? How does the level of AI training affect AI’s ability to solve human-level reasoning tasks?

Name: Yeji Choeh

School: Meadow Park Elementary School

Teacher: Ms.Jackson

2 of 17

Abstract

This project asked the question, “How does the level of AI training affect AI’s ability to solve human-level reasoning tasks?”

My hypothesis was that if AI is trained with less data, its ability will be lower.

I predicted that AI would perform better on logic and math problems, which have large amounts of training data, and worse on emotion judgment tasks, which have less and harder-to-define data.

To test this, I prepared 2 types of questions, Math quizzes, emotion-recognition questions.

Each category had 10 questions, for a total of 20 questions.

One AI model and three human participants answered the same questions.

To keep the test fair, emotion questions were recorded and shown the same way each time.

All results were recorded, and averages and error percentages were calculated.

The results showed that humans scored an average of 86%, while the AI scored 70%.

The AI performed best on logic problems and had the highest error rate on emotion recognition.

This shows a clear pattern that AI struggles more when tasks require emotional understanding.

Based on the data, my hypothesis was supported.

AI is less accurate at emotion judgment because it has much less emotional training data, and emotions are harder to clearly define than logic or math.

This explains why AI is strong at pattern-based tasks but still limited in human-style understanding.

3 of 17

Research

When artificial intelligence (AI) first appeared to the public, it was like a young child.

It needed a lot of human guidance to learn and make decisions.

However, AI developed very quickly. After the last victory of Korean Go master Lee Sedol against an AI program, humans could no longer beat AI in the game of Go.

Many people once believed that creative areas, such as drawing pictures or writing stories, would always be too difficult for AI. Surprisingly, today many artists and writers successfully use AI to create new and popular works. This shows how powerful AI has become in a short time.

This project was started to understand what AI looks like today, what areas still need more development, and why. It also explores what students like me can do in the future AI era.

Key research points:

  • How AI learns: Deep learning, inspired by the human brain.
  • AI improves when it is trained with more data, but it does not truly understand like humans do.
  • Humans use life experience, emotion and reasoning which AI does not naturally have and lesser trainable data.

Through this project, I tested and explored which areas AI is still behind humans and what this means for the future.

4 of 17

Hypothesis

If AI is trained with less data, then its ability will be lower.

AI performs better on logic, writing, and problem-solving tasks

because there is a large amount of high-quality data available for training.

However, AI will perform worse on emotion recognition,

because emotional data is hard to collect, and there is much less real training data.

Even though AI uses deep learning*, the difference in training data causes AI to have lower performance in tasks that involve understanding human emotions.

* Deep learning is a way that AI learns from data.

It uses many tiny parts, called “neurons,” that work like the human brain.

The AI learns by finding patterns in lots of examples.

The more good data it practices with, the better it gets.

5 of 17

Materials

  • Laptop computer (LG Gram, Intel Core i5 7th Generation)

  • 10 math questions

  • 10 emotion guessing questions

  • 3 human participants (Age : 30-40)

  • One free online AI model

  • One recording camera (Mobile phone)

  • Data sheet (Google sheet)

6 of 17

Procedures

  1. Prepare 2 sets of questions:

A. Math quiz (10)

B. Emotion recognition (10)� (To improve the accuracy of the experiment, I controlled the variables. When asking emotion questions, � acting differently each time could affect the results. So, to keep the conditions the same, I recorded the � questions in advance and used the same video for every test.)

  1. Ask to 3 human participants and the AI model to answer the 2 sets of questions.

  1. Compare the human score and the AI’s score for each category. Record all results in a table.

  1. Calculate averages and % error.

  1. Observe patterns and explain which tasks are hardest for AI.

7 of 17

Results - Data/Observations

Correct Answers (Out of 10)

Quantitative Observations:

  • AI performed best on math quiz.
  • AI struggled the most with emotion recognition.

( + One interesting finding was seen in the human test group.

The person who did well on math problems had a lower score on emotion recognition.

The person who did well on emotion recognition had a lower score on math problems.

Because there were only three human participants, this result may not be certain.

However, I found this pattern interesting and worth studying more in the future.)

Task

Human Average

AI

Human 1

Human 2

Human 3

Math Quiz

8

7

9

8

10

Emotion Quiz

8.6

10

7

8

7

8 of 17

Results - Data/Observations (cont’d)

As expected, the AI showed perfect performance on math problems.

But I was surprised that the AI did better than I expected on the emotion-recognition questions.

In some cases, the AI seemed to use the meaning of words in the sentence to guess the emotion.

However, the AI had difficulty understanding sarcastic or ironic tones, when the real feeling was different from the words.

[Emotion Recognition Questions] [Human Answer] [AI Answer]

9 of 17

Results - Data/Observations (cont’d)

Then, why there is a difference in AI ability by tasks while using the same deep learning mechanism?

Because, so far, the amount of data is very different depending on the task.

  • General (Logic / Math / Language) area - trained on very large text datasets, such as;

- Books, Websites, Articles, Math explanations and tutorials

For example: Common Crawl contains hundreds of trillions of words before filtering.

This makes math, logic, and writing tasks easier for AI to learn, because the data is easy to collect and easy to label as right or wrong.

  • Emotion Recognition - data is much smaller and harder to make. Examples of emotion datasets are;

- GoEmotions: about 58,000 labeled text samples / FER-2013: about 36,000 face images / AffectNet: about 1 million images

Even the largest emotion datasets are millions, while language data is trillions.

10 of 17

Conclusion/Results Discussion

My background research helped me understand how AI learns from data.

I learned that AI performs better when it has more training data. This research helped me form my hypothesis and design my experiment to compare logic and emotion tasks.

The answer to my question is: AI performs worse on emotion judgment than on logic tasks. My hypothesis was supported by the data.

The data verified my hypothesis because AI scored much higher on logic problems and much lower on emotion-recognition questions.

This project mostly turned out as I expected, but I was surprised that AI did better than expected on some emotion questions by using word meanings.

From my data, I found a clear pattern: AI performed best on logic tasks and worst on emotion and common-sense tasks.

(Humans scored an average of 86%, while AI scored 70%. The error rate was highest in emotion recognition.)

One reason for this result is the difference in training data size.

AI is trained on trillions of words for language and math, but only millions or fewer emotion examples.

Emotion data is also harder to label because people may disagree, and sarcasm or hidden feelings are difficult to define.

Some variables that could affect the results include the small number of human participants and the limited number of test questions.

Scientifically, this explains why AI is strong at patterns but weaker at understanding emotions.

This finding is similar to other studies that show AI struggles with emotions and real understanding.

In conclusion, AI is less accurate at emotion judgment because it has much less training data, and emotional data is harder to clearly define than math or logic data.

11 of 17

Application/Future Research

If I could do this project again,

I would test more AI models and use a larger number of questions to get more accurate results.

My results can be used in everyday life.

Understanding AI’s weaknesses helps people design better robots, learning tools, and safety systems.

For example, knowing that AI struggles with emotions can help humans double-check important decisions.

People who may be interested in this research include AI scientists, teachers, psychologists, and engineers, because emotions and understanding are important when AI works with humans.

This project can also be applied to similar studies, such as how AI understands body movement, voice tone, or behavior.

For future research, I would like to test what happens when AI is trained with emotional data, videos, or real-world experiences.

I also want to explore how AI can better understand human motion, not just words or pictures.

Personally, I want to keep exploring these underdeveloped areas.

I hope to become an AI scientist who helps develop AI that understands humans better and helps make the world a safer and better place.

12 of 17

References Cited

  • IBM. (2023). What is artificial intelligence (AI)?
  • https://www.ibm.com/topics/artificial-intelligence

  • Google Research. (2021). GoEmotions: A dataset for emotion recognition in text.
  • https://github.com/google-research/google-research/tree/master/goemotions

  • Common Crawl Foundation. (2023). Common Crawl overview.
  • https://commoncrawl.org

  • BrainPOP. (2021). Artificial intelligence.
  • https://www.brainpop.com/technology/artificialintelligence

13 of 17

The following slides are for the IUSD Fair Only

Should you be selected to move on to the OCSEF fair, the following 6 slides will need to be deleted. The video/images from these slides will be uploaded to the OCSEF website.

OCSEF slide limit is 12!!

Delete this slide before submitting your project to the IUSD Science Fair

13

14 of 17

15 of 17

16 of 17

17 of 17