HW3 Discussion
Content
Model track - Jeong Jun Lee
Analysis of AGI - [K-KP-E]
Knowledge
Knowledge
Program
Environment
Json file
JavaScript
Program
Othello Env
Model track - Jeong Jun Lee
Analysis of AGI - [LLM]
Information
Distribution
Imbalance
Language
Limitation
Limitations of LLM
Lack of
Compression
Reasoning in continuous latent space
Model track - Jeong Jun Lee
Sketch of AGI core capability - [Generalization]
Hopfield Networks
with other nodes implicitly
Model track - Jeong Jun Lee
Dynamic Node Architecture
Sketch of AGI core capability - [Generalization]
Node A
Node B
Node C
Model track - Byeong Chang Kim
Analysis of AGI - [DIKM]
“Human-Like” & “Intelligent”
Data: Observation in market A and B.
Information: Milk is 300 won, cereal is 1000 won in A. 500, 1000 in B.
Knowledge: Milk is cheaper in A than B.
Wisdom: Coffee would be cheaper in
Model track - Byeong Chang Kim
Analysis of AGI - [Dreamer]
Phase 1: Collects and learns current world dynamics
Phase 2: Uses learned time-series-based world model to sequentially predict value, reward, and action
- Learns policy and critic networks without actual world interaction
Use Dream/Imagination of Humans
Model track - Byeong Chang Kim
Analysis of AGI - [Dreamer]
1. Manipulated Phenomenal World
2. Absence of Dialectical Thinking
3. Processing Time Limitations
Limitation
Generative Method – GAN
Contrastive Method – Positive/Negative
Model track - Byeong Chang Kim
Meta-cognitive – “Know what I know and I don’t know”
Sketch of AGI core capability - [Dreamer + AMAGO2]
Maintain Dreamer’s human-like characteristics
Utilize AMAGO2’s meta-cognitive capabilities (Attention)
Contrastive encoder for counter problem processing
- capability to handle dialectical thesis, antithesis, and synthesis
Benchmark track - Byunghwa Yoo & Jung Min Kim
Key Characteristics for AGI Evaluation
1. Generalization Ability
2. Multi-Modality
3. Cognitive Competence Implementation
4. Continual Learning
5. Usage and Prior Knowledge Integration
1.Generalization Ability (Multi-task)
2. Human-like
3. Reasoning capability
4. Application
5. Creativity
Benchmark track - Byunghwa Yoo & Jung Min Kim
Current Benchmark’s Limitation
- Lack of modality
- Not continual learning
- If the model train with new one, we can’t check the ability of it already had.
- Measure only properties that it want to measure
- It become task-specific while we try to solve ARC with AI. (learn every patterns used in ARC)
- Get multiple multi-modal benchmark.
To evaluate AGI, all benchmark’s environment should be different.
- Continual learning with training set with multiple benchmark.
- Get IQ-like test, and reasoning task.
- Use human’s test.
By gathering them up, we can avoid task-specific benchmark.
Benchmark track - Byunghwa Yoo & Jung Min Kim
Othello’s Good Point
- Prototype of AGI Benchmark
An environment in which the early stages of AGI can be tested in a specific environment
- Explainability
This is discrete grid environment. So we can know what is changed.
- Try to evaluate “Human-like” action.
We can make our own strategy and play.
- With other environments, model can upgrade their own strategy to win.
Benchmark track - Byunghwa Yoo & Jung Min Kim
Improvement Plan
1. Highly restricted environment
- DL based and RL based approach are impossible for this benchmark. (Due to time constraint) – Make environment that can support DL.
2. No prior-knowledge
- Model can’t use their prior knowledge when they encounter the new task
- Making a DB can be helpful to utilize prior knowledge.
1. Rules are simple
- Mono Color game
- Call only coordinate
- Combine with other games (YINSH)
- 1v1 to multiple player
Peer review
- You can take the HW3 with your name on the post-it and read it.
- Please read the answer sheet carefully and return it to the owner when you finished reading it.
- Discuss with author about your idea.
~3:20 pm.
Group discussion
Key Discussions
- Evaluate Snakebench or other benchmark as an AGI benchmark. (B)
- Discuss about question 2 from this HW3. Evaluate that features are well-aligned with your question 1. (B)
- Consider whether current methods of processing and representing knowledge are sufficient. (M)
- Examine whether scaling up is truly the answer (M)
- Explore how we might measure “AGI-tic” using methods other than accuracy. (M)