AI Journal Club
Accessible AI, Easily locally available State-of-the-Art Methods for VLMs, LLMs, Image, Voice and Music Synthesis
Ammar Qammaz a.k.a. AmmarkoV → ammar.gr
Heraklion - 1/11/2024
Motivation for the talk
AI 1950
Subsymbolic AI
Digital “neurons” inspired by brains
Data in / Transformed data out
Symbolic AI
Computer Programs
Manually created features
Logic rules to process them
Motivation for the talk
AI 1950
Subsymbolic AI
Digital “neurons” inspired by brains
Data in / Transformed data out
Symbolic AI
Computer Programs
Manually created features
Logic rules to process them
Won
Nobel History
Slide taken from Christos Papadimitriou, Archimedes presentation 2022
Slide taken from Christos Papadimitriou, Archimedes presentation 2022
Overview..
Sze, Vivienne, et al. "Efficient processing of deep neural networks: A tutorial and survey." Proceedings of the IEEE105.12 (2017): 2295-2329.
Neural Network Perfomance Trend
Neural Network Size Trend
Neural Network Size Trend
Neural Network Size Trend
https://engineering.fb.com/2024/10/15/data-infrastructure/metas-open-ai-hardware-vision/
Neural Network Size Trend
https://edition.cnn.com/2024/09/20/energy/three-mile-island-microsoft-ai/index.html
Motivation, revisited..
- Foundation models, VLM/LLM agents will have profound influence in society, economy, privacy, ETHICS!!. We are among the Greek vanguard for their proper understanding/use.
Similar to Microsoft OS / Intel / Apple / Android it is not hard to imagine the lock-in for ChatGPT / Tesla / etc. AI/Robots..
- In terms of development even with a 10x efficient method, it is impossible to compete with foundation models, due to their size/computing/power/training corpus/resource availability
- Small architectural improvements yield huge savings due to scale for large companies → incentive for research
- Even huge models are surprisingly “small” on evaluation
- These models will be considered essential building blocks for future applications
- Practical to use models produce superb output,� so in this talk we are going to see some of them
Survey
OS | Windows | Mac | Linux | Other |
| | | | |
Prog.L. | Matlab/R | Python | C/C++ | Other |
| | | | |
LLM | ChatGPT | Claude | LLama | Other |
| | | | |
Everything we will see runs local
I am very happy to help
with linux setups :) !
AI in practice
AI in practice
My personal preference
Introducing LLama VLLM / LLM!
https://www.llama.com/
Step 1 : Get code!
Step 2 : Create venv
Step 3 : Activate venv
Step 4 : Install dependencies
Typical way to install requirements : python3 -m pip install -r requirements.txt
Step 4 : Install dependencies
Step 4 : Install dependencies
Step 5 : Run!
e.g. Anthropic Computer Use VLLM
VLLMs + Actuators → Robots!
Regular LLMs
https://github.com/ggerganov/llama.cpp
Regular LLM
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
./server -m models/Meta-Llama-3-8B-Instruct.Q5_1.gguf --port 8081
Regular LLM
How you “script” an LLM
User input that comes after is appeneded here
Regular LLM example
Regular LLM example
LLMs as a service
Generative AI with Diffusion
https://yang-song.net/blog/2021/score/
Generative AI with Diffusion
Generative AI with Diffusion
https://github.com/AUTOMATIC1111/stable-diffusion-webui
Generative AI with Diffusion
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui
cd stable-diffusion-webui
python3 -m venv venv
source venv/bin/activate
python3 -m pip install -r requirements.txt
./webui.sh
Generative AI example
Generative AI example
Generative AI example
Custom Synthetic Dataset generation
Generative AI example
Generative AI example
Is this art ?
Generative AI example
Is it not art ?
LORA Model Repository
https://civitai.com
Caption / Concept query
https://haveibeentrained.com/
Caption / Concept query
https://promptomania.com/stable-diffusion-prompt-builder/
Fast Voice synthesis!
https://github.com/rhasspy/piper
Fast Voice synthesis!
https://huggingface.co/rhasspy/piper-voices/blob/main/el/el_GR/rapunzelina/low/el_GR-rapunzelina-low.onnx
Learning Voice synthesis!
https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/
This Demo is going to be tricky!
This Demo is going to be tricky!
Generative Music Synthesis
https://github.com/facebookresearch/audiocraft
Generative Music Synthesis
https://github.com/facebookresearch/audiocraft
Generative Music Synthesis
https://github.com/facebookresearch/audiocraft
Generative Music Synthesis
Generative Music Synthesis
Papers..
http://ammar.gr/news
Thank you!