1 of 58

AI Journal Club

Accessible AI, Easily locally available State-of-the-Art Methods for VLMs, LLMs, Image, Voice and Music Synthesis

Ammar Qammaz a.k.a. AmmarkoV → ammar.gr

Heraklion - 1/11/2024

2 of 58

Motivation for the talk

AI 1950

Subsymbolic AI

Digital “neurons” inspired by brains

Data in / Transformed data out

Symbolic AI

Computer Programs

Manually created features

Logic rules to process them

3 of 58

Motivation for the talk

AI 1950

Subsymbolic AI

Digital “neurons” inspired by brains

Data in / Transformed data out

Symbolic AI

Computer Programs

Manually created features

Logic rules to process them

Won

4 of 58

Nobel History

  • 1906: Ramon y Cajal and Golgi: the neuron
  • 1932: C. Sherrington: the synapse
  • 1936: H. Dale and O. Loewi: neurotransmitters
  • 1963: Eccles, Hodgkin, Huxley: the spike
  • 1967 Granit, Hartline, and Wald; the eye
  • 1970: Katz, von Euler, Axelrod: synaptic vesicles
  • 1981 Sperry, Hubel, and Wiesel: vision
  • 1991: Neher and Sakmann; ion channels
  • 2000: Carlsson, Greengard, Kandel: plasticity
  • 2004: R. Axel and L. Buck: olfaction
  • 2014: O’Keefe, Mosers: place cells and grid cells
  • 2021: D. Julius and A. Patapoutian: heat and pressure sensors

Slide taken from Christos Papadimitriou, Archimedes presentation 2022

5 of 58

  • 1906: Ramon y Cajal and Golgi: the neuron
  • 1932: C. Sherrington: the synapse
  • 1936: H. Dale and O. Loewi: neurotransmitters
  • 1963: Eccles, Hodgkin, Huxley: the spike
  • 1967 Granit, Hartline, and Wald; the eye
  • 1970: Katz, von Euler, Axelrod: synaptic vesicles
  • 1981 Sperry, Hubel, and Wiesel: vision
  • 1991: Neher and Sakmann; ion channels
  • 2000: Carlsson, Greengard, Kandel: plasticity
  • 2004: R. Axel and L. Buck: olfaction
  • 2014: O’Keefe, Mosers: place cells and grid cells
  • 2021: D. Julius and A. Patapoutian: heat and pressure sensors
  • 2024: John J. Hopfield and Geoffrey E. Hinton : artificial neural networks
  • 2024: David Baker, Demis Hassabis, and John Jumper: AI protein folding

Slide taken from Christos Papadimitriou, Archimedes presentation 2022

6 of 58

Overview..

Sze, Vivienne, et al. "Efficient processing of deep neural networks: A tutorial and survey." Proceedings of the IEEE105.12 (2017): 2295-2329.

7 of 58

Neural Network Perfomance Trend

8 of 58

Neural Network Size Trend

9 of 58

Neural Network Size Trend

10 of 58

Neural Network Size Trend

https://engineering.fb.com/2024/10/15/data-infrastructure/metas-open-ai-hardware-vision/

11 of 58

Neural Network Size Trend

https://edition.cnn.com/2024/09/20/energy/three-mile-island-microsoft-ai/index.html

12 of 58

Motivation, revisited..

- Foundation models, VLM/LLM agents will have profound influence in society, economy, privacy, ETHICS!!. We are among the Greek vanguard for their proper understanding/use.

Similar to Microsoft OS / Intel / Apple / Android it is not hard to imagine the lock-in for ChatGPT / Tesla / etc. AI/Robots..

- In terms of development even with a 10x efficient method, it is impossible to compete with foundation models, due to their size/computing/power/training corpus/resource availability

- Small architectural improvements yield huge savings due to scale for large companies → incentive for research

- Even huge models are surprisingly “small” on evaluation

- These models will be considered essential building blocks for future applications

- Practical to use models produce superb output,� so in this talk we are going to see some of them

13 of 58

Survey

OS

Windows

Mac

Linux

Other

Prog.L.

Matlab/R

Python

C/C++

Other

LLM

ChatGPT

Claude

LLama

Other

14 of 58

Everything we will see runs local

I am very happy to help

with linux setups :) !

15 of 58

AI in practice

16 of 58

AI in practice

17 of 58

My personal preference

18 of 58

Introducing LLama VLLM / LLM!

https://www.llama.com/

19 of 58

Step 1 : Get code!

20 of 58

Step 2 : Create venv

21 of 58

Step 3 : Activate venv

22 of 58

Step 4 : Install dependencies

Typical way to install requirements : python3 -m pip install -r requirements.txt

23 of 58

Step 4 : Install dependencies

24 of 58

Step 4 : Install dependencies

25 of 58

Step 5 : Run!

26 of 58

e.g. Anthropic Computer Use VLLM

27 of 58

VLLMs + Actuators → Robots!

28 of 58

Regular LLMs

https://github.com/ggerganov/llama.cpp

29 of 58

Regular LLM

git clone https://github.com/ggerganov/llama.cpp

cd llama.cpp

make

./server -m models/Meta-Llama-3-8B-Instruct.Q5_1.gguf --port 8081

30 of 58

Regular LLM

How you “script” an LLM

User input that comes after is appeneded here

31 of 58

Regular LLM example

32 of 58

Regular LLM example

33 of 58

LLMs as a service

34 of 58

Generative AI with Diffusion

https://yang-song.net/blog/2021/score/

35 of 58

Generative AI with Diffusion

36 of 58

Generative AI with Diffusion

https://github.com/AUTOMATIC1111/stable-diffusion-webui

37 of 58

Generative AI with Diffusion

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui

cd stable-diffusion-webui

python3 -m venv venv

source venv/bin/activate

python3 -m pip install -r requirements.txt

./webui.sh

38 of 58

Generative AI example

39 of 58

Generative AI example

40 of 58

Generative AI example

Custom Synthetic Dataset generation

41 of 58

Generative AI example

42 of 58

Generative AI example

Is this art ?

43 of 58

Generative AI example

Is it not art ?

44 of 58

LORA Model Repository

https://civitai.com

45 of 58

Caption / Concept query

https://haveibeentrained.com/

46 of 58

Caption / Concept query

https://promptomania.com/stable-diffusion-prompt-builder/

47 of 58

Fast Voice synthesis!

https://github.com/rhasspy/piper

48 of 58

Fast Voice synthesis!

https://huggingface.co/rhasspy/piper-voices/blob/main/el/el_GR/rapunzelina/low/el_GR-rapunzelina-low.onnx

49 of 58

Learning Voice synthesis!

https://github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/

50 of 58

This Demo is going to be tricky!

51 of 58

This Demo is going to be tricky!

52 of 58

Generative Music Synthesis

https://github.com/facebookresearch/audiocraft

53 of 58

Generative Music Synthesis

https://github.com/facebookresearch/audiocraft

54 of 58

Generative Music Synthesis

https://github.com/facebookresearch/audiocraft

55 of 58

Generative Music Synthesis

56 of 58

Generative Music Synthesis

57 of 58

Papers..

http://ammar.gr/news

58 of 58

Thank you!