MUG Generative AI Presentation
For the meeting that took place on Tuesday, October 10, 2023.
We covered a lot of ground, so this is meant to be a collelction of references to what I demonstrated and, of course, the code to make your own ChatGPT. So to speak.
Generative AI is a system (or systems) capable of generating text, images, or other media in response to prompts. These things use generative models such as large language models (LLM) to statistically sample new data based on the training data set that was used to create them. You can use it to create art, music, stories, poetry, blog posts, and even write code for you.
It's so good at this stuff that people are lining up on either side of an imaginary fence, trying to take one side or the other. On the benefit side, we talk about enhancing human creativity, saving time and resources, exploring strange new worlds of fun and experimentation, improving customer experience, and more. On the less optimistic side, people are raising ethical and legal issues, such as plagiarism, bias, quality control, etc. There are, of course, people who worry that we are unleashing SkyNet, complete with images of the Terminator walking around shooting humans. We'll ignore that for now.
There's no question that we need to discuss this stuff, and we need to do that now. By now, I've written quite a few posts recently that touch on AI and what it means for us as humans. I'm rather obsessed with this topic, so subscribe at https://freethinkeratlarge.com if you want to follow along with these (and other) discussions from yours truly. Here are a few links to get you started.
https://www.freethinkeratlarge.com/guardrails-on-chatgpt-and-other-ai/
https://www.freethinkeratlarge.com/angry-at-ai-for-stealing-your-job/
https://www.freethinkeratlarge.com/does-artificial-intelligence-think/
One more, and one I think is increasingly and vitally important in a world of automation and smart machines.
Where you wind up in this discussion is up to you. Play with the tools available. Find out what they can do. Explore. Or, as Miss Frizzle would say, on the Magic School Bus, “Take chances, make mistakes, and get messy!” Oh, right. I wrote something riffing on that as well.
Allow me one more attempt to capture your eyeballs before moving on. Consider joining my Discord server at https://discord.gg/STbxVA3 which covers a lot more than just AI.
Okay, let's dive in to those links.
This is the biggee and the one behind the insanely popular, ChatGPT.
This is a free chatbot that can generate an answer to almost any question it’s asked, developed by OpenAI and largely funded by, yes, Microsoft. Microsoft uses GPT-4 as the conversational brains behind the new Bing. Even if you aren't a big Microsoft fan, it's worth trying out Bing. It's free, is built into the open-source Chromium browser, and it's nicely tied into GPT-4 through a handy sidebar you can access at any time. If you don't want to use the Edge browser, no worries. You can just visit https://bing.com/chat and get started.
OpenAI's ChatGPT, Microsoft's Bing, Google's Bard, Anthropic's Claude, and a growing list or others, all do some of the same things. With them, you can produce computer code, write (or rewrite) college-level essays, poems, respond to emails, tell jokes, but also sometimes generate nonsensical or inappropriate responses (often referred to as hallucinations).
If you've spent any time with ChatGPT, you've probably run across the message that tells you it's too busy and that you should try again later. Also, you may have noticed that some of the cool toys they promise, including Code Interpreter and DALL-E3 just don't show up on your free account. One way around this is to sign up for ChatGPT Plus. This service gives you premium access at $20 US per month. In addition to not having to wait in line, you also get access to the latest GPT-4, not available to free tier ChatGPT users.
To subscribe to ChatGPT Plus, look for the Upgrade button on the screen when you are logged in.
Using OpenAI's API, you can create your own ChatGPT, either driven entirely from the command line, or graphically, in a browser. I demonstrated this at the meeting on Tuesday. One big additional plus of going this route is that you can use ChatGPT on a "pay as you go" basis and it's really, really, inexpensive, especially if you work with the GPT 3.5 model. If you don't want to pay the $20 US per month, this is a great way to go.
The first thing you need is a developer account on OpenAI account. Sign up at https://platform.openai.com/examples. You'll need to add a payment method and you should, as I demonstrated, make sure you set an upper tier on how much you want to spend per month.
Next, you'll want to generate an API key. Make sure you write it down, because OpenAI will only display it to you the first time. You can create as many of these as you want and delete them when you're done.
Now, it's own to coding. Using a recent distribution, one that supports Python3, we're going to add some additional components. Yes, I'm making this really simple to keep this short.
Make sure pip is installed and make sure it's up to date.
python -m pip install -U pip
Cool, Now, using pip, install openai (to interface with ChatGPT via its API) and gradio (to create the Web interface).
pip install openai
pip install gradio
The code you need for a simple ChatGPT implementation is as follows. Keep in mind that this is one of many ways to do this. In fact, you cold, if you want, ask ChatGPT to write you a Python script that will generate a Web interface to access ChatGPT via an OpenAI API key. I've had it create a Web app using gradio as well as Flask. For this example, I'm going to stick with gradio. Here's a simple script you can edit to suit your needs.
import openai
import gradio as gr
openai.api_key = "Your API key"
messages = [
{"role": "system", "content": "You are a helpful and kind AI Assistant."},
]
def chatbot(input):
if input:
messages.append({"role": "user", "content": input})
chat = openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=messages,
)
reply = chat.choices[0].message['content']
messages.append({"role": "assistant", "content": reply})
return reply
inputs = gr.inputs.Textbox(lines=7, label="Enter ChatGPT prompt here")
outputs = gr.outputs.Textbox(label="Reply")
gr.Interface(fn=chatbot, inputs=inputs, outputs=outputs, title="GTALUG AI Overlord Chatbot",
description="Go ahead. Ask away.",
theme="compact").launch(share=True)
Save it as "chat.py" or whatever you like.
In the script, there at line that says:
openai.api_key = "Your API key"
Well, that's where you put your API key. And where do you get this key? From here:
https://platform.openai.com/account/api-keys
Click on the button that says, "+ Create new secret key" and paste that into your script. Save the script and run it like this:
python3 ./chat.py
You now have your very own ChatGPT app! The screenshot below shows it running in Kubuntu,release 22.10.
Other Bots
As you noticed, I'm a huge fan of POE at https://poe.com , from the company behind Quara. POE is a kind of all you can eat smorgasbord of various AI bots, including GPT-4, GPT4-16K, Claude, Claude 2 (with a 100K context window), various Llama models (from Meta), and even a Stable Diffusion art generation system using SDXL (newest and shiniest from Stable Foundation). You can try it for free, but for unlimited access to everything they offer, and they offer a lot, you'll need to subscribe. I did, and I think it's well worth it. And no, I don't work for them.
Speaking of art generation, although there are more all the time, the big players here are DALL-E2, from OpenAI, Stable Diffusion, and Midjourney. AI art generation has come under particularly heavy scrutiny by artists who claim that the systems have 'stolen' their art and that what the systems create are essentially plagiarism and even outright theft. I personally believe they are misguided and I explain why in this post.
https://www.freethinkeratlarge.com/angry-at-ai-for-stealing-your-job/
Remember that 2 GB file? You can't fit all of that artwork into 2 GB and that's just the start of it. But I digress…
DALL-E3's biggest issue is the tight restrictions on its use and the sort of content you can create. It's fast and easy but those limits are real. For instance, I wanted to create an RPG battle scene but it balked at creating anything that involved violence. Well, there goes the video game industry. POE, with SDXL, on the other hand, just went ahead and gave me the following.
Midjourney has been (or had been) in the lead when it came to creating realistic looking AI artwork. Consequently, a great deal of the more impressive stuff you saw early on came from Midjourney. Their latest implementation, V5, is nothing short of amazing when creating humans.
Stable Diffusion is an open-source project and it has spawned a host of open-source projects that make it possible to run SD on your home computer, assuming it has the processing chops. I'm a huge fan of Stable Diffusion. It's crazy but I run 3 different SD generators on my notebook using a number of different models. Models, as I mentioned, can be fine-tuned to the style and feel of what you're looking to generate. You can browse and download Stable Diffusion model files from Civitai.
https://civitai.com/?types=Checkpoint
I use, and have used, a number of programs to create art with Stable Diffusion. I still have a fondness for NMKD, available for download from https://nmkd.itch.io/t2i-gui . This was the first SD GUI I used and while it's still quite nice, I am mostly using EasyDiffusion and Automatic1111. The former is nice to use and easy on the eyes, but the latter has more options. At the meeting, I showed you EasyDiffusion, so here's the link to that one.
EasyDiffusion (pictured below) can be downloaded from https://github.com/cmdr2/stable-diffusion-ui.
Automatic1111 (pictured below) is available for download at https://github.com/AUTOMATIC1111/stable-diffusion-webui.
I briefly said something about your computer having "the chops". My notebook, the one I demonstrated SD on, is a Lenovo Legion 5, with a AMD Ryzen 7 with 4800H graphics card and 32 GB of RAM. It also has a NVIDIA GTX 1660Ti card which Stable Diffusion, and most things you'll run on your own PC, like far more than "Team Red" AMD. It doesn't mean you can't do it with AMD, but there is more pain if you need to go down that road, and you may even want to just use an online service.
The image below is from Artsmart.ai. I bought a "lifetime sub" at a price I couldn't resist when it was available on Appsumo, so I do use it as well. f you want to try this one, you can use my link here: https://artsmart.ai/?via=marcel
I confess that I'm a huge Stable Diffusion fan, partly because of the ridiculous number of specially trained models that are available to download. I do use Artsmart as well, but part of that comes from the ridiculously cheap lifetime deal I paid for.
I won't go into huge detail here because I did not do so in the presentation, but here are a couple of AI music generation sites to explore (trust me when I say that there are plenty more). These can produce original and harmonious music based on a given genre or mood. Play. Experiment.
AIVA: https://creators.aiva.ai/
Soundful: https://my.soundful.com/
Another subject I barely touched on, but as you all know from discussions of deep fakes, it's possible to generate quite convincing voices these days. If you wanted to do voiceovers for your presentations, articles, stories, or whatever, using a variety of voices and accents, there are services that do this rather well. One such services is Revoicer. Like Artsmart, the service I mentioned earlier, I bought an inexpensive lifetime deal (again, through Appsumo) so I have access to a variety of these tools.
Revoicer is available at https://revoicer.app/speak .
There's also ElevenLabs which you can use to clone your own voice. Then, on your next project, you can narrate your own presentation or let your AI persona do it for you. ElevenLabs is actually pretty awesome, but the subscription price might convince you otherwise.
https://elevenlabs.io/speech-synthesis
It seemed fitting that I tie all this up with a tool that can write a script using generative AI (based on a prompt, of course), speak the text of that script using an AI voice, generate a video from the voice and story, then use AI (again) to find stock images, video footage, GIPHY animations, music, etc, to create a complete video without you having to lift much of a finger. The tool I used to do that was Augie (from AugX Labs), It's kind of amazing and scary all at the same time. To see the video, check out yon unlisted YouTube link below.
You can check out Augie at the following address. https://beta.meetaugie.com/
I'm sure I forgot several things or links I talked about during the presentation, but there is so much in this field, and it changes so quickly, that it's frankly hard to keep up. Still, it's amazing fun. Also, it is the future.
Thanks to everyone who attended. Please consider visiting my new blog at https://freethinkeratlarge.com, powered by the open-source Ghost CMS.
Make sure you subscribe to my mailing list so that you know when I post new stuff.
https://www.freethinkeratlarge.com/#/portal/
I also run a Discord server that covers all sorts of topics, including AI, Linux and open-source software, movies, TV, music, religion, you name it. Click on the link below to become part of the discussion.
Thanks again!