The State Of Generative AI [2023]

This presentation will give you a detailed overview of the current state of Generative AI.

👀 Click here to watch the full presentation on Youtube

👉 Full list of services mentioned in this piece and the full video presentation available on Notion

We have a lot to cover today, from a brief introduction to the origins and the inner-workings of Generative AI to a series of compelling demos of the technology in 5 different fields: text, image, audio, video and coding.

We will finish with some advanced workflows.

Small detail: we will be eating our own dog food.
All the images used in this presentation were created with MidJourney.
As Joe Rogan says “Buckle up folks, it’s going to be a wild ride.”

Let’s start with a definition.

What is Generative AI?

Generative AI, according to ChatGPT itself, refers to a subset of artificial intelligence that focuses on creating new, original content by learning patterns, structures, and relationships in existing data.
These AI models can generate text, images, music, code, and other forms of content.

Generative AI is leveraging what we call LLMs or Large Language Models.

Large Language Models, are advanced machine learning models designed to process and generate human-like text, based on large datasets.

These models are built using deep learning techniques, specifically neural networks inspired by the human brain, with many layers and a large number of parameters.

They are trained on vast amounts of text data, which enables them to understand and generate language with outstanding proficiency.

LLM Training

Training LLMs is a massive endeavour which requires tons of labeled data and computing power. In the initial stage of the first phase, it taps into some good old school NLP (Natural Language Processing models) and a corpus of human-annotated text data, then it moves into masking, fine tuning and evaluation.

The learning process starts with Pre-training

The first stage of the initial phase is “unsupervised learning”, via embedded text pairs (similar in meaning)

The model is fed with tons of text items, which are vectorized, it means converted into numbers, which enables the model to spot statistical patterns in the relationships between tokens and surrounding tokens (the context). For your information, a token represents on average ¾ of a word. 1000 tokens equals roughly 750 words.
We tell the model explicitly which pairs are similar in meaning, so that it can further compare vectors and conclude they’re also similar in meaning by deduction.
This requires a lot of data preparation (via some old school NLP models + human annotations)! There are now some open source datasets of those pairs: Paraphrase Adversaries from Word Scrambling (PAWS), Microsoft Research Paraphrase Corpus, Quora Question Pairs, etc.

The second stage of the first phase is “masking”

The model is fed with content including blanks, (masked) sections, and it is asked to predict the missing parts.

The second phase of the learning process is “Fine tuning”

The model is trained for specific tasks on labeled data. For instance for sentiment analysis, you can feed it with text input labeled negative – neutral – positive.

Finally, we move to “Evaluation & adjustment”

Now you feed the model with unlabeled data that it hasn’t seen before and ask the model to label the data, then compare the output to the correct answer.

You rinse & repeat until you’re satisfied with the results, then deploy the model in production.

GPT Meaning

The models developed by OpenAI and other AI players are called GPT, for Generative Pre-Trained Transformer.

They’re trained on a vast amount of data, to generate text in autocomplete mode, after instantly understanding the relationships between the elements of the input, in order to deliver a coherent output.

The main difference in the evolution of GPTs is the amount of parameters.
We have here the evolution for the OpenAI GPT models. 600x from GPT-2 to GPT-4.

A parameter refers to the adjustable elements of the model that are set during the training process.

Parameters are the internal variables of the model that are fine-tuned to minimize the difference between the model’s predictions and the actual target outputs in the training data, in other words what you expect from the model.

These parameters are also known as weights and biases.

On the input side of a neural connection, weights determine the weighted influence of the info received by the neurons in the neural network.
Whereas biases are a threshold applied to the consolidated input before applying a mathematical function which determines the output of the neuron.

So for instance here, we have 2 inputs, valued 1 and 1, weighted 2 and 1, which gives us 2 and 1, consolidated as 3.
Then we add the bias of 0.5. And get a value of 3.5.
We apply the mathematical function of the neuron, which gives us in this case 4.5, which is the value propagated forward in the network.

Depending on the outcome and the rules of the model, the info will be propagated or not to the next layer of neurons.

To train a model, you feed it with inputs and corresponding outputs.
The model is initialized with a set of weights and biases. They can be randomly generated or pre-trained on a different task.

You run the input through the model and look at the difference between the prediction AND the actual expected output.

Then, based on the error between the predicted output and the actual output, you perform what is known as backpropagation, which programmatically updates the weights and biases to reduce the errors.

You turn the buttons (weights and biases) of this giant factory to improve the results of the production line.

Gradually the model will make a prediction closer to the expected output.

Basically a reliable model offers predictions which closely match the target outputs.

The more refined your parameters, the more reliable your model will be. But also the more it will take time to process the input and generate the output.

I’ve drawn a schema which explains how weights and biases work at a very high level.
Those inner-workings will remain cryptic for 99.9% of the users of Generative AI applications but it’s worth taking a quick peek under the hood.
The neurons in the first layer all receive the same vectorized info (text represented as numbers), then the info is processed based on the weights and biases of the activation functions and moves forward through the network.

Generative AI is an augmenting tool

You’ve probably heard about AR, or Augmented Reality, which can add a digital layer on top of reality as we know it.
It doesn’t replace reality but overlays insights and instructions onto the existing world.
AR amplifies reality.

Generative AI is also an augmenting tool, which can dramatically enhance your skills and capabilities. As a kind of creative exoskeleton, it gives you superpowers to 10x your productivity in a wide range of personal and business activities. Today we will explore 5 types of content generation enabled by artificial intelligence, as well as some advanced automation workflows. This presentation will give you actionable tips & tricks to start using those technologies in your daily practice.

Who is using Generative AI?

Before diving into the presentation, let’s pause for a moment to have a look at this graph recently shared by Inside.com, based on a survey of 4,500 US professionals.

The #1 industry in terms of Generative AI adoption is currently Marketing & Advertising which is always at the cutting of technology. Accounting, which everyone expects to be heavily disrupted by Generative AI, comes in 4th place in the Top 5.

AI Text Generation

Text Generation and text processing are probably the most used forms of Generative AI, the most natural implementations of large language models.

ChatGPT, the #1 text generation application, is the fastest growing consumer application of all times. It went from 0 to 1 million users in 5 days. It took ten months to Facebook to reach the same level. ChatGPT went from 0 to 100 million users in 2 months. It took TikTok 9 months to reach the same level.

ChatGPT is currently the #1 interface to interact with a large language model but it’s not the only one.
Google Translate and DeepL are LLMs focused on machine translation.
DeepL is a German service created in 2017 by the founders of Linguee, launched in 2007.
The original concept was to crawl the web to scrape and present bilingual texts divided in parallel sentences.
Those pairs were the building blocks of DeepL’s training data.

In the field of creative and commercial writing, Jasper, Copy.ai, Writesonic, Rytr.me, among many others, are very popular AI Writing assistants powered behind-the-scenes by OpenAI’s GPT APIs. Other competitors include Writer.com and Peppertype.ai.

It’s a very crowded space, with new entrants on a weekly basis.

In Search, besides Bing, now powered by GPT-4 and Google’s Bard, now accessible on an invitation-only basis, let’s point out Perplexity, a very neat service citing its sources for every result, created by a team of A-listers from Quora, Databricks, Meta, Palantir, Google & other leading companies. I strongly recommend you to try Perplexity, which very often performs better than ChatGPT for pure factual queries. In a not-so-distant future, we might see search engines replaced by answer engines, removing the need (and often the pain) to skim through pages of results.

ChatGPT (Plus & API) vs Jasper.ai

I wanted to compare the consumer price of two of the most visible players in the space: ChatGPT, the consumer UI of GPT-3.5 and GPT-4, on one hand, and Jasper, one of many AI writing assistants, on the other hand. I also wanted to compare the wholesale price of the GPT-3.5 API vs Jasper’s consumer price for 100,000 words per month.

ChatGPT Plus is offered at $20 per month. Bear in mind that GPT-4 currently has a cap of 25 messages every 3 hours. There are no advertised usage limits for ChatGPT Plus. Jasper is priced in what they call “Boss Mode” at $82 per month for 100,000 generated words vs. 26 cents via the OpenAI API (incl. prompt tokens, which also invoiced by OpenAI).

What’s the takeaway?
Well, if you have a regular need for content and are capable of crafting your own prompts and calling the API via a no-code platform like Zapier or Make or, even better, via a custom Python script, you will save A LOT OF money by skipping the user-friendly abstraction layer offered by Jasper and its direct competitors. We’re talking about a 300x difference. That’s pretty significant. 1 million words via the GPT-3.5 API will cost $2.6. Jasper doesn’t communicate pricing above 700,000 words, currently sold at $500 per month.

For your information, without a very strong moat, Jasper nevertheless managed to raise $125M at a $1.5B valuation in late 2022 (while Openai has so far raised a total of $11B in 6 rounds, mainly from Microsoft, for a total valuation estimated at $29B).

Generate a Python list of 50 titles with ChatGPT

I will spare you the usual ChatGPT Q&A screenshots shared on Twitter and focus on a few more advanced examples of the power of this technology.

For my first demo, I asked ChatGPT to generate 50 unique article topics and then iterate through the list to write each article one by one and output the results in HTML format.

I love the fact that you can get a clean Python list as a reply, not the usual ordered or unordered list. I’ve highlighted the square brackets around the Python list.

Generate a Python list of 50 titles with ChatGPT

If you’re a bit familiar with programming, you’ve noticed that I’ve basically created a FOR LOOP in a TEXT PROMPT.

Titles = [“title 1”, “title 2”, “title 3”,…]
for title in titles:
write article

As you see, ChatGPT generated a clean HTML output for the first article, ready for integration in your favourite CMS.

Generate a Python list of 50 titles with ChatGPT - HTML articles

You could even ask it to wrap the article in a DIV, add a CSS class, etc.
That’s the beauty of a tool which can output both text and code.

And if I hit NEXT after the first article, ChatGPT will generate the second piece, etc. until the end of the Python list.

This way, I can avoid truncated outputs. I just have to hit NEXT 49 times to get 50 articles. If you’re able to code a basic Python script, you can also just grab the Python list generated in the first step and iterate through it with an actual For Loop to get your 50 articles automatically. You can even publish them via an API call at the end of each iteration, on a WordPress site or on any other CMS. And remember that if you’re using the native OpenAI API, you will only pay roughly $0.002 per article, that’s 10 cents for 50 articles.

Using ChatGPT for brainstorming purposes

Here’s another super effective use case of ChatGPT: ideation in the brainstorming phase.

Simply set the scene and ask the AI to come up with original ideas.

Most of the time, you’ll be impressed by the relevance of its suggestions. You can also fine tune the prompt to get more specific ideas, around SEO, paid advertising, event marketing, etc.

ChatGPT Role Play

In the previous example I showed you how you can assign a role to the AI agent, which determines the way it will answer your question.

You can push this concept a little bit further and use ChatGPT in a role playing game,

For instance when you need to come up with relevant questions to ask to a candidate in a job interview.

The AI can even suggest examples of good answers, which is pretty useful for candidates preparing for an interview.

ChatGPT Summarization Python Script

If you’re too lazy to read a blog article, you can quickly create a web scraper with BeautifulSoup in Python,
type in the URL, extract the content and call GPT-3.5, via the OpenAI API, to generate a 10-point summary and a super short 3-point takeaway digest.

This could be your first Python script.

FYI, for a typical blog article, this quick 2-part summary will cost less than a cent of a US dollar.

from bs4  import BeautifulSoup
import requests
from openai_chatgpt import chat

URL = input("Enter the URL: ")
#initialize usage
usage = 0
#set header for scraper
headers = {"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36'}
#scrape content
page = requests.get(URL, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')
title = soup.find('title').get_text()
content = soup.find_all('p')
print(title)
text = ""
for p in content:
    text += p.get_text()

cleaning_prompt = f"Remove from the text the content which isn't related to the topic, then output the result summarized in 10 bullet points. \n Topic: {title} \n Text: {text} \n Clean OUTPUT:"
cleaning = chat(cleaning_prompt)
cleaned_text = cleaning[0]
cleaned_text_usage = cleaning[1]
usage += cleaned_text_usage
takeaway_prompt = f"Summarize the text in 3 short bullet points of 3 lines each. \n Topic: {title} \n Text: {text} \n Takeaway OUTPUT in 3 bullet points:"
takeaway = chat(takeaway_prompt)
takeaway_text = takeaway[0]
takeaway_usage = takeaway[1]
usage += takeaway_usage

print("In 10 points:")
print(cleaned_text)
print("---------------------------")
print("In 3 points:")
print(takeaway_text)
print("---------------------------")
print("Done! ")
print(f"Token usage = {usage}")
print(f"Total cost for GPT text generations = ${float(usage) / 1000 * 0.002}")

OpenAI Function (openai_chatgpt)

import openai

def chat(prompt):

   openai.api_key = "YOUR API KEY"

   response = openai.ChatCompletion.create(
   model = "gpt-3.5-turbo",
   messages = [{"role": 'system', "content": f'{prompt}'}],
)

   answer = response["choices"][0]["message"]["content"]
   usage = response["usage"]["total_tokens"]

   return answer, usage

From GenAI to SynthAI?

Speaking of summarization, I wanted to share with you a pretty compelling argument developed by A16Z on their blog.

The writer argues that while we’re now impressed by the volume of content the Generative AI can produce from a short prompt,
We will be even more impressed by the concision and precision of the AI’s output in a second wave of B2B services, presented under the SynthAI label, for Synthesis AI (as opposed to Generative AI).

The importance of prompts

By now, you have already understood that the quality of the output depends significantly on the quality of your prompts.
Prompt Engineering has quickly become a sought-after skill that some companies are ready to pay up to $300,000+ dollars per year.

Multiple marketplaces are now selling readymade prompts for various Generative AI services.

You can check out promptbase.com and prompthero.com.

I tested Bard by Google…

I’ve just started testing Bard by Google, presented as a competitor to ChatGPT.
Honestly, I’m not impressed.
I asked it for some testimonials about my marketing services (you can actually find real ones on my home page).
Bard made up entirely fake testimonials (signed John Smith, Jane Doe & co),
then explained that they looked fishy, before reluctantly admitting that it made them up.
An epic fail, I would say.

AI Image Generation

Image Generation is one of the hottest fields in Generative AI, for a good reason: the images generated by tools like Stable Diffusion, Dall-E and especially MidJourney are simply mind blowing, beautifully illustrating almost any concept you can think of.

But they’re not limited to the usual fantasy or anime characters you’ll come across in most AI art galleries.

Let me show you how to start using MidJourney for very concrete professional use cases.

MidJourney Business Examples

Let’s explore a few examples of images generated with MidJourney for some of my clients.

The first image was meant to illustrate an article about a specific role in the organization.

The second one was a brainstorming around future store improvements.

The third one was part of my research around e-commerce opportunities.

The first image on this slide was generated to illustrate the back of the brochure of an upcoming live production.

The second one was meant to illustrate an article about yoga for kids

and the third one was part of a brainstorming session around a packaging project.

Using a Reference Image with MidJourney

Let’s see how to use a reference image in MidJourney.

Notice how the jacket in the first rendering was inspired by my reference photo, even if transposed to another context (drag & drop at the beginning of the /imagine prompt)

On the last example you see how the output closely matches the image reference.
Look at how well the AI understood the concept of a selfie, incl. for the dogs on the pictures.

Comparing Stable Diffusion to MidJourney

Another major player in the image generation space is Stability AI, with their Stable Diffusion model.
If you have a Windows PC, you can run it locally, provided you have a decent GPU.
There are also online demos, for instance on Huggingface, a platform for AI enthusiasts, sharing hundreds of models and datasets.
You also have cloud-based services such as rundiffusion.com where you can run an instance of Stable Diffusion.
Replicate.com offers API access to multiple versions of Stable Diffusion as well as to other image generation models.

Personally, I still prefer the rendering of MidJourney.
Stable Diffusion and Dall-e deliver rather inconsistent outputs.
They perform better with strong styling guidelines (such as “in the style of Johannes Vermeer”).

You will usually feel much more satisfied with the results of MidJourney.

One last thing re: image generation. Do not expect the AI models to write any readable text on the images.
Import the images into Canva to add your text, or use a Python script (or a no-code tool like Bannerbear) to programmatically add your text to a series of images.

Comparing DALL-E to MidJourney

Here you can see a few examples of images generated by DALL-E from the same prompt,
just changing the style of the output: Rubens, Vermeer or Warhol.

Using AI Image Generation in Canva

Canva, which has now more than 100 million monthly active users,
recently launched some AI features, both for text and image generation.

Here’s a quick demo of the image tool, which I must say delivers pretty decent results. It’s based on a fined tuned version of Stable Diffusion.

I like the way you can pick a clear style via the user interface.

There’s still in my opinion a more refined magical vibe in the creations of MidJourney but you could argue that it’s a matter of taste.

AI Audio Generation

We’ve seen how Generative AI can help you pen any form of copy and design stunning visuals, now let’s speak about audio.

We will cover Text To Speech, Speech To Text and we will also shortly discuss the impact of generative AI on the world of music creation.

Let’s dive in.

Text-to-speech has come a long way since the robotic voice developed for Stephen Hawking in 1986; a peculiar voice which he kept until his death in March 2018, aged 76. Despite the advances in text-to-speech synthesis, Stephen Hawking refused to upgrade his voice. The original 1980s sound had become part of his public persona.

Two years ago I wrote a long article about the Big Boys in the TTS game: Google Cloud Text-To-Speech, Amazon Polly and Microsoft Azure Cognitive Services TTS.

At that time, my preference went to Microsoft Azure.

With the rapid development of generative AI, new challengers are emerging on the scene. The most visible are murf.ai, play.ht and Elevenlabs, founded by 2 Polish engineers.

ElevenLabs now boasts more than 1 million users. Let’s listen to the evolution of the technology. The last example is a video clone I created using d-id.com with a clone of my own voice generated by elevenlabs. The text was written by ChatGPT, in the witty style of Joe Rogan.

Multiple startups are now offering voice cloning features but in most cases this is still rather expensive, except on elevenlabs.io where you can clone your own voice, currently only in US English, with a handful of short samples, for just $5 per month.

Imagine the impact of such a feature on the creative industries: voiceover artists, radio presenters and also singers.

Deep Fakes in Music

In April 2023, an anonymous guy on TikTok posted what he presented as a duet between TheWeeknd and Drake, called “Heart On My Sleeve”.

Some people thought it might have been a promo stunt by the artists themselves but most platforms, including Spotify, quickly removed the track after receiving a cease & desist from the publishers.

So we can assume that it was a real – pretty convincing – deep fake. The question is: are we moving to a time when artists and fans will use AI to produce infinite covers and original songs? It raises a lot of questions not only in terms of productivity but also on the legal front.

Who’s the copyright holder of those AI-generated original songs? What about the rights of the artists to protect the sound of their voice (beyond the obvious trademark on their name).

That’s an interesting time, to say the least.

Generation Rap Beats with AI

And of course you can now generate music with AI, beats, melodies and even vocal parts. Let’s see how you can use two of those tools, beatoven.ai and beabot.fm.

Using AI for podcasts

Text to Speech generation isn’t limited to solo performances. You can also the technology for fully automated conversations, based on AI-generated dialogues.

Let’s listen to the beginning of this episode of the Joe Rogan AI Experience, a parody featuring a clone of Joe Rogan and a clone of Sam Altman, the CEO of OpenAI.

Speech to Text

We’ve heard what you can achieve today with Text to Speech.

The opposite is Voice Recognition, also called Speech To Text.

That’s the technology used by voice assistants like Alexa, Siri and Google Assistant.

They convert your voice prompts into text which can then be used for search queries.

The big players in the industry all have their own Text To Speech products but we’ve also seen the emergence of new players like Whisper by OpenAI, Otter or Descript.

descript demo — Descript transcription demo: https://www.descript.com/transcription

Using those tools, you can easily convert a podcast (or the soundtrack of a video) into a transcript, which you can then use for all sorts of text publications, for instance blog posts.

Descript also offers the possibility to edit the original audio by editing the text, which is quite impressive and much easier to edit the soundwave. This product will also automatically remove long blanks and filler words. You can edit a video like a word document and instantly publish your changes. This will also remove the image frames corresponding to the text edits.

AI Video Generation

Generative AI can be used for various purposes in video creation and advanced editing.

Text to video is still a nascent space and most models you can test on Huggingface and other platforms don’t produce very convincing results, very often limited to 2 or 3-second clips.

But I recently came across a very intuitive app enabling you to generate short videos from a single prompt, delivering pretty good results. It’s called Decoherence and it was part of the 2023 Winter Cohort of Y-Combinator.

He’re a quick demo of the simple user interface (…) and here’s the final output, edited in Canva, after adding a jingle ordered on Fiverr.

It’s meant to be a promo for my weekly radio show on Riverside Radio.

Runway Gen 1 – Video to Video

Runway is probably the most visible company in the AI-assisted video editing space. They’re released the Beta of Gen-1, a video to video application.

You start with a normal video, then apply a style reference, either from an image, or a style preset or a text prompt and hit Generate. My first test was restyled from a cyberpunk reference image, whereas my second test was generated from a text prompt: DJ playing in the jungle.

I also asked the AI to generate a video for a DJ playing on a beach in Ibiza. Look at the t-shirt of the stage assistant in the background, who wore a winter jacket in the original video. Impressive detail I must say.

This tool is still a very arty experiment, limited in the demo at 3 seconds per generation. But you can easily imagine how restyling some basic footage can open up pretty wild creative opportunities.

And Runway didn’t stop at video to video, they decided to bring the video generative AI game to the next level: text to video.

Runway Gen 2 – Text To Video

Runway has just announced Gen-2, their latest model, which will enable you to create a video from a single text prompt, as you would create a static image in Stable Diffusion or MidJourney, without any driving video or reference image. Full Generative AI. Coming Soon. We could even say: coming soon to a theatre near you.

Gen-2 and other AI video tools will empower a new generation of filmmakers. I’m also sharing some examples of a promising open source text-to-video project, which you can fork on GitHub.

No need for expensive motion capture. Creativity at your fingertips. The same techniques will also be applied in the video game industry, to generate not only 3D characters but entire photorealistic virtual worlds.

You can imagine that in the very near future you’ll be able to generate your own fan version of any iconic movie with any plot at any time.

AI-Assisted Code Generation

AI-Assisted code generation is one of the most exciting fields in Generative AI. Talented web developers have long been considered as superhumans, boasting some cryptic skills enhanced by the scalability of computing.

Generative AI is now propelling their superpowers into the stratosphere.

Let’s have a look at GitHub Copilot, a coding assistant powered by the OpenAI Codex model.

Copilot can suggest code and entire functions in real-time, right from your editor.

Just as text-oriented services output predicted words, GitHub Copilot will suggest relevant code snippets.

Let’s watch a quick demo.

Simply hit TAB to confirm the suggestion. It’s also a great tool to document your code. The more you use it for a project, the more relevant it will be, not only inspired by the code contained in its training dataset but also by the logic of your own script.

It’s one of the best examples of AI amplifying your natural skills.
According to a report by Sequoia Capital, GitHub Copilot is now generating nearly 40% of code in the projects where it is installed. The service costs $100 per year on an annual plan. It’s definitely a great investment if you’re coding on a regular basis.

Coding Full Applications with AI

Using ChatGPT, you can go beyond simple code suggestions and create a full application, a plugin or a widget.

A developer recently re-created the iconic game Flappy Bird in 3 hours using ChatGPT (and MidJourney to generate the art). To be fair, this guy has a fairly good understanding of the software he’s using (Unity and Photoshop) and of all the game-specific terminology. A random person with zero experience in design, gaming and coding would not be able (yet) to reproduce this experiment.

But you can already ask ChatGPT to assist you with easier tasks such as creating a simple HTML form styled in material design, sending submissions to an endpoint, like Getform.io Let’s watch a quick demo.

Coding a specific Python function with ChatGPT

I’m using ChatGPT on a daily basis to code specific functions in Python. Let’s say for instance that you want to extract 10 screenshots at regular intervals from a video.

Just ask ChatGPT and you’ll get a detailed tutorial including the code to insert into your favourite IDE.

It took less than 30 seconds to implement this function into a longer script. As in traditional coding projects, it is advised to adopt a modular approach in your conversations with ChatGPT. Don’t expect the chatbot to output 500 lines of code in a single reply.

Coding a specific Python function with ChatGPT

It’s not possible because the engine is limited by the amount of tokens it can generate in a single reply.

Divide your brief in multiple sections. Add print commands in the code to help you with debugging.

And if something is wrong, copy-paste your current code and ask ChatGPT to fix its mistakes.
Be aware that sometimes you’ll end up in a never-ending loop of successive bugs.
But generally speaking you’ll get pretty good results if you’re well organized and crystal clear in your brief.

Advanced AI Workflows: AutoGPT, BabyAGI

At time of putting together this presentation, the latest advancement in the GPT world is the launch of autonomous AI agents, also called AutoGPTs, which are able to create a task list from a single prompt, carry out those tasks and deliver results when the tasks have been completed. AutoGPT is to GPT what self-driving cars are to automobiles.

In the near future, we can expect the verticalization of those AI agents to improve their effectiveness in a set of specific industries.

You could for instance develop AutoGPT for Social Media Management or AutoGPT for accounting.

This approach is best suited for recurring tasks.

Some people consider AutoGPTs, also called BabyAGIs, as the early version of AGI, for Artificial General Intelligence, the expected final outcome of the AI evolution, the ultimate singularity, when we will have an AI capable of solving any issue and performing any task, a superintelligent AI, more capable than all humans combined.

We’re not there yet but this perspective has many prominent thinkers worrying about the existential risk posed by an AI which would not necessarily align its goals with our best interests.

Python plays a central role in the AI ecosystem

Coming back to more practical considerations, I believe that Python plays more than ever a central role in the development of Generative AI services.

It sits at the crossroads of various input options (chat, voice, input from third-party services), content sources (proprietary data, open web resources, other LLMs), processing (by your favourite LLM, for instance GPT-3.5 or 4) and output channels (a website, an email, an instant message, a voice snippet, etc.).

I strongly advise anyone interested in AI to learn the basics of Python, with some help from ChatGPT, in order to be able to build advanced workflows and reap the benefits of this technology.

All the projects you see popping up on a weekly basis are powered by Python under the hood.

From Jobs to Vocations

What does the fast development of AI – and in particular Generative AI – mean for the future of work?

Before giving you my personal conclusion, I’d like to share a 3-min segment of a long conversation between Lex Fridman and Manolis Kellis, a Professor in Computational Biology at the MIT.

AI: A Force For Good

Generative AI is a vibrant ecosystem, with new entrants releasing products almost on a daily basis.
It’s another iphone moment for the digital industry.
We’re still in the early days of this paradigm shift.

The scene is moving fast, very fast, too fast according to some people who recently called for a 6-month break to focus on human-AI alignment and other security measures.
In my opinion, AI can be a Force For Good.

In a world subject to countless disruptions, it’s understandable to have legitimate apprehensions about the potential misuse or bad evolution of AI.
Most experts who express concerns about the ultimate outcome of AGI (Artificial General Intelligence) argue for a better alignment of AI goals with those of humanity as we know it. But is this really the best we can hope for? Has humanity always made the optimal choices over the past 300,000 years? What if the advent of AGI was instead an opportunity to reinvent our trajectory ? What if its infinitely effective capabilities could help us fight our worst instincts and develop our best intuitions?
What if AGI turns out to be our ultimate companion, not our worst enemy?
As you can see on the screen, MidJourney didn’t depict a nuclear mushroom on the horizon but a giant tree of wisdom.

I’m betting on a Bright Future for AI & Humanity.