Simple AWS
Posts
re:Invent 2023 Recap and the Future of AI

re:Invent 2023 Recap and the Future of AI

Guille Ojeda
December 09, 2023

Imagine this scenario: Your favorite newsletter went on pause because the author was at re:Invent, and you miss it 🥹. There were a ton of announcements, everyone and their dogs 🐶 is writing a recap, the 2023 re:Invent YouTube playlist is over 25 days long, and you're not sure how to interpret so much information.

Don't worry, I'm back! I've got a short recap from re:Invent, and then an analysis on what this means for the future of us cloud people.

Re:Invent 2023 announcements recap

Amazon Q

This is the big one. Basically it's ChatGPT (different AI model though) trained with all of AWS's docs and kept up to date. Moreover, you can have it perform Retrieval-Augmented Generation (RAG) by connecting different sources such as information from your own AWS account. That means you can ask it a general question like "how can I reduce costs on EC2?", or ask it to use your own information to answer "using data from my AWS account, how can I reduce my costs on EC2?".

The promise is that it'll serve as an assistant to engineers and architects. ChatGPT already does something like this for you, but using Amazon Q has these advantages:

Less hallucinations: LLMs are probabilistic and will always have a chance of making things up. Amazon Q improves this by providing references to what it says, so you can go to the relevant doc with just one click, and easily verify an answer.
Much easier to send your data: You can feed your data to ChatGPT if you want, and it can provide tailored recommendations. However, that means figuring out all the data it needs, writing the AWS CLI commands, parsing the output, and copy-pasting it to the ChatGPT UI. With Amazon Q you just grant it access, and it will pull what data it needs.
A lot more private: ChatGPT uses your questions and its answers to train the model. Amazon Q doesn't. Your data isn't stored beyond your questions, and it isn't used. Besides, AWS already had this data in the first place.

You can lean more here.

AWS Graviton4 and Trainium2 chips

This sounds minor, but the ARM-based Graviton chips designed by AWS have been providing a good increase in performance for ARM-ready applications ever since the first Graviton chip and its associated instances. Graviton4 is expected to provide 20% better performance over Graviton3. Basically, wait for the next gen of g instances (for example r8g is already available in preview) and migrate your ARM-based instances as soon as possible. If you're still running Intel-based instances, consider switching to Graviton-based instances. Some applications might experience some issues, or not see performance gains, so run some experiments first.

Amazon Bedrock

Bedrock is the AWS service you use to power your AI-based applications, selecting different models and parameters. Comparable to the GPT API, not to ChatGPT (the web application). In the past few days the service got several very nice improvements and features:

Easily customize models: Instead of having to get the training data, download a model, set up some EC2 or SageMaker instances, run the training job and set up the trained model somewhere, how you can just get the training data (no way around that yet), run the training job, and use the model from Bedrock. Essentially, managed LLM customization. Announcement.
Multistep AI apps: Using Agents for Amazon Bedrock, you can create generative AI applications to execute multistep tasks across company systems and data sources. Announcement.
Guardrails: A way to promote safety rules in your users' interactions with Bedrock-based AI applications. Announcement.
Fully managed RAG: Retrieval-Augmented Generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources. Essentially you set up a layer on top of the model which retrieves data from external sources and augments AI prompts and responses. Now Bedrock offers a managed capability called Knowledge Bases to do this. Announcement.
More AI models, model comparison, And more!

Amazon SageMaker

SageMaker was AWS's managed ML platform since its announcement at re:Invent 2017 (same year as Google's "Attention Is All You Need" paper that pioneered the GPT architecture, and before OpenAI's very first GPT model). It has evolved a lot over the years, but it was most likely outside your radar because it built for data scientists and data and ML engineers.

Nowadays, with OpenAI making AI cool, everyone and their dogs is looking to learn AI and ML. That pushed AWS to build some cool features for SageMaker, like using ML models to perform business analysis at scale, using natural language to prep data, managed infrastructure for distributed model training at scale, a new code editor, and more.

However, SageMaker still mainly caters to data scientists who know what they're doing and struggle a bit with the tech side of ML, and to data and ML engineers who're great at the tech side of ML and mostly know what they're doing data-wise. It's not built for web developers who just want to consume AI models, that's what Bedrock is for. If you just want to consume AI as an API, use Bedrock, or ChatGPT's API. If you want to do more than just use an API, check out SageMaker's features.

Amazon S3 Express One Zone

S3 offers a ridiculous 11 9s of durability, and it's fast and cheap. But what if you care more about fast and cheap, and less about durability? That's what the new storage tier S3 Express One Zone offers: 10x the performance, half the price, and still 11 9s of durability except for destruction of an AZ events.

Here's a picture of me putting an object in an S3 Express One Zone bucket:

Fun fact: This was 2 days before the announcement. Nobody knew why there was an S3 bucket walking around re:Invent. Then we found out about the new storage tier. And after that, we found out this picture was needed to get some secret swag.

AI Everywhere

Here's the full list of re:Invent announcements.

Nobody should be too surprised if somebody starts calling this re:AInvent. 40% of the sessions were about AI, so us attendees were already prepared. Besides, some of us had preview access to Amazon Q, PartyRock, and a few other things (sorry I couldn't tell you, there was an NDA!). But even if it's not surprising, it's still a good tell of where this leads us.

Where Will AI Take Us?

This is my personal opinion, so take this with a grain of salt. Well, this entire newsletter has always been my personal opinion, but I'm an AWS expert, not an AI and future of tech expert, so take this with a bigger grain of salt than usual.

The Current Generation of AI

The current generation of AI has been marked by the following characteristics:

Probabilistic generation of text (includes code) and images
No real understanding or "thinking" from AI
Good enough results for most stuff
Some hallucinations

With the right prompts, you can generate results that are worse than what a professional can generate but better than what a layman can, and for a very small price. That's one big use case: better stuff than what you can generate in domains where you're not an expert.

The other big use case is good enough stuff that you can fix and improve, in domains where you are an expert. Notably for us, this includes code. I've often used ChatGPT to generate code snippets, CLI commands, CloudFormation snippets, and even some text ideas. I have a particular writing style that I haven't found a way to even remotely emulate in ChatGPT, so I always end up rewriting everything (this isn't a flex, it's actually a downside of being me).

AI Everywhere

The old (first half of 2023) way of using AI for work is opening ChatGPT in a separate tab and copy-pasting any information that you absolutely need to (and be careful with your privacy). Not because there's a cost to sending it information (except for privacy), but because sometimes it's not easy to get everything in text, fit it into a prompt, etc.

The new way, which was probably pioneered (in IT at least) by GitHub CoPilot and Amazon CodeWhisperer, is to have the AI in your context, so you don't have to go somewhere else. This reduces the Alt-Tabbing and context switching, and it also lets the AI tool get information from the context, such as the surrounding code, without you having to spend time, effort and focus gathering that information and passing it.

A lot of the minor announcements of AWS go in that direction: There's a lot of services that will now let you use AI in place to write things in natural language, and will generate code from that. The most useful for my typical use cases are queries in CloudWatch Logs and Athena, and CodeWhisperer in the CLI. But the general idea is the same for everything: instead of switching to a different window to describe in natural language what you need and copy-paste the result, you can do that in place. Results aren't better, but it's much faster, and it further reduces your need to memorize syntax. You still need to be really good at figuring out what you need and at describing it.

Everyone is Doing AI

Ever since the ChatGPT API opened we've seen a ton of startups creating wrappers around it. They can be divided into three categories:

Plain wrapper: A product that "does X with AI", where AI is at least 80% of doing that X, and the only thing AI it has is a call to an API. For example, those apps that will write text for you.
Augmentation: A product that already does something, and now it uses AI to add or improve some features that are just one part of the entire product. For example, a CRM that lets you contact leads via email, and it writes the first version of the email with AI.
Complex steps: A product that does something specific, and it uses AI to achieve that, but using AI requires multiple steps, possibly with manual steps and feedback in between, and with some expert knowledge used in the AI prompts. For example, Courses.ai (built by a friend of mine) will guide you through creating a course by asking a ton of questions and using AI to help you answer them. Honestly, the questions alone are worth the price, AI is just the cherry on top.

With OpenAI's GPTs (a new feature of ChatGPT) and Amazon PartyRock you can build your own plain wrapper for free, so those apps will not-so-slowly disappear. And we won't miss them.

Augmentation will still be a thing, mostly because there was already value in the app before AI came around to augment them. The interesting thing here is that new players that get more creative with their AI augmentations might turn out to have better products than the old and established solutions. The key factors will be creativity, ability to move fast, and an understanding of what users need. This also leads to a good potential for niche startups, such as a CRM for solopreneurs (right, as if that niche wasn't already super crowded).

Complex steps are where new apps that have seen the most success. There's the fear that they can be replaced by a single prompt, but current AI models can't do those things with a single prompt yet, not matter how good the prompt (trust me, a LOT of people have tried). This also includes creating an entire set of Infrastructure as Code templates, or an entire application.

There are two main reasons why complex steps can't be replaced with just one prompt. On one hand there's just too much information to hold in the current context windows (how much an LLM can “remember” of the current conversation). On the other hand just a single hallucination can break everything, and the probability of at least one hallucination increases dramatically as the size of the output increases.

When Does AI Break Everything?

GPT-6.

It won't be the current generation that breaks the world. In fact, I think we're mostly at the limit of what the current generation can do. We will continue to see some improvements and price reductions (for example the Jurassic 2 model in Bedrock performs pretty similar to GPT-3.5 at half the price), but nothing truly game changing regarding AI technology. The advances that we can still see will come from creative use of augmentation and complex steps in specialized sectors, but those aren't things that can't be done right now, just things that haven't been done right now (if I could give you a specific example, I'd be launching a startup around that). Price reductions will also expand on what is profitable to do with AI, but it won't be a breaking change, since those price reductions will be well below an order of magnitude without a technological breakthrough in AI or in AI hardware.

It won't be the next generation that breaks the world either. The next generation will make complex steps possible with one or very few prompts, and will allow richer integrations with other systems. "Write infrastructure as code for this app" will be possible, with a manageable amount of details about the app (AWS, containers, 100k users, include monitoring, etc), and some expert input. Complex steps will depend on humans still, but they'll take only a handful of relatively simple prompts, so you won't need an extra app for that.

The generation that comes after that, that's the world breaker. There won't be an intrinsic understanding or actual thought, but for most tasks it will be able to emulate it really well, just like the current generation emulates writing sentences and paragraphs and articles. It won't be a complete AGI in the strictest sense, and it won't be self-aware, so we're still not at the technological singularity proposed (predicted?) by SciFi authors. But it will be so good at so many information jobs that we'll face a societal singularity. Most of us in software will be redundant, the cost of building software (outside of deep tech) will become ridiculously low, and that will, in turn, break most other jobs that are entirely about processing information. The only thing left for us will be human trust, but only because we'll be slow to adapt. Simple AWS will be entirely written by AI, and you'll only keep reading it because you'll still think there's a person behind it. Not because I want to stop writing it, but because AI will write in 30 seconds something much better than what I can write in 6 hours, taking into account my continued improvement and my most sincere desire to keep writing this newsletter. I don't know where we go from there.

(this section is pure speculation, obviously!)

One bit of good news: The pace of improvement from GPT-3 to GPT-4 isn't sustainable. It came from a huge increase in model parameters, and an increase of the same magnitude from GPT-4 to GPT-5 is simply impossible due to the current hardware. There has been success in improving a model's output without increasing parameters, but nothing so far has shown even the promise of an order of magnitude improvement. So, we're still at least one technology breakthrough away from GPT-5, and possibly a few more after that until we reach the ugly part of the future.

I say GPT because it is the current benchmark, but obviously it could be any company. Or government, but that's an even scarier thought.

To end this article on a happier note, here's a picture of the legos they gave us at the AWS Certification Lounge at re:Invent:

And one of me being Certified Awesome:

Did you like this issue?

Loved it! 💖 | It was good 🙂 | No bueno 😑

Reply

or to participate.