Gavin Li's Blog.

Auther of airllm, ex-Airbnb, ex-Alibaba AI Leader, unicorn AI advisor, entrepreneur.

Cover Image for Crazy Challenge: Run Llama 405B on a 8GB VRAM GPU

Crazy Challenge: Run Llama 405B on a 8GB VRAM GPU

I’m taking on the challenge of running the Llama 3.1 405B model on a GPU with only 8GB of VRAM. The Llama 405B model is 820GB! That’s 103 times the capacity of an 8GB VRAM! It clearly doesn’t fit into the 8GB VRAM. So how do we make it work?

Gavin Li
Gavin Li

More Stories

Cover Image for I Trained a 2D Game Animation Generation Model to Create Complex, Cool Game Actions (Fully Open-Source)

I Trained a 2D Game Animation Generation Model to Create Complex, Cool Game Actions (Fully Open-Source)

Six months ago, one afternoon, a friend came to me with an AI question. The problem was that he tried to use OpenAI’s text-to-image model to generate sprites for 2D game animations, but couldn’t achieve it due to character misalignment and consistency issues. I found it interesting and, on a whim, started training a 2D game animation generation model.

Gavin Li
Gavin Li
Cover Image for AI Scribes:10+ $10m ARR Companies, Too Late to Join?

AI Scribes:10+ $10m ARR Companies, Too Late to Join?

The AI scribes field has been booming lately, with no fewer than ten companies earning tens of millions of dollars annually. I've conducted in-depth research on over 70 companies in this space, analyzing their products, founding dates, available funding information, revenue disclosures, each company's barriers to entry, and their unique positioning.

Gavin Li
Gavin Li
Cover Image for Why It’s Extremely Hard to Start an AI Application Business with Large Language Models

Why It’s Extremely Hard to Start an AI Application Business with Large Language Models

The media hype surrounding AI might have you believe it’s soaring high above us, on the brink of rendering all human jobs obsolete and poised to rule the world. However, this is merely hype.

Gavin Li
Gavin Li
Cover Image for Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU!

Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU!

The strongest open source LLM model Llama3 has been released, some followers have asked if AirLLM can support running Llama3 70B locally with 4GB of VRAM. The answer is YES.

Gavin Li
Gavin Li
Cover Image for Open-Source SORA Has Arrived! Training Your Own SORA Model!

Open-Source SORA Has Arrived! Training Your Own SORA Model!

To date, the open-source model that comes closest to SORA is Latte, which employs the same Vision Transformer architecture as SORA. What exactly makes the Vision Transformer outstanding, and how does it differ from previous methods?

Gavin Li
Gavin Li
Cover Image for How Your Ordinary 8GB MacBook’s Untapped AI Power Can Run 70B LLM Models That Will Blow Your Mind!

How Your Ordinary 8GB MacBook’s Untapped AI Power Can Run 70B LLM Models That Will Blow Your Mind!

Do you think your Apple MacBook is only good for making PPTs, browsing the web, and streaming shows? If so, you really don’t understand the MacBook.

Gavin Li
Gavin Li
Cover Image for Unbelievable! Run 70B LLM Inference on a Single 4GB GPU with This NEW Technique

Unbelievable! Run 70B LLM Inference on a Single 4GB GPU with This NEW Technique

Large language models require huge amounts of GPU memory. Is it possible to run inference on a single GPU? If so, what is the minimum GPU memory required?

Gavin Li
Gavin Li
Cover Image for $100m AI Product Leader’s Checklist: 5 Key Criteria to Validate Your Next Big AI Idea

$100m AI Product Leader’s Checklist: 5 Key Criteria to Validate Your Next Big AI Idea

Over the years working on various AI products and technologies at top American tech giants, I have unconsciously formed a checklist in my mind for evaluating whether an AI product direction is promising. These are principles and standards for checking if an AI product direction is good. And many of them are about what “not to do”.

Gavin Li
Gavin Li
Cover Image for Exciting update!! Announcing new Anima LLM model, 100k context window!! Fully open source!

Exciting update!! Announcing new Anima LLM model, 100k context window!! Fully open source!

We believe the future of AI will be fully open and democratized. AI should be a tool that’s accessible to everyone, instead of only the big monopolies(some of them have the term “open” in their names 😆 .). QLoRA might be an important step towards that future. We want to make some small contribution to the historical process of democratization of AI, we are open sourcing the 33B QLoRA model we trained: all the model parameters, code, datasets and evaluations are opened!

Gavin Li
Gavin Li
Cover Image for Exciting update!! Announcing new Anima LLM model, 100k context window!! Fully open source!

Exciting update!! Announcing new Anima LLM model, 100k context window!! Fully open source!

We released the new Anima open source 7B model, supporting an input window length of 100K! It’s based on LLama2, so available for commercial use!

Gavin Li
Gavin Li
Cover Image for How AI Text Generation Models Are Reshaping Customer Support at Airbnb

How AI Text Generation Models Are Reshaping Customer Support at Airbnb

Leveraging text generation models to build more effective, scalable customer support products.

Gavin Li
Gavin Li
Cover Image for Task-Oriented Conversational AI in Airbnb Customer Support

Task-Oriented Conversational AI in Airbnb Customer Support

How Airbnb is powering automated support to enhance the host and guest experience?

Gavin Li
Gavin Li