Gavin Li's Blog.
Auther of airllm, ex-Airbnb, ex-Alibaba AI Leader, unicorn AI advisor, entrepreneur.
Crazy Challenge: Run Llama 405B on a 8GB VRAM GPU
I’m taking on the challenge of running the Llama 3.1 405B model on a GPU with only 8GB of VRAM. The Llama 405B model is 820GB! That’s 103 times the capacity of an 8GB VRAM! It clearly doesn’t fit into the 8GB VRAM. So how do we make it work?
More Stories
I Trained a 2D Game Animation Generation Model to Create Complex, Cool Game Actions (Fully Open-Source)
Six months ago, one afternoon, a friend came to me with an AI question. The problem was that he tried to use OpenAI’s text-to-image model to generate sprites for 2D game animations, but couldn’t achieve it due to character misalignment and consistency issues. I found it interesting and, on a whim, started training a 2D game animation generation model.
AI Scribes:10+ $10m ARR Companies, Too Late to Join?
The AI scribes field has been booming lately, with no fewer than ten companies earning tens of millions of dollars annually. I've conducted in-depth research on over 70 companies in this space, analyzing their products, founding dates, available funding information, revenue disclosures, each company's barriers to entry, and their unique positioning.
Why It’s Extremely Hard to Start an AI Application Business with Large Language Models
The media hype surrounding AI might have you believe it’s soaring high above us, on the brink of rendering all human jobs obsolete and poised to rule the world. However, this is merely hype.
Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU!
The strongest open source LLM model Llama3 has been released, some followers have asked if AirLLM can support running Llama3 70B locally with 4GB of VRAM. The answer is YES.
Open-Source SORA Has Arrived! Training Your Own SORA Model!
To date, the open-source model that comes closest to SORA is Latte, which employs the same Vision Transformer architecture as SORA. What exactly makes the Vision Transformer outstanding, and how does it differ from previous methods?
How Your Ordinary 8GB MacBook’s Untapped AI Power Can Run 70B LLM Models That Will Blow Your Mind!
Do you think your Apple MacBook is only good for making PPTs, browsing the web, and streaming shows? If so, you really don’t understand the MacBook.
Unbelievable! Run 70B LLM Inference on a Single 4GB GPU with This NEW Technique
Large language models require huge amounts of GPU memory. Is it possible to run inference on a single GPU? If so, what is the minimum GPU memory required?
$100m AI Product Leader’s Checklist: 5 Key Criteria to Validate Your Next Big AI Idea
Over the years working on various AI products and technologies at top American tech giants, I have unconsciously formed a checklist in my mind for evaluating whether an AI product direction is promising. These are principles and standards for checking if an AI product direction is good. And many of them are about what “not to do”.
Exciting update!! Announcing new Anima LLM model, 100k context window!! Fully open source!
We believe the future of AI will be fully open and democratized. AI should be a tool that’s accessible to everyone, instead of only the big monopolies(some of them have the term “open” in their names 😆 .). QLoRA might be an important step towards that future. We want to make some small contribution to the historical process of democratization of AI, we are open sourcing the 33B QLoRA model we trained: all the model parameters, code, datasets and evaluations are opened!
Exciting update!! Announcing new Anima LLM model, 100k context window!! Fully open source!
We released the new Anima open source 7B model, supporting an input window length of 100K! It’s based on LLama2, so available for commercial use!
How AI Text Generation Models Are Reshaping Customer Support at Airbnb
Leveraging text generation models to build more effective, scalable customer support products.
Task-Oriented Conversational AI in Airbnb Customer Support
How Airbnb is powering automated support to enhance the host and guest experience?