Sign up
Log in
Sign up
Log in
Live Webinar: June 25th
Join our Builder's Roundtable to learn all about fine-tuning LLMs
Register now

30 Days of Llama 3: Newest Member of the Herd is Living up to the Hype

Blog Author - Brittany Carambio
Blog Author - Ben Hamm

May 17, 2024

3 minutes


It’s been 30 days since MetaAI dropped Llama 3, and among AI startups it’s safe to say that the newest member of the herd is living up to the hype.

At launch, Meta released impressive-looking quality benchmarks suggesting that GPT-4 may be in for serious open source competition. Exciting as it was, anyone building with LLMs will tell you that the true test comes only when the model is evaluated against a real-life use case.

We onboarded the 70B and 8B variants to the OctoAI platform April 19 and customers began evaluating it immediately. Just 30-days into its reign as “next top model” Llama 3 will soon account for 25% of all OctoAI customer traffic.

Llama 3 customer usage from launch to Llama 3 Hackathon May 11, 2024

In particular, users have been impressed with:

Knowledge and capability: Llama 3 is perceived superior to all open source alts (except potentially Mixtral-8x22B) in world knowledge, recursive planning, dialogue, and reasoning.
Fine tuning malleability: which has allowed our customers to consistently beat GPT-4 in both cost and quality for their use cases.
Prompt adherence: which has allowed for fine-grained control even without tuning. Exceeds GPT-4 on customer use cases and matches on JSON schema following.

Many customers seeking a model to handle complex tasks have found it in Llama 3-70B. Several are transitioning from GPT-4 and discovering comparable quality and lower costs. Customers opt for Llama 3-8B when the use case calls for speed, efficiency, and cost effectiveness. In turn, they get a highly performant 8B model that is roughly on par with the previous generation’s Llama 2-70B.

OctoAI customers tend to be among the earliest adopters of new LLMs, giving them a front-row seat to the evolution of model/prompt interactions. One shared that “Llama-3-Instruct has the best prompt compliance/adherence of any model we have ever used, by far. It truly feels like a new paradigm in prompting. Every small change in wording results in changes in the model output that we can notice, which has given us an incredible level of fine-detail control through our prompting.”

Others are transitioning away from GPT-4 to fine-tuned Llama 3 variants in pursuit of dramatically lower costs. In one analysis, a customer who switched from GPT-4 saw 30% higher revenue conversion with a fine-tuned Llama-3-70B. Another determined that the switch from GPT-4 to a fine tuned Llama-3-8B would result in an 80X cost reduction without a significant change in model quality. OctoAI is already hosting several fine-tuned variants for customers, including Hermes-2-Pro-Llama-3-8b and Hermes-2-Theta-Llama-3-8B, with more to come.

The economics, quality, and ease-of use are hard to argue with. But Llama 3 won’t be the last impressive model to come along, and won’t be perfect for every use case. Some customers who’ve evaluated Llama 3 have opted to stick with other open source models. Benchmarks can only tell you so much – ultimately, it’s how well it serves your use case that matters. As far as we can tell, Llama 3 looks pretty darn good. Ask us again in 30 days.

Sign up at and get $10 in free credit to try Llama 3 yourself. Need hands-on support to evaluate an upgrade to Llama 3? Our team of experts can help.