Sign up
Log in
Sign up
Log in
Live Webinar: June 25th
Join our Builder's Roundtable to learn all about fine-tuning LLMs
Register now

OctoAI and Google Cloud Unite to Accelerate Generative AI Innovation

Blog Author - Luis Ceze

Apr 9, 2024

3 minutes


In this article

OctoAI is excited to announce a strategic partnership with Google Cloud to deliver efficient, reliable, customizable generative AI systems on Google Cloud's AI-optimized infrastructure.

“Through this partnership, we’re bringing together our AI-optimized infrastructure with OctoAI’s platform for serving generative AI models, giving developers more ways to use Google Cloud for inference and AI applications,” said Matt Renner, President, Global Field Organization at Google Cloud. “This partnership is a testament to Google Cloud’s commitment to supporting a vibrant ecosystem of startups and an open stack of AI tooling and applications.”

The partnership adds significant AI-compute capacity to power OctoAI’s hosted inference APIs, and model customization solutions for AI app developers. Customers can run their choice of model, including open source (e.g. Mixtral 8x7B, SDXL), custom models, or fine tunes with efficiency and reliability. Google Cloud’s robust infrastructure ensures that OctoAI can deliver these solutions with the capacity to support workloads at scale, seamlessly accommodating the demands of businesses across various industries. Enabled by the added Google Cloud infrastructure, customers like Socialgist have been able to accelerate new LLM capability evaluation on OctoAI.

It’s been exciting to actively evaluate OctoAI’s diverse set of LLMs and features, including a dedicated JSON-mode access via the new Google Cloud endpoint. In our initial phase of evaluation we can see that it provides us with new opportunities to refine our programmatic integration with LLMs. We recognize the impact and potential of the partnership between OctoAI and Google Cloud, as it makes it easy and more performant for us to consume LLMs while we evolve our web footprint.

Nate Kerr, Software Engineer @ Socialgist

The partnership also expands access to OctoStack, a turnkey serving stack for generative AI models that runs in a customer’s Google Cloud environment. With OctoStack, Google Cloud customers get a self-contained solution that sits alongside their enterprise data, unlocking Retrieval Augment Generation (RAG) capabilities for private data stores like BigQuery.

OctoAI customers such as Capitol AI report significant speedups and cost savings using OctoAI to run open source and custom generative AI models. OctoStack brings these same efficiencies to GenAI deployments in a customer’s environment, with 4X better GPU utilization and an estimated 50% reduction in operational costs compared to best-in-class DIY.

“OctoAI is excited about the opportunity to expand access to market-leading generative AI systems and infrastructure in partnership with Google Cloud,” said Luis Ceze, OctoAI Co-Founder and CEO. “Google Cloud delivers powerful, reliable AI compute for OctoAI customers to run generative AI applications in production at massive scale.”

Customers can sign up to try OctoAI for free at today. To learn more about OctoStack and to sign up for the promotional launch offer, reach out to OctoAI today.

About OctoAI

OctoAI (formerly OctoML) is on a mission to make AI more accessible and sustainable so it can be used to improve lives. The OctoAI platform delivers a complete stack for app builders to run, tune, and scale their AI applications in the cloud or on-prem. With blazing fast inference APIs for popular models like SDXL, Mixtral, and Llama2, end-to-end developer solutions, and world-class ML systems under the hood, businesses can focus on building apps that wow their customers without becoming AI infrastructure experts. OctoAI is based in Seattle, Washington and is backed by Madrona Venture Partners, Amplify Partners, Tiger Global, and Addition Capital.