In the rapidly evolving landscape of AI, Large Language Models (LLMs) are pushing the boundaries of what's possible in natural language processing. But, what if we could extend their capabilities beyond mere chat interfaces? What if we could give LLMs the power to make decisions and take actions? At OctoAI, we believe this is not only possible but essential for the next generation of AI applications.
Enter Function Calling – a game-changing capability that's transforming how we interact with and leverage LLMs. In this post, we'll dive into what Function Calling is, why it matters, and how you can harness its power using OctoAI's models like Llama 3.1 8b and 70b. We'll demonstrate its practical applications by extending our previous customer support example, showing you how to create more intelligent, autonomous AI systems that can streamline your workflows and save valuable time for your team.
Whether you're a developer looking to push the boundaries of what's possible with LLMs, or a business leader seeking to optimize your operations, this post will provide you with the insights and tools you need to get started with Function Calling on OctoAI.
Function Calling: Empowering LLMs with Decision-Making Capabilities
When we talk about LLM agents, we're referring to AI systems that can perceive their environment and act upon it to achieve specific goals. In the context of customer support, for example, an LLM agent's goal might be to satisfy customer requests efficiently while minimizing the need for human intervention.
To create effective LLM agents, we need to consider several key components:
Environment: The context in which the agent operates (e.g., user interactions, integrated systems).
Sensors: How the agent perceives its environment (e.g., input text, API responses).
Actuators: How the agent acts upon its environment (e.g., generated text, API calls).
Goals: Defined objectives that guide the agent's actions.
Planning: For complex tasks, the ability to break down goals into subtasks and execute them.
Function calling is a crucial capability that enables LLM agents to interact with their environment more effectively. It allows the model to decide when it needs additional information or when to use external functionalities, bridging the gap between language understanding and real-world actions.
Levels of Function Calling Capabilities
Simple Function Calling: The LLM decides whether to call a single available function.
Multiple Function Calling: The LLM chooses from multiple available functions.
For example, consider a travel recommendation system with functions like get_rain_prob_by_location
, get_temperature_by_location
, and is_nice_weather
. With parallel function calling, the LLM can efficiently gather weather data for multiple locations in a single step, dramatically reducing response time.
There are also some models, including the Llama 3.1 herd, with native or built-in tools. This means that the functions do not need to be defined custom. These include Brave Search, Wolfram Alpha, and Code Interpreter for Python.
Navigating the challenges of Function Calling
While Function Calling offers powerful capabilities, it's important to be aware of potential challenges when implementing it in production environments:
Decision Accuracy: Ensuring the LLM correctly decides when to call functions and which ones to use.
Payload Generation: Generating valid, well-structured payloads consistently.
Latency Management: Optimizing response times, particularly with complex, nested function calls.
Error Handling: Gracefully managing cases where function calls fail or return unexpected results.
Despite these challenges, JSON-based Function Calling offers several advantages for production environments:
Predictability: Easier to ensure valid, parsable responses.
Security: Reduced risk compared to executing arbitrary generated code. Implementing linting for generated JSON can also be used to improve the predictability and security feedback loop.
Ease of Integration: Simpler to integrate with existing systems and APIs.
Function Calling in Action: Enhancing Customer Support Automation
As a refresher, in a previous blog we implemented an LLM-based tool to help our Product and CX Managers to quickly group together user complaints and write informative Jira tickets for these.
The high-level demo workflow:
Convert the reviews into JSON, labeling them with the specific category and whether the review sentiment: positive or negative.
Select the negative reviews.
Group the issues that refer to the same problem
Summarize these issue groups into a single ticket and “submit” it to Jira
We can allow the LLM to decide whether it has enough information about an issue and allow it to respond to customer reviews. By providing additional functions as options to the model and increase the “agency” offered to it, we are approaching the semblance of agentic workflows referred to at the beginning of this post.
So here’s the updated workflow, also high-level:
Iterate over all the reviews and feed them all to our function-calling enabled models
For each review, we do one of the following steps:
(a) If it’s a positive review, we send a thank you note
(b) If it’s negative feedback, we send an apology message, and we extract the type of issue and the summary for later ticket creation
(c) If it’s an unclear review or it’s negative but not very detailed, we ask some follow-up questions. Note that we can repeat (c) multiple times until we can finally run either (a) or (b)
Group together the issues that refer to the same problem
Summarize these issue groups into a single ticket and “submit” it to Jira
Notice that we delegate more decision-making to the model, and we let it interact with the users, too.
We used magazine reviews from Amazon as our source of customer feedback. We also mocked our API requests to Jira. No customer replies were sent, but in a real-world implementation, the functions should be able to send API requests to post reply messages. And, finally, to simulate a user’s reply to our follow-up questions, we’ll use a Llama 3 model.
Potential Extensions:
Documentation Search: Add a function to search product documentation and provide relevant information to users.
Intelligent Routing: Use Function Calling to determine when to escalate issues to human support.
Personalized Responses: Incorporate user history and preferences into the AI's decision-making process.
Implementation Considerations:
When implementing such a system, it's crucial to define clear boundaries for the AI's decision-making power. For instance, while it's generally safe to let the AI send apologies or thank-you notes, decisions with financial implications (like offering discounts) should remain under human control.
Unlocking the Full Potential of LLMs with Function Calling
Function Calling represents a significant leap forward in LLM capabilities, opening up new possibilities for AI-driven applications. By enabling LLMs to interact with external systems and make decisions, we can create more powerful, flexible, and autonomous AI agents.
Key takeaways:
Enhanced Autonomy: Function Calling allows LLMs to decide when they need additional information or actions, reducing the need for complex, hard-coded logic.
Improved Efficiency: Parallel function calling can significantly reduce latency in complex workflows.
Versatility: From customer support to data analysis, Function Calling can enhance a wide range of business applications.
Seamless Integration: OctoAI's JSON-based approach ensures safe and easy integration with existing systems.
Moreover, function calling can revolutionize how we approach Retrieval-Augmented Generation (RAG) applications. By defining document searches as functions, we can create more intelligent and efficient information retrieval systems, capable of handling diverse information sources without the need for separate semantic routing systems.
Whether you're looking to automate customer support, enhance decision-making processes, or create more intelligent data analysis tools, OctoAI provides the robust, secure, and scalable foundation you need to bring your AI innovations to life.
Try Function Calling for your use cases on OctoAI Text Gen Solution today
You can get started with the OctoAI Text Gen Solution today at no cost, and easily get started with the newest Llama 3.1 models. The application used for the demo in this blog is available here:
Also, check out OctoStack to bring the power of OctoAI to your private cloud.