Conversational AI adoption is accelerating in marketing, sales, and customer service, with over 40% of organizations already implementing this technology. However, many business leaders are unsure how to begin implementation, particularly when it comes to choosing between open-source and closed-source large language models (LLMs).
Key considerations for building conversational AI: The choice between popular LLMs like GPT-4o (OpenAI) and Llama 3 (Meta) depends on factors such as setup costs, processing costs, and specific business needs.
- Setup costs include development and operational expenses to get the LLM running, while processing costs cover the actual expense of each conversation once the tool is live.
- The cost-to-value ratio depends on the intended use of the LLM and the expected usage volume.
- GPT-4o offers quicker deployment with minimal setup, while Llama 3 requires more initial investment but may provide long-term cost benefits for high-volume users.
Understanding LLM pricing models: LLMs typically use “tokens” as a basic metric for processing input and output, though the definition of tokens can vary between models.
- GPT-4o, a closed-source model, charges $0.005 per 1,000 input tokens and $0.015 per 1,000 output tokens.
- Llama 3, an open-source model, can be hosted on private servers or cloud infrastructure, with providers like Amazon Bedrock charging $0.00265 per 1,000 input tokens and $0.00350 per 1,000 output tokens.
Cost comparison for a benchmark conversation: Using a hypothetical conversation of 16 messages totaling 30,390 tokens, the costs were calculated for both LLMs.
- GPT-4o: Approximately $0.16 per conversation
- Llama 3 (on AWS Bedrock): Approximately $0.08 per conversation, not including server costs
Additional factors to consider: The decision between LLMs should take into account various aspects beyond just token costs.
- Time to deployment: GPT-4o offers faster implementation, while Llama 3 may require weeks of setup.
- Usage volume: High-volume users may benefit more from Llama 3’s lower per-conversation costs in the long run.
- Control and customization: Open-source models like Llama 3 offer more control over the product and data.
- Operational requirements: Llama 3 demands more time and resources for setup, maintenance, and infrastructure management.
Weighing the options: The choice between building in-house or using off-the-shelf solutions depends on the company’s specific needs and resources.
- Companies planning to use conversational AI as a core service may find it worthwhile to invest in building their own solution.
- For businesses where conversational AI is not a fundamental element of their brand, off-the-shelf products may offer a more cost-effective and efficient solution.
Looking ahead: As conversational AI continues to evolve, businesses must carefully evaluate their options based on their unique context and customer needs.
- The rapid adoption of generative AI in various sectors indicates its growing importance in bridging communication gaps between businesses and customers.
- Continuous assessment of LLM options and their associated costs will be crucial as the technology advances and market demands change.
What does it cost to build a conversational AI?