LlamaIndex Integration Guide

This guide will walk you through integrating LlamaIndex with APIpie, enabling you to build powerful RAG (Retrieval Augmented Generation) applications that connect your custom data sources to various LLMs through a unified interface.
What is LlamaIndex?
LlamaIndex is a comprehensive data framework for connecting custom data to LLMs. It provides tools for:
- Data Ingestion: Connect to various data sources through built-in connectors
- Data Indexing: Structure your data for efficient retrieval
- Retrieval: Extract relevant context for queries
- Synthesis: Generate accurate responses augmented with retrieved context
- Evaluation: Assess and improve RAG system performance
- Agent Orchestration: Build complex LLM agents with data access
By connecting LlamaIndex with APIpie, you gain access to a wide range of powerful language models while leveraging LlamaIndex's sophisticated data management capabilities.
Integration Steps
1. Create an APIpie Account
- Register here: APIpie Registration
- Complete the sign-up process.
2. Add Credit
- Add Credit: APIpie Subscription
- Add credits to your account to enable API access.
3. Generate an API Key
- API Key Management: APIpie API Keys
- Create a new API key for use with LlamaIndex.
4. Install LlamaIndex
Install LlamaIndex core and required packages:
pip install llama-index-core
pip install llama-index-llms-openai # For OpenAI-compatible endpoints like APIpie
For advanced use cases, you may need additional packages:
pip install llama-index-embeddings-openai # For embeddings
pip install llama-index-vector-stores-qdrant # For Qdrant vector store
# or other integrations as needed
5. Configure LlamaIndex for APIpie
Create a custom LLM configuration that points to APIpie:
import os
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
# Configure APIpie as the LLM provider
api_key = "your-apipie-api-key"
apipie_llm = OpenAI(
api_key=api_key,
base_url="https://apipie.ai/v1",
model="gpt-4o-mini", # You can use any model available on APIpie
temperature=0.1,
)
# Set as the default LLM
Settings.llm = apipie_llm
Key Features
- Multiple Data Connector Options: Connect to APIs, PDFs, CSVs, SQL databases, websites, and more
- Flexible Data Indexing: Create vector stores, summaries, keyword indices, and knowledge graphs
- Advanced Retrieval: Implement BM25, hybrid search, or re-ranking for improved results
- Query Planning: Break down complex queries into sub-questions for comprehensive answers
- Caching: Optimize performance and reduce API costs
- Evaluation Framework: Assess and fine-tune your RAG systems
Example Workflows
Application Type | What LlamaIndex Helps You Build |
---|---|
Document Q&A | Systems that answer questions about specific documents |
Knowledge Bases | Comprehensive knowledge systems from multiple data sources |
Research Assistants | Tools that analyze and synthesize information |
Data Analysis | Systems that query and analyze structured data |
Multi-Agent Applications | Complex agent systems with coordinated data access |
Using LlamaIndex with APIpie
Basic Document Q&A
import os
from llama_index.llms.openai import OpenAI
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
Settings,
)
# Configure APIpie
api_key = "your-apipie-api-key"
apipie_llm = OpenAI(
api_key=api_key,
base_url="https://apipie.ai/v1",
model="gpt-4o-mini", # You can use any model available on APIpie
temperature=0.1,
)
# Set as the default LLM
Settings.llm = apipie_llm
# Load your documents
documents = SimpleDirectoryReader("./data").load_data()
# Create an index from the documents
index = VectorStoreIndex.from_documents(documents)
# Create a query engine
query_engine = index.as_query_engine()
# Query your data
response = query_engine.query("What is the main topic discussed in these documents?")
print(response)
Advanced RAG with Custom Embeddings
import os
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
Settings,
ServiceContext,
)
# Configure APIpie for LLM
api_key = "your-apipie-api-key"
apipie_llm = OpenAI(
api_key=api_key,
base_url="https://apipie.ai/v1",
model="gpt-4o",
temperature=0.1,
)
# Configure APIpie for embeddings
apipie_embed_model = OpenAIEmbedding(
api_key=api_key,
base_url="https://apipie.ai/v1",
model_name="text-embedding-3-large",
embed_batch_size=100,
)
# Set the default models
Settings.llm = apipie_llm
Settings.embed_model = apipie_embed_model
# Load your documents
documents = SimpleDirectoryReader("./data").load_data()
# Create an index with the custom settings
index = VectorStoreIndex.from_documents(documents)
# Create a query engine with more advanced settings
query_engine = index.as_query_engine(
similarity_top_k=5, # Retrieve top 5 most similar chunks
streaming=True, # Enable streaming responses
)
# Query your data
response = query_engine.query(
"Provide a detailed summary of these documents and their key insights."
)
print(response)
Using a Persistent Vector Store
import os
from llama_index.llms.openai import OpenAI
from llama_index.vector_stores.qdrant import QdrantVectorStore
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
Settings,
StorageContext,
)
import qdrant_client
# Configure APIpie
api_key = "your-apipie-api-key"
apipie_llm = OpenAI(
api_key=api_key,
base_url="https://apipie.ai/v1",
model="gpt-4o-mini",
)
# Set as the default LLM
Settings.llm = apipie_llm
# Create a Qdrant client (local or cloud)
client = qdrant_client.QdrantClient(
location=":memory:", # Use a real URL for production
)
# Create a QdrantVectorStore
vector_store = QdrantVectorStore(
client=client,
collection_name="documents",
)
# Create a storage context
storage_context = StorageContext.from_defaults(vector_store=vector_store)
# Load your documents
documents = SimpleDirectoryReader("./data").load_data()
# Create an index with the custom vector store
index = VectorStoreIndex.from_documents(
documents,
storage_context=storage_context,
)
# Create a query engine
query_engine = index.as_query_engine()
# Query your data
response = query_engine.query("What are the main points in these documents?")
print(response)
Building an Agent with Data Access
import os
from llama_index.llms.openai import OpenAI
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
Settings,
)
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.callbacks import CallbackManager, ConsoleCallbackHandler
# Configure callbacks for logging
Settings.callback_manager = CallbackManager([ConsoleCallbackHandler()])
# Configure APIpie
api_key = "your-apipie-api-key"
apipie_llm = OpenAI(
api_key=api_key,
base_url="https://apipie.ai/v1",
model="gpt-4o", # Using a more capable model for the agent
temperature=0.1,
)
# Set as the default LLM
Settings.llm = apipie_llm
# Load different document sets
financial_docs = SimpleDirectoryReader("./financial_data").load_data()
product_docs = SimpleDirectoryReader("./product_data").load_data()
# Create indices for each document set
financial_index = VectorStoreIndex.from_documents(financial_docs)
product_index = VectorStoreIndex.from_documents(product_docs)
# Create query engines for each index
financial_engine = financial_index.as_query_engine()
product_engine = product_index.as_query_engine()
# Create tools from the query engines
tools = [
QueryEngineTool(
query_engine=financial_engine,
metadata=ToolMetadata(
name="financial_data",
description="Provides information about financial statements, revenue, and business performance",
),
),
QueryEngineTool(
query_engine=product_engine,
metadata=ToolMetadata(
name="product_data",
description="Provides information about products, features, and specifications",
),
),
]
# Create a sub-question query engine that can route to the appropriate tool
query_engine = SubQuestionQueryEngine.from_defaults(
query_engine_tools=tools,
verbose=True,
)
# Query across both datasets
response = query_engine.query(
"Compare the financial performance of our top-selling product to the overall company results last quarter."
)
print(response)
Troubleshooting & FAQ
-
Which models are supported?
Any OpenAI-compatible model available via APIpie's endpoint. -
How do I handle environment variables securely?
Store your API keys in environment variables or use a secure environment management tool. Never commit API keys to repositories. -
How can I optimize token usage?
LlamaIndex provides several mechanisms to reduce token usage, including chunking strategies, text splitting parameters, and caching. You can adjust thechunk_size
andchunk_overlap
parameters in the document loading process. -
What if I need to handle very large datasets?
For large datasets, consider using a persistent vector database like Pinecone, Weaviate, or Qdrant. LlamaIndex provides integrations with many vector stores. -
How do I debug retrieval issues?
Use theverbose=True
parameter when creating query engines and enable the console callback handler to see detailed logs of the retrieval process. -
Can I use custom embedding models?
Yes, LlamaIndex supports many embedding models. You can configure custom embedding models through the appropriate integration packages and theSettings.embed_model
configuration.
For more information, see the LlamaIndex documentation or the GitHub repository.
Support
If you encounter any issues during the integration process, please reach out on APIpie Discord for assistance.