Pixeltable Integration Guide


This guide will walk you through integrating Pixeltable with APIpie, enabling you to build powerful multimodal AI applications with declarative data infrastructure and access to a wide range of language models.
What is Pixeltable?
Pixeltable is a declarative data infrastructure framework for multimodal AI apps that provides:
- Data Ingestion & Storage: Work with images, videos, audio, documents, and structured data
- Transformation & Processing: Apply Python functions or built-in operations automatically
- AI Model Integration: Run inference (embeddings, object detection, LLMs) as part of data pipelines
- Indexing & Retrieval: Create and manage vector indexes for semantic search
- Incremental Computation: Only recompute what's necessary when data or code changes
- Versioning & Lineage: Track data and schema changes for reproducibility
By connecting Pixeltable with APIpie, you gain access to a wide range of powerful language models for your multimodal applications while leveraging Pixeltable's sophisticated data management capabilities.
Integration Steps
1. Create an APIpie Account
- Register here: APIpie Registration
- Complete the sign-up process.
2. Add Credit
- Add Credit: APIpie Subscription
- Add credits to your account to enable API access.
3. Generate an API Key
- API Key Management: APIpie API Keys
- Create a new API key for use with Pixeltable.
4. Install Pixeltable
Install Pixeltable using pip:
pip install pixeltable
For additional functionalities, you might need extra packages:
# For working with embedding models
pip install sentence-transformers
# For object detection
pip install pixeltable-yolox
# For document processing
pip install spacy
python -m spacy download en_core_web_sm
5. Configure Pixeltable for APIpie
Pixeltable integrates with APIpie through built-in functions for OpenAI-compatible endpoints:
import pixeltable as pxt
from pixeltable.functions import openai
# Configure the OpenAI function with APIpie credentials
# This configuration can be reused across your tables
openai.configure(
api_key="your-apipie-api-key",
base_url="https://apipie.ai/v1"
)
Key Features
- Unified Multimodal Interface: Work with images, videos, audio, documents, and structured data through a consistent interface
- Declarative Computed Columns: Define transformations that run automatically on new or updated data
- Built-in Vector Search: Add embedding indexes for similarity search directly on tables/views
- On-the-Fly Data Views: Create virtual tables using iterators for efficient processing
- Seamless AI Integration: Built-in functions for various AI providers including APIpie
- Custom Python Functions: Extend with User-Defined Functions (UDFs)
- Incremental Computation: Save time and costs by only recomputing what's necessary
Example Workflows
Application Type | What Pixeltable Helps You Build |
---|---|
Multimodal RAG Systems | Apps that search and generate content from diverse media |
Computer Vision Apps | Applications performing detection and classification |
Content Generation | Systems that create and modify text and images |
Data Curation & Labeling | Tools for organizing and annotating multimodal datasets |
AI Evaluation Frameworks | Systems to test and benchmark model performance |
Using Pixeltable with APIpie
Basic Text Generation with APIpie
import pixeltable as pxt
from pixeltable.functions import openai
# Configure APIpie credentials
openai.configure(
api_key="your-apipie-api-key",
base_url="https://apipie.ai/v1"
)
# Create a table for prompts and responses
qa = pxt.create_table(
'qa_system',
{'prompt': pxt.String},
if_exists='replace'
)
# Add a computed column for LLM responses
qa.add_computed_column(
response=openai.chat_completions(
model='gpt-4o-mini', # You can use any model available on APIpie
messages=[{
'role': 'user',
'content': qa.prompt
}]
).choices[0].message.content
)
# Insert a prompt and get a response
qa.insert([
{'prompt': 'Explain quantum computing in simple terms.'}
])
# View the results
print(qa.select(qa.prompt, qa.response).collect())
Multimodal RAG with Pixeltable and APIpie
import pixeltable as pxt
from pixeltable.functions import openai, huggingface
from pixeltable.iterators import DocumentSplitter
# Configure APIpie credentials
openai.configure(
api_key="your-apipie-api-key",
base_url="https://apipie.ai/v1"
)
# Create a directory for organization
pxt.create_dir("rag_system", if_exists="replace")
# Create a document table and add some PDFs
docs = pxt.create_table(
'rag_system.documents',
{'doc': pxt.Document},
if_exists='replace'
)
docs.insert([
{'doc': 'path/to/your/document.pdf'}
])
# Create chunks view with sentence-based splitting
chunks = pxt.create_view(
'rag_system.chunks',
docs,
iterator=DocumentSplitter.create(
document=docs.doc,
separators='sentence'
)
)
# Add embedding index for similarity search
chunks.add_embedding_index(
'text',
string_embed=huggingface.sentence_transformer.using(
model_id='all-MiniLM-L6-v2'
)
)
# Define query function for retrieval
@pxt.query
def get_relevant_context(query_text: str, limit: int = 3):
sim = chunks.text.similarity(query_text)
return chunks.order_by(sim, asc=False).limit(limit).select(chunks.text)
# Create QA table that integrates context retrieval with APIpie
qa = pxt.create_table(
'rag_system.qa',
{'question': pxt.String},
if_exists='replace'
)
# Add context retrieval as a computed column
qa.add_computed_column(
context=get_relevant_context(qa.question)
)
# Construct prompt with retrieved context
qa.add_computed_column(
formatted_prompt=f"""
Based on the following information, please answer the question.
CONTEXT:
{qa.context}
QUESTION: {qa.question}
ANSWER:
"""
)
# Generate response with APIpie
qa.add_computed_column(
answer=openai.chat_completions(
model='gpt-4o',
messages=[{
'role': 'user',
'content': qa.formatted_prompt
}]
).choices[0].message.content
)
# Ask a question
qa.insert([
{'question': 'What are the key concepts discussed in the document?'}
])
# View the result
print(qa.select(qa.question, qa.answer).collect())
Image Classification with Pixeltable and APIpie Vision
import pixeltable as pxt
from pixeltable.functions import openai
# Configure APIpie credentials
openai.configure(
api_key="your-apipie-api-key",
base_url="https://apipie.ai/v1"
)
# Create an image table
images = pxt.create_table(
'image_analysis',
{'image': pxt.Image},
if_exists='replace'
)
# Insert some sample images
images.insert([
{'image': 'https://upload.wikimedia.org/wikipedia/commons/thumb/6/68/Orange_tabby_cat_sitting_on_fallen_leaves-Hisashi-01A.jpg/1920px-Orange_tabby_cat_sitting_on_fallen_leaves-Hisashi-01A.jpg'},
{'image': 'https://upload.wikimedia.org/wikipedia/commons/d/d5/Retriever_in_water.jpg'}
])
# Add a computed column for vision analysis using APIpie's vision model
images.add_computed_column(
description=openai.chat_completions(
model='gpt-4-vision',
messages=[
{
'role': 'user',
'content': [
{'type': 'text', 'text': 'Describe this image in detail.'},
{
'type': 'image_url',
'image_url': {'url': images.image}
}
]
}
]
).choices[0].message.content
)
# View the results
print(images.select(images.image, images.description).collect())
Agentic Workflows with Tool Calling
import pixeltable as pxt
from pixeltable.functions import openai
import json
# Configure APIpie credentials
openai.configure(
api_key="your-apipie-api-key",
base_url="https://apipie.ai/v1"
)
# Define a UDF for weather information (mock function)
@pxt.udf
def get_weather(location: str) -> str:
"""Get current weather for a location."""
# In a real app, you would call a weather API
return f"The weather in {location} is sunny and 72°F."
# Define a UDF for search (mock function)
@pxt.udf
def search_web(query: str) -> str:
"""Search the web for information."""
# In a real app, you would call a search API
return f"Search results for: {query}..."
# Register the UDFs as tools
tools = pxt.tools(get_weather, search_web)
# Create agent table
agent = pxt.create_table(
'agent_system',
{'user_request': pxt.String},
if_exists='replace'
)
# Add a column for tool selection by the LLM
agent.add_computed_column(
tool_choice=openai.chat_completions(
model='gpt-4o',
messages=[
{'role': 'user', 'content': agent.user_request}
],
tools=[t.to_openai_tool() for t in tools],
tool_choice='auto'
)
)
# Add a computed column to extract tool call information
@pxt.udf
def extract_tool_call(response) -> dict:
"""Extract tool call information from the LLM response."""
tool_calls = response.choices[0].message.tool_calls
if not tool_calls:
return {"tool": None, "args": None}
call = tool_calls[0]
return {
"tool": call.function.name,
"args": json.loads(call.function.arguments)
}
agent.add_computed_column(
tool_info=extract_tool_call(agent.tool_choice)
)
# Add a column to invoke the selected tool
@pxt.udf
def invoke_tool(tool_info: dict) -> str:
"""Invoke the selected tool with the provided arguments."""
if not tool_info or not tool_info["tool"]:
return "No tool was selected."
tool_name = tool_info["tool"]
args = tool_info["args"]
if tool_name == "get_weather":
return get_weather(args["location"])
elif tool_name == "search_web":
return search_web(args["query"])
return f"Unknown tool: {tool_name}"
agent.add_computed_column(
tool_result=invoke_tool(agent.tool_info)
)
# Add a column for the final response
agent.add_computed_column(
final_response=openai.chat_completions(
model='gpt-4o-mini',
messages=[
{'role': 'user', 'content': agent.user_request},
{'role': 'assistant', 'content': f"I found this information: {agent.tool_result}"}
]
).choices[0].message.content
)
# Test the agent
agent.insert([
{'user_request': 'What\'s the weather like in Paris?'},
{'user_request': 'Tell me about quantum computing.'}
])
# View the results
print(agent.select(agent.user_request, agent.tool_info, agent.final_response).collect())
Troubleshooting & FAQ
-
How do I update computed columns when data changes?
Pixeltable automatically recomputes values when data or UDFs change. To force a recomputation, usetable.invalidate_columns(['column_name'])
. -
How do I handle environment variables securely?
Store your API keys in environment variables and load them in your code. Pixeltable also supports configuration files for security. -
Can I use APIpie's routing capabilities with Pixeltable?
Yes, APIpie's routing features work seamlessly with Pixeltable. Simply specify the desired model when calling functions. -
How do I monitor costs and usage?
Pixeltable's incremental computation helps minimize API calls, but you should still monitor your APIpie dashboard for usage metrics. -
Can I cache API calls to reduce costs?
Yes, Pixeltable's computed columns implicitly cache results, only recomputing when inputs change. -
How can I scale to large datasets?
Pixeltable is designed to handle large datasets efficiently. For very large datasets, consider using Views with iterators for on-demand processing.
For more information, see the Pixeltable documentation or the GitHub repository.
Support
If you encounter any issues during the integration process, please reach out on APIpie Discord or Pixeltable Discord for assistance.