Perplexity API Overview: Key Features & Use Cases
For detailed information about using models with APIpie, check out our Models Overview and Completions Guide.
Description
Perplexity AI is pioneering the development of advanced language models with a focus on real-time information processing and extended context understanding. Building upon Meta's Llama architecture, Perplexity has enhanced these foundation models with specialized training and optimizations for online inference. Their Sonar models represent a significant breakthrough, combining Llama's powerful language understanding capabilities with real-time information processing and extended context windows.
This innovative approach leverages Llama's strong foundation while adding crucial features for real-world applications:
- Online Inference Optimization: Enhanced the base Llama architecture for real-time processing
- Extended Context Understanding: Improved context window handling up to 128K tokens
- Knowledge Integration: Added capabilities for real-time information access and processing
These models are available through APIpie's routing system, offering state-of-the-art performance for various applications.
Key Features
- Extended Context Processing: Supports up to 128K tokens context length for comprehensive document analysis and long-form conversations
- Online Inference: Designed for real-time information processing and up-to-date knowledge
- Scalable Architecture: Available in multiple sizes (small, large, huge) to suit different computational needs
- High Performance: Optimized for both speed and accuracy in natural language understanding tasks
- Knowledge Integration: Enhanced with real-time information access capabilities
- Adaptive Learning: Continuously updated to maintain current knowledge and capabilities
Available Models
Model List updates dynamically please see the Models Route for the up to date list of models
Perplexity offers various models optimized for different use cases. The following models represent their Sonar series, which excels at real-time information processing with extended context windows. For detailed information about model capabilities and use cases, visit the Perplexity Blog.
Model Name | Max Tokens | Response Tokens | Providers | Subtype |
llama-3.1-sonar-small-128k-online | 127,072 | 127,072 | OpenRouter | Chat |
llama-3.1-sonar-large-128k-online | 127,072 | 127,072 | OpenRouter | Chat |
llama-3.1-sonar-huge-128k-online | 127,072 | 127,072 | OpenRouter | Chat |
Example API Call
Below is an example of how to use the Chat Completions API to interact with a Perplexity model:
curl -L -X POST 'https://apipie.ai/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer <YOUR_API_KEY>' \
--data-raw '{
"provider": "openrouter",
"model": "llama-3.1-sonar-huge-128k-online",
"max_tokens": 150,
"messages": [
{
"role": "user",
"content": "What are the latest developments in quantum computing?"
}
]
}'
Response Example
The expected response structure for a Perplexity model:
{
"id": "chatcmpl-12345example12345",
"object": "chat.completion",
"created": 1729535643,
"provider": "openrouter",
"model": "llama-3.1-sonar-huge-128k-online",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Recent developments in quantum computing include advances in error correction, new qubit architectures, and breakthrough experiments in quantum supremacy. Companies like IBM, Google, and others continue to make progress in both hardware and software aspects of quantum computing technology."
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 15,
"completion_tokens": 125,
"total_tokens": 140,
"prompt_characters": 45,
"response_characters": 520,
"cost": 0.002250,
"latency_ms": 3100
},
"system_fingerprint": "fp_123abc456def"
}
API Highlights
- Provider: Specify a provider or leave blank for automatic selection
- Model: Choose from available Perplexity models based on your needs. See Models Guide
- Max Tokens: Leverage the extended context window of up to 127K tokens
- Messages: Format your request with user inputs and system instructions. See message formatting
Applications and Use Cases
- Real-time Information Processing: Ideal for applications requiring current information and knowledge
- Long-form Content Analysis: Perfect for processing and analyzing lengthy documents, research papers, or conversations
- Knowledge-intensive Tasks: Excellent for research, analysis, and complex problem-solving requiring deep understanding
- Enterprise Solutions: Suitable for business applications requiring processing of large documents and real-time information
- Educational Applications: Valuable for in-depth learning and research assistance
- Research and Development: Powerful tools for scientific research and technological innovation
- Content Generation: Advanced capabilities for creating high-quality, well-researched content
Try these models with LibreChat or OpenWebUI for an interactive experience.
Best Practices
- Utilize the extended context window effectively by providing comprehensive context when needed
- Consider the model size based on your specific use case - smaller models for faster responses, larger models for more complex tasks
- Implement appropriate error handling and retry mechanisms for optimal performance
- Follow Perplexity's usage guidelines for best results
- Leverage the real-time information processing capabilities for up-to-date responses
- Structure prompts to take advantage of the models' knowledge integration features
Resources and Documentation
- Perplexity AI Documentation
- APIpie Models Guide
- Perplexity Research Blog
- Model Performance Benchmarks
Experience the power of Perplexity models in APIpie's various supported integrations.