Microsoft AI Models Guide: Phi Series & WizardLM
For detailed information about using models with APIpie, check out our Models Overview and Completions Guide.
Description
The Microsoft Phi Series represents Microsoft's innovative approach to efficient and powerful language models. Developed by Microsoft Research, these models demonstrate exceptional performance through advanced scaling techniques and architectural improvements. The models are available through various providers integrated with APIpie's routing system, offering flexibility in deployment options.
Key Features
- Advanced Architecture: Models leverage Microsoft's scaling techniques for optimal performance with smaller parameter counts
- Extended Context Windows: Support for context lengths from 2K to 128K tokens, with the latest Phi-3 series offering enhanced long-context understanding
- Multi-Provider Availability: Accessible through platforms like OpenRouter and Monster API
- Diverse Applications: Optimized for chat, instruction-following, and complex reasoning tasks with state-of-the-art performance
- Resource Efficiency: Industry-leading performance-to-size ratio, making them ideal for cost-effective deployments
- Research-Backed Development: Built on Microsoft's extensive research in efficient ML and model scaling
Model List in the Microsoft Series
Model List updates dynamically please see the Models Route for the up to date list of models
For detailed performance analysis and benchmarks, see the Microsoft Phi-2 Technical Report and Phi-2 Research Updates.
Model Name | Max Tokens | Response Tokens | Providers | Subtype |
resnet-50 | - | - | deepinfra | image-classification |
beit-base-patch16-224-pt22k-ft22k | - | - | deepinfra | image-classification |
wizardlm-2-8x22b | 65536 | 4096 | openrouter | chatx |
WizardLM-2-7B | 32000 | 32000 | deepinfra | chatx |
WizardLM-2-8x22B | 65536 | 65536 | deepinfra | chatx |
wizardlm-2-7b | 32000 | 4096 | openrouter | text-generation |
phi-3-mini-128k-instruct | 128000 | 128000 | openrouter | chatx |
phi-3-medium-128k-instruct | 128000 | 128000 | openrouter | chatx |
phi-3.5-mini-128k-instruct | 128000 | 128000 | openrouter | chatx |
nova-micro-v1 | 128000 | 5120 | bedrock | |
nova-micro-v1 | 128000 | 5120 | openrouter | chat |
WizardLM-2-8x22B | 65536 | 65536 | together | chat |
Example API Call
Below is an example of how to use the Chat Completions API to interact with a model from the Microsoft Series, such as phi-3-medium-128k-instruct
.
curl -L -X POST 'https://apipie.ai/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-H 'Authorization: Bearer <YOUR_API_KEY>' \
--data-raw '{
"provider": "openrouter",
"model": "phi-3-medium-128k-instruct",
"max_tokens": 150,
"messages": [
{
"role": "user",
"content": "What are the key principles of machine learning?"
}
]
}'
Response Example
The expected response structure for the Microsoft model might look like this:
{
"id": "chatcmpl-12345example12345",
"object": "chat.completion",
"created": 1729535643,
"provider": "openrouter",
"model": "phi-3-medium-128k-instruct",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The key principles of machine learning include:\n\n1. **Data Quality**: High-quality, diverse training data is essential\n\n2. **Feature Selection**: Identifying relevant input variables\n\n3. **Model Selection**: Choosing appropriate algorithms for the task\n\n4. **Training & Validation**: Using separate datasets to ensure generalization\n\n5. **Evaluation**: Measuring performance with appropriate metrics\n\nThese principles form the foundation of effective machine learning systems."
},
"logprobs": null,
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 98,
"total_tokens": 110,
"prompt_characters": 45,
"response_characters": 420,
"cost": 0.001850,
"latency_ms": 2800
},
"system_fingerprint": "fp_123abc456def"
}
API Highlights
- Provider: Specify the provider or leave blank for automatic selection.
- Model: Use any model from the Microsoft Series, such as
phi-3-medium-128k-instruct
or others suited to your task. See Models Guide. - Max Tokens: Set the maximum response token count (e.g., 150 in this example).
- Messages: Format your request with a sequence of messages, including user input and system instructions. See message formatting.
This example demonstrates how to seamlessly query models from the Microsoft Series for conversational or instructional tasks.
Applications and Integrations
- Efficient AI Solutions: Ideal for applications requiring high performance with resource constraints, leveraging Microsoft's optimized architecture
- Research and Education: Perfect for academic and research applications, with comprehensive documentation and research papers.
- Extended Context Processing: Latest models support up to 128K tokens for advanced document analysis and long-form content understanding
- Enterprise Applications: Production-ready for business applications with Microsoft's enterprise-grade reliability
- Development and Testing: Excellent for rapid prototyping and development with fast inference times and consistent outputs
- Natural Language Processing: State-of-the-art performance in text understanding, generation, and analysis tasks
Ethical Considerations
Microsoft's AI models should be used responsibly with appropriate safeguards and consideration of potential biases. For guidance on responsible AI usage, see Microsoft's Responsible AI Standards.
Licensing
The Microsoft Phi Series models are available under specific terms and conditions. For detailed licensing information, consult the Microsoft Research Open Source documentation.
Try out the Microsoft models in APIpie's various supported integrations.