Skip to main content

Internet Search Grounding: Web Data in AI Responses

Search Feature Banner

Enable any model to answer with real-time, verifiable information using Inline Search — our built-in web search and grounding system that works seamlessly with OpenAI-compatible APIs. With a single request, you can inject live context from the internet into your prompt, no browser tools, agents, or plugins required.

What is Search Grounding?

Inline Search enhances any model by integrating real-time search results and scraped content directly into the prompt — without requiring the model to have built-in browsing capabilities.

When enabled, the system:

  1. Performs X number of live searches using top-ranked search engines
  2. Fetches and ranks up to 100 results per search
  3. Scrapes the top results, cleans the text, removes noise
  4. Appends that content to your prompt before model invocation

This allows even legacy models to answer with fresh knowledge, and it works automatically with any model or app built on the OpenAI chat format.


  • Fresh data in any model – including GPT-3.5, Claude, LLaMA, Mixtral, and more
  • No special setup – works out of the box in standard OpenAI API format
  • Customize search volume and scope
  • Simple to configure via request body

⚠️ Note: Because this uses live search + scraping, expect an added delay of 1–6 seconds total, depending on complexity of the web pages we are pulling data from. We have to render the javascript before we can pull the data.


Quick Start (OpenAI Format Example)

Use web_search_options in your request to instantly enable Inline Search.

curl -X POST 'https://apipie.ai/v1/chat/completions' \
-H 'Authorization: Bearer <API_KEY>' \
-H 'Content-Type: application/json' \
-d '{
"user": "qa_test",
"stream": true,
"model": "openai/gpt-4o",
"web_search_options": {
"search_context_size": "medium"
},
"max_tokens": 500,
"messages": [
{
"role": "user",
"content": "Who was the best and worst president of the united states, quantifiable and verifiable?"
}
]
}'

Options for web_search_options.search_context_size

ValueDescription
lowSmall content injection (~1 result up to 10k characters)
mediumModerate (~3 results, ~15K characters)
highFull web grounding (~5 results, ~35K characters)

Alternate Inline Search via Payload (Advanced)

You can also enable search with deeper control using the online flag and extended parameters:

FieldDescription
onlineEnable inline search grounding (true or false)
searchesNumber of search queries to perform (default: 1)
pullMax results to pull from each search (default: 20)
useHow many of those results to append to the prompt (default: 3)
scrape_lengthMax number of characters to include from each scrape
search_langLanguage code for results (e.g. "en")
search_geoGeolocation code (e.g. "US")
{
"user": "user123",
"model": "claude-3-5-sonnet",
"online": true,
"searches": 2,
"pull": 10,
"use": 4,
"scrape_length": 12000,
"search_lang": "en",
"search_geo": "US",
"messages": [
{ "role": "user", "content": "Summarize the most recent news about generative AI in healthcare." }
]
}

Using Inline CLI (Prompt Commands)

You can also enable search directly in the prompt with these Inline CLI commands:

CommandBehavior
:searchFast search (1 query, 1 results)
:searchmoreMore results (2 query, 3 results)
:deepsearchBroad + deep search (3 queries, 5 results)
:setsearchlang:enSet language to English
:setsearchgeo:USSet geo location to United States

Example prompt:

{ "content": "Summarize today’s major AI headlines :deepsearch :setsearchlang:en :setsearchgeo:US" }

Direct Use of the Search & Scrape APIs

For users needing lower-level control or standalone search capabilities, we also offer public API endpoints.

POST /v1/search

{
"query": "latest AI developments",
"search_provider": "google",
"search": "google",
"geo": "us",
"lang": "en",
"results": 10,
"safeSearch": -1,
"user": "user123"
}

Returns a list of ranked search results with URLs, titles, and descriptions.


POST /v1/scrape

{
"url": "https://example.com/article",
"format": "parsed"
}

Returns parsed content:

  • title
  • textContent
  • excerpt
  • hrefs[]

Developer Tips

  • Works with all models, including offline-only models like GPT-3.5 or Mistral
  • Use web_search_options for OpenAI-style apps
  • Use online, pull, use, scrape_length for full control
  • Combine with memory, integrity, and shaping for robust agents
  • Monitor latency: search grounding can add 1–6s per request

curl -X POST 'https://apipie.ai/v1/chat/completions' \
-H 'Authorization: Bearer <API_KEY>' \
-H 'Content-Type: application/json' \
--data-raw '{
"user": "research_bot",
"model": "mixtral",
"online": true,
"searches": 3,
"pull": 15,
"use": 5,
"scrape_length": 15000,
"search_lang": "en",
"search_geo": "US",
"messages": [
{ "role": "user", "content": "What are the leading companies developing open-source LLMs in 2025?" }
]
}'

Conclusion

Inline Search brings live, trusted information to any model or application using the standard OpenAI request format or advanced options. Whether you're building intelligent agents, research assistants, or chatbots — this feature ensures your responses are grounded in the real world.

Try it out with web_search_options, online: true, or :deepsearch in your next request.


Internal Documentation