Internet Search Grounding: Web Data in AI Responses

Enable any model to answer with real-time, verifiable information using Inline Search — our built-in web search and grounding system that works seamlessly with OpenAI-compatible APIs. With a single request, you can inject live context from the internet into your prompt, no browser tools, agents, or plugins required.

What is Search Grounding?

Inline Search enhances any model by integrating real-time search results and scraped content directly into the prompt — without requiring the model to have built-in browsing capabilities.

When enabled, the system:

Performs X number of live searches using top-ranked search engines
Fetches and ranks up to 100 results per search
Scrapes the top results, cleans the text, removes noise
Appends that content to your prompt before model invocation

This allows even legacy models to answer with fresh knowledge, and it works automatically with any model or app built on the OpenAI chat format.

Why Use Inline Search?

Fresh data in any model – including GPT-3.5, Claude, LLaMA, Mixtral, and more
No special setup – works out of the box in standard OpenAI API format
Customize search volume and scope
Simple to configure via request body

⚠️ Note: Because this uses live search + scraping, expect an added delay of 1–6 seconds total, depending on complexity of the web pages we are pulling data from. We have to render the javascript before we can pull the data.

Quick Start (OpenAI Format Example)

Use web_search_options in your request to instantly enable Inline Search.

curl -X POST 'https://apipie.ai/v1/chat/completions' \
-H 'Authorization: Bearer <API_KEY>' \
-H 'Content-Type: application/json' \
-d '{
  "user": "qa_test",
  "stream": true,
  "model": "openai/gpt-4o",
  "web_search_options": {
    "search_context_size": "medium"
  },
  "max_tokens": 500,
  "messages": [
    {
      "role": "user",
      "content": "Who was the best and worst president of the united states, quantifiable and verifiable?"
    }
  ]
}'

Options for `web_search_options.search_context_size`

Value	Description
`low`	Small content injection (~1 result up to 10k characters)
`medium`	Moderate (~3 results, ~15K characters)
`high`	Full web grounding (~5 results, ~35K characters)

Alternate Inline Search via Payload (Advanced)

You can also enable search with deeper control using the online flag and extended parameters:

Field	Description
`online`	Enable inline search grounding (`true` or `false`)
`searches`	Number of search queries to perform (default: 1)
`pull`	Max results to pull from each search (default: 20)
`use`	How many of those results to append to the prompt (default: 3)
`scrape_length`	Max number of characters to include from each scrape
`search_lang`	Language code for results (e.g. `"en"`)
`search_geo`	Geolocation code (e.g. `"US"`)

{
  "user": "user123",
  "model": "claude-3-5-sonnet",
  "online": true,
  "searches": 2,
  "pull": 10,
  "use": 4,
  "scrape_length": 12000,
  "search_lang": "en",
  "search_geo": "US",
  "messages": [
    { "role": "user", "content": "Summarize the most recent news about generative AI in healthcare." }
  ]
}

Using Inline CLI (Prompt Commands)

You can also enable search directly in the prompt with these Inline CLI commands:

Command	Behavior
`:search`	Fast search (1 query, 1 results)
`:searchmore`	More results (2 query, 3 results)
`:deepsearch`	Broad + deep search (3 queries, 5 results)
`:setsearchlang:en`	Set language to English
`:setsearchgeo:US`	Set geo location to United States

Example prompt:

{ "content": "Summarize today’s major AI headlines :deepsearch :setsearchlang:en :setsearchgeo:US" }

Direct Use of the Search & Scrape APIs

For users needing lower-level control or standalone search capabilities, we also offer public API endpoints.

POST `/v1/search`

{
  "query": "latest AI developments",
  "search_provider": "google",
  "search": "google",
  "geo": "us",
  "lang": "en",
  "results": 10,
  "safeSearch": -1,
  "user": "user123"
}

Returns a list of ranked search results with URLs, titles, and descriptions.

POST `/v1/scrape`

{
  "url": "https://example.com/article",
  "format": "parsed"
}

Returns parsed content:

title
textContent
excerpt
hrefs[]

Developer Tips

Works with all models, including offline-only models like GPT-3.5 or Mistral
Use web_search_options for OpenAI-style apps
Use online, pull, use, scrape_length for full control
Combine with memory, integrity, and shaping for robust agents
Monitor latency: search grounding can add 1–6s per request

Example: High-Precision Search

curl -X POST 'https://apipie.ai/v1/chat/completions' \
-H 'Authorization: Bearer <API_KEY>' \
-H 'Content-Type: application/json' \
--data-raw '{
  "user": "research_bot",
  "model": "mixtral",
  "online": true,
  "searches": 3,
  "pull": 15,
  "use": 5,
  "scrape_length": 15000,
  "search_lang": "en",
  "search_geo": "US",
  "messages": [
    { "role": "user", "content": "What are the leading companies developing open-source LLMs in 2025?" }
  ]
}'

Conclusion

Inline Search brings live, trusted information to any model or application using the standard OpenAI request format or advanced options. Whether you're building intelligent agents, research assistants, or chatbots — this feature ensures your responses are grounded in the real world.

Try it out with web_search_options, online: true, or :deepsearch in your next request.

What is Search Grounding?​

Why Use Inline Search?​

Quick Start (OpenAI Format Example)​

Options for web_search_options.search_context_size​

Alternate Inline Search via Payload (Advanced)​

Using Inline CLI (Prompt Commands)​

Direct Use of the Search & Scrape APIs​

POST /v1/search​

POST /v1/scrape​

Developer Tips​

Example: High-Precision Search​

Conclusion​

Related Links​

Internal Documentation​