Ollama

Chat with Ollama AI models

post

Chat with Ollama AI models.

Available Models: gpt-oss:20b, gpt-oss:120b, llama3.2:3b, deepseek-r1:8b

Required Parameters

Parameter
Type
Description

model

string

Model ID (see available models above)

messages

array

Array of message objects with role and content

Optional Parameters

Parameter
Type
Default
Description

think

string/bool

-

Reasoning mode: low/medium/high (gpt-oss), true/false (deepseek), not supported (llama3.2). Omit to use model default.

mode

string

auto

Routing: auto, opengpu (blockchain), direct (low-latency)

stream

boolean

false

Enable streaming (not yet supported)

options

object

-

Generation options (see below)

Options Object (optional)

Parameter
Type
Description

temperature

float

Sampling temperature (0.0-2.0)

top_k

integer

Top-k sampling (>=1)

top_p

float

Nucleus sampling (0.0-1.0)

num_ctx

integer

Context window size (128-131072)

num_predict

integer

Max tokens to generate (-1 = infinite)

seed

integer

Random seed for reproducibility

stop

array

Stop sequences

repeat_penalty

float

Repetition penalty (>=0)

Async Mode

Set async: true to get a task_address immediately and poll for results.

Header parameters
x-api-keyany ofOptional
stringOptional
or
nullOptional
Body

Ollama-compatible chat completion request.

Supports all standard Ollama parameters with explicit validation.

modelstringRequired

Model identifier. Available: gpt-oss:20b, gpt-oss:120b, llama3.2:3b, deepseek-r1:8b

Example: gpt-oss:120b
streamany ofOptional

Enable streaming responses (not yet supported)

Default: false
booleanOptional
or
nullOptional
optionsany ofOptional

Model generation options (temperature, top_k, etc.)

or
nullOptional
thinkany ofOptional

Reasoning mode. gpt-oss: 'low'/'medium'/'high', deepseek-r1: true/false, llama3.2: not supported. Omit to use model default.

booleanOptional
or
string · enumOptionalPossible values:
or
nullOptional
modeany ofOptional

Routing mode: 'auto' (intelligent), 'opengpu' (blockchain), 'direct' (low-latency)

Default: auto
string · enumOptionalPossible values:
or
nullOptional
asyncany ofOptional

Async mode: returns task_address immediately, poll /v2/tasks/{task_address} for result. Default: false (sync mode).

Default: false
booleanOptional
or
nullOptional
Responses
chevron-right
200

Successful Response

application/json
post
/v2/ollama/api/chat

List available AI models

get

List all available AI models.

Returns all active Ollama models. Access control is enforced at request time via tier model_restrictions - this endpoint shows what's available.

Responses
chevron-right
200

Successful Response

application/json
get
/v2/ollama/api/models
200

Successful Response

Last updated