OpenAI-compatible chat completions endpoint. Accepts the standard OpenAI request schema and returns the standard OpenAI response schema. Supports both streaming (SSE) and non-streaming modes. Works as a drop-in replacement with the OpenAI Python and TypeScript SDKs.
Model to use for completion
A list of messages comprising the conversation
Sampling temperature
Nucleus sampling parameter
Number of completions to generate
Whether to stream partial results
Maximum tokens to generate
Maximum completion tokens (takes precedence over max_tokens)
Presence penalty
Frequency penalty
Unique identifier for the end-user
Maximum number of agent reasoning loops (Swarms extension). Defaults to 1 (single pass). Use via extra_body in the OpenAI SDK.