开发者文档

Curl 与 Python 示例

Curl

curl -X POST http://127.0.0.1:9000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Say hello in one short sentence."}
    ],
    "stream": false,
    "think": false,
    "max_tokens": 64,
    "temperature": 0.2
  }'

Python

from openai import OpenAI

client = OpenAI(
    base_url="http://127.0.0.1:9000/v1",
    api_key="omniinfer-local",
)

resp = client.chat.completions.create(
    model="local-model",
    messages=[
        {"role": "user", "content": "Say hello in one short sentence."}
    ],
    stream=False,
)

print(resp.choices[0].message.content)

如果你的应用会自己控制本地模型加载,则也可以先调用控制平面接口,再走这里的推理接口。

OmniMind

万象智维

Omni Studio 公众号二维码

公众号

Omni Studio 小红书二维码

小红书

© 2025 万象智维科技有限公司. All rights reserved.

京ICP备2025136340号-1