Ollama API

Using /api.

Use LangChain

from langchain_ollama.llms import OllamaLLM

Use Ollama package

See ollama-python on GitHub.

import ollama

response = ollama.chat(model='llama3.1', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response['message']['content'])

Use cURL

Based on the Ollama docs:

curl http://localhost:11434/api/generate -d '{
  "model": "llama3.1",
  "prompt":"Why is the sky blue?"
}'

OpenAI style

Using /v1.

Connect to Ollama’s OpenAI-compliant API with this URL:

http://localhost:11434/v1

From OpenAI compatibility.

This means you can use LangChain or the OpenAI package and just configure with that URL.

Use OpenAI package

from openai import OpenAI

client = OpenAI(
    base_url = 'http://localhost:11434/v1',
    api_key='dummy',
)

Use cURL

curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "llama3.2",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'