Claude API Tutorial: Complete Beginner’s Guide (Python, 2026)

The Claude API gives you direct access to Anthropic’s language models from any Python application. Whether you want to generate text, analyze images, stream responses token by token, or wire up tools so Claude can call your own functions — all of it goes through the same client.messages.create() call. This tutorial walks through every major feature with working code, from your first “Hello!” to a full tool-use loop.

Table of Contents

Prerequisites

You need Python 3.8 or later and an Anthropic API key. If you do not have a key yet, follow the step-by-step guide at How to Get a Claude API Key — it takes about two minutes. Store the key in an environment variable so it never ends up in your source code:

export ANTHROPIC_API_KEY="sk-ant-..."   # add to ~/.bashrc or ~/.zshrc

Installation and First Call

Install the official Anthropic Python SDK with pip:

pip install anthropic

Now make your first API call. The SDK reads ANTHROPIC_API_KEY from the environment automatically:

from anthropic import Anthropic

client = Anthropic()

msg = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)

print(msg.content[0].text)

Run it and you will see Claude’s greeting in your terminal. That single messages.create() call is the foundation of everything else in this tutorial.

The Messages API

Every call to client.messages.create() accepts the same core parameters:

model — the Claude model to use (e.g. "claude-sonnet-4-6")
max_tokens — the upper limit on output tokens; Claude stops when it reaches this or finishes naturally
messages — a list of turn objects, each with a role ("user" or "assistant") and content
system — an optional system prompt string that sets Claude’s persona and constraints (covered below)

The content field in each message can be a plain string or a list of content blocks. A content block is an object with a type field — "text" for text, "image" for images, "tool_use" for tool calls, and "tool_result" for tool outputs. This block-based design is what makes vision and tool use possible without a separate API surface.

The response object has:

msg.content — list of output content blocks
msg.content[0].text — the text of the first (usually only) block
msg.stop_reason — "end_turn" when Claude finished, "max_tokens" if truncated, "tool_use" when a tool was called
msg.usage.input_tokens / msg.usage.output_tokens — token counts for cost tracking

Your First Chat App

The Messages API is stateless — Claude does not remember previous turns automatically. To build a multi-turn conversation, append each exchange to the messages list before the next call. Here is a complete REPL chat loop:

from anthropic import Anthropic

client = Anthropic()
conversation = []

print("Chat with Claude (type 'quit' to exit)\n")

while True:
    user_input = input("You: ").strip()
    if user_input.lower() in ("quit", "exit", "q"):
        break
    if not user_input:
        continue

    conversation.append({"role": "user", "content": user_input})

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        messages=conversation,
    )

    assistant_text = response.content[0].text
    conversation.append({"role": "assistant", "content": assistant_text})

    print(f"Claude: {assistant_text}\n")

Each iteration appends both the user turn and Claude’s reply so that the full context is available on the next call. In production you will want to summarise or truncate conversation once it grows large to stay within context limits and control costs.

Streaming Responses

For chatbots and interactive tools, streaming lets you display tokens as they arrive instead of waiting for the full response. Use the client.messages.stream() context manager:

from anthropic import Anthropic

client = Anthropic()

with client.messages.stream(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain quantum entanglement simply."}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

print()  # newline after streaming finishes

The stream.text_stream iterator yields raw text deltas. When the block exits, the stream is closed automatically. If you need the final Message object (for stop reason, usage, etc.) call stream.get_final_message() after the loop.

Analyzing Images with Vision

Claude’s vision capability accepts images as base64-encoded content blocks alongside text. Load the image, encode it, then pass it as part of the content list:

import base64
from anthropic import Anthropic

client = Anthropic()

with open("screenshot.png", "rb") as f:
    image_data = base64.standard_b64encode(f.read()).decode("utf-8")

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data,
                    },
                },
                {
                    "type": "text",
                    "text": "Describe what you see in this image.",
                },
            ],
        }
    ],
)

print(response.content[0].text)

Supported media types are image/jpeg, image/png, image/gif, and image/webp. Images count toward input tokens — a typical 800×600 screenshot is roughly 900 tokens. You can also pass an image by URL using "type": "url" and a "url" field instead of the base64 block.

Tool Use (Function Calling)

Tool use lets Claude call your Python functions. You define a list of tools with JSON Schema descriptions, Claude returns a tool_use content block when it wants to call one, and you send the result back as a tool_result block. Here is a complete single-turn example with a weather lookup tool:

from anthropic import Anthropic

client = Anthropic()

tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a city.",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {"type": "string", "description": "City name, e.g. 'Kyiv'"},
            },
            "required": ["city"],
        },
    }
]


def get_weather(city: str) -> str:
    # Replace with a real weather API call
    return f"The weather in {city} is 22°C and sunny."


messages = [{"role": "user", "content": "What's the weather in Kyiv?"}]

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=tools,
    messages=messages,
)

# If Claude called a tool, handle it and send the result back
if response.stop_reason == "tool_use":
    tool_block = next(b for b in response.content if b.type == "tool_use")
    tool_result = get_weather(**tool_block.input)

    messages.append({"role": "assistant", "content": response.content})
    messages.append({
        "role": "user",
        "content": [
            {
                "type": "tool_result",
                "tool_use_id": tool_block.id,
                "content": tool_result,
            }
        ],
    })

    final = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        tools=tools,
        messages=messages,
    )
    print(final.content[0].text)
else:
    print(response.content[0].text)

In production you will typically wrap the tool-call loop so Claude can chain multiple tool calls before giving a final answer. The pattern stays the same: check stop_reason == "tool_use", dispatch all tool_use blocks, append tool_result blocks, and call messages.create() again.

System Prompts

The system parameter sets a persistent instruction that applies to the entire conversation. Unlike a user message, the system prompt is not part of the messages list — it sits outside the turn structure:

from anthropic import Anthropic

client = Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    system=(
        "You are a senior Python engineer. "
        "Always respond with concise, production-quality code. "
        "Prefer explicit error handling and type hints."
    ),
    messages=[{"role": "user", "content": "Write a function to retry a failed HTTP request."}],
)

print(response.content[0].text)

Tips for reliable system prompts:

Be specific about format — “respond in JSON” beats “be structured”
State constraints positively — “only answer questions about Python” rather than a long list of what not to do
For multi-turn chat apps, keep the system prompt stable across turns; changing it mid-conversation can confuse the model
Prompt caching (cache_control: {"type": "ephemeral"} on the system block) reduces costs by up to 90% on long system prompts

Choosing a Model

Anthropic offers three model tiers in 2026. Pick based on the complexity of your task and your latency and cost requirements:

claude-opus-4-7 — most capable; best for complex reasoning, nuanced writing, and multi-step agentic tasks where quality matters most
claude-sonnet-4-6 — balanced; excellent quality at moderate cost and speed; the right default for most production applications
claude-haiku-4-5 — fastest and cheapest; ideal for high-volume tasks, real-time streaming, classification, and extraction where latency is critical

All three models support the same API surface: text, vision, streaming, tool use, and system prompts. Start with claude-sonnet-4-6 and move up to Opus if quality falls short, or down to Haiku if cost or latency is the bottleneck.

Summary

Install with pip install anthropic; the SDK reads ANTHROPIC_API_KEY automatically
Every feature — text, vision, streaming, tool use — goes through client.messages.create()
Multi-turn chat works by appending each exchange to the messages list before the next call
Streaming uses client.messages.stream() and iterates stream.text_stream
Vision accepts base64-encoded images as content blocks alongside text
Tool use: define tools with JSON Schema, handle stop_reason == "tool_use", return tool_result blocks
System prompts set persistent instructions outside the turn list via the system parameter
Default to claude-sonnet-4-6; upgrade to Opus for quality, downgrade to Haiku for speed and cost

Further reading: How to Build an AI Agent in Python — extend these patterns into a full agentic loop; Claude API Cost Guide — understand pricing and how to use prompt caching to cut costs.

Subscribe to my newsletter — practical guides on Claude API, AI agents, RAG, and automation.