Tutorials12 min read25 May 2026

Your First Claude API Call

Get from zero to a working Claude API integration in Python or Node.js. Covers installation, your first call, streaming, error handling, and the parameters that matter.

This tutorial takes you from zero to a working Claude API integration. By the end you will have made your first API call in Python and Node.js, understand the response object, added streaming, and have proper error handling in place.

What You Need

An Anthropic API key — get one at console.anthropic.com
Python 3.8+ or Node.js 18+ installed
Basic familiarity with either language
5–10 minutes

Never hardcode your API key in source code. Use environment variables. This tutorial shows the correct pattern — follow it from the start.

Installation

bash (Python)

pip install anthropic

bash (Node.js)

npm install @anthropic-ai/sdk
# or
yarn add @anthropic-ai/sdk

Your First Call in Python

Create a file called first_call.py. Set your API key as an environment variable first.

bash

export ANTHROPIC_API_KEY="your-api-key-here"

python

import anthropic

# The client automatically reads ANTHROPIC_API_KEY from the environment
client = anthropic.Anthropic()

message = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain what a REST API is in two sentences."}
    ]
)

print(message.content[0].text)

Line-by-Line Explanation

import anthropic — imports the SDK
anthropic.Anthropic() — creates a client. If ANTHROPIC_API_KEY is set, it picks it up automatically. You can also pass api_key="..." but do not do this in production.
model="claude-opus-4-5" — the model to use. claude-opus-4-5 is the most capable; claude-haiku-4-5 is fastest and cheapest for simple tasks.
max_tokens=1024 — the maximum number of tokens in the response. Always set this to avoid surprise costs.
messages=[...] — the conversation history. For a single-turn call this is just one user message.
message.content[0].text — the text of the first content block in the response.

Always set max_tokens. The default is model-dependent and higher than you usually need. For most tasks 512–2048 is the right range. For long documents or code generation, go up to 4096 or 8192.

Your First Call in Node.js / TypeScript

typescript

import Anthropic from "@anthropic-ai/sdk";

// Client reads ANTHROPIC_API_KEY from process.env automatically
const client = new Anthropic();

async function main() {
  const message = await client.messages.create({
    model: "claude-opus-4-5",
    max_tokens: 1024,
    messages: [
      {
        role: "user",
        content: "Explain what a REST API is in two sentences.",
      },
    ],
  });

  // content is an array of blocks — text blocks have a .text property
  const textBlock = message.content[0];
  if (textBlock.type === "text") {
    console.log(textBlock.text);
  }
}

main();

The SDK is fully typed. If you are using TypeScript, content blocks are a discriminated union — check textBlock.type === "text" before accessing .text. This makes your code safe and gives you autocomplete.

Understanding the Response Object

The response from client.messages.create() contains more than just the text. Here is the full structure with comments:

json (response structure)

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",  // unique message ID
  "type": "message",
  "role": "assistant",
  "model": "claude-opus-4-5",              // model that was used
  "content": [
    {
      "type": "text",                       // content block type
      "text": "A REST API is..."            // the actual response text
    }
  ],
  "stop_reason": "end_turn",               // why generation stopped
  // "end_turn" = natural end
  // "max_tokens" = hit your max_tokens limit — increase it if you see this
  // "stop_sequence" = hit a stop sequence you defined
  "stop_sequence": null,
  "usage": {
    "input_tokens": 18,                    // tokens in your messages + system prompt
    "output_tokens": 47                    // tokens in the response
  }
}

Check stop_reason in production. If you see "max_tokens" frequently, your responses are being cut off mid-sentence. Increase max_tokens or reduce your prompt length.

Adding Streaming

Streaming sends tokens as they are generated rather than waiting for the full response. This dramatically improves perceived latency for users. It is one line of change in both languages.

python (streaming)

import anthropic

client = anthropic.Anthropic()

# Use stream() instead of messages.create()
with client.messages.stream(
    model="claude-opus-4-5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Write a short poem about APIs."}
    ],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)  # print each token as it arrives

print()  # newline after streaming completes

typescript (streaming)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

async function main() {
  const stream = await client.messages.stream({
    model: "claude-opus-4-5",
    max_tokens: 1024,
    messages: [
      { role: "user", content: "Write a short poem about APIs." },
    ],
  });

  // Stream text chunks as they arrive
  for await (const chunk of stream) {
    if (
      chunk.type === "content_block_delta" &&
      chunk.delta.type === "text_delta"
    ) {
      process.stdout.write(chunk.delta.text);
    }
  }

  // Get the final complete message after streaming
  const finalMessage = await stream.finalMessage();
  console.log("
Total tokens used:", finalMessage.usage.output_tokens);
}

main();

Handling Errors Properly

API calls fail. Rate limits, network issues, invalid parameters — handle them explicitly. The SDK provides typed error classes for each failure mode.

python (error handling)

import anthropic

client = anthropic.Anthropic()

try:
    message = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello"}],
    )
    print(message.content[0].text)

except anthropic.APIConnectionError as e:
    print("Network error — check your internet connection")
    print(e)

except anthropic.RateLimitError as e:
    # Implement exponential backoff here in production
    print("Rate limit hit — slow down your requests")
    print(e)

except anthropic.AuthenticationError as e:
    print("Invalid API key — check ANTHROPIC_API_KEY")
    print(e)

except anthropic.APIStatusError as e:
    # Catch all other API errors (400, 500, etc.)
    print(f"API error: {e.status_code}")
    print(e.message)

typescript (error handling)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

async function callClaude(prompt: string): Promise<string> {
  try {
    const message = await client.messages.create({
      model: "claude-opus-4-5",
      max_tokens: 1024,
      messages: [{ role: "user", content: prompt }],
    });

    const textBlock = message.content[0];
    return textBlock.type === "text" ? textBlock.text : "";

  } catch (error) {
    if (error instanceof Anthropic.APIConnectionError) {
      throw new Error("Network error — check your connection");
    }
    if (error instanceof Anthropic.RateLimitError) {
      throw new Error("Rate limit exceeded — implement backoff");
    }
    if (error instanceof Anthropic.AuthenticationError) {
      throw new Error("Invalid API key");
    }
    throw error; // re-throw unexpected errors
  }
}

Setting max_tokens and temperature

python (key parameters)

message = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=2048,     # max tokens to generate — always set this
    temperature=0.7,     # 0.0 = deterministic/factual, 1.0 = creative/varied
    system="You are a helpful assistant.",  # optional system prompt
    messages=[
        {"role": "user", "content": "Write a product description for noise-cancelling headphones."}
    ],
    stop_sequences=["---"],  # stop generation when this string appears (optional)
)

# temperature guidance:
# 0.0  — best for: classification, data extraction, factual Q&A
# 0.3  — best for: code generation, technical explanations
# 0.7  — best for: general writing, summaries
# 1.0  — best for: creative writing, brainstorming

For code generation, keep temperature at 0.2–0.3. Higher temperatures produce more creative code — which usually means more bugs. For creative writing tasks, 0.7–0.9 gives better variety.

Multi-Turn Conversations

The Claude API is stateless — there are no sessions. To have a multi-turn conversation, you pass the full conversation history in every request.

python (multi-turn)

import anthropic

client = anthropic.Anthropic()

# Build conversation history manually
conversation = []

def chat(user_message: str) -> str:
    # Add user message to history
    conversation.append({"role": "user", "content": user_message})

    # Send the full history every time
    response = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=1024,
        system="You are a helpful programming tutor.",
        messages=conversation
    )

    assistant_message = response.content[0].text

    # Add assistant response to history for next turn
    conversation.append({"role": "assistant", "content": assistant_message})

    return assistant_message

# Example multi-turn session
print(chat("What is a closure in JavaScript?"))
print(chat("Can you give me a practical example?"))
print(chat("How is this different from a regular function?"))

Conversation history grows with every turn and costs tokens each time. For long sessions, consider summarising old turns to keep the context window manageable. A common pattern: keep the last 10 messages in full, summarise everything before that.

What to Build Next

A command-line tool that takes a filename and reviews the code in it
A batch processor that reads a CSV of questions and writes Claude's answers to a new column
A Slack bot that routes questions to Claude and posts the reply back to the channel
A document summariser that takes a URL, fetches the content, and returns a summary
A structured data extractor — give it freeform text, get back clean JSON
A multi-model router that sends creative tasks to a high-temperature model and factual tasks to a low-temperature one

The official Anthropic documentation at https://docs.anthropic.com covers all models, parameters, and advanced features including tool use, vision, and prompt caching. It is the most up-to-date reference.

APIClaudePythonNode.jsdevelopertutorials

🎓Interactive Courses

Ready to go further?

Take the interactive course — daily lessons, real exercises, XP and streaks. Turn reading into lasting skills.

Daily streaksXP & levels

Start a course