Skip to main content
Running a Managed Deep Agent involves creating a thread, starting a run, and streaming output. Deploy the agent first, then invoke it with the SDKs or REST API. For a shorter end-to-end path, see the quickstart.
Managed Deep Agents is in private preview, available on LangSmith Cloud in the US region only. Join the waitlist to request access.

Find the agent ID

Find the agent ID with the CLI or the REST API.
List agents:
deepagents agents list
Inspect one agent:
deepagents agents get <agent_id>

Create threads and stream runs

You can create threads and stream runs with the SDK or REST API. Currently, there is no CLI command for running Managed Deep Agents. Install the SDK for your runtime:
pip install managed-deepagents
Set request defaults:
export LANGSMITH_API_KEY="<LANGSMITH_API_KEY>"
export LANGSMITH_API_URL="https://api.smith.langchain.com"
export DEEPAGENTS_BASE_URL="$LANGSMITH_API_URL/v1/deepagents"
The SDKs read LANGSMITH_API_KEY by default. REST requests require the X-Api-Key header:
X-Api-Key: <LANGSMITH_API_KEY>
Export the agent ID you retrieved in the previous section:
export AGENT_ID="<agent_id>"
The examples below reuse these variables:
import os

from managed_deepagents import Client

agent_id = os.environ["AGENT_ID"]
client = Client()

Create a thread

Create a thread before running the agent. Threads preserve conversation and execution state for long-running work. The options object is optional, and both fields default to false. Set test_run to true to mark the thread as a test run that is filtered out of usage and analytics. By default, skip_memory_write_protection lets the runtime raise a human-in-the-loop interrupt before the agent writes to long-term memory, so you can approve or reject the write. Set it to true to let memory writes proceed immediately, which is useful for headless runs where no human is available to approve the write. For the full field reference, see the API reference.
thread = client.threads.create(
    agent_id=agent_id,
    options={
        "test_run": False,
        "skip_memory_write_protection": False,
    },
)
thread_id = thread["id"]
print(f"Thread ID: {thread_id}")

Stream a run from a thread

Start work on the thread and stream the result:
for event in client.threads.stream(
    thread_id,
    agent_id=agent_id,
    messages=[
        {
            "role": "user",
            "content": "Research recent approaches to agent memory and summarize the main tradeoffs.",
        }
    ],
    stream_mode=["values", "updates", "messages-tuple"],
    stream_subgraphs=True,
    user_timezone="America/Los_Angeles",
):
    print(event.event, event.data)
The endpoint streams Server-Sent Events (text/event-stream). With the stream mode (stream_mode, streamMode) shown, you receive incremental updates and messages-tuple events as the agent works, followed by a final values event with the run’s full state, including the agent’s response. Set stream_subgraphs (streamSubgraphs) to true to also stream events from subgraphs, such as subagents. The optional user_timezone sets the caller’s IANA timezone so the agent reasons about dates in local time, defaulting to the agent’s configured timezone or UTC. The cURL example prints raw SSE lines. Parse its data: payloads as JSON to drive a UI. The Python and TypeScript SDK examples yield decoded events with event.event and event.data. Choose a stream mode:
Stream modeUse for
valuesFull state snapshots after steps.
updatesIncremental state updates as the agent works.
messages-tupleToken-level message output for chat UIs. Emits a messages event whose payload is a [chunk, metadata] tuple.

Stream with React useStream

The previous Python SDK and TypeScript SDK examples stream route-level events. The following React useStream example exposes LangGraph projections such as stream.messages, stream.values, and output state for chat UIs. For React applications, use the TypeScript SDK’s LangGraph client adapter with @langchain/react:
Do not ship your LangSmith API key to the browser. In production React apps, route requests through your backend with a custom fetch instead of passing apiKey directly.
import { Client } from "@langchain/managed-deepagents";
import { useStream } from "@langchain/react";

const agentId = "<agent_id>";

const managedDeepAgents = new Client({
  // In browser apps, prefer passing a custom fetch that calls your backend.
  apiKey: process.env.LANGSMITH_API_KEY,
});

const client = managedDeepAgents.getLangGraphClient({ agentId });

export function ManagedDeepAgentStream() {
  const stream = useStream({
    client,
    assistantId: agentId,
    fetchStateHistory: false,
  });

  return (
    <section>
      <button
        type="button"
        disabled={stream.isLoading}
        onClick={() => {
          void stream.submit({
            messages: [
              { role: "user", content: "Write a short status update." },
            ],
          });
        }}
      >
        Run agent
      </button>

      {stream.messages.map((message, index) => (
        <p key={message.id ?? index}>{String(message.content)}</p>
      ))}
    </section>
  );
}

Next steps

SDK reference

SDK configuration details for the Python and TypeScript clients.

API reference

Route-level request and response details.
If a request fails, confirm that your API key is valid, that the workspace has private preview access, and that you are calling the supported region. For response status codes and error shapes, see the API reference.