Create threads and stream runs for a Managed Deep Agent.
Running a Managed Deep Agent involves creating a thread, starting a run, and streaming output. Deploy the agent first, then invoke it with the SDKs or REST API.For a shorter end-to-end path, see the quickstart.
Managed Deep Agents is in private preview, available on LangSmith Cloud in the US region only. Join the waitlist to request access.
You can create threads and stream runs with the SDK or REST API. Currently, there is no CLI command for running Managed Deep Agents.Install the SDK for your runtime:
Create a thread before running the agent. Threads preserve conversation and execution state for long-running work.The options object is optional, and both fields default to false. Set test_run to true to mark the thread as a test run that is filtered out of usage and analytics. By default, skip_memory_write_protection lets the runtime raise a human-in-the-loop interrupt before the agent writes to long-term memory, so you can approve or reject the write. Set it to true to let memory writes proceed immediately, which is useful for headless runs where no human is available to approve the write. For the full field reference, see the API reference.
The endpoint streams Server-Sent Events (text/event-stream). With the stream mode (stream_mode, streamMode) shown, you receive incremental updates and messages-tuple events as the agent works, followed by a final values event with the run’s full state, including the agent’s response. Set stream_subgraphs (streamSubgraphs) to true to also stream events from subgraphs, such as subagents. The optional user_timezone sets the caller’s IANA timezone so the agent reasons about dates in local time, defaulting to the agent’s configured timezone or UTC.The cURL example prints raw SSE lines. Parse its data: payloads as JSON to drive a UI. The Python and TypeScript SDK examples yield decoded events with event.event and event.data.Choose a stream mode:
Stream mode
Use for
values
Full state snapshots after steps.
updates
Incremental state updates as the agent works.
messages-tuple
Token-level message output for chat UIs. Emits a messages event whose payload is a [chunk, metadata] tuple.
The previous Python SDK and TypeScript SDK examples stream route-level events. The following React useStream example exposes LangGraph projections such as stream.messages, stream.values, and output state for chat UIs.For React applications, use the TypeScript SDK’s LangGraph client adapter with @langchain/react:
Do not ship your LangSmith API key to the browser. In production React apps, route requests through your backend with a custom fetch instead of passing apiKey directly.
import { Client } from "@langchain/managed-deepagents";import { useStream } from "@langchain/react";const agentId = "<agent_id>";const managedDeepAgents = new Client({ // In browser apps, prefer passing a custom fetch that calls your backend. apiKey: process.env.LANGSMITH_API_KEY,});const client = managedDeepAgents.getLangGraphClient({ agentId });export function ManagedDeepAgentStream() { const stream = useStream({ client, assistantId: agentId, fetchStateHistory: false, }); return ( <section> <button type="button" disabled={stream.isLoading} onClick={() => { void stream.submit({ messages: [ { role: "user", content: "Write a short status update." }, ], }); }} > Run agent </button> {stream.messages.map((message, index) => ( <p key={message.id ?? index}>{String(message.content)}</p> ))} </section> );}
SDK configuration details for the Python and TypeScript clients.
API reference
Route-level request and response details.
If a request fails, confirm that your API key is valid, that the workspace has private preview access, and that you are calling the supported region. For response status codes and error shapes, see the API reference.
Connect these docs to Claude, VSCode, and more via MCP for real-time answers.