API Reference

DO.New

The DO.New method is used to create a new SuperDoer instance for task execution. It provides an interface for configuring and instantiating a task executor with a given model, name, and purpose. There are two versions available: a synchronous version (New) and an asynchronous version (A_new).

New

Signature:

New(model: Model, **kwargs): SuperDoer

# Async version
A_new(model: Model, **kwargs): SuperDoer

Creates a new SuperDoer instance for task execution. This synchronous method wraps the asynchronous A_new method for ease of use in synchronous contexts.

Arguments:

model (Model): The model instance to use for task execution.
name (string): A required name for the doer.
purpose (string): A required description of the doer’s purpose.
Additional optional configuration can be provided via **kwargs.

Usage Example:

doer = DO.New(model, name="task_executor", purpose="execute various tasks")

# Async version
doer = await DO.A_new(model, name="task_executor", purpose="execute various tasks")

SuperDoer

The SuperDoer is an advanced task executor. It offers powerful methods to manage constraints, provisions, and task execution, enabling complex workflows for task automation and decision-making.

realm

Signature:

realm(provisions: List[Provision]) -> SuperDoer

The realm method adds additional resources or capabilities (provisions) to the SuperDoer instance. It returns a new instance with the specified provisions, such as browser, mcp.run tasks etc., which can be used to extend functionality during task execution.

Usage Example:

doer_with_provisions = doer.realm([browser_instance])

envision

Signature:

envision(constraints: dict or PydanticModel, verify?: Callable[[Any], Any]) -> SuperDoer

The envision method allows you to set output constraints that the final answer must meet. Constraints can be specified either as a simple Python dictionary or as a Pydantic model for schema-based validation. In addition, an optional verify function can be provided to perform further custom checks on the final result. If the output fails to meet the constraints or the verification function’s criteria, a ValidationError will be raised.

This mechanism ensures that the results produced by the enact method adhere strictly to the specified format or criteria, providing robust validation for task execution.

Usage Example:

class TeamMember(BaseModel):
    """A team member"""
 
    name: str = Field(description="The name of the person")
    bio: Optional[str] = Field(
        ..., description="a short bio of the person if known"
    )
    is_founder: bool = Field(
        description="Whether the person is a founder of the company"
    )
 
class Team(BaseModel):
    """The team"""
 
    members: list[TeamMember] = Field(description="The team members")
 
   
doer_with_constraints = doer.envision(
    constraints=Team, 
    verify=lambda team: True if len(team.members) > 0 else "Empty team")
)

enact

Signature:

enact(task: str, params?: dict[str, any]) -> Any

The enact method is an asynchronous operation that executes the specified task, integrating the configured model, output constraints from envision, and added provisions from realm. It validates and processes the result according to the defined constraints and returns the final output of the task.

Usage Example:

result = await doer.enact("execute task description")

The SuperDoer supports method chaining, allowing you to successively call realm and envision before executing a task with enact. This design enables flexible and modular workflows for complex automation scenarios.

DO.Browse

The WebBrowser interface manages browser sessions and page history, enabling autonomous navigation and interaction with web pages. Use these methods to perform a range of automated browser tasks including navigation, form interaction, cookie and storage management, image and text extraction, and knowledge graph extraction.

Initialization

Signature:

DO.Browse(**kwargs) -> WebBrowser

# Async version
DO.A_browse(**kwargs) -> WebBrowser

Creates a new browser instance for web automation. The synchronous Browse method wraps the asynchronous A_browse method for ease of use in synchronous contexts.

Arguments:

headless (bool, optional): Controls browser visibility. Defaults to True.
chrome_path (str, optional): Path to Chrome executable.
user_data_dir (str, optional): Chrome user profile directory.
channel (str, optional): Browser channel to use. Defaults to “chromium”.
screen (dict, optional): Screen dimensions. Example: {"width": 1920, "height": 1080}
bypass_csp (bool, optional): Bypass Content Security Policy. Defaults to False.
Additional configuration can be provided via **kwargs.

Usage Example:

# Basic initialization
browser = DO.Browse()
 
# Initialization with custom configuration
browser = DO.Browse(
    headless=False,
    channel="chrome",
    screen={"width": 1440, "height": 900}
)
 
# Async initialization
browser = await DO.A_browse(headless=False)
 
# Using with Chrome profile
browser = DO.Browse(
    chrome_path="/path/to/chrome",
    user_data_dir="/path/to/profile",
    channel="chrome",
    headless=False,
    args=["--profile-directory=Profile 2"]
)

goto

Signature:

goto(url: string): Promise<void>

Navigates to the specified URL in a new page.

Arguments:

url (string): The URL to navigate to.

Usage Example:

await browser.goto("https://example.com");

annotation

Signature:

annotation(enabled?: boolean): Promise<void>

Toggles visual annotation of elements on the page. When enabled, elements may be highlighted to assist with debugging and analysis.

Arguments:

enabled (boolean, default: true): Determines whether to enable or disable annotation.

Usage Example:

await browser.annotation(); // Enable annotations by default
await browser.annotation(false); // Disable annotations

cookies

Signature:

cookies(cookies?: Record<string, string>): Promise<Record<string, string>>

Gets or sets cookies for the current browser context using Playwright’s cookie mechanisms.

Arguments:

cookies (optional, Record<string, string>): If provided, sets the cookies and returns the updated state. If omitted, returns the current cookies.

Usage Example:

// Retrieve current cookies
const currentCookies = await browser.cookies();
 
// Set new cookies
const updatedCookies = await browser.cookies({
  session: "abc123",
  user_id: "12345"
});

storage

Signature:

storage(storageState?: {
  localStorage: Record<string, string>;
  sessionStorage: Record<string, string>;
}): Promise<{
  localStorage: Record<string, string>;
  sessionStorage: Record<string, string>;
}>

Gets or sets the storage state (both localStorage and sessionStorage) for the current page.

Arguments:

storageState (optional): An object with the following structure:
```
{
  localStorage: { key: "value", ... },
  sessionStorage: { key: "value", ... }
}
```
If omitted, the current storage state is returned.

Usage Example:

// Getting current storage state
const state = await browser.storage();
 
// Setting new storage state
await browser.storage({
  localStorage: { user: "jane" },
  sessionStorage: { token: "xyz789" }
});

click

Signature:

click(elementId: number): Promise<void>

Clicks on an element by its identifier. The click action moves the pointer heuristically and performs a mouse click.

Arguments:

elementId (number): The ID of the element to click.

Usage Example:

await browser.click(42);

type

Signature:

type(elementId: number, text: string): Promise<void>

Types the provided text into the specified element.

Arguments:

elementId (number): The ID of an input element.
text (string): The text string to type.

Usage Example:

await browser.type(33, "Hello, world!");

image

Signature:

image(elementId?: number, bbox?: [number, number, number, number], viewport?: boolean): Promise<Buffer>

Captures a screenshot in PNG format. If elementId is provided, captures that element; otherwise, captures the entire page.

Arguments:

elementId (optional, number): The ID of the element to capture. If omitted, the entire page is captured.
bbox (optional, tuple): A tuple [x1, y1, x2, y2] specifying a crop area.
viewport (optional, boolean): If true and elementId is omitted, captures only the viewport.

Returns: PNG image content as a Buffer.

Usage Example:

// Capture the entire page
const pageImage = await browser.image();
 
// Capture a specific element
const elementImage = await browser.image(15);

text

Signature:

text(elementId?: number): Promise<string>

Retrieves text content with interactive elements marked in the format [id@type#subtype].

Arguments:

elementId (optional, number): The ID of the element. If omitted, returns the full page text.

Returns: A string representing the text content.

Usage Example:

const pageText = await browser.text();

elements

Signature:

elements(bbox?: [number, number, number, number]): Promise<Record<number, ElementMetadata>>

Retrieves metadata for all elements on the page, optionally filtered by a bounding box.

Arguments:

bbox (optional, tuple): A crop filter specified as [x1, y1, x2, y2].

Returns: An object mapping element IDs to their metadata.

Usage Example:

const elements = await browser.elements();

evaluate

Signature:

evaluate(script: string): Promise<any>

Evaluates the provided JavaScript expression in the context of the current page.

Arguments:

script (string): The JavaScript code to run.

Returns: The result of the evaluated expression.

Usage Example:

const title = await browser.evaluate("document.title");

close

Signature:

close(): Promise<void>

Closes the current page, ending the browser session for that page.

Usage Example:

await browser.close();

state

Signature:

state(): Promise<string>

Retrieves the current state of the page, including interaction history, an overview of page elements, and top entities extracted from the page.

Returns: A string representing the current state of the page.

Usage Example:

const currentState = await browser.state();
console.log(currentState);

analyze

Signature:

analyze(): Promise<string>

Analyzes the current page performing Knowledge Graph extraction and entity recognition. This enriched analysis can be later used by the state() method for deeper insights.

Usage Example:

const analysis = await browser.analyze();
console.log(analysis);

Examples Core Values