Deployments

Why use deployments?

At heart, deployments are simply a way to manage some completion parameters that would traditionally be committed and deployed with the codebase. With the speed at which LLMs are evolving, re-deploying code any time a prompt needs to be adjusted or a model needs to be changed is not sustainable.

The flow is simple:

Create completions via the API or the playground mcp tool. AnotherAI will separate the completion parameters into the static components (aka Version) and the dynamic components (aka Input).

Once you are happy with a set of parameters, create a deployment via the MCP tool.

Update your code to point to the deployed version and remove the hardcoded parameters.

In other words, change:

const completion = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [
        {
            "role": "system",
            // Using a template here instead of a string format to allow separating a static system message template and 
            // input variables.
            "content": `You are an expert on {{ country }}. You are helping a customer traveling to {{ country }}. Answer questions in {{ language }}.`
        },
        { role: "user", content: "Any customs I should be mindful about at the dinner table ?" },
    ],
    temperature: 0.5,
    //   Input variables
    input: {
        country: "France",
        language: "English",
    }
    agent_id: "travel-assistant"
});

to:

const completion = await openai.chat.completions.create({
    // model and temperature are stored in the deployment
    model: "anotherai/deployment/travel-assistant:production#1",
    messages: [
        // System message template is stored in the deployment
        // User message is dynamic here so it is not stored in the deployment
        { role: "user", content: "Any customs I should be mindful about at the dinner table ?" },
    ],
    input: {
        country: "France",
        language: "English",
    }
    // agent_id is stored in the deployment
});

Understanding versions and input

To understand deployments, it is important to understand how AnotherAI separates the static (Version) and dynamic (Input) portions of the completion call, and how AnotherAI re-creates the completion call from a Version and Input.

Separating version and input

The rules for separating the Version and Input are simple:

All completion parameters (model, temperature, etc.) besides messages are part of the Version
All messages up to the last message containing a templated content is part of the Version (version.prompt)
If no message contains a templated content and if the first message is a system message, the first system message is part of the Version

The input contains the rest:

the input variables
the messages that are not part of the Version

For example, in the following code:

The version contains the first system message but not the user message since the user message does not contain a templated content.
The input contains the user message and the input variables.

const completion = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [
        {
            "role": "system",
            // Using a template here instead of a string format to allow separating a static system message template and 
            // input variables.
            "content": `You are an expert on {{ country }}. You are helping a customer traveling to {{ country }}. Answer questions in {{ language }}.`
        },
        { role: "user", content: "Any customs I should be mindful about at the dinner table ?" },
    ],
    temperature: 0.5,
    //   Input variables
    input: {
        country: "France",
        language: "English",
    }
    agent_id: "travel-assistant"
});

Compiling a completion call from a version and input

Most providers API only accept a list of messages as input so AnotherAI needs to compile the completion call from the Version and Input.

Building the message list is done in two steps:

1. The message templates that belong to the Version are rendered using the input variables.
1. The messages that belong to the Input are added to the message list.

Response format and Variables schema

A Version also refers to a specific response format and set of variables. That's because both are usually tightly linked to the prompt itself:

the system message often refers to specific fields in the response format
if the prompt is templated, it directly refers to a set of input variables

Deployments

A deployment is a way to refer to a Version via an alias so that the underlying prompt and parameters can be edited without having to change the codebase.

Following with the example above, if we deploy the associated version with the alias travel-assistant/production#1, we can use it in the codebase like this:

const completion = await openai.chat.completions.create({
    model: "anotherai/deployment/travel-assistant:production#1",
    messages: [
        { role: "user", content: "Any customs I should be mindful about at the dinner table ?" },
    ],
    input: {
        country: "France",
        language: "English",
    }
});

When AnotherAI receives the request, it retrieves the Version using the alias (travel-assistant/production#1), follow the steps described above before sending the request to the provider.

Deployment (and Version) compatibility

As explained above, it is possible to change the Version that is targeted by a deployment. If there were no restrictions, on the replacement Version, it would be possible to break production code. For example, consider a Deployment A that requires two variables name and age. If we were to replace it with a Deployment B that requires three variables name, age and city, it would not be possible to correctly render the template. The same example can occur can be done with a response format (aka output schema).

To prevent any issue, AnotherAI prevents replacing a deployment with a version that has an incompatible set of variables or response format. Incompatible means that the two object definitions (JSON schemas) have different properties or property types.

// Consider an agent that extracts the capital of a country given a city
const completion = await openai.chat.completions.create({
    model: "gpt-4o",
    messages: [
        { role: "user", content: "Give me the capital of the country of {{ city }} ?" },
    ],
    input: {
        city: "Toulouse"
    }
    response_format: {
        type: "json_schema",
        json_schema: {
            name: "capital_response",
            schema: {
                type: "object",
                properties: {
                    capital: { type: "string" }
                }
            }
        }
    }
});

// the following is compatible since it uses the same variable schema and response format
const completion = await openai.chat.completions.create({
    model: "gpt-4.1", // different model
    messages: [
        // A new system message
        {role: "system": content: "You are a geography expert."}
        { role: "user", content: "What is the capital closest to {{ city }} ?" },
    ],
    temperature: 0, // a different temperature
    input: {
        city: "Toulouse"
    }
    response_format: {
        type: "json_schema",
        json_schema: {
            name: "capital_response",
            schema: {
                type: "object",
                properties: {
                    capital: { 
                        type: "string",
                        // Different JSON Schema metadata are not considered as incompatible
                        description: "<city>, <country>",
                        examples: ["Paris, France", "London, United Kingdom"]
                    }
                }
            }
        }
    }
});

// the following is not compatible since it uses a different variable schema
const completion = await openai.chat.completions.create({
    model: "gpt-4.1",
    messages: [
        // city -> country would be a breaking change in code
        { role: "user", content: "Give me the capital of {{ country }} ?" },
    ],
    input: {
        country: "France" // breaking change
    }
    ...
});

That does not mean that the new code cannot be deployed. It should simply create a new deployment instead of replacing the existing one. You can create as many deployments as you want. We suggest a naming convention like <agent-id>/<environment>#<number>.

Reconciliating code and deployments

The code allows targeting a deployment but still provide completion parameters. For example, one could write:

const completion = await openai.chat.completions.create({
    model: "anotherai/deployment/travel-assistant:production#1",
    input: {
        country: "France"
    }
    temperature: 0.5, // temperature might be different from the deployment
});

In the above example, we need to decide which temperature should be used, the one from the deployment or the one from the code.

We believe that code should be the source of truth which means that in the above case the temperature should be the one from the code. The reconciliation between code and deployments follows the following rules:

any provided completion parameter can override the corresponding deployment parameter
if the override creates a version that is incompatible with the deployment an error is raised.

Consider a deployment travel-assistant/production#1 created with:

model: "gpt-4o"
temperature: 0.5
variables: country: string

// Accepted since the version is compatible with the deployment
const completion = await openai.chat.completions.create({
    model: "anotherai/deployment/travel-assistant:production#1",
    input: {
        country: "France"
    }
    temperature: 1, // temperature 1 is used
    tools: [...] // tools are used
});

// Rejected since the version is incompatible with the deployment
const completion = await openai.chat.completions.create({
    model: "anotherai/deployment/travel-assistant:production#1",
    input: {
        country: "France"
    }
    response_format: {
        type: "json_schema",
        json_schema: ... // response format is incompatible with the deployment
    }
});

Example: Updating code for a new deployment

// Before - using old deployment
const completion = await openai.chat.completions.create({
    model: "anotherai/deployment/travel-assistant:production#1",
    input: {
        country: "France"
    }
});

// After - using new deployment with breaking changes
const completion = await openai.chat.completions.create({
    model: "anotherai/travel-assistant/production#2", // new deployment
    input: {
        destination: "France", // variable renamed: country -> destination
        traveler_type: "business" // new required variable added
    }
});

Deployments

Why use deployments?

Understanding versions and input

Separating version and input

Compiling a completion call from a version and input

Response format and Variables schema

Deployments

Deployment (and Version) compatibility

Reconciliating code and deployments

Creating New vs. Updating Existing Deployments

Benefits of updating an existing deployment

What updates are compatible with an existing deployment?

Creating a New Deployment

On this page

Deployments

Non-breaking Changes Examples

Breaking Changes Examples

On this page