GuidesAPI Reference
Log In

A thread is a back-and-forth exchange between your generative AI pipeline and an external actor (typically an end user). A prime example is the conversation that occurs between ChatGPT and a user. Typically, a thread is composed of multiple AI generations.

Within our data model, we represent threads as a singly-linked list of pipeline runs. You can specify that a run belongs to a thread by specifying the prior pipeline run while generating the next run.

Example

Let's say you have a chat feature for your end user. The feature initially performs a chat completion request with our SDK.

// Omit initialization

const openai = new OpenAI({
  apiKey: process.env.OPENAI_KEY,
});

// This would normally be populated from an HTTP request
const endUserQuestion = "Hello OpenAI! How are you doing today?";

const chatCompletionResponse = await openai.chat.completions.create({
  messages: [
    {
      role: "user",
      content: endUserQuestion
    },
  ],
  model: "gpt-3.5-turbo",
  pipelineSlug: "introduction",
});

const runId = chatCompletionResponse.pipelineRunId;

// Omit logic to store this run in your desired persistent storage
// Omit logic to return the chat response to the user
# Omit SDK initialization

gentrace.init(
    api_key=os.getenv("GENTRACE_API_KEY"),
)

gentrace.configure_openai()

openai.api_key = os.getenv("OPENAI_KEY")

# This would normally be populated from an HTTP request
endUserQuestion = "Hello OpenAI! How are you doing today?"

result = openai.ChatCompletion.create(
    pipeline_slug="introduction",
    messages=[
        {
            "role": "user",
            "content": endUserQuestion
        }
    ],
    model="gpt-3.5-turbo"
)

runId = result["pipelineRunId"]

# Omit logic to store this run in your desired persistent storage
# Omit logic to return the chat response to the user

Once the request is performed and you receive a Gentrace run ID, store the initial pipeline run ID in persistent storage (e.g. database).

At a later point, your end user decides to respond to the supplied AI generation with a follow up question.

// Omit initialization

const openai = new OpenAI({
  apiKey: process.env.OPENAI_KEY,
});

// This would normally be populated from an HTTP request
const endUserQuestion = "Great to hear! What's the capital of Maine?";

const previousRunId = ...; // TODO: pull prior run ID from DB
const priorMessages = ...; // TODO: pull prior messages from DB

const chatCompletionResponse = await openai.chat.completions.create({
  messages: [
    ...priorMessages,
    {
      role: "user",
      content: endUserQuestion
    },
  ],
  model: "gpt-3.5-turbo",
  pipelineSlug: "introduction",
  gentrace: {
    // By specifying the previous run ID, we associate this generation to a thread in Gentrace 
    previousRunId,
  }
});

const runId = chatCompletionResponse.pipelineRunId;

// Omit logic to store this next run in your desired persistent storage
// Omit logic to return the chat response to the user
# Omit SDK initialization

gentrace.init(
    api_key=os.getenv("GENTRACE_API_KEY"),
)

gentrace.configure_openai()

openai.api_key = os.getenv("OPENAI_KEY")

previousRunId = ... # TODO: pull prior run ID from DB
priorMessages = ... # TODO: pull prior messages array from DB

# This would normally be populated from an HTTP request
endUserQuestion = "Great to hear! What's the capital of Maine?"

result = openai.ChatCompletion.create(
    pipeline_slug="introduction",
    messages=[
      	*priorMessages,
        {
            "role": "user",
            "content": endUserQuestion
        }
    ],
    model="gpt-3.5-turbo",
    gentrace={
      # By specifying the previous run ID, we associate this generation to a thread in Gentrace 
      "previousRunId": previousRunId
    }
)

runId = result["pipelineRunId"]

# Omit logic to store this run in your desired persistent storage
# Omit logic to return the chat response to the user

Once the generation completes, Gentrace will associate that generation with the previously-specified run.

UI

You can view threads in the Observe → Runs view. Runs that belong to the same thread are rolled up into a single row.

By tagging the generation with `previousRunId`, the whole thread is rolled up into a single row in the UI

While in the detailed view for the thread, navigate through the individual runs with the left/right arrow keys.

Navigate through the runs with the arrows keys

You can also navigate through individual runs in the timeline view.

Limitations

  • This feature is available for our observability features only.
  • Threading supports only singly-linked runs. We currently do not support branching where multiple generations reference the same run. If two runs are submitted with the same previous run ID (previousRunId), one run submission will be rejected with a 400 status code.

Future work

  • Thread evaluation
  • Visual thread comparison
  • Aggregate statistics (e.g. average run latency)
  • Thread metadata

If you're interested in shaping this future work, please reach out over email.