Back to Publications

Composable AI Pipelines: Write Intent, Not Infrastructure

    Tech Note
  • Artificial Intelligence

Introduction

AI product features routinely embed scheduling, model selection, and retry logic directly into business code. This couples what a workflow does to how it runs, creating fragility as runtime capabilities evolve.

Composable Pipelines separates those concerns by design. Developers express intent in a SwiftUI-inspired Swift DSL — property wrappers, result builders, composable value types — making the authoring model immediately familiar to any Swift developer. Our proprietary runtime handles dependency analysis, parallel batching, model ranking, and re-execution mechanics. The developer-facing surface stays stable while the execution layer improves.

This note describes that boundary: the DSL contract, the compilation and execution model, and the open questions that remain.

Motivation

Most AI feature code today is a mix of business logic and infrastructure: loops that retry, branching that routes between models, glue that parallelizes independent calls. That works at small scale, but it accumulates fast. Every new feature re-solves the same scheduling problems, and every runtime improvement requires touching product code.

Our design target is a clean separation:

  • developers write intent in a DSL,
  • the compiler and executor decide scheduling and model routing,
  • local-first execution is a first-class mode, not an afterthought.

This note demonstrates the current architecture direction. Breaking changes should be expected as we iterate.

System Model

The pipeline lifecycle follows four stages:

flowchart LR DSL["DSL"] --> AST["AST"] AST --> Compiler["Compiler"] Compiler --> Executor["Executor"]

DSL

Swift-native declarative authoring via @PipelineBuilder, @State, and composable primitives: Model, Guardrail, While, Group, ClientTask. The construction is intentionally SwiftUI-inspired — property wrappers, result builders, and composable value types — so the authoring model feels immediately familiar to any Swift developer who has built a view hierarchy. Developers express intent; the DSL makes that intent inspectable by the compiler.

AST

Pipelines lower to PipelineGraph—a codable graph representation that serves as the stable transport boundary between authoring and compilation.

Compiler

PipelineCompiler transforms the AST into a PipelineExecutionGraph by performing dependency analysis, slot hazard ordering (RAW / WAR / WAW), gate and barrier edge construction, and topological grouping for safe parallel batches.

Executor

ExecutionEngine evaluates compiled graphs with deterministic walk semantics, state update commits, parallel branch batching, and graph replacement when committed state triggers re-execution.

Re-execution and Epoch Resolution

When a model step completes and writes to state, the pipeline may need to continue along a different path. The naive solution—re-run everything from the top—is correct but wasteful. The obvious alternative—assign stable identities to tasks and skip the ones already done—sounds reasonable until you try to implement it. Task identity in a dynamic graph turns out to be fragile: two executions of the same pipeline body can produce structurally similar but semantically different graphs, and there is no reliable anchor for "this task in pass two is the same task as in pass one."

The underlying tension comes from how pipelines lower to graphs. Developers read the DSL as conventional code: steps flow top to bottom, state accumulates, branches resolve. But the executor never sees the full structure at once. Because the graph re-lowers only when committed state changes, the executor receives one DAG slice per pass — a partial view of a pipeline that may branch or extend itself based on values it has not yet produced.

This asymmetry pointed us toward a different invariant: instead of tracking task identity, guarantee that the prefix of each re-lowered graph is identical to the prefix of the previous one. If that holds, the executor can skip the prefix entirely without knowing anything about individual tasks — positional order is enough.

Enforcing that guarantee is where epoch-based time travel comes in. Each committed state write advances a monotonic epoch counter. When the DSL body re-evaluates for a new pass, @State reads are answered using the slot history as it stood at the epoch recorded by the previous pass — not the current live values. This means the re-lowered graph replays the same decisions the first pass made for every step that was already executed, producing a structurally identical prefix. Only the suffix, where new state values change the outcome, diverges.

At the boundary between passes, the engine produces an ExecutionCursor carrying the latest epoch and a walk offset — the count of surface tasks allocated so far. The client trims the re-lowered graph by that offset and hands only the suffix back to the executor. The executor skips the prefix, slot values from prior passes remain available in place, and execution continues from exactly where the graph changed.

The result is incremental re-execution with a provable prefix guarantee: a pipeline can iterate, branch differently, and extend itself in response to model output, while the engine avoids redundant work without needing to track or compare individual tasks across passes.

Developer Contract

The contract separates what product teams own from what the runtime owns.

Developers control: pipeline business logic, control flow and state transitions, task intent via model hints, and tool usage inside client tasks.

Our runtime controls: scheduling and batching strategy, parallel safety and hazard ordering, model candidate ranking and selection, and execution bookkeeping.

This separation is deliberate. DSL stays stable and readable; the runtime keeps improving without forcing product teams to rewrite logic.

Product boundary: PipelineDSL syntax and semantics are the developer-facing contract. The compiler and executor are proprietary.

Examples

The snippets below are intentionally compact and illustrate contract shape, not full production implementations.

Intent-First Composition

The simplest form: a two-step pipeline where the reply step depends on a prior summarization.

import PipelineDSL

struct SupportReply: Pipeline {
    typealias Output = String

    let userMessage: String
    @State var summary: String = ""

    var body: some Pipeline {
        $summary.set {
            Model(
                instructions: "Summarize user intent",
                hints: [.fastResponse, .instruct],
                input: userMessage
            )
        }

        Model(
            instructions: "Write a short helpful reply",
            hints: [.reasoning, .streaming],
            input: summary
        )
    }
}

Hints describe intent, not a hardcoded model ID. The runtime selects a candidate at execution time.

Parallel Prep + Guarded Exit

Independent branches can batch in parallel when optimization is enabled. A guardrail check allows early return before the expensive synthesis step.

import PipelineDSL

struct AnalyzeMessage: Pipeline {
    typealias Output = String

    let message: String
    @State var intent: String = ""
    @State var sentiment: String = ""
    @State var risk: Bool = false

    var body: some Pipeline {
        $intent.set {
            Model(
                instructions: "Classify intent",
                hints: [.fastResponse, .instruct],
                input: message
            )
        }
        $sentiment.set {
            Model(
                instructions: "Classify sentiment",
                hints: [.fastResponse, .instruct],
                input: message
            )
        }
        $risk.set {
            Model(
                instructions: "Reply true if the message contains PII, false otherwise",
                hints: [.fastResponse, .instruct],
                input: message
            )
        }

        if risk {
            Self.return("Request blocked by guardrails.")
        } else {
            Model(
                instructions: "Synthesize final response",
                hints: [.reasoning, .streaming],
                input: """
                message=\(message)
                intent=\(intent)
                sentiment=\(sentiment)
                """
            )
        }
    }
}

The diagram below illustrates how this pipeline actually executes at runtime — the compiler detects that the three state-writes have no data dependencies between them and groups them into a parallel batch.

flowchart TD Start([Pipeline Start]) --> Compiler["Compiler: detects independent branches"] Compiler --> I & S & R subgraph Parallel ["Parallel Batch"] I["Model: Classify intent → $intent"] S["Model: Classify sentiment → $sentiment"] R["Model: PII check → $risk"] end I & S & R --> Gate["Barrier — all branches complete"] Gate --> Cond{risk?} Cond -- yes --> Blocked([Return: request blocked]) Cond -- no --> Synth["Model: Synthesize final response"] Synth --> Output([Output])

Agent Loop Shape

Iterative tool-call loops remain declarative. The While block drives refinement without exposing scheduling machinery to product code.

import PipelineDSL

struct AgentTurnLite: Pipeline {
    typealias Output = String

    @State var conversation: String
    @State var reply: String = ""

    var body: some Pipeline {
        While { reply.isEmpty } body: {
            let modelOutput = Model(
                instructions: "Return either final reply or tool plan",
                hints: [.reasoning, .streaming],
                input: conversation
            )
            // If tool plan → run ClientTask → append tool result to conversation
            // Else set reply
        }
        reply
    }
}

Pipeline Composition

Pipelines compose. A small reusable sub-pipeline can be embedded into any larger product flow without duplicating logic.

import PipelineDSL

struct TranslateAndSummarize: Pipeline {
    typealias Output = String

    let text: String
    @State var translated: String = ""

    var body: some Pipeline {
        $translated.set {
            Model(
                instructions: "Translate to English",
                hints: [.fastResponse, .instruct],
                input: text
            )
        }

        Model(
            instructions: "Summarize in two sentences",
            hints: [.reasoning],
            input: translated
        )
    }
}

struct ResearchBrief: Pipeline {
    typealias Output = String

    let articles: [String]
    @State var summary: String = ""

    var body: some Pipeline {
        ForEach(in: { articles }) { article in
            $summary.set { TranslateAndSummarize(text: article) }
        }

        Model(
            instructions: "Synthesize a research brief from the collected summaries",
            hints: [.reasoning, .streaming],
            input: summary
        )
    }
}

Model Hints and Selection

Model hints are soft runtime signals, not hardcoded model references. The current vocabulary includes fastResponse, instruct, reasoning, and streaming.

Developers express workload intent. The runtime ranks available model candidates and selects one at execution time. Pipeline logic is therefore not coupled to a single vendor or model ID by default—this is a deliberate core design decision, not an incidental property.

Local-First Thesis

We believe local execution will become the default for many production AI workflows.

The reasoning is practical: the privacy boundary stays near user data; the security posture is easier to reason about; the cost profile is more predictable; and product teams retain more freedom from external provider economics.

Platform Architecture

The same pipeline architecture runs on both macOS and iOS. The developer-facing API is identical across platforms — what changes is only how the runtime is deployed.

macOS: Out-of-Process Agent

On macOS, we ship a system service — the Elix Agent — that runs as a launch agent and owns execution. Client apps link against ElixClient, which lowers the DSL to a PipelineGraph and sends it to the agent over XPC. The agent compiles the graph, runs it through ExecutionEngine, and calls back to the client whenever pipeline state changes require re-emitting the body or executing a ClientTask closure.

Client app
    │  ElixClient.execute(pipeline:)
    │  lower DSL → PipelineGraph
    ▼
ElixAgentRuntime  (XPC service)
    │  compile PipelineGraph → PipelineExecutionGraph
    │  ExecutionEngine walks graph
    │  → calls back for ClientTask and state re-emission
    ▼
Result returned to client

This split has real benefits beyond privacy. The agent can serve multiple client processes, enforce load-balancing per process key, and validate client code signatures and entitlements before accepting any work. The agent binary is distributed and updated independently from client apps — ElixAgentBootstrap handles resolving, downloading, signature-verifying, and installing the correct agent version before a client connects.

The XPC contract is intentionally small: RunPipeline from client to agent, and HandlePipelineStateUpdated / ProvideUpdateGraph / ExecuteClientTask from agent back to client. No broader execution surface is exposed.

iOS: Embedded Runtime

On iOS, cross-process services are not a viable pattern. Rather than a different architecture, the same runtime is embedded directly inside the client process. ElixClient switches to a local implementation that runs ExecutionEngine in-process, with the same compilation path and the same re-execution semantics.

The pipeline code is identical. No conditional compilation, no platform-specific branching in product code. The difference is entirely in how ElixClient is backed at the deployment layer.

Cloud Compatibility

Cloud execution remains important for burst capacity and large shared infrastructure. Batching economics at cloud scale can be significant, and our pipeline model targets both environments with the same developer contract: write DSL intent once, let the runtime choose execution strategy.

Clients plug in remote models by implementing the ModelProvidingHeap protocol and advertising a [RemoteModelDescriptor] catalog. Each descriptor declares a model ID and the ModelSelectionHints it advertises — the same hint vocabulary used in the DSL (fastResponse, reasoning, streaming, etc.). When the executor encounters a Model step, it ranks descriptors by hint-intersection score and invokes the winning entry. Clients separately provide a closure that maps descriptor IDs to ModelConfiguration values holding the endpoint URL and credentials — keeping connection details out of the authoring layer entirely.

let catalog: [RemoteModelDescriptor] = [
    RemoteModelDescriptor(id: "fast-cloud", advertisedHints: [.fastResponse, .instruct]),
    RemoteModelDescriptor(id: "reasoning-cloud", advertisedHints: [.reasoning, .streaming]),
]

let config = ElixClientConfiguration(
    modelCatalog: catalog,
    modelConfigurationForDescriptor: { id in
        switch id {
        case "fast-cloud":      return ModelConfiguration(modelID: id, baseURL: "...", apiKey: "...")
        case "reasoning-cloud": return ModelConfiguration(modelID: id, baseURL: "...", apiKey: "...")
        default: return nil
        }
    }
)

For teams already running OpenAI-compatible inference (Ollama, vLLM, Groq, and similar), the built-in .openAICompatibleHTTP endpoint type covers the transport layer without requiring a custom ModelProvidingHeap implementation. Both approaches coexist in the same catalog, so a pipeline can route fast classification steps to a local model and reasoning steps to a remote one — driven entirely by hints, with no routing logic in pipeline code.

Client Authority and Permissions

Every action that touches the outside world — file access, network calls, keychain reads, user prompts, API calls — runs inside a ClientTask closure on the client side. The executor never crosses that boundary directly. It schedules a ClientTask step, hands control back to the client process, waits for the result, and continues.

This means the client is the sole gatekeeper between the execution engine and the environment. Permissions, entitlements, secrets, and user-facing authorization dialogs all live in client code, not in pipeline graph definitions. A pipeline that calls a calendar API, reads a file, or requests a user confirmation does so through a ClientTask that the client registers and controls. The DSL describes that a task should happen; the client decides whether and how.

ClientTask(
    input: query,
    action: { encodedQuery in
        // Running in client process: full access to keychain, entitlements, UI
        let token = try Keychain.read("calendar-api-token")
        let result = try await CalendarAPI(token: token).search(encodedQuery)
        return try JSONEncoder().encode(result)
    }
)

The practical consequence is a clean capability model: the executor is stateless with respect to user secrets and system access. It cannot exfiltrate credentials, trigger permission dialogs, or access protected resources on its own. All of that capability is owned and brokered by the client, which can enforce its own audit, rate-limiting, or user-consent policies around each task registration.

On macOS, this boundary is also a process boundary — ClientTask callbacks cross XPC back to the originating client, so the agent never holds secrets in its process space at all. On iOS, where execution is in-process, the same logical boundary holds: task closures are registered by client code and remain under client control.

Open-Source Plan

We intend to open-source the PipelineDSL package — the authoring layer developers write against — while keeping the compiler and executor proprietary.

What becomes public: PipelineDSL syntax, the Pipeline protocol, all composable primitives (Model, Guardrail, While, Group, ClientTask, ForEach), @State and @PipelineBuilder, and the PipelineGraph AST representation. This gives developers a stable, inspectable surface they can build product code against, share pipelines, and contribute to tooling.

What stays closed: PipelineCompiler, ExecutionEngine, model ranking and selection logic, scheduling heuristics, and the re-execution mechanics. These are where our runtime differentiation lives.

The rationale is straightforward: an open authoring surface makes pipelines portable and testable in isolation, and lowers the barrier for early adoption. A closed execution layer lets us continue optimizing scheduling, batching, and model selection without exposing proprietary logic or locking the DSL contract to implementation details.

This is the same pattern as many successful platform designs — open interface, differentiated engine. Developers never need to know which execution path their pipeline takes; they just write intent.

@online{kotliar-2026-composable-ai-pipelines,
  author = {Maksym Kotliar},
  title = {Composable AI Pipelines: Write Intent, Not Infrastructure},
  note = {\emph{Online.} \url{https://research.macpaw.com/publications/composable-ai-pipelines}},
  month = {May},
  year = {2026},
}

Related publications