Zero QA Architecture

What Users Need to Know

From a user point of view, Zero QA is simple:

you provide suites in business language
you define suite dependencies in zero-qa.yaml
you can provide an ordered initialization script list in zero-qa.yaml
Zero QA runs executors through an Agent runtime such as Codex or Claude Code
zero-qa ui can expose the shared workspace through a local read-only dashboard
mobile and browser connections are already available out of the box

How Zero QA Works

mermaid

flowchart TD
    subgraph O[Only Once]
        A[Business Project<br/>zero-qa.yaml] --> B

        subgraph B["Scenario + Target Suites"]
            T1["Suite A (Usage doc)<br/>how to use Feature A<br/>e.g. log in"]
            T2["Suite B (Usage doc)<br/>how to use Feature B<br/>e.g. browse products"]
            T3["Suite Z (Usage doc)<br/>how to use Feature Z<br/>e.g. place an order"]
            T1 -->|"prerequisite of"| T2
            T2 -->|"prerequisite of"| T3
        end

        C[DAG Planning<br/>validate + topological order]
        D[Executor Preflight<br/>executor.yaml + host tools]
        E[Project Init Scripts<br/>pre_test_scripts]
        R[Runtime Inputs Resolution<br/>scenario vars + resolve commands]

        B --> C
        C --> D
        D --> E
        E --> R
    end

    subgraph L[Loop Suites]
        F[One Planned Suite]
        G[Per-Suite Workspace]
        H[Per-Suite Execution<br/>Agent runtime + executor skill]
        I[Android or iPhone / Browser]
        J[Suite Result<br/>PASS or FAILED]
        W[Shared Run Workspace<br/>run_meta.json + suites/* + results/summary.json]

        F --> G
        G --> H
        H --> I
        I --> H
        H --> J
        G --> W
        J --> W
    end

    subgraph N[Run Finalization]
        M[Project Finalization Scripts<br/>post_test_scripts]
        K[Run Summary<br/>selected suites + durations]
        P[Post-run Hooks<br/>best-effort cleanup]
        M --> K
        K --> P
        K --> W
    end

    subgraph U[Read-only UI]
        Q[Workspace Reader<br/>filesystem scanner + status inference]
        S[JSON API + HTML Pages<br/>FastAPI + Jinja2]
        T[SSE Refresh Stream<br/>watchfiles]
        V[Browser Dashboard<br/>projects, runs, DAG, suite detail]

        Q --> S
        Q --> T
        S --> V
        T --> V
    end

    R --> F
    J --> M
    W --> Q

This is the intended meaning:

start from Business Project / zero-qa.yaml
Target Suites means the suite set that will actually run: user-selected suites plus automatically expanded upstream dependencies
Scenario + Target Suites, DAG Planning, Executor Preflight, and Project Init Scripts are run setup stages and happen once before the suite loop
Runtime Inputs Resolution runs once before the first suite workspace is created
Project Finalization Scripts, Run Summary, and Post-run Hooks are run finalization stages and happen once after the suite loop finishes
Per-Suite Workspace, Per-Suite Execution, and Suite Result happen once for each suite in the planned order
Shared Run Workspace is the contract boundary between zero-qa run and zero-qa ui
Zero QA parses the scenario, expands upstream dependencies when users select specific suites, and derives a topological execution order
suite focuses on business steps such as how to use Feature A to Feature Z
in practice, suite is often the usage document for that feature
suite does not include phone commands or browser commands
Executor Preflight loads executor metadata from executor.yaml, checks host tools before any suite execution starts, installs missing tools only when allowed, and re-checks afterward
Project Init Scripts runs once before suite execution starts and fails fast on the first script error
Runtime Inputs Resolution collects executor-scoped scenario variables, runs executor-declared resolve commands, and builds the concrete values rendered into copied skill files
Project Finalization Scripts runs post_test_scripts best-effort before the run summary is written, even when suite execution, pre-test scripts, host-tool checks, or runtime input resolution already failed
Post-run Hooks runs executor-declared best-effort cleanup commands after the run summary is written, preserving the main run result even when cleanup fails
each suite then runs in an isolated workspace with only the current suite and copied executor skill directories
Per-Suite Execution runs the Agent runtime and selected executor skill inside that suite workspace
the shared workspace persists run_meta.json, per-suite result.json, evidence files, and the final results/summary.json
run_meta.json now carries dag_nodes and dag_edges, so the UI can render the execution graph without importing planner code
Workspace Reader scans the shared workspace, infers in-progress suite states from directory and file presence, and fails fast when workspace data is inconsistent
JSON API + HTML Pages expose project history, run detail, and suite detail through one local FastAPI app
SSE Refresh Stream watches one run directory and tells the browser to refresh while a run is still in progress
suite results stay minimal, while run-level summary data tracks selected suites and stage durations
executors already know how to interact with Android or iPhone and Browser

In many cases, a suite is effectively the user document for one feature:

for Mobile App flows, it describes how to use the app
for Web flows, it describes how to use the website
once that usage document exists, Zero QA can use it as the testing input
an Agent can also generate the usage document or suites automatically by reading code

Minimal Demo

You can look at this minimal demo directly in examples/minimal_ecommerce_demo/zero-qa.yaml

It shows the three things users usually need to prepare:

the minimal zero-qa.yaml
the minimal suite documents
the dependency relationship between suites

Run-level executor selection:

use zero-qa run --executor-type web-executor for Browser runs
use zero-qa run --executor-type mobile-executor for Android or iPhone runs
if --executor-type is omitted, Zero QA first checks scenario defaults.executor_type
if the scenario does not define it, Zero QA uses global defaults.executor_type

Minimal zero-qa.yaml example:

yaml

scenario_name: minimal-ecommerce-demo
defaults:
  executor_type: web-executor
pre_test_scripts:
  - path: scripts/start-backend.sh
  - path: scripts/start-frontend.sh
    executor_type: web-executor
  - path: scripts/build-install-apk.sh
    executor_type: mobile-executor
post_test_scripts:
  - path: scripts/stop-backend.sh
  - path: scripts/stop-frontend.sh
    executor_type: web-executor
suites:
  - name: login
    path: suites/login.md
  - name: browse-products
    path: suites/browse-products.md
    needs:
      - login
  - name: place-order
    path: suites/place-order.md
    needs:
      - browse-products

Roles

Business Project / zero-qa.yaml
- Defines the test entry for the project.
- Declares suites and their dependencies.
- Can set shared run defaults under defaults, including executor_type.
- Can define an ordered initialization script list for the project.
suite
- Is the usage document for one feature or one flow.
- Describes how to use one feature.
- Describes what counts as success or failure for that feature.
- Stays in business language.
- May be written manually or generated by an Agent.
Scenario + Target Suites
- Parses zero-qa.yaml for the current run.
- Starts from the suites chosen by the user for the run.
- Expands upstream dependencies automatically before execution.
DAG Planner
- Validates suite dependency declarations.
- Detects invalid or cyclic graphs.
- Produces the final execution order.
Executor Preflight
- Loads the selected executor definition from executor.yaml.
- Checks whether required host tools already exist on the host.
- Installs missing tools only when auto-install is enabled.
- Re-checks the tools after installation and fails fast on mismatch.
- Runs once before any suite execution starts.
Project Init Scripts
- Runs the ordered pre_test_scripts list from zero-qa.yaml.
- Filters scripts by the effective run executor type while preserving order.
Project Finalization Scripts
- Runs the ordered post_test_scripts list from zero-qa.yaml.
- Reuses the same relative-path, working-directory, and executor-type filtering rules as pre_test_scripts.
- Treats script failures as warnings so cleanup does not replace the main run result.
Runtime Inputs Resolution
- Reads executor-scoped scenario variables from the <executor-name>: top-level mapping in zero-qa.yaml.
- Executes executor-declared resolve_command entries from executor.yaml.
- Merges scenario variables and resolved values into one runtime input map before any suite workspace is created.
Run Summary
- Aggregates the final result after post_test_scripts finish.
- Records the selected suite set and run-level timing information.
- Keeps total_duration scoped to the main run path, excluding post_test_scripts and executor post_run_hooks.
Shared Run Workspace
- Stores the filesystem contract shared by run and ui.
- Persists run_meta.json, per-suite directories under suites/, debug evidence, and results/summary.json.
- Keeps dag_nodes and dag_edges in run_meta.json so the read path can render the run graph directly.
Post-run Hooks
- Executes executor-declared post_run_hooks from executor.yaml after the run summary is written.
- Receives the final runtime input map again under the original declared names, so cleanup scripts can reuse resolved values directly.
- Treats cleanup as best-effort: hook failures are logged, but they do not replace the main run result.

Per-Suite Workspace

Creates an isolated workspace for one suite at a time.
Copies the current suite file and executor skill directories into the workspace.
Renders runtime inputs into copied markdown skill files before the Agent starts.

Workspace layout (all directories pre-created by the framework — the Agent must not create or check them):

<suite>/
├── suite.md      — suite instructions (read-only); copied and renamed from the user-declared suite path in zero-qa.yaml
├── evidence/     — artifacts written by the Agent: screenshots, logs, and captured outputs; the framework pre-creates this directory, executor agents write flat files by default and must not create subdirectories, and the UI can still read nested paths recursively for compatibility
└── result.json   — structured result written by the Agent at the end

Per-Suite Execution
- Runs once for each suite in the planned order.
- Uses the Agent runtime to execute the selected executor skill.
- Turns suite steps into actions on the phone or browser surface.
Workspace Reader
- Lives under zero_qa/ui/ and reads the shared workspace without importing scheduler or runner code.
- Lists projects and runs, loads DAG metadata, exposes suite evidence, and infers pending / running / passed / failed states.
- Treats malformed workspace data as an error instead of silently skipping it.
UI Server
- Runs behind zero-qa ui.
- Serves read-only HTML pages, JSON API endpoints, and one SSE refresh endpoint per run.
- Uses the shared workspace as its only data source.
Agent
- Is the runtime that actually executes the executor.
- Zero QA currently supports Codex and Claude Code.
- The same suite and executor model works with either Agent runtime.
executor
- Is the agent that executes the suite.
- Turns suite steps into actions on the target.
- Observes the target and acts on the target during the test.
- Is a concrete domain specialist, not a generic assistant.
- Stays reusable across many suites.
- Carries its expertise through agent-facing documentation, not just through a tool declaration.
Codex
- Is one supported Agent runtime for Zero QA.
Claude Code
- Is another supported Agent runtime for Zero QA.
mobile-executor
- Is a mobile testing expert for Android or iPhone flows.
- Reads suite steps and knows how to operate the phone to complete them.
- Focuses on mobile observation, interaction, screenshots, and debugging.
- Should include concrete guidance for agent-device and the boundary for adb fallback.
web-executor
- Is a web testing expert for browser flows.
- Reads suite steps and knows how to use Playwright to complete them.
- Focuses on web observation, interaction, screenshots, and debugging.
- Should include concrete guidance for Playwright observation, action, waiting, and evidence capture.
Zero QA
- Owns parsing, DAG planning, host-tool checks, workspace assembly, dispatch, and result collection.
- Already connects executors to mobile and browser targets.

Kernel-Executor Decoupling

The kernel (zero_qa/) must never contain code written for a specific executor. It defines generic contracts; executors fulfill those contracts through declarative configuration in executor.yaml.

The kernel does not reference any executor by name, does not reference executor-specific tools, and does not branch on executor identity.
Executor-specific behavior is declared in executor.yaml (host tools, runtime input resolve commands, required scenario variables) and executed generically by the kernel.
Adding a new executor only requires adding a directory under executors/. No kernel changes are needed.

This is enforced by scripts/lints/check_kernel_executor_decoupling.py, which dynamically discovers executor names and tool names from executors/ and rejects any occurrence in kernel code.

For the executor-side view of this contract, see executors/README.md.

Run and UI Decoupling

zero-qa run and zero-qa ui are intentionally separate processes:

zero-qa run writes the shared workspace and does not import zero_qa.ui.
zero-qa ui reads the shared workspace and does not import planner, runner, or workspace builder write-path modules.
zero-qa ui acquires one machine-wide lock at startup, so only one local dashboard instance can run at a time.
the two processes synchronize only through stable files and directories, not through in-process callbacks or RPC
UI dependencies remain optional, so the core run path can stay installable without FastAPI, uvicorn, watchfiles, or Jinja2

This keeps the write path minimal and lets one long-running UI aggregate many historical or in-progress runs at once.

What Users Own

your suites or usage documents
your suite dependencies in zero-qa.yaml
your initialization scripts in zero-qa.yaml, such as service startup or building and deploying an APK to an emulator or a phone
your scenario-level defaults.executor_type, if needed
optionally, the agent_type choice when you want to choose between Codex and Claude Code

The minimal demo in examples/minimal_ecommerce_demo/ is the reference shape for these inputs.

What Zero QA Already Owns

selected-suite expansion and DAG planning
executor metadata loading and host-tool checks
ordered pre-test script execution
per-suite isolated workspace assembly
shared workspace metadata writing, including DAG nodes and edges
executor dispatch
read-only workspace scanning for UI consumers
local API, HTML, and SSE serving for the dashboard
out-of-the-box mobile connection through mobile-executor
out-of-the-box browser connection through web-executor
final result collection

This means users do not need to design how to connect to phones or browsers. They only need to describe:

what each suite verifies
how suites depend on each other

If the product already has clear usage documentation for Mobile App or Web flows, that documentation is often enough to become the suites for Zero QA.

For project setup, users can also provide initialization scripts in zero-qa.yaml, for example:

start backend services
start frontend services
build an APK and deploy it to an emulator or install it on a phone

These script lists may be different for web-executor and mobile-executor. These scripts stay in one ordered list. Each script may optionally declare executor_type:

omit executor_type when the script should always run
use executor_type: web-executor when the script should run only for web
use executor_type: mobile-executor when the script should run only for mobile

Agent Runtime

Zero QA separates two concerns:

executor: what kind of target to operate
agent: which runtime executes that executor

For executor knowledge, Zero QA also separates two layers:

executor.yaml: machine-readable runtime dependencies and host-tool setup
skill/: agent-facing execution knowledge, including the main workflow and optional focused references

Today, Zero QA supports:

Codex
Claude Code

In most cases:

choose --executor-type mobile-executor when the target is Android or iPhone
choose --executor-type web-executor when the target is Browser
choose Codex or Claude Code through agent_type when you want to pick the Agent runtime

Important:

Codex can run all supported executor types, including mobile-executor and web-executor
Claude Code can also run all supported executor types, including mobile-executor and web-executor
agent_type only selects the Agent runtime
--executor-type selects the target type for the run
scenario defaults.executor_type sets the executor directory name used when the CLI does not pass --executor-type
zero-qa ui is separate from executor selection and only needs a readable run workspace

Result

Suite success returns PASS
Suite failure returns FAILED with:
- a short reason
- the business step or expectation that failed

At run level, Zero QA also writes a summary with the selected suites and stage durations.

The same workspace also powers the UI dashboard:

project and historical run listing
per-run DAG visualization
suite detail and evidence listing
live refresh while result.json or summary.json files are still changing

Debug evidence such as logs and screenshots may be kept by the system, but they are not the normal business-facing result.

Zero QA Architecture ​

What Users Need to Know ​

How Zero QA Works ​

Minimal Demo ​

Roles ​

Kernel-Executor Decoupling ​

Run and UI Decoupling ​

What Users Own ​

What Zero QA Already Owns ​

Agent Runtime ​

Result ​