Architecture

A layered architecture with clear boundaries: the rules engine is deterministic and pure, the surface layer adapts to each mode.

System overview

OpenTax is three layers. The bottom two are shared; only the top layer differs between standalone and plugin mode.

System layers
Layer 3 — Surface (mode-specific)
┌──────────────────────┐  ┌──────────────────────┐
  Standalone UI           OpenClaw Plugin     
                                              
  React + Router          Express server       
  Zustand store           REST API + SSE       
  OCR (Tesseract.js)      16 agent tool defs   
  PDF gen (pdf-lib)       SQLite persistence   
  IndexedDB storage       PDF gen (pdf-lib)    
└───────────┬──────────┘  └───────────┬──────────┘
                                      
                                      
Layer 2 — Service API (shared)
┌─────────────────────────────────────────────────┐
                 TaxService                      
                                                 
  createReturn()  addW2()     compute()          
  add1099()       setCredits()  traceLine()      
  getSummary()    generatePdf()                  
└────────────────────────┬────────────────────────┘
                         
                         
Layer 1 — Rules Engine (shared, deterministic)
┌─────────────────────────────────────────────────┐
              Rules Engine                      
                                                 
  IRS rules (Rev. Proc. 2024-40, Pub 501, ...)  
  Tax brackets · Standard deduction              
  Credits · Deductions · AMT · State (CA)        
  Trace tree generation · IRS citations           
  Deterministic: same input → same output         
└─────────────────────────────────────────────────┘

Rules engine

The engine is a pure-function computation graph. Given identical inputs, it always produces identical outputs. No side effects, no network calls, no randomness.

Computation model

  • Each IRS form is a TypeScript module that exports a compute() function
  • Forms declare dependencies on other forms (e.g., Schedule D feeds into Form 1040 Line 7)
  • The engine resolves the dependency graph topologically and executes forms in order
  • Every intermediate value is annotated with its IRS citation and source type

Supported forms

  • 1040 Individual income tax (main return)
  • Sch A Itemized deductions
  • Sch B Interest and dividends
  • Sch D Capital gains and losses
  • Sch E Rental and royalty income
  • 8949 Sales of capital assets (wash sales)
  • 8889 Health Savings Accounts
  • 6251 Alternative Minimum Tax
  • CA 540 California state return

File mapping

src/rules/2025/
src/rules/2025/
├── constants.ts          # Brackets, thresholds (Rev. Proc. 2024-40)
├── form1040.ts           # Main 1040 computation
├── taxComputation.ts     # Tax-on-income calculation
├── scheduleA.ts          # Itemized deductions
├── scheduleD.ts          # Capital gains
├── form8949.ts           # Capital asset sales
├── form8889.ts           # HSA deductions
├── amt.ts                # Alternative Minimum Tax
├── childTaxCredit.ts     # CTC / ACTC
├── earnedIncomeCredit.ts # EITC
├── educationCredit.ts    # AOTC + LLC
├── saversCredit.ts       # Retirement savings credit
├── hsaDeduction.ts       # HSA above-the-line deduction
└── ca/
    ├── constants.ts      # California thresholds
    └── form540.ts        # CA 540 computation

Trace system

Every computed value in the engine produces a trace node. The trace tree is the foundation of OpenTax's explainability.

Document source

Value extracted from an uploaded document (W-2 Box 1, 1099-INT amount).

Computed value

Derived by the rules engine from other values (total tax, taxable income).

User entry

Manually entered by the user during the interview (filing status, dependents).

Trace node structure

TraceNode
{
  "nodeId":      "form1040.line24",
  "label":       "Total tax",
  "amount":      1173400,              // cents (integer)
  "irsCitation": "Form 1040, Line 24",
  "sourceType":  "computed",            // "document" | "computed" | "user-entry"
  "confidence":  1.0,                  // 0.0–1.0 (OCR = lower)
  "inputs": [
    { "nodeId": "taxComputation.taxOnIncome", ... },
    { "nodeId": "form1040.line23", ... }
  ]
}

Data flow by mode

The data flow is fundamentally different in each mode. The engine itself is identical—only the I/O boundary changes.

Standalone mode

  1. 1 User uploads documents in the browser. OCR extracts data locally via Tesseract.js.
  2. 2 Extracted data is stored in a Zustand store backed by IndexedDB.
  3. 3 UI calls TaxService.compute() which runs the rules engine in the main thread.
  4. 4 Results and trace tree render in the React UI. PDF generation happens client-side via pdf-lib.

Network calls: Zero. All data stays in the browser.

Plugin mode

  1. 1 Agent discovers tools from the OpenClaw manifest at /.well-known/openclaw.json.
  2. 2 Agent sends structured JSON via tool calls (REST). Data is stored in SQLite on the server.
  3. 3 Server calls the same TaxService.compute() in Node.js. Progress streams via SSE.
  4. 4 Results return as JSON. Agent presents them conversationally. PDFs generated server-side.

Network calls: Agent ↔ plugin server (localhost or private network).

Architectural boundaries

Clear separation of concerns keeps the codebase maintainable and each layer independently testable.

Rules engine → TaxService

The rules engine is pure computation. It takes a canonical tax model as input and produces computed forms + trace tree as output. It never touches I/O, storage, or network.

TaxService wraps the engine with return lifecycle management (create, update, compute, trace). Both modes consume TaxService; neither calls the engine directly.

TaxService → Surface layer

In standalone mode, the React UI imports TaxService directly as an ES module. In plugin mode, the Express server wraps TaxService behind REST endpoints and SSE streams.

The surface layer handles I/O concerns: storage (IndexedDB vs. SQLite), document handling (client-side OCR vs. agent-provided data), and output (React rendering vs. JSON responses).

Plugin → Agent

The plugin boundary is the OpenClaw protocol. The plugin exposes tools and accepts structured JSON input. It makes no assumptions about the agent's implementation, LLM provider, or conversation history. Any OpenClaw-compatible agent can integrate.

Security model

Tax data is sensitive. The architecture is designed to minimize exposure.

Standalone

  • All computation in browser (no server)
  • Data stored in IndexedDB (user's device only)
  • No cookies, no tracking, no analytics by default
  • No SSN/PII ever leaves the client
  • User clears data by clearing browser storage

Plugin

  • Plugin server runs on localhost or private network
  • SQLite database is local to the server process
  • No outbound network calls from the engine
  • Agent ↔ plugin traffic stays on the local machine
  • Data lifecycle controlled by the agent operator

Testing strategy

The layered architecture enables focused testing at each boundary.

Layer Test type What's verified
Rules engine Unit tests (1,700+) Every form computation against IRS-published expected values
TaxService Integration tests Return lifecycle: create → populate → compute → trace
Plugin API HTTP tests REST endpoints return correct status codes, shapes, and data
UI Component tests (129+) Interview flow, responsiveness, accessibility, touch targets

Start building

Choose your path and start building with the OpenTax engine.