OpenAI just chose Elixir to orchestrate AI agents. So I ran their code through mine

Alex66 · March 6, 2026, 1:44am

I ran Symphony through Giulia, here’s the AST analysis.

Yesterday, OpenAI open-sourced Symphony — an autonomous agent orchestration framework. It monitors Linear issues in real time, dispatches Codex agents, handles CI testing, code review, and merges PRs. Without humans supervising every step.

They built it in Elixir/OTP.

Not Python. Not Go. Not Node. Elixir.

DaAnalyst · March 6, 2026, 7:37am

A link to Giulia so I can bookmark it?

al2o3cr · March 6, 2026, 5:53pm

FWIW, a bunch of code in the HttpServer module disappeared when it was replaced with Phoenix in a followup commit.

Alex66 · March 6, 2026, 10:49pm

# OpenAI Symphony (Elixir) — Giulia Code Intelligence Report

**Date**: 2026-03-06

**Analyzer**: Giulia v0.1.0.127 (Build 127)

**Project**: openai/symphony — Elixir implementation

**Previous Report**: 2026-03-05 (v0.1.0.100, Build 91)

-–

1. Executive Summary

Symphony is an Elixir-based orchestrator that polls Linear for issues and dispatches them to OpenAI Codex-backed workers. The codebase is **compact but concentrated**: 34 files, 37 modules, 592 functions. Three modules carry disproportionate complexity and risk, forming the project’s structural bottleneck. Three modules have test coverage (SpecsCheck, CLI, LogFile); the remaining 33 are untested.

Metric Value Delta vs 2026-03-05

Files 34 —

Modules 37 —

Functions 592 —

Specs 137 —

Types 11 —

Structs 4 —

Callbacks 5 —

Knowledge Graph Vertices 628 —

Knowledge Graph Edges 668 —

Connected Components 153 —

Dead Code 0 —

Tested Modules 3 +3 (bug fix in test detection)

-–

2. Health Overview

Zone Distribution

Zone Count Percentage Delta vs 2026-03-05

RED 3 8% —

YELLOW 15 42% -3 (was 18 / 50%)

GREEN 18 50% +3 (was 15 / 42%)

> **Note**: The previous report (2026-03-05) reported YELLOW 18 (50%) / GREEN 15 (42%). A bug in Giulia’s test detection caused SpecsCheck, CLI, and LogFile to appear untested. With the fix, these three modules correctly show has_test: true, dropping their heatmap scores and shifting them from YELLOW to GREEN.

Red Zone Modules (Critical)

Module Score Complexity Centrality Max Coupling Tests

SymphonyElixir.Config 83 169 10 45 No

SymphonyElixir.Orchestrator 75 198 3 69 No

SymphonyElixir.StatusDashboard 73 331 2 75 No

Unprotected Hubs

Module Severity In-Degree Public Functions Spec Ratio Doc Ratio Tests

SymphonyElixir.Orchestrator YELLOW 3 12 75% 0% No

-–

3. Topology & Dependency Analysis

Hub Modules (Most Depended-On)

Module Total Degree In-Degree (Fan-In) Out-Degree (Fan-Out)

SymphonyElixir.Config 11 10 1

SymphonyElixir.Orchestrator 9 3 6

SymphonyElixir.AgentRunner 7 1 6

SymphonyElixir.StatusDashboard 6 2 4

SymphonyElixir.Tracker 5 4 1

Dependency Cycles (2 detected)

**Cycle 1** — Triangular (3 modules):

```

SymphonyElixir.HttpServer → SymphonyElixir.Orchestrator → SymphonyElixir.StatusDashboard → HttpServer

```

**Risk**: Process startup ordering issues, potential deadlocks during initialization.

**Cycle 2** — Bilateral (2 modules):

```

SymphonyElixir.Workflow ↔ SymphonyElixir.WorkflowStore

```

**Risk**: Lower severity; typical for GenServer + accessor module pairs.

God Modules

Module Functions Complexity Centrality God Score

StatusDashboard 117 331 2 785

Orchestrator 95 198 3 500

Config 84 169 10 452

Codex.AppServer 45 107 1 262

Linear.Client 35 80 2 201

-–

4. Change Risk Analysis

Ranked by composite refactoring priority score:

Rank Module Score Complexity Public Private API Ratio Centrality

1 Config 3270 169 31 53 37% 10

2 StatusDashboard 1904 331 15 102 13% 2

3 Orchestrator 1632 198 12 83 13% 3

4 Codex.AppServer 475 107 4 41 9% 1

5 Linear.Client 466 80 8 27 23% 2

6 Workspace 278 43 5 15 25% 2

7 Presenter 194 26 3 13 19% 2

8 SpecsCheck 160 33 2 11 15% 1

9 Workflow 160 18 6 4 60% 3

10 Tracker 156 7 6 0 100% 4

**Key Insight**: Config has the highest change risk (3270) due to 10 dependents (fan-in). Any change to Config’s API ripples through 10 direct consumers and 5 additional modules at depth 2, totaling **15 affected modules** (40% of the codebase).

-–

5. Impact Analysis (Blast Radius)

SymphonyElixir.Config (Change Risk: 3270)

- **Direct dependents (10)**: AgentRunner, Codex.AppServer, HttpServer, Linear.Client, Orchestrator, PromptBuilder, StatusDashboard, Tracker, Workspace, Presenter

- **Depth-2 dependents (5)**: Codex.DynamicTool, Linear.Adapter, Tracker.Memory, DashboardLive, ObservabilityApiController

- **Total blast radius**: 15 modules (40% of codebase)

- **Upstream**: Workflow → WorkflowStore

SymphonyElixir.Orchestrator (Change Risk: 1632)

- **Direct dependents (3)**: HttpServer, StatusDashboard, Presenter

- **Depth-2 dependents (2)**: DashboardLive, ObservabilityApiController

- **Total blast radius downstream**: 5 modules

- **Upstream (6)**: AgentRunner, Config, Linear.Issue, StatusDashboard, Tracker, Workspace

- **Total blast radius upstream**: 11 modules

- **Private function count**: 83 (88% of module API is hidden)

SymphonyElixir.StatusDashboard (Change Risk: 1904)

- **Direct dependents (2)**: Orchestrator, Presenter

- **Depth-2 dependents (3)**: HttpServer, DashboardLive, ObservabilityApiController

- **Total blast radius downstream**: 5 modules

- **Upstream (4)**: Config, HttpServer, Orchestrator, ObservabilityPubSub

- **117 functions** — largest module in the project by function count

-–

6. Structural Audit

Struct Lifecycle Analysis

Struct Defining Module Users Logic Leaks

SymphonyElixir.Linear.Issue Linear.Issue 4 4 (all leak)

State (Orchestrator) State 2 2 (all leak)

State (WorkflowStore) State 2 2 (all leak)

SymphonyElixir.StatusDashboard StatusDashboard 0 0

**Finding**: All structs that are shared across modules have **100% logic leak rate**. Every consumer directly pattern-matches or manipulates struct fields instead of using accessor functions. This creates tight coupling — any field rename or restructuring breaks all consumers.

**Notable**: Two different modules define a State struct (Orchestrator and WorkflowStore). This naming collision could cause confusion.

Semantic Duplicates

**Cluster 1** (High similarity: 92.9%):

- Mix.Tasks.PrBody.Check.run/1

- Mix.Tasks.Specs.Check.run/1

- Mix.Tasks.Workspace.BeforeRemove.run/1

These three Mix tasks share nearly identical structure. Candidate for extraction of shared boilerplate.

**Cluster 2** (Low similarity: 37.7%, 144 members):

Large cluster of accessor/delegate functions across Config and Tracker modules — expected pattern for configuration getters and behaviour delegates.

Behaviour Integrity

**Status**: Consistent. No behaviour fractures detected.

-–

7. Spec Coverage Analysis

Module Public Functions Specs Coverage

Config 31 31 100%

Orchestrator 12 9 75%

StatusDashboard 15 \~15 \~100%

Codex.AppServer 4 4 100%

Linear.Client 8 8 100%

Tracker 6 6 100%

**Overall**: 137 specs across 592 functions (23% total, but public function coverage is significantly higher).

-–

8. Architectural Observations

What They Did Well

1. **Clean behaviour abstraction** for Tracker (Linear.Adapter + Tracker.Memory) — allows swapping issue trackers

2. **Config is fully spec’d** (31/31 public functions) despite being the highest-risk module

3. **Zero dead code** — clean codebase with no orphaned functions

4. **Behaviour integrity is consistent** — no broken contracts

5. **Workflow-driven configuration** via WORKFLOW.md — flexible runtime config without recompilation

6. **Test coverage exists** for SpecsCheck, CLI, and LogFile — the three most utility-oriented modules

Structural Concerns

1. **StatusDashboard is a god module** (117 functions, complexity 331, score 785) — it handles rendering, TPS calculation, Codex message humanization, terminal formatting, and snapshot management. This is at least 4 distinct responsibilities.

2. **Orchestrator carries too much** (95 functions, 83 private) — polling, dispatching, retry logic, token accounting, issue reconciliation, rate limiting all in one GenServer.

3. **Config’s fan-in of 10** means any API change is a project-wide event. The 53 private helper functions suggest this module is doing too much parsing/validation internally.

4. **Limited test coverage** — only 3 of 37 modules have tests. All three red-zone modules (Config, Orchestrator, StatusDashboard) remain untested.

5. **Triangular cycle** (HttpServer → Orchestrator → StatusDashboard) creates initialization coupling that could cause startup race conditions.

6. **100% struct logic leak** — no encapsulation on shared data structures.

-–

9. Refactoring Recommendations

Priority 1: StatusDashboard Decomposition

- **Current**: 117 functions, complexity 331, god score 785

- **Suggested split**:

StatusDashboard.Renderer — terminal output, colorize, border, table formatting

StatusDashboard.TPS — rolling_tps, throttled_tps, token sample management

StatusDashboard.CodexHumanizer — humanize_codex_message, event parsing, command normalization

StatusDashboard — GenServer shell (init, handle_info, notify_update)

- **Expected**: Complexity 331 → ~80 max per module

Priority 2: Orchestrator Decomposition

- **Current**: 95 functions, complexity 198

- **Suggested split**:

Orchestrator.Dispatcher — issue dispatch, slot management, retry scheduling

Orchestrator.Reconciler — state reconciliation, stall detection, terminal cleanup

Orchestrator.TokenAccounting — usage extraction, delta computation, rate limits

Orchestrator — GenServer shell (init, handle_info, handle_call, snapshot)

- **Expected**: Complexity 198 → ~50 max per module

Priority 3: Break the Triangular Cycle

- Introduce a PubSub or event bus between HttpServer, Orchestrator, and StatusDashboard

- StatusDashboard should subscribe to events rather than being directly called by Orchestrator

Priority 4: Struct Encapsulation

- Add accessor functions to Linear.Issue and enforce access through module API

- Rename duplicate State structs to avoid confusion (Orchestrator.State, WorkflowStore.State)

-–

10. Full Heatmap

Module Score Zone Complexity Centrality Max Coupling Tests

Config 83 RED 169 10 45 No

Orchestrator 75 RED 198 3 69 No

StatusDashboard 73 RED 331 2 75 No

Codex.AppServer 50 YELLOW 107 1 25 No

Linear.Client 44 YELLOW 80 2 13 No

Workspace 39 YELLOW 43 2 13 No

Tracker 38 YELLOW 7 4 12 No

Presenter 36 YELLOW 26 2 10 No

Linear.Issue 36 YELLOW 1 4 10 No

Mix.Tasks.PrBody.Check 36 YELLOW 38 0 16 No

Workflow 35 YELLOW 18 3 5 No

Codex.DynamicTool 32 YELLOW 30 1 4 No

AgentRunner 32 YELLOW 28 1 6 No

WorkflowStore 32 YELLOW 28 1 6 No

Endpoint 31 YELLOW 0 3 0 No

DashboardLive 30 YELLOW 24 0 7 No

HttpServer 30 YELLOW 16 1 5 No

ObservabilityApiController 30 YELLOW 10 0 10 No

PromptBuilder 29 GREEN 15 1 3 No

Mix.Tasks.Workspace.BeforeRemove 29 GREEN 20 0 4 No

Tracker.Memory 29 GREEN 11 0 8 No

ObservabilityPubSub 29 GREEN 3 2 2 No

Linear.Adapter 28 GREEN 13 0 6 No

StaticAssetController 28 GREEN 6 0 8 No

State (WorkflowStore) 28 GREEN 28 0 0 No

StaticAssets 28 GREEN 2 1 4 No

Mix.Tasks.Specs.Check 26 GREEN 6 0 3 No

Layouts 26 GREEN 2 0 2 No

ErrorJSON 25 GREEN 1 0 1 No

Router 25 GREEN 0 0 0 No

SymphonyElixir 25 GREEN 3 0 1 No

ErrorHTML 25 GREEN 1 0 2 No

Application 25 GREEN 3 0 0 No

SpecsCheck 11 GREEN 33 1 13 Yes

CLI 9 GREEN 29 0 15 Yes

LogFile 5 GREEN 10 1 6 Yes

-–

11. Corrections vs Previous Report (2026-03-05)

The previous report was generated with Giulia v0.1.0.100 which had a bug in test detection. The following data points were incorrect:

Data Point Old (Incorrect) New (Correct)

Zone distribution RED 3 (8%), YELLOW 18 (50%), GREEN 15 (42%) RED 3 (8%), YELLOW 15 (42%), GREEN 18 (50%)

Tested modules 0 ("Zero tests detected") 3 (SpecsCheck, CLI, LogFile)

SpecsCheck zone YELLOW (score 36) GREEN (score 11)

CLI zone YELLOW (score 34) GREEN (score 9)

LogFile zone YELLOW (score 30) GREEN (score 5)

All other data (module counts, function counts, specs, change risk scores, god module rankings, dependency cycles, blast radius, structural audit findings) remains identical and was correctly reported.

*Report generated by Giulia Code Intelligence v0.1.0.127 (Build 127)*

*Analyzed via AST parsing + Knowledge Graph (628 vertices, 668 edges)*

*Previous report: 2026-03-05, Giulia v0.1.0.100 (Build 91)*

FlyingNoodle · March 7, 2026, 7:58am

Every consumer directly pattern-matches or manipulates struct fields instead of using accessor functions.

Since when do we write accessor functions in elixir?

I think your LLM is leaking Java.

Alex66 · March 8, 2026, 3:44pm

Good catch @FlyingNoodle — you’re absolutely right.

That paragraph was poorly framed. Pattern matching on struct fields is idiomatic Elixir, not a “leak.”

The underlying metric the analysis was trying to surface is coupling: if a struct’s shape changes, how many modules break? For app-internal structs the compiler has your back, so it’s a non-issue.

The concern only applies at library/context boundaries, where @opaque or a module API can help — but that’s a design choice, not a rule. I’ve updated the report.

The old text:

All structs that are shared across modules have 100% logic leak rate. Every consumer directly pattern-matches or manipulates struct fields instead of using accessor functions. This creates tight coupling — any field rename or restructuring breaks all consumers.

Need to be replaced with:

All structs shared across modules are accessed by direct field matching in every consumer. This is idiomatic Elixir — structs are transparent data and pattern matching on them is expected. However, it does mean that any field rename or restructuring will break all consumers
simultaneously. For application-internal structs this is acceptable (the compiler catches it). For structs that cross library/context boundaries, consider defining a public API in the owning module (e.g. Issue.title(issue)) or using @opaque typespecs to signal that callers
should not depend on the internal shape.

Thanks for the feedback

egeersoz · March 8, 2026, 7:27pm

Even the original AST appears to be written by LLM. The “Not Python. Not Go. Not Node. Elixir.” is a dead giveaway.

Do we need some policies regarding posting LLM content?

Dmk · March 10, 2026, 12:43am

Is Giulia opensource/you have a link to it? I’ve searched for it but can’t find it… so I presume its not?

Alex66 · March 10, 2026, 4:48am

Not open source yet, still in active development.

Metric	Value	Delta vs 2026-03-05
Files	34	—
Modules	37	—
Functions	592	—
Specs	137	—
Types	11	—
Structs	4	—
Callbacks	5	—
Knowledge Graph Vertices	628	—
Knowledge Graph Edges	668	—
Connected Components	153	—
Dead Code	0	—
Tested Modules	3	+3 (bug fix in test detection)

Zone	Count	Percentage	Delta vs 2026-03-05
RED	3	8%	—
YELLOW	15	42%	-3 (was 18 / 50%)
GREEN	18	50%	+3 (was 15 / 42%)

Module	Score	Complexity	Centrality	Max Coupling	Tests
SymphonyElixir.Config	83	169	10	45	No
SymphonyElixir.Orchestrator	75	198	3	69	No
SymphonyElixir.StatusDashboard	73	331	2	75	No

Module	Total Degree	In-Degree (Fan-In)	Out-Degree (Fan-Out)
SymphonyElixir.Config	11	10	1
SymphonyElixir.Orchestrator	9	3	6
SymphonyElixir.AgentRunner	7	1	6
SymphonyElixir.StatusDashboard	6	2	4
SymphonyElixir.Tracker	5	4	1

Module	Functions	Complexity	Centrality	God Score
StatusDashboard	117	331	2	785
Orchestrator	95	198	3	500
Config	84	169	10	452
Codex.AppServer	45	107	1	262
Linear.Client	35	80	2	201

Rank	Module	Score	Complexity	Public	Private	API Ratio	Centrality
1	Config	3270	169	31	53	37%	10
2	StatusDashboard	1904	331	15	102	13%	2
3	Orchestrator	1632	198	12	83	13%	3
4	Codex.AppServer	475	107	4	41	9%	1
5	Linear.Client	466	80	8	27	23%	2
6	Workspace	278	43	5	15	25%	2
7	Presenter	194	26	3	13	19%	2
8	SpecsCheck	160	33	2	11	15%	1
9	Workflow	160	18	6	4	60%	3
10	Tracker	156	7	6	0	100%	4

Struct	Defining Module	Users	Logic Leaks
SymphonyElixir.Linear.Issue	Linear.Issue	4	4 (all leak)
State (Orchestrator)	State	2	2 (all leak)
State (WorkflowStore)	State	2	2 (all leak)
SymphonyElixir.StatusDashboard	StatusDashboard	0	0

Data Point	Old (Incorrect)	New (Correct)
Zone distribution	RED 3 (8%), YELLOW 18 (50%), GREEN 15 (42%)	RED 3 (8%), YELLOW 15 (42%), GREEN 18 (50%)
Tested modules	0 ("Zero tests detected")	3 (SpecsCheck, CLI, LogFile)
SpecsCheck zone	YELLOW (score 36)	GREEN (score 11)
CLI zone	YELLOW (score 34)	GREEN (score 9)
LogFile zone	YELLOW (score 30)	GREEN (score 5)