OpenAI just chose Elixir to orchestrate AI agents. So I ran their code through mine

I ran Symphony through Giulia, here’s the AST analysis.

Yesterday, OpenAI open-sourced Symphony — an autonomous agent orchestration framework. It monitors Linear issues in real time, dispatches Codex agents, handles CI testing, code review, and merges PRs. Without humans supervising every step.

They built it in Elixir/OTP.

Not Python. Not Go. Not Node. Elixir.

7 Likes

A link to Giulia so I can bookmark it?

2 Likes

FWIW, a bunch of code in the HttpServer module disappeared when it was replaced with Phoenix in a followup commit.

2 Likes

# OpenAI Symphony (Elixir) — Giulia Code Intelligence Report

**Date**: 2026-03-06

**Analyzer**: Giulia v0.1.0.127 (Build 127)

**Project**: openai/symphony — Elixir implementation

**Previous Report**: 2026-03-05 (v0.1.0.100, Build 91)

-–

1. Executive Summary

Symphony is an Elixir-based orchestrator that polls Linear for issues and dispatches them to OpenAI Codex-backed workers. The codebase is **compact but concentrated**: 34 files, 37 modules, 592 functions. Three modules carry disproportionate complexity and risk, forming the project’s structural bottleneck. Three modules have test coverage (SpecsCheck, CLI, LogFile); the remaining 33 are untested.

MetricValueDelta vs 2026-03-05
Files34
Modules37
Functions592
Specs137
Types11
Structs4
Callbacks5
Knowledge Graph Vertices628
Knowledge Graph Edges668
Connected Components153
Dead Code0
Tested Modules3+3 (bug fix in test detection)

-–

2. Health Overview

Zone Distribution

ZoneCountPercentageDelta vs 2026-03-05
RED38%
YELLOW1542%-3 (was 18 / 50%)
GREEN1850%+3 (was 15 / 42%)

> **Note**: The previous report (2026-03-05) reported YELLOW 18 (50%) / GREEN 15 (42%). A bug in Giulia’s test detection caused SpecsCheck, CLI, and LogFile to appear untested. With the fix, these three modules correctly show has_test: true, dropping their heatmap scores and shifting them from YELLOW to GREEN.

Red Zone Modules (Critical)

ModuleScoreComplexityCentralityMax CouplingTests
SymphonyElixir.Config831691045No
SymphonyElixir.Orchestrator75198369No
SymphonyElixir.StatusDashboard73331275No

Unprotected Hubs

ModuleSeverityIn-DegreePublic FunctionsSpec RatioDoc RatioTests
SymphonyElixir.OrchestratorYELLOW31275%0%No

-–

3. Topology & Dependency Analysis

Hub Modules (Most Depended-On)

ModuleTotal DegreeIn-Degree (Fan-In)Out-Degree (Fan-Out)
SymphonyElixir.Config11101
SymphonyElixir.Orchestrator936
SymphonyElixir.AgentRunner716
SymphonyElixir.StatusDashboard624
SymphonyElixir.Tracker541

Dependency Cycles (2 detected)

**Cycle 1** — Triangular (3 modules):

```

SymphonyElixir.HttpServer → SymphonyElixir.Orchestrator → SymphonyElixir.StatusDashboard → HttpServer

```

**Risk**: Process startup ordering issues, potential deadlocks during initialization.

**Cycle 2** — Bilateral (2 modules):

```

SymphonyElixir.Workflow ↔ SymphonyElixir.WorkflowStore

```

**Risk**: Lower severity; typical for GenServer + accessor module pairs.

God Modules

ModuleFunctionsComplexityCentralityGod Score
StatusDashboard1173312785
Orchestrator951983500
Config8416910452
Codex.AppServer451071262
Linear.Client35802201

-–

4. Change Risk Analysis

Ranked by composite refactoring priority score:

RankModuleScoreComplexityPublicPrivateAPI RatioCentrality
1Config3270169315337%10
2StatusDashboard19043311510213%2
3Orchestrator1632198128313%3
4Codex.AppServer4751074419%1
5Linear.Client4668082723%2
6Workspace2784351525%2
7Presenter1942631319%2
8SpecsCheck1603321115%1
9Workflow160186460%3
10Tracker156760100%4

**Key Insight**: Config has the highest change risk (3270) due to 10 dependents (fan-in). Any change to Config’s API ripples through 10 direct consumers and 5 additional modules at depth 2, totaling **15 affected modules** (40% of the codebase).

-–

5. Impact Analysis (Blast Radius)

SymphonyElixir.Config (Change Risk: 3270)

- **Direct dependents (10)**: AgentRunner, Codex.AppServer, HttpServer, Linear.Client, Orchestrator, PromptBuilder, StatusDashboard, Tracker, Workspace, Presenter

- **Depth-2 dependents (5)**: Codex.DynamicTool, Linear.Adapter, Tracker.Memory, DashboardLive, ObservabilityApiController

- **Total blast radius**: 15 modules (40% of codebase)

- **Upstream**: Workflow → WorkflowStore

SymphonyElixir.Orchestrator (Change Risk: 1632)

- **Direct dependents (3)**: HttpServer, StatusDashboard, Presenter

- **Depth-2 dependents (2)**: DashboardLive, ObservabilityApiController

- **Total blast radius downstream**: 5 modules

- **Upstream (6)**: AgentRunner, Config, Linear.Issue, StatusDashboard, Tracker, Workspace

- **Total blast radius upstream**: 11 modules

- **Private function count**: 83 (88% of module API is hidden)

SymphonyElixir.StatusDashboard (Change Risk: 1904)

- **Direct dependents (2)**: Orchestrator, Presenter

- **Depth-2 dependents (3)**: HttpServer, DashboardLive, ObservabilityApiController

- **Total blast radius downstream**: 5 modules

- **Upstream (4)**: Config, HttpServer, Orchestrator, ObservabilityPubSub

- **117 functions** — largest module in the project by function count

-–

6. Structural Audit

Struct Lifecycle Analysis

StructDefining ModuleUsersLogic Leaks
SymphonyElixir.Linear.IssueLinear.Issue44 (all leak)
State (Orchestrator)State22 (all leak)
State (WorkflowStore)State22 (all leak)
SymphonyElixir.StatusDashboardStatusDashboard00

**Finding**: All structs that are shared across modules have **100% logic leak rate**. Every consumer directly pattern-matches or manipulates struct fields instead of using accessor functions. This creates tight coupling — any field rename or restructuring breaks all consumers.

**Notable**: Two different modules define a State struct (Orchestrator and WorkflowStore). This naming collision could cause confusion.

Semantic Duplicates

**Cluster 1** (High similarity: 92.9%):

- Mix.Tasks.PrBody.Check.run/1

- Mix.Tasks.Specs.Check.run/1

- Mix.Tasks.Workspace.BeforeRemove.run/1

These three Mix tasks share nearly identical structure. Candidate for extraction of shared boilerplate.

**Cluster 2** (Low similarity: 37.7%, 144 members):

Large cluster of accessor/delegate functions across Config and Tracker modules — expected pattern for configuration getters and behaviour delegates.

Behaviour Integrity

**Status**: Consistent. No behaviour fractures detected.

-–

7. Spec Coverage Analysis

ModulePublic FunctionsSpecsCoverage
Config3131100%
Orchestrator12975%
StatusDashboard15\~15\~100%
Codex.AppServer44100%
Linear.Client88100%
Tracker66100%

**Overall**: 137 specs across 592 functions (23% total, but public function coverage is significantly higher).

-–

8. Architectural Observations

What They Did Well

1. **Clean behaviour abstraction** for Tracker (Linear.Adapter + Tracker.Memory) — allows swapping issue trackers

2. **Config is fully spec’d** (31/31 public functions) despite being the highest-risk module

3. **Zero dead code** — clean codebase with no orphaned functions

4. **Behaviour integrity is consistent** — no broken contracts

5. **Workflow-driven configuration** via WORKFLOW.md — flexible runtime config without recompilation

6. **Test coverage exists** for SpecsCheck, CLI, and LogFile — the three most utility-oriented modules

Structural Concerns

1. **StatusDashboard is a god module** (117 functions, complexity 331, score 785) — it handles rendering, TPS calculation, Codex message humanization, terminal formatting, and snapshot management. This is at least 4 distinct responsibilities.

2. **Orchestrator carries too much** (95 functions, 83 private) — polling, dispatching, retry logic, token accounting, issue reconciliation, rate limiting all in one GenServer.

3. **Config’s fan-in of 10** means any API change is a project-wide event. The 53 private helper functions suggest this module is doing too much parsing/validation internally.

4. **Limited test coverage** — only 3 of 37 modules have tests. All three red-zone modules (Config, Orchestrator, StatusDashboard) remain untested.

5. **Triangular cycle** (HttpServer → Orchestrator → StatusDashboard) creates initialization coupling that could cause startup race conditions.

6. **100% struct logic leak** — no encapsulation on shared data structures.

-–

9. Refactoring Recommendations

Priority 1: StatusDashboard Decomposition

- **Current**: 117 functions, complexity 331, god score 785

- **Suggested split**:

  • StatusDashboard.Renderer — terminal output, colorize, border, table formatting

  • StatusDashboard.TPS — rolling_tps, throttled_tps, token sample management

  • StatusDashboard.CodexHumanizer — humanize_codex_message, event parsing, command normalization

  • StatusDashboard — GenServer shell (init, handle_info, notify_update)

- **Expected**: Complexity 331 → ~80 max per module

Priority 2: Orchestrator Decomposition

- **Current**: 95 functions, complexity 198

- **Suggested split**:

  • Orchestrator.Dispatcher — issue dispatch, slot management, retry scheduling

  • Orchestrator.Reconciler — state reconciliation, stall detection, terminal cleanup

  • Orchestrator.TokenAccounting — usage extraction, delta computation, rate limits

  • Orchestrator — GenServer shell (init, handle_info, handle_call, snapshot)

- **Expected**: Complexity 198 → ~50 max per module

Priority 3: Break the Triangular Cycle

- Introduce a PubSub or event bus between HttpServer, Orchestrator, and StatusDashboard

- StatusDashboard should subscribe to events rather than being directly called by Orchestrator

Priority 4: Struct Encapsulation

- Add accessor functions to Linear.Issue and enforce access through module API

- Rename duplicate State structs to avoid confusion (Orchestrator.State, WorkflowStore.State)

-–

10. Full Heatmap

ModuleScoreZoneComplexityCentralityMax CouplingTests
Config83RED1691045No
Orchestrator75RED198369No
StatusDashboard73RED331275No
Codex.AppServer50YELLOW107125No
Linear.Client44YELLOW80213No
Workspace39YELLOW43213No
Tracker38YELLOW7412No
Presenter36YELLOW26210No
Linear.Issue36YELLOW1410No
Mix.Tasks.PrBody.Check36YELLOW38016No
Workflow35YELLOW1835No
Codex.DynamicTool32YELLOW3014No
AgentRunner32YELLOW2816No
WorkflowStore32YELLOW2816No
Endpoint31YELLOW030No
DashboardLive30YELLOW2407No
HttpServer30YELLOW1615No
ObservabilityApiController30YELLOW10010No
PromptBuilder29GREEN1513No
Mix.Tasks.Workspace.BeforeRemove29GREEN2004No
Tracker.Memory29GREEN1108No
ObservabilityPubSub29GREEN322No
Linear.Adapter28GREEN1306No
StaticAssetController28GREEN608No
State (WorkflowStore)28GREEN2800No
StaticAssets28GREEN214No
Mix.Tasks.Specs.Check26GREEN603No
Layouts26GREEN202No
ErrorJSON25GREEN101No
Router25GREEN000No
SymphonyElixir25GREEN301No
ErrorHTML25GREEN102No
Application25GREEN300No
SpecsCheck11GREEN33113Yes
CLI9GREEN29015Yes
LogFile5GREEN1016Yes

-–

11. Corrections vs Previous Report (2026-03-05)

The previous report was generated with Giulia v0.1.0.100 which had a bug in test detection. The following data points were incorrect:

Data PointOld (Incorrect)New (Correct)
Zone distributionRED 3 (8%), YELLOW 18 (50%), GREEN 15 (42%)RED 3 (8%), YELLOW 15 (42%), GREEN 18 (50%)
Tested modules0 ("Zero tests detected")3 (SpecsCheck, CLI, LogFile)
SpecsCheck zoneYELLOW (score 36)GREEN (score 11)
CLI zoneYELLOW (score 34)GREEN (score 9)
LogFile zoneYELLOW (score 30)GREEN (score 5)

All other data (module counts, function counts, specs, change risk scores, god module rankings, dependency cycles, blast radius, structural audit findings) remains identical and was correctly reported.


*Report generated by Giulia Code Intelligence v0.1.0.127 (Build 127)*

*Analyzed via AST parsing + Knowledge Graph (628 vertices, 668 edges)*

*Previous report: 2026-03-05, Giulia v0.1.0.100 (Build 91)*

1 Like

Every consumer directly pattern-matches or manipulates struct fields instead of using accessor functions.

Since when do we write accessor functions in elixir?

I think your LLM is leaking Java.

3 Likes

Good catch @FlyingNoodle — you’re absolutely right.

That paragraph was poorly framed. Pattern matching on struct fields is idiomatic Elixir, not a “leak.”

The underlying metric the analysis was trying to surface is coupling: if a struct’s shape changes, how many modules break? For app-internal structs the compiler has your back, so it’s a non-issue.

The concern only applies at library/context boundaries, where @opaque or a module API can help — but that’s a design choice, not a rule. I’ve updated the report.

The old text:

All structs that are shared across modules have 100% logic leak rate. Every consumer directly pattern-matches or manipulates struct fields instead of using accessor functions. This creates tight coupling — any field rename or restructuring breaks all consumers.

Need to be replaced with:

All structs shared across modules are accessed by direct field matching in every consumer. This is idiomatic Elixir — structs are transparent data and pattern matching on them is expected. However, it does mean that any field rename or restructuring will break all consumers
simultaneously. For application-internal structs this is acceptable (the compiler catches it). For structs that cross library/context boundaries, consider defining a public API in the owning module (e.g. Issue.title(issue)) or using @opaque typespecs to signal that callers
should not depend on the internal shape.

Thanks for the feedback

Even the original AST appears to be written by LLM. The “Not Python. Not Go. Not Node. Elixir.” is a dead giveaway.

Do we need some policies regarding posting LLM content?

8 Likes

Is Giulia opensource/you have a link to it? I’ve searched for it but can’t find it… so I presume its not?

Not open source yet, still in active development.