Is it possible to create a ComfyUI-like system in Elixir or something similar?

The origin of this idea goes beyond just building “Phoenix’s ComfyUI.” I want to combine BEAM’s parallel performance and fault tolerance with elegant syntax (even creating a friendly DSL), which are precisely the reasons I started learning Elixir.

Before posting, I explored existing FBP (Flow-Based Programming) related works. However, most packages either don’t fully align with my vision or are domain-specific (e.g., web-focused applications like Ash/Reactor).

While I welcome discussions about flow programming/job orchestration feasibility in Elixir, I want to ground this conversation in practical context to avoid purely theoretical debates.

My motivation stems from an ongoing project: building a singing synthesizer interface/app that integrates multiple models (initially aiming to implement DiffSinger’s Elixir wrapper with a simple WebUI).

For those familiar with song synthesizers like Vocaloid: generating audio from lyrics involves multiple steps and requires parameter adjustments at various abstraction levels (note duration, syllable timing, pitch curves, etc.). This complexity necessitates choosing different models/phoneme dictionaries, which inspired me to build a tool that lowers the usage barrier.

A ComfyUI-like system seems ideal for this scenario. A “track” could be defined as a workflow of interdependent tasks, where individual tasks might require operations from others.

I’ve created a repo under SynapticStrings/QyEditor: “Lightweight synthesizer interface.”. Currently, the project is primarily in Chinese due to my limited English proficiency and intentional delay in internationalization (docs/comments/etc.) until the core is stable.

The architecture splits into two applications:

  1. :qy_core handles parameter manipulation and chaining (similar to Plug’s philosophy).
  2. :qy_flow manages parallelism, task scheduling, and process orchestration using libraries like GenStage/Flow.

Key questions:

  1. Is building a ComfyUI-like system in Elixir feasible?
  2. Am I on the right track with this architecture? What should be my next steps?

P.S. I know the optimal path would be to implement DiffSinger’s pipeline in Elixir first. However, calling its ONNX model via Ortex throws errors, and my limited Rust/ML debugging skills prevent me from resolving this.

1 Like

It is very likely possible, but would require a lot of community work, and a few contributors.

Now even more possible with PythonX