Seeking thoughts on the design of my umbrella project: Multinode WhatsApp Gateway and Bot App

Hey fellow devs,

This is my first post in this awesome community, and I’m excited to share my project with you all. As a newbie, I’d love to get your feedback on my design.

Here’s the context: I have an umbrella project with two apps. The first app is a gateway that receives a high volume of messages from Meta’s WhatsApp Cloud API. These messages are destined for different business account numbers (in Meta’s jargon). To simplify, imagine receiving messages like {"1800555", "message 1 from whatsapp consumer"}, {"18007777", "message 2 from another whatsapp consumer"}, and so on.

The second app is a bot that processes these messages. Now, here’s how my architecture works: the gateway (a Plug endpoint) is a node called gateway_node@127.0.0.1. When a message arrives, it reads from Mnesia, which has a map of business accounts to destination nodes. The destination node is a copy of the bot app (app 2), which can be something like customer_services_node@127.0.0.1 or another_company@127.0.0.1.

The advantage of this setup is that I can add nodes (business accounts) on demand without affecting the others.

So, what do you think? Is this approach too newbie-ish? Any recommendations?

Thanks in advance!

1 Like

Node-per-tenant is somewhat unusual; more commonly you’d a setup where a DynamicSupervisor on a single node manages one tree of processes per bot, roughly like:

This lets all the instances of Bot share the rest of the BEAM support machinery, versus having separate copies in each client’s node.

This may not be possible if instances of Bot need a lot of resources - for instance, if you’re running LLM inference on a GPU in them - in which case splitting onto different physical nodes may be the only option.

But you need to restart the app to add a business. That is what he tries to avoid.

Why won’t OP be able to do it without a restart? They can use the remote shell to trigger a function for creating a business. There’s a myriad other ways.

Sorry, I was already half sleeping and missed the ‘Dynamic’ part of ‘DynamicSupervisor’ :pensive:

1 Like

Thanks @al2o3cr so much for the insight. I think I need to dust off my notes from Sasa Juric’s book to fully wrap my head around the DynamicSupervisor setup you described. Appreciate the guidance, and I’ll definitely consider the resource implications for the bot instances.

Totally get it @BartOtten, no one wants to restart the app mid-convo :slight_smile:

@krasenyp I’ve explored a solution that involves storing a table in memory (using :mnesia), connecting to this remote-node, and establishing a business account relationship with the new node.

The downside is that the entire app doesn’t interrupt conversations, plus the ability to create new business accounts ‘on the fly’. Each bot app has its own state, conversation history, and client profile, and it’s crucial that this info is isolated between business accounts.

On top of that, I’m using rpc.call rpc.cast to route to the destination node (The bot). The bot calls a LLM, and depending on the state, it performs long-running processes, image and audio processing, external API calls, and even calls a Pytorch model.

I’m sharing a snippet of the code that decides which node to send the request to. Guys, it’s working for now, but I’m not sure if it’s good practice. Any recommendations are welcome!

bussines_account_id = "...." # get from Meta request

case MyApp.discover_node_target(bussines_account_id) do # read from :mnesia
      [_, target_node, :target_app_name] ->
        :rpc.cast(
          target_node,
          MyApp.StateMachine,
          :new,
          [
            bussines_account_id,
            Keyword.get(sender, :sender_phone_number),
            Keyword.get(sender, :message),
            Keyword.get(sender, :wa_message_id),
            Keyword.get(sender, :flow),
            Keyword.get(sender, :audio_id),
            Keyword.get(sender, :scheduled),
            Keyword.get(sender, :forwarded)
          ]
        )
...

# output example
iex(node1@10.0.0.28)1> MyApp.discover_node_target("742393808949918") 
["742393808949918", :"customer_service@10.0.0.28", :target_app_name]