Best practice for SQS message: JSON encoded objects for the message body? Or message attributes?

fireproofsocks · April 21, 2020, 5:52pm

This is a general question, but what are people doing for their SQS message formats? In a lot of places I’ve worked, people just JSON encoded an object and sent it as the SQS message and they never used the SQS message attributes. Even the Broadway testing functions don’t put message attributes at the forefront: the envisioned use case is that all test messages in a pipeline will share the same message attributes.

Is the JSON encoding route the “proper” way? Although I can see some advantages (mostly in pattern matching) that leverage using message attributes, it’s usually a chunk more work.

What are people’s thoughts on this?

gregvaughn · April 21, 2020, 8:40pm

As always with these “best” practice type of questions, the answer is “it depends.” There is no single best way of doing it. it really depends upon what your payload is. If, for example, you were sending pdf files or images, then json is a horrible choice. If you’re sending some structured data payload, then json is a fine choice (though there’s cases where csv or xml or protobuf or other might be the best for that situation).

My personal use involves integration with a 3rd party company, so they dictated the message format. They have a message attribute (I asked the Broadway team to expose those) that is a cryptographic IV (initialization vector) that I have to read and combine with a shared secret to decrypt the json payload.

I would not use message attributes for domain knowledge purposes though. I see them more as infrastructural metadata. However, I tend to be cautious about buying too deeply into one vendor’s implementation. The less you depend upon their special features the easier it will be for you to migrate to some other queueing service. Tradeoffs apply and your situation may not match mine though.

fireproofsocks · April 21, 2020, 10:10pm

That’s helpful, thanks! Could you elaborate a bit more on what you would consider “domain knowledge purposes” vs. “infrastructural metadata”? E.g. message attributes are handy in pattern matching to route execution, but there would be no significant change if JSON were decoded and the pattern matching happened one level below handle_message/3.

gregvaughn · April 21, 2020, 10:16pm

My example of the decryption IV is an example of infrastructural metadata. You need that to even be able to read the message body. Other examples might be SQS configuration (I’m just guessing. I really haven’t studied what SQS exposes).

Routing is a fine idea for message attributes. But if “there would be no significant change” to pattern matching on the body, then I’d default to that because I’m less coupled to SQS.

When you control the format of messages you put on and take off the queues, then maybe there’s more design forces that I’m not considering. I’m not in that situation and haven’t spent time trying to weigh out all the details of that.

fireproofsocks · April 21, 2020, 10:44pm

By “no significant change”, I just mean that the pattern matching would move downstream (i.e. to somewhere after the JSON in the message body had been decoded). I can see that there are lots of ways to skin this cat…

Thanks for the follow-up!