Best Approach for Validating Thousands of Inputs in an Elixir SDK Project

franzin · January 25, 2025, 7:34pm

Hello everyone! I hope you’re all doing well.

I’m new to the world of Elixir, and recently I decided to create an SDK (I’ll share more details soon) as a way to practice.

So far, I haven’t encountered many issues developing the core of the SDK or the Clients themselves. However, I’m uncertain about the best way to handle the inputs coming from the Clients.

Most of the inputs are maps, which will be converted into JSON and sent to the service APIs via HTTPS requests. A malformed input would result in a rejected request by the API, which I consider an unnecessary overhead. Additionally, the error returned is not always clear about what’s is wrong.

My goal is to validate these inputs — mainly checking for missing required fields — directly within the Client, to avoid burdening the API with simple validation logic.

The SDK will include several Clients, each with multiple operations, which will ultimately result in thousands of different inputs (Clients * Operations).

My question is: should I create a specific module/structure for each input, or would it be better to build a dynamic solution for map validation? What approach would you use to ensure validation is both performant and easy to maintain?

Thanks in advance for any suggestions or insights!

dimitarvp · January 25, 2025, 7:59pm

I’d come back and give you a more concrete response but for now I’d say avoid dynamic solutions as a matter of principle. You should prioritize writing code that would be painfully obvious to your future self a year later. That should override all other considerations. If that means the code gets to be more verbose and live in more files – so be it.

Additionally, you can code up a central router / dispatcher that “guesses” which service is the input meant for, by having multiple-headed pattern-matching function clauses – but that’s optional and it could backfire. If the clients are explicitly identifying the data they are sending and whom are they intended for (via an extra input piece of data in their request to your server) then you’re better off using that.

al2o3cr · January 25, 2025, 9:10pm

Doing additional validation on every request seems like the more-likely cause of overhead since presumably you’re expecting substantially more correct requests than malformed ones.

In any case, consider JSON Schema if you need to validate a bunch of different JSON shapes; if you’re lucky, the API may already publish one.

franzin · January 26, 2025, 12:53am

thank you so much for sharing your insights and perspectives about maintaining a well-defined codebase.

Regarding your suggestion about a general router, I believe it might not be necessary given how I’m planning to organize the SDK operations. Each operation will have its own dedicated function within the Client module (SDK.CLIENT.operation) to match different API endpoints. This way, pattern matching/validation can be performed at the operation level, e.g:

<SDK>.VirtualMachine.create_instance(…, %CreateInstanceInput{}, …) 
<SDK>.VirtualMachine.attach_disk(…, %AttachDiskInput{}, …) 
<SDK>.Log.search_logs(…,%SearchLogsInput{}, …)```

franzin · January 26, 2025, 1:05am

thank you so much for the reply! I totally get your point, and it makes a lot of sense.

That being said, I still believe client-side validations are valuable for this SDK. Not only will they prevent unnecessary API calls, but they’ll also allow me to provide more user-friendly error messages compared to the current API responses. I believe this will make the SDK more developer-friendly and help users avoid headaches.

I’ll explore the JSON schema options, they seem like they could be a great fit, thanks!