Recursively replace all JSON `$ref`s with the code that they refer to

In JSON, it is possible in some formats (e.g. OpenAPI, React-JSON-Schema-Forms etc.) to use the $ref keyword as a rough parallel of macro functionality (see here: JSON References | NIEM GitHub).

This creates an issue when decoding JSON to Elixir (e.g. with Poison or Jason). I may be processing a complex JSON structure, and instead of the data I need being under "someKey", I have "$ref": "/path/to/something".

What I need is a way to replace all $refs with the actual JSON data that they represent. I have no idea how to do this, especially since $refs can be nested (i.e. a $ref refers to a JSON object with other refs inside it etc.). Also, in OpenApi, it is possible to add a attributes for summary and description after the $ref, but it is not possible to add any other attributes (How to use JSON references ($refs)).This creates further parsing complexity.

Is there some existing library that I can use maybe?

1 Like

One approach would be preprocessing the JSON with a JS/TS library like json-schema-ref-parser to de-reference/resolve the references before decoding it in Elixir.

If you really wanted to do this in Elixir, I’d take a look at how the JS libraries handle tree traversal and re-implement it in Elixir. I imagine there would be opportunities for recursion.

Off the top of my head, Elixir’s built-in get_in/2 and update_in/3 functions in the Kernel module paired with the Access module will be helpful. There’s also a Pathex library that could be a good alternative for working with nested data structures.

Edit to add: this github issue might be relevant

2 Likes

Both of the examples you linked to are specifically about embedding references in schemas (either YAML- or JSON-formatted) to reduce duplication; I don’t believe a normal application decoding plain JSON would ever encounter a $ref outside of that context.

If you are expecting input like this, make sure to be careful about which URLs you follow and when. For instance, if you wrote a JSON parser that loaded these references before validating the shape of the data you could accidentally create a denial-of-service reflector. For instance, imagine a malicious payload like:

{
  "foo1": {
    "$ref": "http://example.com/oh_no1#foo"
  },
  "foo2": {
    "$ref": "http://example.com/oh_no2#foo"
  },
  "foo3": {
    "$ref": "http://example.com/oh_no3#foo"
  },
  "foo4": {
    "$ref": "http://example.com/oh_no4#foo"
  },
  ...etc for thousands of times
}

A single request to an API that tries to “follow” all those refs would be multiplied into thousands pointed at example.com.

2 Likes