I have a (very WIP) project that will be making some graph-structured data generally available. I’d really like it to be accessible and convenient for AI systems (e.g., LLMs) and their developers. So, I’d like to know whether there are any preferred data representation(s) for property graphs, including pitfalls, etc.
Background
Internally, the data is maintained in an Elixirish version of GraphQL. This (WIP) approach retains GraphQL’s ability to let clients dynamically request subsets of a published data set, but in a BEAM-friendly manner (Actors, messages, terms).
So, the native data structures are dynamic (i.e., run-time friendly) Elixir terms: nested maps, lists, and scalars, but few atoms or structs. A GraphQL front end is an obvious addition, because the semantics are extremely similar. However, this is sort of a “raw” interface.
The next-level structure is a property graph (i.e., entities and relationships, both with attached attributes). Used by means of a property graph database (think Gremlex or Neo4j), this provides access, navigation, organization, processing, and storage.
Although a property graph can be expressed in terms of maps and such, the underlying data structures can also be finessed by graph query languages such as Cypher.
In summary, there will be a way to generate almost any desired concrete data representation. Which brings us back to the original question: what data format(s) would AI code and coders find the most palatable? (ducks)
-r