Best way to model & write functions for datatype that has many different "forms"

vshesh · January 29, 2021, 3:18am

I am trying to model musical chords and I have many ways of thinking about them:

First, a pitch class can be expressed one of a few ways:

:c
"C"
0

then there’s a pitch type (midi or type + octave)

# note pitch can have all the forms above
%Pitch{octave: 3, pitch: :c}
{:c, 0} # same thing here for the first element, can take three forms above
36
"C3" # string serialized version

I also have an interval type

Interval.interval(5) => %Interval{value: 5}
Interval.name(5) => "^"

I picked a single character to represent each value from 0-11

Then a chord can be:

# root is a pitch, so can also be "E3" or %Pitch{pitch: :e, octave: 3} or any other form
{root: 40, quality: :M} # root is a pitch, raw midi value :M -> major
{root: 40, quality: [interval(5), interval(7), interval(12)]} #  major expressed as a list of intervals, `interval` gives a tuple like `interval(5) => %Interval{value: 5}` etc 
{root: 40, quality: "5∆8"} # list of intervals represented as string where each interval is one character of the string
# ^ this is useful for data entry and creating 'databases' of chord qualities and voicing

{root: 40, quality: :M, voicing: :closed} # voicing style used to lookup voicing in a table, quality can be any quality type above. 
{root: 40, quality: :M, voicing: fn tones -> Enum.map(tones, fn tone -> Interval.new(tone).value) } # function voicing instead of atom
{root: 40, voicing: [interval(1), interval(5), interval(1), interval(3)]} # voicing is described as a tone_stack.
{root: 40, voicing: [0, 5, 7, 12]} # an exact voicing with deltas in midi pitch amounts
[40, 45, 47, 52] # "rendered" chord that can be sent to a synth

You could call the first one and the second one different concepts I guess? Chord and ChordVoicing or something?

What is the best way to represent all these options in Elixir? In ML type languages I’d use a sum type, not sure how to do it here.

Also, some of the variants are 100% interchangeable (isomorphic) - do i pick a canonical form or do I write functions in a way that works with both?

Eg
{root: 40, voicing: [0, 5, 7,12]} = iso = [40, 45, 47, 52]
{root: 40, quality: [interval(5), interval(7), interval(12)]} = iso = {root: 40, quality: "5∆8"} (just concatenate the names of the intervals in the forward direction and convert from character to value in the backward)

I’m not clear how best to represent the isomorphisms and the variants in the design.
The challenge is that every function has to be written to support all the different forms, and that I’m manually doing the conversions between them. On top of that, there’s isomorphisms within isomorphisms, so it quickly combinatorically expands! Or I have to write a “sanitizer” function that gives me a predictable form (even though different functions are best expressed with different forms) and then do the operation on that.

Is there some way to auto-generate the missing variants for isomorphic types like this?

for example (not in perfect syntax for brevity):

# transformation functions
to_midi(%Pitch{octave: o, pitch: p}) => 12*(o+1) + p
to_tuple(%Pitch{octave: o, pitch: p}) => {p, o} # how do i get the clause to_tuple(i :: int) => to_tuple(to_struct(i))
to_struct(i) => %Pitch{octave: div(i, 12) - 1, pitch: pitch(rem(i, 12))}

raise(x :: int) = x+1
# don't want to have to write raise(p :: Pitch.t) = to_struct(raise(to_midi(p)))

That doesn’t seem so bad because there’s one variant… now imagine a function on chords:

#transformation functions
to_midi(%ChordVoicing{root: r, voicing: v}) = Enum.map(v, fn x -> x + r end)
to_struct(midi :: [integer]) = Enum.map(midi, fn x -> x-List.first(midi) end)
# how do I "inherit" clauses for roots that are in the form %Pitch{pitch: :c, octave: 3}? 

#Then a simple function: 
voicing(%ChordVoicing{root: r, voicing: v}) = v
# now i need clauses for all types of voicing and all types of root (pitch descriptions)
# that's 4 types of root * 6 types of voicing = 24 function clauses.

You could say, “look, don’t try to make every function work on every type of input - just choose one input type for most functions, and expose the transforms where applicable”
That’s fine, except even for the translation functions I need n choose 2 such transformations! And after writing all of those, the calling code has the headache of figuring out which ones to use for its particular data type. The hope is that I can “just call a function” and not worry about how the data is represented, so long as it’s valid. I can either standardize the output representation if that is better design, or return whatever representation is given to me.

Thank you for the help in advance! This isn’t an elixir specific problem, I think I’d have the same question in python or haskell or clojure or anything else. Most programming languages do not have a construct for describing or using a “type” that can have multiple representations. The assumption is that the type is defined by its one representation.

vshesh · January 29, 2021, 8:38am

one option is to write a macro that takes some graph of translations and fills in a clique of such translations (generating all paths in the graph so it becomes a clique) so that I have all the direct options possible. Then I can use all those translations to wrap any function in the module to generate all the variants necessary.
If the function happens to use two of these isomorphic data types… that would be n^2 clauses for the definition. And how do I deal with a list of such types? I guess I would first translate everything to a known format, and then work with it? Or whatever function I use to process the list has to be expanded to work with all the variants too?

Writing a macro to wrap all these functions is not easy because I’d have to detect which args have the type and therefore need to be expanded. I can’t think of a straightforward way to do this without some kind of type annotation that elixir lacks. Use the guards? What if the representation is a String - how would I know it’s the right kind?

Another option is 2(n-1) translation functions - n-1 into a chosen “canonical form” and n-1 out of the canonical form. I’d need at least 2(n-1) functions for n representations anyway no matter how I arranged them (one going into each representation and one coming out of each representation), choosing a star pattern with one representation in the middle gives me a set of arrows that can be used to get from anywhere to anywhere predictably. This is the smallest number of translations I can write.

Assuming I did that. The next problem is excess translation. Consider this example from above:

voicing(%ChordVoicing{root: r, voicing: v}) = v

Now imagine for some reason I chose the [40, 45, 47, 52] version (same logic the other way around too) as the canonical definition of the chord voicing - I’d change the definition of the function to use the

voicing(list) = to_struct(list).v

Now say calling code was holding onto a struct as a representation (because some user entered it into a form, for example). They would have to call the function with:

ChordVoicing.voicing(ChordVoicing.from_struct(cv))

and so now I’ve added a nice NoOp in my code And the calling code is still doing translations anyway.

One way to fix this is to add only two variants so that in case you are using the variant on which the logic is performed you can “skip” translation:

voicing(%ChordVoicing{root: r, voicing: v}) = v
voicing(list) = to_struct(list).v

However, this changes the calling code’s pattern now, and I end up with inconsistent interfaces. For some functions i’ll have the [40, 45, 47, 52] and the %ChordVoicing{...} representation, some functions will have the [40, 45, 47, 52] and the %{root: 40, voicing: [interval(5), interval(7), interval(12)]} version.

Say I use the voicing function above… I would need to know which variants it supports and whether I can call it with the one I’m currently holding. This is not ideal as a dev, and causes me to look at the docs every time I need to use a function. The benefit of supporting all the variants is that I don’t need to think about which variant I have; I can treat it like a black box and do what I want to it.

ChordVoicing.voicing(cv) # I don't care what cv is, could be string, struct, list, w/e

Hopefully helps illuminate the problem and there’s some idiomatic elixir way to do this.

Qqwy · January 29, 2021, 11:55am

I don’t have a lot of time so I’ll only be able to give three brief tips (without much in-depth explanation) right now:

Do not use sum types for things that are isomorphic. Take the canonical example of a sum type, the boolean, for instance: false and true are not isomorph. Instead, if you have things that are isomorph, try to transform them into one canonical form.
Computers hate having multiple representations that mean the same thing, exactly because it means having to support all the different representations everywhere. Instead, transform input (from the user or some other application calling your library) once at the start, and then have all interesting functions (such as your voicing example) only work on this canonical representation. Now you only need n input transformation functions (and possibly a couple of output transformation functions. Probably less than your input transformation functions, actually.) rather than supporting all the formats everywhere.
Data representation: Use a struct wherever you need a product type, and atoms wherever you need a sum type (or, if a variant of the sum type contains further values, a tuple whose first element is the atom indicating the variant and the later elements the values). In some limited cases having plain integers instead of atoms makes more sense.

In your example I’d probably choose integer semitones as canonical pitch representation, á la MIDI. This should be a fine data model, as long as you do not want to express too much pitch bending or microtonal music in it .

aseigo · January 29, 2021, 12:25pm

Good advice in general, just thought to expand slightly on this point:

Having a single canonical representation to flow data between functions (with the user at the head and tail of that) is super useful. However, in problem domains where there is a fully faithful 1:1 transformation between the various forms (so, they are truly isomorphic), it can make sense sometimes to transform from the canonical representation to a more useful local transformation.

As a simple example, something using degrees versus radians … maybe everything is expressed in degrees, but some calculations are easier in radians … so those functions may do a local transformation, perform the more convenient algorithm, then convert back to canonical form before returning.

Which is to say: things that are more convenient to process in different forms can be done fully encapsulated (hidden) from the rest of the data interchange between functions.

So I might expand “only work on this canonical type” to the slightly more long-winded “only accept and return this canonical type, thought local transformations may occur for local convenience”

dimitarvp · January 29, 2021, 12:41pm

@aseigo and @Qqwy IMO both are achievable. Just make a strong sum type – either with a library like domo or with atoms / tuples – and then introduce a bunch of from_* and to_* functions. Cover them well with tests – preferably property tests – and it’s done once and for all.

I completely agree with you both that having a proper strong sum type that’s used as the common interface between all functions that must operate on this kind of data is the way to go.

vshesh · January 31, 2021, 2:01am

thank you all (I wish I could select all of the answers as the solution) I have settled on this for the “cannoical representations”:

# Pitch class also called a chroma, apparently
Chroma: :c | :"c#" | :d | ... etc # mapped to 0 - 11
Pitch: {Chroma, int}
Interval: m2 | M2 | m3 | M3 ... etc # mapped to 0-11 + a few upper structure intervals (13-21)
Scale: set(interval) when interval in [0,11]
Quality: Set(interval) # any interval this time is fine 
Chord: %Chord{root: Pitch, quality: Quality}
Voicing: [Interval] when interval in [0,11]
VoicedChord: %VoicedChord{root: Pitch, voicing: Voicing}

And I will write the appropriate to_ and from_ functions to do the translations for the rest!

What I understand from the approach is that all funcitons will work with this format, and if i have to convert to some other format in the process, I’ll include a clause for that too (because why not).