Best way to define a type that's only a limited set of strings

I want to define a type using typespec like

@type foo :: "bar" | "baz"

But the compiler shows the following error

unexpected expression in typespec: "bar"

I really don’t want to make it String.t or something alike which is not specific enough. How can I fix it?

Thanks in advance.

5 Likes

You can’t express literal strings in typespecs, you might want to use atoms instead which support this and are more appropriate.

5 Likes

I checked the documentation, and it seems there’s no way to use string literals in typespec, but why? Why lists and maps, even binaries are OK, but strings are forbidden?

Because different types have different use cases. Reconsider instead why you have of string literals instead of using atoms which are built for this use case and are supported as literals in typespecs.

I choose strings because they are user inputs, and I don’t want to stuff my RAM with atoms then blow my RAM some day. Also, I don’t want to whitelist all allowed input and convert them to atoms.

If it’s user input you have to use String.t since you don’t know what strings they will input. If you have validated that the user input is within your allowed set of accepted strings you can then convert to atoms without leaking atoms.

If you don’t want to do this you have to use String.t.

2 Likes

If you don’t want to whitelist the input, then a typespec of "foo" | "bar" is a fiction since the user can enter anything that they want.

For future alchemists:

The nicest workaround that comes to my mind is something like this:

@typedoc """
"foo" | "bar" | "baz"
"""
@type almost_enum :: String.t

Afterwards you can use almost_enum in your typespecs and the possible values will show up in both the docs and the code in an easy to find place.

That being said, it’s still best to stick to atoms in that case whenever practical.

8 Likes

I know I’m reviving an oollllllllld topic here, but this was the highest-ranked Google result that wasn’t documentation I’d already read through.

By my reckoning, it makes sense to wish typespecs could use string literals, because API integrations don’t trade in atoms. And, structured as they are, it is still much more ergonomic to keep the keys they send over the wire in the string formats they were received in, than it is to write bespoke modules with struct definitions that enumerate all of those same keys and provide functions to convert maps parsed from JSON into those structs and then work with the data being received. Yet, trust is still rickety enough with APIs that it is also highly undesirable, and indeed a potential source of atom declaration leaks, to take JSON received from an API and parse it with, e.g., Jason.decode data, keys: :atoms, like some kind of cowboy.*

And I know that @ericmj’s reply about atoms being “built for this use case” was to @Aetherus’s particular one-string-or-another problem, but I struggle to see the same utility to atoms in my use case, where I want to document the anticipated shape of data, which is optimistically well-structured, but subject to additions, revisions, and removals, as seen fit by its third party developer – whom I am unlikely to be able to persuade into reconsidering any such changes – of that integration. I’m only contemplating the typespec to give myself and my language servr hinting as to what keys I can expect to be able to work with while handling that data, so that I have quick access to that information when I would otherwise have to rummage through references to it in other parts of my code & through my curl history entries of when I’d received it.

Beyond that, it strikes me that, in a language where function signatures can be crafted to raise exceptions when their arguments fall outside of strictly pattern-matched, any such circumstance where a finite set of inputs constitute the only success-path arguments to a function (or where said inputs exhibit a specialized workflow in contrast to generic, but still valid, arguments) are worth representing as such inside of the corresponding type definitions and documentation. To wit, the type definition for a hypothetical IP4.address or Color.rgba function would benefit immensely from its signature being expressible as {0..255, 0..255, 0..255, 0..255} :: Ip4.Address.t | Color.Swatch.t than from it only being possible to sat that the function receives arguments of {non_neg_integer, non_neg_integer, non_neg_integer, non_neg_integer}… and forgive my saying, but I sincerely doubt atoms would carry more utility in that situation.

*with utmost respect to the incomparable Erlang HTTP server, cowboy

6 Likes

You can already express integer literals as well as ranges with typespecs, so your ip address example would work as you wrote it.

With respect to an external API, I would say that in this case the fact that dialyzer does not have string literals is probably pushing you towards a better design. If you just assume that your API response always has the same string keys then you could get into a situation where the API response changes and now some deep layer of your code is blowing up and it’s not immediately clear why. Instead a better pattern is to write a wrapper around this API, validate it has the keys that you expect it to, and then turn those string keys into atoms. That way if the response changes you have one central place that you can change and your internal code can keep relying on the fact that if it has a response the keys are always present.

Edit: I now see I skimmed over where you said this above approach is not ergonomic. I disagree, I think it feels less ergonomic until you are bitten in the ass by something you weren’t expecting to change, changing. Then you begin to appreciate keeping your external touch points of your system isolated into their own internal representations.

That’s a fair point about failing when incoming data doesn’t have the expected shape, but I don’t think an internally-defined struct really confers any bit-in-the-ass guards over, e.g., ~W[expected keys in_the_map] |> Enum.map(&Map.fetch!(map, &1)), %{"expected" expected, "keys" => keys, "in_the_map" => in_the_map} = map (which seems, in fact, more or less prerequisite to ferrying those values into an internally-defined struct), or the Map.take!/2 function you’ve just inspired me to write.

I’m skeptical that an internal struct inherently defines a design that is “better”; as far as I can tell, it more just seems to be the design that Elixir and Erlang chose.

1 Like

What I have come to always do is just use ecto (if you’re not using it for DB you don’t need to bring in the postgres adapter) and just structure interaction points as casting schemas. This gives you a central place for documenting the types, constraints and form, allows you to use the validations in ecto (or write your own) in a pipeline fashion, allows you to define additional types to express fine-grained values automatically on cast, serialisation when going out back to whatever, and to then “type” your functions to only accept casted values (ecto types) to work with - plus provides a very easy way to have fine grained error messages if needed.

It’s a bit more boilerplaty but I’ve found the end result to be way better than anything else I’ve seen (in and outside elixir) without much additional work. I even cast login params now, because why not.

5 Likes

Okay, that is a design I can immediately see has a better structure than simple, parsed JSON. I think I might even try this if any of my projects develop complex enough entities.

We seem to have gotten onto a tangent, though; whether Ecto or structs result in more reliable coding patterns, I still believe IEx and hexdocs would benefit from string literals being acceptable input for typespecs :sweat_smile:

Yeah in a way I agree - I’ve tried it as well - I think the issue is dyalizer not supporting it (probably as a choice since binary pattern matching is supported in the vm), as such elixir can’t either because it would render wrong expectations as to what the type meant - since it can’t be enforced by dyalizer - you would have a spec but then it would be meaningless for other than pure documentation.

On the other hand it kinda forces you to pick up an enforceable representation (yeah sometimes it can be slightly annoying…) - with elixir proper you can use ecto and set custom types to your stringy things that you want to enforce, by casting them into valid atoms, which allows then to have proper typespecs. Structs are also much faster to serialise than stringed maps. But I can see a situation where, optimally and for documentation purposes, you may want to specify the keys a map has/can have as strings, for instance in an interface even if that then does proper casting itself.

2 Likes

Hey peeps, I would have String Literal as well,

Imagine the following module,

defmodule GoodTypeProvider do
  use OnePiece.Commanded.TypeProvider

  register_mapping("something_happened", SomethingElseHappened)
  register_mapping("something_else_happened", SomethingHappened)
end

I would love to generate a typespec that as follows in the GoodTypeProvider

@spec to_string(SomethingHappened.t()) :: "something_happened"

There is honestly no reason why this shouldn’t be possible by careful introspection of the bytecode. At some point I’ll get time to finish up Mavis and maybe we’ll get this feature out.

4 Likes