Best practice between `String.t()` and `binary()`?

I’ve learnt that phx.gen.auth uses Base.url_encode64() to convert binary() to String.t(). If I just pass a binary, consumer would malfunction without exception.

And It seems it’s hard to draw a clear line between them. There is no built-in guard for String.t(). Is Elixir completely incapable of telling them apart? And, is Base.url_encode64 the best practice when converting one? What other options I have?

1 Like

From the typespec docs

Note that String.t() and binary() are equivalent to analysis tools. Although, for those reading the documentation, String.t() implies it is a UTF-8 encoded binary.

Let’s say you have two binaries:

iex(1)> <<0, 0, 0>>
<<0, 0, 0>>
iex(2)> <<?a, ?b, ?c>>
"abc"

binary() would describe both of them, but String.t() would only apply to "abc". Tooling, however, doesn’t differentiate at the type level because String.t() is just an alias for binary().

Base.url_encode64 just converts any binary, which could be a human-readable string, to a URL-encoded Base64 string, which is for sure a UTF-8 encoded binary.

3 Likes

That’s where I’m starting from.

If analysis don’t tell them apart, shouldn’t Elixir ship with a guard like is_string/1 that check if it qualifies as a String.t()? So we can control them on runtime at least.

is_binary/1 does the same thing on top of is_bitstring/1.

The reason is that string is actually a binary - a UTF-8 encoded binary.

edit: String.valid?/1, like @brettbeatty said.

Base.url_encode64 is not for converting between binaries and strings. It is for encoding a binary into a base64 encoded string, which can be used as a part of a valid URL.

In practice, I am using typespecs to tell my API consumers the accurate information about the public interfaces. For example:

@spec hello(String.t()) :: String.t()
def hello(message), do: message

If you see the the source code of String module, it is using the same method. For example:

@spec split(t, pattern | Regex.t(), keyword) :: [t]
def split(string, pattern, options \\ [])

def split(string, %Regex{} = pattern, options) when is_binary(string) and is_list(options) do
  Regex.split(pattern, string, options)
end

You can see the guard is_binary(string) - I think this is the official way to do string checking.

Maybe, you can just follow this convention for now. :wink:

1 Like

The problem with that is you have to look at every piece of the binary to check that it’s a valid string, which you can’t really do in a guard. If you need to branch on it, something like String.valid?/1 may get you what you want.

3 Likes