Please how do you validate url inputs in your Phoenix projects forms?

I’m looking for a way to validate any url, but also a specific one (for example youtube channel, facebook page like urls).

Any recommended or common way of doing that? Ecto Type library?

Thanks

Do you want to only validate the format or that the url really exists and returns something?

2 Likes

For now the url format is my only concern.

There is validate_format/4 where you can write a regexp for url validation, or something like this EctoFields.URL

1 Like

So if I get it right, for some specific url I can combine validate_format/4 and EctoFields.URL.

Edit: If I don’t want to write some long regex like http://www\.youtube\.com\/(.+)|https://www\.youtube\.com\/(.+). ^^

Then something like this https://stackoverflow.com/questions/39041335/how-to-validate-url-in-elixir

1 Like

If you want to validate a specific format for YouTube, then I think you will have to write a regexp for YouTube and handle different cases.

http://www\.youtube\.com\/(.+)|https://www\.youtube\.com\/(.+) is of course not so optimal. Could be done with http(s)?:\/\/www\.youtube\.com\/(.+).

Maybe also see this topic for inspiration Validating youtube link

1 Like

I have a small lib called ex_url that parses URLs that you might find helpful.

5 Likes

Thanks to all the suggestions, finally I managed to write a really naive module for my url validation with Ecto.

My use case is very simple. In my project I want users to add their personal website url or their social link (facebook, twitter, etc.).

  • for some personal website any valid url is okay.
  • for some social link I need to ensure that the said social network name is included in a valid url.

All that validations are just for users browsing convenience. I won’t do anything with the urls apart displaying them on the owner profile page.

I choose the regex solution suggested by @egze because like that I won’t have to add any extra dependency to my project. Thanks also to @wolfiton that provided to me links to great resources on regex writing.

So see below a sample of the module

1 Like
defmodule Easy.Url do
  @moduledoc """
  A module for url validation with Ecto
  """

  alias Ecto.{Changeset}
  import Changeset, only: [validate_format: 3]

  @before_tld "^(((ht|f)tp(s?))\://)?(www.|[a-zA-Z].)[a-zA-Z0-9\-\.]"
  @tld "([a-zA-Z])"
  @after_tld "(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\;\?\'\\\+&%\$#\=~_\-]+))*$"

  @regex Regex.compile!("#{@before_tld}+\.#{@tld}#{@after_tld}")

  @doc """
  Validate an url
  """
  def validate_url(changeset, field), do: validate_format(changeset, field, @regex)

  @doc """
  Validate an url for only some given top level domains
  """
  def validate_url(changeset, field, tld_list) do
    tld = Enum.join(tld_list, "|")

    case Regex.compile("#{@before_tld}+\.#{tld}#{@after_tld}") do
      {:ok, regex} ->
        validate_format(changeset, field, regex)

      _ ->
        validate_url(changeset, field)
    end
  end

  @doc """
  Validate an url while ensuring it contains a given domain
  """
  def validate_specific_url(changeset, field, domain) do
    changeset
    |> validate_url(field)
    |> validate_format(field, ~r/#{domain}/)
  end
end

Edit: Any suggestion is welcome of course.

Unfortunately your regex won’t parse some valid URLs like:

iex> Regex.match? @regex, "https://user:pass@my_site.com"          
false
iex> Regex.match? @regex, "https://my_site.com?tracking_token=some_token"   
false
iex> Regex.match? @regex, "https://my_site.com/" 

All of which can be parsed by the built-in URI module (ie no new dependency). For example:

iex> URI.parse "https://user:pass@my_site.com/"       
%URI{
  authority: "user:pass@my_site.com",
  fragment: nil,
  host: "my_site.com",
  path: "/",
  port: 443,
  query: nil,
  scheme: "https",
  userinfo: "user:pass"
}
2 Likes

So @kip you can use regex with URI parse to verify certain websites and profiles if they are valid like instagram facebook and vimeo videos.

My question is:

How can you combine URI.parse with regex does it have a method to do this?

Because i can’t find anything here https://hexdocs.pm/elixir/URI.html#parse/1, does it have URI.parse/2? to include regex?

Or this idea can be done using a pipe something li this

uri_to_validate
|> URI.parse
|> regex_fn

Can you help me understand what this means? Maybe a simple example? I think what you’re saying is “if the host is instagram, check that the path of the URI conforms to how instagram represents profile URLs” but I’m not sure.

Yes to validate if the given url is valid for that specific social website.For example how to create a valid url for a facebook profile using URL.parse and regex.

Something like this?

defmodule Easy.Url do
  @moduledoc """
  A module for url validation with Ecto
  """

  @doc """
  Validate an url for only some given top level domains
  """
  def validate_url(url) do
    URI.parse(url)
    |> validate_social_url(url)
  end
  
  def validate_social_url(%{host: "instagram.com"} = uri, url) do
    case Regex.match(@insta_regex, uri.path) do
      true -> {:ok, url}
      false -> {:error, "Invalid profile url: #{url}"}
    end
  end
  
  def validate_social_url(%{host: "facebook.com"} = uri, url) do
    
  end
  
  def validate_social_url(_other_uri, url) do
   {:ok, url}
  end

end
1 Like

Thanks for the suggestion. I will try then to refine my module.

Just something bothered me with URI.parse.

iex(39)> URI.parse "http://                      google.fr"
%URI{
  authority: "                      google.fr",
  fragment: nil,
  host: "                      google.fr",
  path: nil,
  port: 80,
  query: nil,
  scheme: "http",
  userinfo: nil
}

In the above example it takes blanks in the host string. Is that okay?

Thanks for the example I will have to try it sometime.

One thing is still unclear to me can this example be rewritten using defp and bring all the transformations in a pipe or is it better to do pattern match.

That looks like a bug in URI to me.
Not a bug. URI.parse is a parser, not a validator. So likely some additional work is required.

1 Like

I would say pattern match for two reasons:

  1. There is only one step here: validate the URL
  2. Separation of concerns. The eye scans very easily the different function heads, one for each social network. And when there is a new social network to add, its just adding a function clause.
1 Like

Thanks @kip, This makes more sense now and I can see how this can be extended for different url validations not just social ones.