This library contains common helpers used with Ecto.Changeset.
I noticed myself copying over validators from various projects multiple times. So I figured I’d create a common library. I know everyone likes to have their own validation logic but I tried keeping it as generic as possible so it’ll be useful for many situations.
For now, it contains validators for the following cases:
Date, DateTime and Time
EmailValidator: validate emails and exclude temporary email providers
URLValidator: validate URLs with various levels of strictness
StringValidator: validate a string has a given prefix
PostalCodeValidator: validate postal codes for multiple countries (to be improved)
SocialSecurityValidator: validate social security number (SSN)
LuhnValidator: validate Luhn-type numbers such as credit card numbers and other administrative codes.
Happy to receive feedback, pull requests from motivated folks and ideas for improvement I hope this lib will grow into a set of good common ecto helpers and tools we can all benefit from.
If you are making it into library, then make it correct validator. Or at least mention that it will match only on subset of possible addresses.
Why use custom regex for validating URL instead, well, URI module? Also, why is regex strictly allowing only HTTP(s) and FTP protocols? What about other like for example Gopher or IRC?
Well, without DB to lookup these postal codes, this validator makes little sense. In most cases you can get list of all possible postal codes for given country, so it should be quite simple to make them into in-memory database.
What’s about adding for example an Ecto.Type to validate string fields length?
I just made a validator helper and you may be interrested on it…
@doc """
Will dynamically call passed anonym validation function with the same arguments
for a list of fields.
"""
def validate_many(changeset, fields, validator, opts \\ []) when is_list(fields) do
Enum.reduce(fields, changeset, fn field, changeset ->
args = [changeset, field | [opts]]
apply(validator, args)
end)
end
What do you mean by this ? There’s already a validate_length/3 function for such purpose, what would the Ecto.Type add ?
Your validate_many/4 helper is interesting indeed ! I’ll look into it !
I don’t know if I’ll include the normalize/2 helpers, it might depend a lot on how people manage their Ecto.Multi transaction but I can clearly see how it helps with Phoenix controllers. Having said that, I could add a “Guides” section in the documentation where such examples could be added so people searching for a solution could still find it there. I’ll think about it
Also, maybe going forward, a good way to approach it would be to open 1 issue per suggestion on the GitHub repo so other people can upvote and see what is the most requested/popular. I’m just cautious in adding too much clutter to the lib.
Thanks for taking a look at the library and the strict feedback As a reminder, it’s a young library so it’ll get better with time (hopefully !).
Concerning EmailValidator, indeed, the regular expression used is the one used by browsers to validate email type fields so it’s a bit stricter than the RFC 5322. I pondered for I while if i should include the “real” email regexp of the RFC 5322 but it accepts some very exotic email formats so I preferred to choose a “sane default” that most people would be comfortable with. I’d be happy to see a pull request with an improvement in the strictness of the validation (with various options). In any case, I’ll make the documentation more obvious about this and will try updating it in the upcoming releases.
About URLValidator, a similar issue exists. As you said, I first tried just using the URI.parse/1 function, but it never really errors and accepts pretty much anything. So I added the use of :http_uri.parse function which does a better check as well as a “sane default” regex that can be enabled for most common use-cases. Here too, most of it is configurable through options so you can have a very loose validation (only using URI.parse/1 for example), or a stricter one, depending on your use-case. Your comment made me realise a few improvement options in the names of the options to make it clearer.
Finally, about PostalCodeValidator, it’s still a work in progress indeed, same with SocialSecurityValidator. I’ll need to find a nice CSV file with all postal codes regular expressions for all countries so I can generate all the cases. If you have a good ressource for that, that has been checked by locals so we know the regexp are correct, I’d be happy to include it in my next update.
Generally I’d suggest just using a database instead of a regex. Postal codes are more akin to random strings than consecutive ranges of numbers. At least here in germany historical changes mean you’ll often find gaps in the sequence of postal numbering for reasons like consolidating postal codes of some areas and things like that.
With such an Ecto.Type we won’t need anymore to call explicitly validate_length/3.
We would pass directly the validations opton when defining the schema fields:
field name, MyStringType, min: 2, max: 25
More over if we don’t pass any option, it could set the max length to 255 for example. The default Ecto string type expect both migration :string and :text types. So for non :text column we are forced to set a max length validator to 255. If we don’t, we have an exception when users submit forms with string length greater than 255.
Imho this muddies the water for the clean separation between validation and casting. A type is only responsible for the latter, but not the first one. Casting is converting data from external sources to some proper elixir datatype, while validation checks if the data is valid. There’s some overlap for types, which are more restrictive in their runtime representation, but tbh a :name unlikely needs such.
Check out GeoNames. However I think that in such case it should be independent library, as it can get pretty huge with all that priv files that will contain list of all postal codes for all countries. If that interests anybody I was working (and I need to get back to it) on NIF library that would wrap Rust’s fst crate for such indices. It would allow quicker checks and searches for data.
Why not but I’m not sure I’d be ready to package the database of all world postal codes into this library (and maintaining it up to date would also be a huge undertaking).
I’m open to suggestions though
In an upcoming release, I’m thinking of adding a phone number validator (using ex_phone_number) but it’s a lighter database.
EDIT: I hadn’t seen your reply yet @hauleth but if we manage to package the postal codes too that would be amazing. Still, it may start to be a bit huge for a common set of ecto helpers library. Or we could add it as an optional dependency that can be validated against using an option in the validator.
Yeah, see @hauleth’s answer. But as with emails there’s a huge difference between validating a postal by format (which is a sanity check at best) and actually validating a postal for being a correct (and existing) postal.
Concerning EmailValidator , indeed, the regular expression used is the one used by browsers to validate email type fields so it’s a bit stricter than the RFC 5322. I pondered for I while if i should include the “real” email regexp of the RFC 5322 but it accepts some very exotic email formats so I preferred to choose a “sane default” that most people would be comfortable with. I’d be happy to see a pull request with an improvement in the strictness of the validation (with various options). In any case, I’ll make the documentation more obvious about this and will try updating it in the upcoming releases.
Seems that this is still not fully valid, as this do not allow [::1] as a domain part. Additionally it will allow .foo. as a domain, which is incorrect.
I’ve improved the URLValidator by better documenting what is and what isn’t supported.
I’ve added all countries for the PostalCodeValidator (still no full database but a first sanity check of the format). The formats come from http://i18napis.appspot.com/, recommended by the (now deprecated) unicode postal code database.
I’ve added better documentation to EmailValidator and an opt-in approach to which checks you want to apply. I also added pow's package email validator and fixed some of the shortcomings found by @hauleth.
I’ve added the validate_many helper as it’s indeed common to validate multiple fields with the same options.
As usual, happy to have your feedbacks, issues and pull requests
Good catch, thanks! Didn’t validate the domain properly.
I didn’t add IP addresses as it’s strongly discouraged with RFC 3696:
The domain name can also be replaced by an IP address in
square brackets, but that form is strongly discouraged except for
testing and troubleshooting purposes.
@achedeuzot you may want to take a look at the changes in https://github.com/danschultzer/pow/issues/560 as I didn’t validate the domain properly. Each DNS label should be validated. There are also some additional validation that I didn’t include such as checking for reserved domains (like example.com, these domains can’t receive e-mail).
Thanks for the link to the pow issue link. I’ll update the validator check to better reflect your updates. It looks far better than mine which only addressed a few specific issues.