Are there any libraries to normalize emails

Hey community,

Does anyone now if there is a package to normalize emails?

Somethibng like this https://github.com/johno/normalize-email?

Thanks in advance

I don’t know if you noticed but the same author has an elixir package to do the same thing (well it’s far more basic). I did’t even realise it was the same author until I was reading the source code.

It’s not maintained but this should be enough to get you started.

2 Likes

I didn’t notice that there was a port of the node package.

Thanks

I will probably have to rewrite it with the new features from the node package, but i was hoping that there is a package already built by someone else that does this.

If you’re building a service that uses emails for logins though, please don’t strip the +, it’s so very useful.

5 Likes

Can you please explain why?

Because as of now, what i know about this best practice goes like this:

  • instead of having multiple emails for one user(one user to have 5 entries for the same email), the email must be normalized

  • also a normalized email is easier to validate

Please share your point of view on this.

Thanks in advance

While Google and other providers might do magic with the pluses, at work we have pluses in the mail literally. Stripping them makes the email address inexistent.

2 Likes

Well, I suppose enforcing uniquness based on the “normalized” email may not be an issue for me personally as long as the email address as given with the + is preserved when actually sending the emails. I use + suffixes in my email addresses all the time when signing up for services that I wish to have a automatically categorized by google. It also makes it more obvious when a service shared my email with some third party.

Thanks, so I can drop the normalize email theory and just use emails as they are

1 Like

Can you please show me an example of what an email with suffixes that helps you identify who is trying to share your emails looks like?

Thanks

I create emails like myemail+elixirforum@mydomain.com . so when this email was leaked, I know who to blame

4 Likes

Well, suppose I sign up for in flight wifi, I’ll often use something like myemail+gogo@gmail.com for GoGo in flight wifi. If i suddenly start getting advertisements from some company and they have the +gogo suffix in there still, I know who shared it.

I’ll often just use a +temp suffix for short term transactional emails I want to just generally ignore after I do my particular transaction.

4 Likes

Thanks but doesn’t this method of using emails, pile up to a big stack of emails and passwords that you have to keep in mind?

Gmail has all + suffixed emails go to the root email inbox, they aren’t different accounts. So for example if I have foo+bar@gmail.com, foo+baz@gmail.com and foo+qux@gmail.com that all shows up under my regular foo@gmail.com account, but tagged as bar, baz, etc. You can then have custom inbox rules for those various tags to mark them as read, archive them, etc.

4 Likes

That’s cool didn’t know that you could do that.

Thanks again for the clarification

1 Like

Thank you for everyone that replied in this thread.

I learned a lot today.

1 Like

Just to inform you, tagging feature (the +something) isn’t standardized anywhere, it is just the way Gmail handles it and different providers can handle them in different ways, for example MMDF is using = (equal sign) and Qmail is using - (hyphen). Some MTA servers allow configuration of that (Postfix and Exim). The same comes with dots, some servers will ignore them (Gmail) others will not. So there is no “safe” way to normalize email addresses. Even capitalization is important as Jon@doe.com is different email from jon@doe.com (TBH I am not aware of any MTA server that would differentiate between them, but standard is clear in that matter), but the capitalization is important only for local-part, so jon@doe.com and jon@DoE.com are the same addresses.

5 Likes

The only way to correctly verify an email is to send a confirm email. Usually you only validate the domain

2 Likes

Ditto for things at my job as well.

Plus, even on my google account, things like <myname>@gmail.com and <myname>+id@gmail.com go through very different sets of filters, and if something is sent to the wrong one it is likely I’ll never see it.

+ is not some magic character in email, literally everything is valid right on down do @a is a valid email address (though not that useful on the Internet, it might be on an internal network).

This, you contact the mail server and ask if the account exists, that’s the only way to know for sure.

3 Likes

To add on that, my company wanted to order equipement for our VR “lab”, ~10k EUR of volume.

We weren’t able to order there, because their “verifaction” system insistet that the emails we feeded in were invalid because of the +.

We weren’t even able to reach for the support because of this until we were able to find a phone number deeply burried.

In hours of phonecalls our orders person was send from one support level to the next until he got to someone who wanted to actually help, that person called back after an hour or so, where they tried to get to the correct person internally. And even though we argued with correct quotation/citing of the RFC, the final answer was that + is not valid in email addressses and we shall configure our mail server correctly.

We ended up buying the same stuff from another seller for less…

4 Likes

Uh, it ‘is’ not only valid in emails but very well used…

I don’t suppose you want to namedrop who this is? My work campus is thinking of a VR lab and I’ll keep some things in mind…

And maybe who this is too? ^.^