Regex for Domain Name

Hi everyone, this is my first time writing regular expression. I have a regular expression that helps in validating domain names. I think the syntax is wrong here for Elixir. Any help?

~r/^((?!-)[A-Za-z0-9-]{1, 63}(?<!-)\\.)+[A-Za-z]{2, 6}$/

The expected behavior is to match domain names that are in the following format

domain_name.domain_name.extension

or

domain_name.extension

The regular expression should make sure the

  • domain_name can be maximum of 63characters long
  • domain_names can have any letters(can be both lower and upper case), numbers and hyphen.
  • domain_names cannot start or end with hyphen
  • There should be atleast one domain_name.
  • Tld can have any letters(can be both lower and upper case), and numbers in the range of 2 to 6 characters long.
  • All domain_names and tld are separated by period.

escape . only once, and remove space inside the curly brackets and it works.

iex> ~r/^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\.)+[A-Za-z]{2,6}$/ |> Regex.match?("domain-name.domain-name.ext")
true
2 Likes

Thanks a lot! Works now :smiley:

1 Like

This is the regex that I was using when I was trying to meaningfully check email addresses. It isn’t complete on the mailbox side (but it’s very good), but it is nearly complete on the domain side (I just noticed an error where the length of the punycode prefix is not considered as part of a part length; it’s not hard to deal with, but will result in two branches for length checking):

  @email_re ~r/
    \A
    (?=
      # Lookahead - the entire address must not exceed 254 characters and
      # must be at least six characters long.
      [-a-z0-9@.!#$%&'*+\/=?^_`{|}~]{6,254}
      \z
    )
    (?=
      # Lookahead - the mailbox part of the address must be between 1..64
      # characters long.
      [-a-z0-9.!#$%&'*+\/=?^_`{|}~]{1,64}
      @
    )
    (?<mailbox>
      # The mailbox part of the address must begin with a printable character.
      [a-z0-9!#$%&'*+\/=?^_`{|}~-]+
      # The mailbox part of the address may contain periods, but must not
      # contain multiple periods in a row.
      (?:\.[a-z0-9!#$%&'*+\/=?^_`{|}~-]+)*
    )
    # The mailbox part is separated from the domain part with a literal.
    @
    (?<domain>
      # The domain part must have at least one non-TLD part. Practically, for
      # SMTP, there is a maximum of 125 non-TLD parts -- subdomains, domain,
      # and TLD.
      (?:
        # The domain part may not start with a dash.
        (?!-)
        # The domain part may start with xn-- for punycode encoding.
        (xn--)?
        # The domain part must have at least one alphanumeric.
        [a-z0-9]
        # The domain part may have 0-62 alphanumeric or dash
        # characters.
        [-a-z0-9]{0,62}
        # The domain part may not end with a dash
        (?<![-])
        # Each hostname label in the domain part terminates with a period literal.
        \.
      ){1,}
      # The TLD may not start with a dash.
      (?!-)
      # The TLD may start with xn-- for punycode
      (xn--)?
      # The TLD starts with an alphanumeric.
      [a-z0-9]
      # The TLD may have 1..62 alphanumeric or dash characters.
      [-a-z0-9]{1,62}  # May have 0–61 alphanumeric, dash, or underscore
      # The TLD may not end with a dash or underscore.
      (?<!-)
    )
    \z
    /ix
2 Likes