DomainTwistex

nix2intel · January 16, 2025, 8:23pm

This library helps organizations protect themselves against several types of domain-based cyber attacks. It specifically looks for:

Typosquatting: When attackers register misspelled versions of legitimate domains (like “gooogle[.]com” instead of “google.com”)
Phishing Infrastructure: Domains set up to impersonate trusted organizations in email-based attacks
Brand Impersonation: Domains created to look like legitimate company websites

How It Works

The library combines Elixir’s powerful ecosystem with Rust’s performance through Rustler NIFs. This makes it particularly effective for scanning and analyzing large numbers of potential threat domains quickly and efficiently.

Features

Domain Variation Generator: Creates possible misspellings that attackers might use, including:
- Missing characters (microsft[.]com)
- Replaced characters (rnicrosoft[.]com)
- Extra characters (micrrosoft[.]com)
- Swapped characters (micrsooft[.]com)
- Common typing mistakes
- Look-alike characters (using similar-looking letters)
Email Server Detection: Checks if suspicious domains are set up to receive email
Fast, Concurrent Analysis: Examines many domains simultaneously for quick results

Contributing

We welcome contributions! You can:

Report bugs
Suggest new features
Submit code changes
Help improve documentation

License

BSD-3-Clause

Links

benwilson512 · January 16, 2025, 8:28pm

Hey @nix2intel can you say a bit more about the package and its uses here in your post?

nix2intel · January 16, 2025, 10:05pm

Yeah for sure, it is a great way to find potential phishing attacks, typo squatting, and brand impersonation for your company domains. I updated above thank you.

linusdm · January 16, 2025, 10:43pm

Maybe I’m getting old, but these questions present themselves:

what is typosquatting?
what is typically done with all those domain name permutations? What’s the purpose?
what is “concurrent domain analysis”?
what does it mean to validate an MX record in this context?
how does generating domain permutations help with finding phishing attacks?

I’m genuinely curious.

nix2intel · January 16, 2025, 11:04pm

Actually a great question and I apologize for all the jargon, I forget sometimes i’m in a cybersecurity bubble

First, let’s talk about typosquatting. This is when attackers deliberately register misspelled versions of legitimate domain names. For example, they might register “Gooogle[.]com” hoping to catch people who accidentally type an extra ‘o’ when trying to reach Google.com It’s actually a form of cybersquatting, which means registering domains that infringe on someone else’s brand or copyright. These typosquatting techniques are often used in phishing attacks - that’s when criminals try to trick people into revealing sensitive information by pretending to be a trustworthy entity. They might send emails that appear to come from these slightly misspelled domains, hoping recipients won’t notice the subtle difference. An even more targeted version of this is used in Business Email Compromise (BEC) attacks. In a BEC attack, criminals specifically impersonate executives or business partners to trick employees into taking actions like making wire transfers or sharing sensitive information. They might register domains like “companynamefinance[.]com” that look legitimate at first glance. The tool I wrote helps defend against these threats by:

Generating possible misspellings of legitimate domains which the NIF does
Checking if these domains are registered (our elixir implementation)
Looking for signs they might be used maliciously, like mail server setups, this is generally used by threat intelligence analysts or anyone taking a fine toothed comb through their environment. As of right now it is just grabbing the mx records for the domain variations, but could be expanded to do more in the future, but if I use microsoft servers and gmail, godaddy, or a selfhosted mail server shows up, that can be a huge red flag.

I made it run these checks concurrently because there can be hundreds or thousands of possible variations to check. The faster we can identify potentially malicious domains, the better chance we have of preventing successful attacks. What makes this particularly useful is that it helps catch these threats before they’re used in attacks. Think of it like a radar system for suspicious domains - it helps security teams spot potential threats before they become actual problems.

linusdm · January 16, 2025, 11:23pm

Thanks for clarifying! Interesting.

D4no0 · January 16, 2025, 11:31pm

I feel using a NIF for this is overkill, as ultimately the bottleneck will be always the requests you send to those permutated domains.

dimitarvp · January 16, 2025, 11:45pm

What’s the significance of a NIF (Rust) here? Why is it needed?

nix2intel · January 17, 2025, 1:14am

I don’t disagree and at some point I may come back and write the permutation engine in elixir. I mention the NIF because I want folks to know what they are signing up for, as it is my understanding NIFS can bring down the beam correct? I am very new to elixir and this was more about getting things done for work (we were using dnstwist at work written in python hosted on an api) twistrs was already written for the permutations and it felt like a quick win and a way to solve a problem and add a nice library (work is great about letting us open source libraries but not main functionality. I have been doing a lot of dev work since moving from security architecture to threat intelligence and I used elixir to rewrite a tool at work that parses through sec filings looking for data breaches, the first iteration was terrible and would fail whenever a websocket connection failed, I was researching to try and fix it and discovered elixir. Has been a game changer at work and this will go a long way to updating our stack. Having said that could you look at my task implementation on the elixir side and make suggestions for speed ups? As you said requests are going to be the most time consuming piece, I’m considering flame on kubernetes cluster but may be overkill? I’ve been using elixir for less than a year and still very much learning.

nix2intel · January 17, 2025, 1:16am

There was already a great permutation engine in rust, I wanted to learn about NIFs and will be able to replace a key component at work written in python with an elixir library mostly.

nix2intel · January 17, 2025, 1:19am

Thank you! Being new to elixir i’d love feedback on the code and how i’m being dumb! I did Java way back in the day and mostly have done bash and python in the security space, but right now i’m loving elixir more and more each day.

D4no0 · January 17, 2025, 8:52am

Exactly, this is a necessary evil to reach native performance speeds or as in your case, borrow implementation of a library from another ecosystem. I think with dirty schedulers it’s much safer these days, but the general idea is that they should be used when other options are exhausted.

Well, this is exactly what you should have posted in your description of the library, as you got people confused about why would you resort to a NIF for something that could be implemented in elixir.

Looks fine for the first iteration, I would personally just focus on making it work correctly (for example so it’s not opening too many network connections and killing your server or network).

For a implementation that is more production-ready, take a look at SSL MOON, that implementation uses oban for both rate-limiting on how many concurrent requests are done and for persistence when upgrades/restarts of the server happen.

BTW I’ve been looking for someone to use that project in production, if you find that could be something useful for your company, let me know, I might be able to give support and implement the missing features.

Lucassifoni · January 17, 2025, 9:09am

A bit tangential, but way back I enjoyed this paper about domain permutations, not with typosquatting but bitsquatting : Wayback Machine

This is old (BlackHat 2011) and I am not sure whether hardware is now more resilient to those errors.

nix2intel · January 17, 2025, 12:55pm

Sslmoon looks amazing! Will be tearing through it to learn and ingest, thanks!

filmor · January 17, 2025, 9:43pm

The Rustler version you are using there is quite old and lacks (e.g.) forward compatibility in the build process
You have committed the binaries
You can simplify the NIF function quite a bit by using more of Rustler’s (and Rust’s) features:

use rustler::NifResult;
use std::collections::HashSet;
use twistrs::permutate::Domain;

// 1. Rustler has builtin support for maps with atom keys
// -------------------------------------------------------
#[derive(rustler::NifMap)]
struct Result {
    fqdn: String,
    tld: String,
    kind: String,
}

// 2. You can return anything that is convertible to a Term, you don't need to do inline encoding (and thus don't need `env`, together with `NifMap`)
// ------------------------------------------
#[rustler::nif]
fn generate_permutations(domain_str: String) -> NifResult<Vec<Result>> {
    let domain = match Domain::new(&domain_str) {
        Ok(d) => d,
        Err(_) => return Ok(Default::default()),
    };

    // 3. No need to convert the HashSet into a Vec if you just want to
    //    iterate over it again
    // -----------------------------------------
    let perms = match domain.all() {
        Ok(p) => p.collect::<HashSet<_>>(),
        Err(_) => return Ok(Default::default()),
    };

    let results = perms
        .iter()
        .map(|p| Result {
            fqdn: p.domain.fqdn.clone(),
            tld: p.domain.tld.clone(),
            kind: format!("{:?}", p.kind),
        })
        .collect();

    Ok(results)
}

rustler::init!("Elixir.DomainTwistex");

It would be even shorter if you didn’t use NifResult (which is unnecessary as you don’t report any errors back).

nix2intel · January 17, 2025, 11:43pm

Holy cow thank you! I’ve never written rust code before and thus terrible at it. ill be sure to make these changes!

nix2intel · January 18, 2025, 12:23am

For anyone interested i made some recent changes, and will be implementing what @filmor just suggested in the coming days, but as an example, I ran this tool against elixirforum.com 's domain and it found this:

  %{
    nameservers: ["ns2.ownidentity.com", "ns1.ownidentity.com"],
    kind: "Tld",
    fqdn: "elixirforum.it",
    ip_addresses: ["217.70.146.10"],
    mx_records: [%{priority: 10, server: "gmail-smtp-in.l.google.com"}],
    resolvable: true,
    server_response: %{
      server: "Microsoft-IIS/10.0",
      headers: %{
        "Accept-Ranges" => "bytes",
        "Connection" => "close",
        "Content-Length" => "444",
        "Content-Type" => "text/html",
        "Date" => "Sat, 18 Jan 2025 00:14:52 GMT",
        "ETag" => "\"0c4edd728bd71:0\"",
        "Last-Modified" => "Thu, 25 Feb 2021 03:46:48 GMT",
        "Server" => "Microsoft-IIS/10.0",
        "X-Powered-By" => "ASP.NET"
      },
      status_code: "200"
    },
    txt_records: [],
    tld: "it"
  },

is it malicious? probably not, but it is something I would potential check mail logs on for elixirforum, the domain is hosting an unconfigured plesk server that may be on a windows machine (always skeptical as I sometimes spoof headers as well) that is configured to use gmail as it’s mail servers. In our environment we’d likely add rules to either block the domain outright to avoid potential business email compromise accounts, or make rules in our organizational email for it. We would also monitor for changes to things like this.