Zeek is a powerful tool for monitoring your networks. It has many powerful capabilities, but the best of all, it is the Zeek script language, that gives you the capability to extend what you can see, detect and log.

Nowadays many attacks are targeting users by email. The attacker sends an email with a malicious URL and when the user clicks on that, GAME OVER. So one of the tasks we have to do as Security Engineers is to extract these URLs, analyze them and block them if the verdict of these is malicious.

As we mentioned earlier, Zeek gives us the capability to extend its framework. In this tutorial, we will cover how we can make Zeek extract URLs from SMTP Connections. Zeek has already the capability to extract URLs based on the find_all_urls function (https://docs.zeek.org/en/master/scripts/base/utils/urls.zeek.html), but in this tutorial, we will analyze how we can extract URLs based on our URL custom Regex and then log them to the default smtp.log as an extra field.

Let’s create the script, to make Zeek extract URLs.

We create a file `custom-smtp-url-extraction.zeek` and we add the following code.

@load base/protocols/smtp

module Custom_Smtp_Url_Extraction;

export {

        redef record SMTP::Info += {

                smtp_urls: set[string] &log &default=set();

        };
	option url_regex = /https?:\/\/[a-z0-9A-Z\/\.\_\-\?\#\=\:]*/ &redef;

}


event mime_entity_data(c: connection, length: count, data: string){
               
                if (  c?$smtp ) {
                       
                        local xxx: set[string];
                        xxx = find_all(escape_string(data),url_regex);
                        if (|xxx| > 0){

                                for (x in xxx){
                                                add c$smtp$smtp_urls[x];
                       
                                }
                        } 

                
                }
}

Before we move to the next step, let’s take line-by-line the code above.

With the @load directive, we declare the dependencies that Zeek will need for the script. Here we need the SMTP Protocol ( Email Delivery is based on SMTP Protocol ).

In the export section (https://docs.zeek.org/en/master/script-reference/statements.html#keyword-export) we declare or redefine the values of variables.

With redef record SMTP::Info we add an extra field the smtp.log with the field name smtp_urls and we define that this field will return set[string] values. The &default=set() is assigning an empty set as default value.

Below that, we declare the url-regex variable. I know, urls.zeek exists, that includes the find_all_urls function (https://docs.zeek.org/en/master/scripts/base/utils/urls.zeek.html?highlight=find_all#id-find_all_urls) , which promises to extract all URLs from a string. So why do that?

For some reason, the default regex didn’t work as it should, hence I created my own regex. Maybe I did something badly wrong, but let’s move on!!!

Moving to the next line, we are checking if the returned set length is greater than zero and if so, we append the smtp_urls to the smtp.log.

At last, we have to define to Zeek, where to find the script file. To do so, in /opt/zeek/share/zeek/site/local.zeek we append the following directive:

@load <path to file>/custom-smtp-url-extraction

Now we have applied the changes. Before we run the deploy command, remember to run zeekctl check, in order to detect any errors in the code or in the configuration (FYI: zeekctl deploy always checking for errors and stops the deployment, but I wanted to mention it here, as a tip, that when we make changes, we can check them with this command).

root@ubuntu:/opt/zeek/etc# zeekctl check
logger-1 scripts are ok.
manager scripts are ok.
proxy-1 scripts are ok.
worker-1-1 scripts are ok.
worker-1-2 scripts are ok.
worker-2-1 scripts are ok.
worker-2-2 scripts are ok.

rroot@ubuntu:/opt/zeek/etc# zeekctl deploy
checking configurations ...
installing ...
removing old policies in /opt/zeek/spool/installed-scripts-do-not-touch/site ...
removing old policies in /opt/zeek/spool/installed-scripts-do-not-touch/auto ...
creating policy directories ...
installing site policies ...
generating cluster-layout.zeek ...
generating local-networks.zeek ...
generating zeekctl-config.zeek ...
generating zeekctl-config.sh ...
stopping ...
stopping workers ...
worker-1-1 did not terminate ... killing ...
worker-1-2 did not terminate ... killing ...
worker-2-1 did not terminate ... killing ...
worker-2-2 did not terminate ... killing ...
stopping proxy ...
stopping manager ...
stopping logger ...
starting ...
starting logger ...
starting manager ...
starting proxy ...
starting workers .

Finally, we can go to the /opt/zeek/current/smtp.log and verify that an extra field has been added and the URLs that Zeek will extract.

The idea behind all that is to extract the URLs that exist in the emails and then send them to a sandbox or a SIEM for analysis.