Will Pearce

More by Will Pearce
WP_Query Object
(
    [query] => Array
        (
            [post_type] => Array
                (
                    [0] => post
                    [1] => webinars
                )

            [posts_per_page] => -1
            [post_status] => publish
            [meta_query] => Array
                (
                    [relation] => OR
                    [0] => Array
                        (
                            [key] => new_authors
                            [value] => "76"
                            [compare] => LIKE
                        )

                    [1] => Array
                        (
                            [key] => new_presenters
                            [value] => "76"
                            [compare] => LIKE
                        )

                )

        )

    [query_vars] => Array
        (
            [post_type] => Array
                (
                    [0] => post
                    [1] => webinars
                )

            [posts_per_page] => -1
            [post_status] => publish
            [meta_query] => Array
                (
                    [relation] => OR
                    [0] => Array
                        (
                            [key] => new_authors
                            [value] => "76"
                            [compare] => LIKE
                        )

                    [1] => Array
                        (
                            [key] => new_presenters
                            [value] => "76"
                            [compare] => LIKE
                        )

                )

            [error] => 
            [m] => 
            [p] => 0
            [post_parent] => 
            [subpost] => 
            [subpost_id] => 
            [attachment] => 
            [attachment_id] => 0
            [name] => 
            [pagename] => 
            [page_id] => 0
            [second] => 
            [minute] => 
            [hour] => 
            [day] => 0
            [monthnum] => 0
            [year] => 0
            [w] => 0
            [category_name] => 
            [tag] => 
            [cat] => 
            [tag_id] => 
            [author] => 
            [author_name] => 
            [feed] => 
            [tb] => 
            [paged] => 0
            [meta_key] => 
            [meta_value] => 
            [preview] => 
            [s] => 
            [sentence] => 
            [title] => 
            [fields] => 
            [menu_order] => 
            [embed] => 
            [category__in] => Array
                (
                )

            [category__not_in] => Array
                (
                )

            [category__and] => Array
                (
                )

            [post__in] => Array
                (
                )

            [post__not_in] => Array
                (
                )

            [post_name__in] => Array
                (
                )

            [tag__in] => Array
                (
                )

            [tag__not_in] => Array
                (
                )

            [tag__and] => Array
                (
                )

            [tag_slug__in] => Array
                (
                )

            [tag_slug__and] => Array
                (
                )

            [post_parent__in] => Array
                (
                )

            [post_parent__not_in] => Array
                (
                )

            [author__in] => Array
                (
                )

            [author__not_in] => Array
                (
                )

            [ignore_sticky_posts] => 
            [suppress_filters] => 
            [cache_results] => 
            [update_post_term_cache] => 1
            [lazy_load_term_meta] => 1
            [update_post_meta_cache] => 1
            [nopaging] => 1
            [comments_per_page] => 50
            [no_found_rows] => 
            [order] => DESC
        )

    [tax_query] => WP_Tax_Query Object
        (
            [queries] => Array
                (
                )

            [relation] => AND
            [table_aliases:protected] => Array
                (
                )

            [queried_terms] => Array
                (
                )

            [primary_table] => wp_posts
            [primary_id_column] => ID
        )

    [meta_query] => WP_Meta_Query Object
        (
            [queries] => Array
                (
                    [0] => Array
                        (
                            [key] => new_authors
                            [value] => "76"
                            [compare] => LIKE
                        )

                    [1] => Array
                        (
                            [key] => new_presenters
                            [value] => "76"
                            [compare] => LIKE
                        )

                    [relation] => OR
                )

            [relation] => OR
            [meta_table] => wp_postmeta
            [meta_id_column] => post_id
            [primary_table] => wp_posts
            [primary_id_column] => ID
            [table_aliases:protected] => Array
                (
                    [0] => wp_postmeta
                )

            [clauses:protected] => Array
                (
                    [wp_postmeta] => Array
                        (
                            [key] => new_authors
                            [value] => "76"
                            [compare] => LIKE
                            [compare_key] => =
                            [alias] => wp_postmeta
                            [cast] => CHAR
                        )

                    [wp_postmeta-1] => Array
                        (
                            [key] => new_presenters
                            [value] => "76"
                            [compare] => LIKE
                            [compare_key] => =
                            [alias] => wp_postmeta
                            [cast] => CHAR
                        )

                )

            [has_or_relation:protected] => 1
        )

    [date_query] => 
    [request] => SELECT   wp_posts.* FROM wp_posts  INNER JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id ) WHERE 1=1  AND ( 
  ( wp_postmeta.meta_key = 'new_authors' AND wp_postmeta.meta_value LIKE '{ef656dc05fc34a55d70359d4f08560d6c57ea09c9c36c32dedfa9fbc13faa15f}\"76\"{ef656dc05fc34a55d70359d4f08560d6c57ea09c9c36c32dedfa9fbc13faa15f}' ) 
  OR 
  ( wp_postmeta.meta_key = 'new_presenters' AND wp_postmeta.meta_value LIKE '{ef656dc05fc34a55d70359d4f08560d6c57ea09c9c36c32dedfa9fbc13faa15f}\"76\"{ef656dc05fc34a55d70359d4f08560d6c57ea09c9c36c32dedfa9fbc13faa15f}' )
) AND wp_posts.post_type IN ('post', 'webinars') AND ((wp_posts.post_status = 'publish')) GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC 
    [posts] => Array
        (
            [0] => WP_Post Object
                (
                    [ID] => 23535
                    [post_author] => 76
                    [post_date] => 2018-11-14 07:00:41
                    [post_date_gmt] => 2018-11-14 07:00:41
                    [post_content] => 

TLDR: It’s possible to detect a sandbox using a process list with machine learning.

Introduction

For attackers, aggressive collection of data often leads to the disclosure of infrastructure, initial access techniques, and malware being unceremoniously pulled apart by analysts. The application of machine learning in the defensive space has not only increased the cost of being an attacker, but has also limited a techniques’ operational life significantly. In the world that attackers currently find themselves in:

  • Mass data collection and analysis is accessible to defensive software, and by extension, defensive analysts
  • Machine learning is being used everywhere to accelerate defensive maturity
  • Attackers are always at a disadvantage, as we as humans try to defeat auto-learning systems that use every bypass attempt to learn more about us, and predict future bypass attempts. This is especially true for public research, and static bypasses. 

However, as we will present here, machine learning isn’t just for blue teams. This post will explore how attackers can make use of the little data they have to perform their own machine learning. We will present a case study that focuses on initial access. By the end of the post, we hope that you will have a better understanding of machine learning, and how we as attackers can apply machine learning for our own benefit.

A Process List As Data

Before discussing machine learning, we need to take a closer look at how we as attackers process information. I would argue that attackers gather less than 1% of the information available to them on any given host or network, and use less than 3% of the collected information to make informed decisions (don’t get too hung up on the percentages). Increasing data collection efforts in the name of machine learning would come at a cost to stealth, with no foreseeable benefit. Gathering more data isn’t the best solution for attackers; attackers need to increase their data utilization. However, increasing data utilization is difficult due to the textual nature of command output. For example, other than showing particular processes, architecture, and users, what more can the following process list really provide?

PIDARCHSESSSYSTEM NAMEOWNERPATH
1x640smss.exeNT AUTHORITY\SYSTEM\SystemRoot\System32\smss.exe
4x640csrss.exeNT AUTHORITY\SYSTEMC:\Windows\system32\csrss.exe
236x640wininit.exeNT AUTHORITY\SYSTEMC:\Windows\system32\wininit.exe
312x640csrss.exeNT AUTHORITY\SYSTEMC:\Windows\system32\csrss.exe
348x641winlogon.exeNT AUTHORITY\SYSTEMC:\Windows\system32\winlogon.exe
360x641services.exeNT AUTHORITY\SYSTEMC:\Windows\system32\services.exe
400x640lsass.exeNT AUTHORITY\SYSTEMC:\Windows\system32\lsass.exe
444x640lsm.exeNT AUTHORITY\SYSTEMC:\Windows\system32\lsm.exe
452x640svchost.exeNT AUTHORITY\SYSTEMC:\Windows\system32\svchost.exe
460x640svchost.exeNT AUTHORITY\NETWORK SERVICEC:\Windows\system32\svchost.exe
564x640svchost.exeNT AUTHORITY\LOCAL SERVICEC:\Windows\system32\svchost.exe
632x640svchost.exeNT AUTHORITY\SYSTEMC:\Windows\system32\svchost.exe
688x640svchost.exeNT AUTHORITY\SYSTEMC:\Windows\system32\svchost.exe
804x640svchost.exeNT AUTHORITY\LOCAL SERVICEC:\Windows\system32\svchost.exe
852x640spoolsv.exeNT AUTHORITY\SYSTEMC:\Windows\System32\spoolsv.exe
964x640taskhost.exeAdmin-PC\AdminC:\Windows\system32\taskhost.exe
1004x641dwm.exeAdmin-PC\AdminC:\Windows\system32\Dwm.exe
1044x641svchost.exeNT AUTHORITY\LOCAL SERVICEC:\Windows\system32\svchost.exe
1052x640explorer.exeAdmin-PC\AdminC:\Windows\Explorer.EXE
1064x641svchost.exeNT AUTHORITY\NETWORK SERVICEC:\Windows\System32\svchost.exe
1120x640svchost.exeNT AUTHORITY\LOCAL SERVICEC:\Windows\System32\svchost.exe
1188x640UI0Detect.exeNT AUTHORITY\SYSTEMC:\Windows\system32\UI0Detect.exe
1344x640svchost.exeNT AUTHORITY\NETWORK SERVICEC:\Windows\system32\svchost.exe
1836x640taskhost.exeNT AUTHORITY\LOCAL SERVICEC:\Windows\system32\taskhost.exe
1972x640taskhost.exeAdmin-PC\AdminC:\Windows\system32\taskhost.exe
508x641WinSAT.exeAdmin-PC\AdminC:\Windows\system32\winsat.exe
828x641conhost.exeAdmin-PC\AdminC:\Windows\system32\conhost.exe
652x641unsecapp.exeNT AUTHORITY\SYSTEMC:\Windows\system32\wbem\unsecapp.exe
684x640WmiPrvSE.exeNT AUTHORITY\SYSTEMC:\Windows\system32\wbem\wmiprvse.exe
2712x640conhost.exeAdmin-PC\AdminC:\Windows\system32\conhost.exe
2796x641svchost.exeNT AUTHORITY\SYSTEMC:\Windows\System32\svchost.exe
2852x640svchost.exeNT AUTHORITY\SYSTEMC:\Windows\System32\svchost.exe
2928x640svchost.exeNT AUTHORITY\SYSTEMC:\Windows\System32\svchost.exe

Textual data also makes it hard to describe differences between two process lists, how would you describe the differences between the process lists on different hosts?

A solution to this problem already exists – we can describe a process list numerically. Looking at the process list above, we can derive some simple numerical data:

  • There are 33 processes
  • The ratio of processes to users is 8.25
  • There are 4 observable users

By describing items numerically, we can start to analyze differences, rank, and categorize items. Let’s add a second process list.

Process List AProcess List B
Process Count33157
Process Count/Users8.25157
User Count41

Viewing the numerical descriptions side-by-side reveal clear differences between each process list. We can now measure a process list on any given host, without knowing exactly what processes are running. So far, that doesn’t seem so useful, but with the knowledge that Process List A is a sandbox, and Process List B is not, we can examine the four new process lists below. Which of them are sandboxes?

Process List CProcess List DProcess List EProcess List F
Process Count308419534
Process/User7.5841958.5
User Count4114

How might we figure this out? Our solution was to sum the values of each column, then calculate the average of the host totals. For each host total, below average values are marked as 1 for a sandbox, and above average values are marked as 0 for a normal host.

ABCDEF
Process Count33157308419534 
Process Count/User8.251577.5841958.5 
User Count414114Host Score Average
Host Total59.2531554.522648065.5168.04
Sandbox Score101001 

Our solution seems to work out well, however, it was completely arbitrary. It’s likely that before working through our solution, you had already figured out which process lists were sandboxes. Not only did you correctly categorize the process lists, but you did so without textual data against four process lists you’ve never seen! Using the same data points, we can use machine learning to correctly categorize a process list.

Machine Learning for Lowly Operators

The mathematic techniques used in machine learning attempt to replicate human learning. Much like the human brain has neurons, synapses, and electrical impulses that are all connected; artificial neural networks have nodes, weights, and an activation function that are all connected. Through repetition and making small adjustments between each iteration, both humans and artificial neural networks are able to adjust in order to get closer to an expected output. Effectively, machine learning attempts to replicate your brain with math. Both networks operate in a similar fashion as well.

In biology, an electrical impulse is introduced into a neural network, the electrical impulse travels across a synapse, and is processed by a neuron. The strength of the electrical impulse received from the synapse determines whether or not the neuron is activated. Performing the same action repeatedly strengthens the synapses between particular neurons.
In machine learning, an input is introduced into an artificial neural network. The input travels along a link weight into a node where it is passed into an activation function. The output of the activation function determines whether or not the node is activated. By iteratively examining outputs relative to a target value, link weights can be adjusted to reduce the error.

Artificial Neural Networks (ANNs) can have an arbitrary size. The network explored in this post has 3 inputs, 3 hidden layers, and a single output. One thing to notice about the larger ANN is the number of connections between each node. Each connection represents an additional calculation we can perform, both increasing the efficiency, and accuracy of the network. Additionally, as the ANN increases in size, the math doesn’t change (unless you want to get fancy), only the number of calculations.

Gathering and Preparing Data

Gathering a dataset of process lists is relatively easy. Any document with a macro will be executed in a sandbox by any half decent mail filter, and the rest are normal hosts. To get a process list from a sandbox, or a remote system, a macro will need to gather and post a process list back for collection and processing. For processing, the dataset needs to be parsed. The process count, process to user ratio, and unique process count need to be calculated and saved. Finally, each item in the dataset needs to be correctly labelled with either 0 or 1. Alternatively, the macro could gather the numerical data from the process list and post the results back. Choose your own adventure. We prefer to have the raw process list for operational purposes.

There is one more transformation we need to make to the process list data set. Earlier we compared the sum of each process list to the average of each process list total. Using an average in this way is problematic, as very large or very small process list results could adjust the average significantly. Significant shifts would reclassify potentially large numbers of hosts, introducing volatility into our predictions. To help with this, we scale (normalize) the data set. There are a few techniques to do this. We tested all the scaling functions from skikit-learn and chose the StandardScalar transform. What is important here is that overly large or small values no longer have such a volatile effect on classification. 

ABCDEF
Process Count-0.7861.285-0.836-0.0651.92-0.770
Process Count/User-0.5011.652-0.6630.5210.846-0.648
Unique Process Count0.812-0.9020.813-0.902-0.9020.813
Label101001

Building and Training a Network

The data used in the example above, is pulled from our data set. With it we can start to explore how machine learning can help attackers detect sandboxes. At a high level, in order to successfully train an artificial neural network, we will iteratively:

  1. Introduce scaled data into the artificial neural network.
  2. Calculate the output of the activation function.
  3. Provide feedback to the network in the form of 0 or 1 (its label).
  4. Calculate the difference between the output and the feedback.
  5. Update the link weights in an attempt to reduce the difference calculated in step 4.

Some of you may be wondering about step 3. A small, but significant, detail responsible for our earlier success at detecting a sandbox was the fact we told you “Process List A” was a sandbox. From then on, the values of Process List A provided a reference point for everything else. An artificial neural network requires a similar reference point in order to measure how “wrong” it was.

Using skikit-learn, we trained 3 models on 190 unique process lists. The data was scaled in 3 different ways, and even combined. In the end we simply chose the model that performed the best. When the network receives process list data it has never seen before, the network will (hopefully) output an accurate prediction.

Classifying a host as 1 or 0
Confidence level of prediction for the above categorization

The scratchpad code for the neural network can be found here. We opted not to explain all the math involved, although we did write it up. If you have questions feel free to send them our way. Otherwise, if you’re just generally interested, we highly recommend Tariq Rashids book, “Make your own neural network”.

Weaponizing An Artificial Neural Network

Time to put the ANN to work. To test our proof-of-concept, we wrote a simple macro that:

  1. Collected a process list
  2. Calculated the inputs (process count, process/user count, and user count)
  3. Posted the values back to our server and ran them through the neural network for a prediction
  4. If the neural network predicts a normal host, stage code, otherwise do nothing. 

Next, we uploaded the malicious document to several online malware scanners (virustotal, malware-analysis, etc) and waited. We executed the macro on a non-sandbox (first highlighted post-back), then after a few minutes 2 different sandboxes executed the macro and posted the calculated values back. Running the post-back values through the neural network provided 3 accurate predictions!

Macro post-backs (normal, sandbox, sandbox)
Probabilities and classifications for macro post-backs

From the predictions additional logic would be used to deploy malware or not.  Just to recap what we accomplished,

  1. Derived numerical values from a process list
  2. Built a dataset of those values, and properly scaled them
  3. Trained an artificial neural network to successfully categorize a sandbox, based on the dataset
  4. Wrote a macro to post-back required values
  5. Sent in some test payloads and used the post-back values to predict a categorization

Conclusion

Hopefully, this was a good introduction on how attackers can harness the power of machine learning. We were able to successfully classify a sandbox from the wild. Most notably, the checks were not static and harnessed knowledge of every process list in the data set. For style points, the network we created could be embedded in an Excel document, and make the checks client side.

Regardless, of where the ANN we created sits, machine learning will no doubt change the face of offensive security. From malware with embedded networks, to operator assistance, the possibilities are endless (and very exciting).

Credits and Sources

First and foremost, I want to thank Tariq Rashid (@rzeta0) for his book, “Make Your Own Neural Network”. It’s everything you want to know about machine learning, without all the “math-splaining.” Tariq also kindly answered a few questions I had along the way.

Secondly, I would like to thank James McAffrey of Microsoft for checking my math and giving me back some of my sanity.

If this post piqued your interest in machine learning, I highly recommend “Make Your Own Neural Network” as a starting place.

Here are some other links that were helpful to us (if you go down this road):

[post_title] => Machine Learning for Red Teams, Part 1 [post_excerpt] => [post_status] => publish [comment_status] => open [ping_status] => open [post_password] => [post_name] => machine-learning-for-red-teams-part-1 [to_ping] => [pinged] => [post_modified] => 2021-04-13 00:05:20 [post_modified_gmt] => 2021-04-13 00:05:20 [post_content_filtered] => [post_parent] => 0 [guid] => http://www.netspi.com/?p=23535 [menu_order] => 210 [post_type] => post [post_mime_type] => [comment_count] => 0 [filter] => raw ) [1] => WP_Post Object ( [ID] => 23536 [post_author] => 76 [post_date] => 2018-09-10 07:00:52 [post_date_gmt] => 2018-09-10 07:00:52 [post_content] =>

TLDR

By “nulling” the first one or two bytes of a docm file, some spam filters will allow a malicious document to be delivered despite being explicitly blocked. Upon opening the docm file, Microsoft Word gives the option to repair the document, which allows for the potential execution of a macro embedded in the document.

A number of vendors have independently verified this bypass as an issue, so naturally it’s time to write a blog post. While macro-enabled documents were the focus of our testing, the same methodology could apply to many other file types and applications.

Background

Suppose an email filter is configured to block macro-enabled Word documents (docm), how does that filter decide whether a particular sample is of a “docm” type to support a filtering decision?

Generally, we believe this occurs in two ways, although not always in tandem or mutually exclusive:

  1. Extension: The user-friendly way to mark files, this is no more than a string “extension” to the end of a file name. Simply put, extensions provide a nice way to keep track of file types, and are generally used as a shortcut for the operating system to improve user experience. Windows will use a file extension to lookup file registration info in the registry, allowing Windows to pass execution to the correct program with the file as an (optional) argument.
  2. Header:  Files can be further recognized by a particular file header, as a result of the file format. Most file types have a well defined structure, and the first 2-24 bytes, commonly referred to as “Magic Bytes”, usually provide a good indication of its contents. You can find a good list of these at https://www.garykessler.net/library/file_sigs.html for anyone curious. Some quick examples:
    1. MZ (4d 5a) for PE files (exe, dll, sys)
    2. PK (50 4b) for zip files (zip, docm)
  3. Contents (Bonus) : With the use of more exhaustive parsing, a file could be identified based on its holistic structure matching a known format. Naturally this is rare in filtering mechanisms due to the computational cost of parsing.

In regards to bypassing a mail filter, consider the following:

  • A malicious file is created which would normally match an explicit block rule.
  • The contents of the file header is tampered with to bypass “in-transit” filtering.
  • A file extension is chosen to direct execution to a specific application
  • The file is still considered “valid” by the application and executed (semi-)normally.

Testing Parameters

As mentioned above, we were specifically interested in the filtering mechanisms for “Macro-Enabled Word Documents” or DOCM files.

Payload: PlanetExpress.docm, a OpenXML (Office 07+) formatted document with the following embedded macro.

Sub Document_Open()
    MsgBox "Hello!"
End Sub

Delivery Technique: PlanetExpress.docm directly attached to a an e-mail that is delivered to Outlook.

Defense: A mail product configured to block all macro-enabled files, but allow any traditional documents. Each mail filtering product has a different process for handling incoming documents. For example, the first product we tested, all documents were inspected by the filter regardless of blocking rules put in place. Each document was opened, parsed, and inspected. If the document was deemed safe, a link to the document would be sent to the user, rather than the document itself.

The Classic Phish

For our first test we simply changed the file extension from .docm to .doc. This is a common technique, and probably the “original” technique to hide the fact a document with a macro is being sent in. This technique can fool users but not a modern mail filter, as expected this file was blocked every time.

Testing the Extension (Attempt #1)

For our first real attempt at bypassing the filter, we removed the extension completely. To our surprise, the mail filter allowed the document through unopened, unparsed, and uninspected. However, if the user attempted to open the PlanetExpress file, Windows would present a prompt asking which application should be used to handle the file.

Testing the Extension and the File Header (Attempt #2)

For our next attempt, we changed the file extension from .docm to .doc, and nulled the first byte of the file header, as follows:

Again, to our surprise this got past the filter and was delivered to the users inbox (unopened, unparsed, and uninspected). However, if the user opened the document, the ‘File Conversion’ dialog box is brought up by Word.

Putting the Pieces Together (Attempt #3)

We knew from the previous test that we could use any extension, so long as the first byte or two were nulled out. The challenge then became to find something a Windows program could execute, despite being a corrupt file. In our final test, we kept the original docm extension and nulled the first byte of the file header. The email was again delivered, but interestingly enough, Word handled the corruption differently:

  1. When the corruption is detected, it apologizes for not being able to open the document.
  2. Clicking ‘OK’, we get an option to repair the file.
  3. Clicking ‘Yes’, we get a brand new file called ‘Document1’.
Corruption Detection
Corruption Repair
Word produces a repaired document

However, when we click ‘Enable Content’, we don’t get the message box from our macro. Why? Well, it’s because Word has created a new “Document1” and our old macro, while still there, won’t be executed until next time we open the file. This is not very useful from an attack standpoint, but a simple macro change will get us execution, change the macro to:

Sub Document_New()
    MsgBox "Hello!"
End Sub

Now when we click through the prompts and enable macros, we are warmly greeted by some macro execution.

Execution!

Conclusion

This was a relatively simple attack that came from simply questioning our assumptions about mail filters. While the addition of some user prompts, and the requirement for a docm extension are not ideal, this does represent an interesting proof of concept that has successfully bypassed mail filtering in the real world. A number of other scenarios involving extensions, original formats, and header tampering have since been attempted, but we feel the attempts above concisely convey the general process.

Here are the steps for the working attack:

  1. Create a macro-enabled Word document
  2. Add a Document_New macro as above and save the document
  3. Open the document in a hex editor and null out the first byte, re-save the file
  4. Email the document to your victim and wait patiently for profit

Vendor Response

  • Mimecast: Fixed (2017)
  • Barracuda: Fixed (2018)
  • Microsoft: Non-issue citing “user action required” (2017)
[post_title] => An Approach to Bypassing Mail Filters [post_excerpt] => [post_status] => publish [comment_status] => open [ping_status] => open [post_password] => [post_name] => bypassing-mail-filters [to_ping] => [pinged] => [post_modified] => 2021-04-13 00:05:17 [post_modified_gmt] => 2021-04-13 00:05:17 [post_content_filtered] => [post_parent] => 0 [guid] => http://www.netspi.com/?p=23536 [menu_order] => 221 [post_type] => post [post_mime_type] => [comment_count] => 0 [filter] => raw ) ) [post_count] => 2 [current_post] => -1 [in_the_loop] => [post] => WP_Post Object ( [ID] => 23535 [post_author] => 76 [post_date] => 2018-11-14 07:00:41 [post_date_gmt] => 2018-11-14 07:00:41 [post_content] =>

TLDR: It’s possible to detect a sandbox using a process list with machine learning.

Introduction

For attackers, aggressive collection of data often leads to the disclosure of infrastructure, initial access techniques, and malware being unceremoniously pulled apart by analysts. The application of machine learning in the defensive space has not only increased the cost of being an attacker, but has also limited a techniques’ operational life significantly. In the world that attackers currently find themselves in:

  • Mass data collection and analysis is accessible to defensive software, and by extension, defensive analysts
  • Machine learning is being used everywhere to accelerate defensive maturity
  • Attackers are always at a disadvantage, as we as humans try to defeat auto-learning systems that use every bypass attempt to learn more about us, and predict future bypass attempts. This is especially true for public research, and static bypasses. 

However, as we will present here, machine learning isn’t just for blue teams. This post will explore how attackers can make use of the little data they have to perform their own machine learning. We will present a case study that focuses on initial access. By the end of the post, we hope that you will have a better understanding of machine learning, and how we as attackers can apply machine learning for our own benefit.

A Process List As Data

Before discussing machine learning, we need to take a closer look at how we as attackers process information. I would argue that attackers gather less than 1% of the information available to them on any given host or network, and use less than 3% of the collected information to make informed decisions (don’t get too hung up on the percentages). Increasing data collection efforts in the name of machine learning would come at a cost to stealth, with no foreseeable benefit. Gathering more data isn’t the best solution for attackers; attackers need to increase their data utilization. However, increasing data utilization is difficult due to the textual nature of command output. For example, other than showing particular processes, architecture, and users, what more can the following process list really provide?

PIDARCHSESSSYSTEM NAMEOWNERPATH
1x640smss.exeNT AUTHORITY\SYSTEM\SystemRoot\System32\smss.exe
4x640csrss.exeNT AUTHORITY\SYSTEMC:\Windows\system32\csrss.exe
236x640wininit.exeNT AUTHORITY\SYSTEMC:\Windows\system32\wininit.exe
312x640csrss.exeNT AUTHORITY\SYSTEMC:\Windows\system32\csrss.exe
348x641winlogon.exeNT AUTHORITY\SYSTEMC:\Windows\system32\winlogon.exe
360x641services.exeNT AUTHORITY\SYSTEMC:\Windows\system32\services.exe
400x640lsass.exeNT AUTHORITY\SYSTEMC:\Windows\system32\lsass.exe
444x640lsm.exeNT AUTHORITY\SYSTEMC:\Windows\system32\lsm.exe
452x640svchost.exeNT AUTHORITY\SYSTEMC:\Windows\system32\svchost.exe
460x640svchost.exeNT AUTHORITY\NETWORK SERVICEC:\Windows\system32\svchost.exe
564x640svchost.exeNT AUTHORITY\LOCAL SERVICEC:\Windows\system32\svchost.exe
632x640svchost.exeNT AUTHORITY\SYSTEMC:\Windows\system32\svchost.exe
688x640svchost.exeNT AUTHORITY\SYSTEMC:\Windows\system32\svchost.exe
804x640svchost.exeNT AUTHORITY\LOCAL SERVICEC:\Windows\system32\svchost.exe
852x640spoolsv.exeNT AUTHORITY\SYSTEMC:\Windows\System32\spoolsv.exe
964x640taskhost.exeAdmin-PC\AdminC:\Windows\system32\taskhost.exe
1004x641dwm.exeAdmin-PC\AdminC:\Windows\system32\Dwm.exe
1044x641svchost.exeNT AUTHORITY\LOCAL SERVICEC:\Windows\system32\svchost.exe
1052x640explorer.exeAdmin-PC\AdminC:\Windows\Explorer.EXE
1064x641svchost.exeNT AUTHORITY\NETWORK SERVICEC:\Windows\System32\svchost.exe
1120x640svchost.exeNT AUTHORITY\LOCAL SERVICEC:\Windows\System32\svchost.exe
1188x640UI0Detect.exeNT AUTHORITY\SYSTEMC:\Windows\system32\UI0Detect.exe
1344x640svchost.exeNT AUTHORITY\NETWORK SERVICEC:\Windows\system32\svchost.exe
1836x640taskhost.exeNT AUTHORITY\LOCAL SERVICEC:\Windows\system32\taskhost.exe
1972x640taskhost.exeAdmin-PC\AdminC:\Windows\system32\taskhost.exe
508x641WinSAT.exeAdmin-PC\AdminC:\Windows\system32\winsat.exe
828x641conhost.exeAdmin-PC\AdminC:\Windows\system32\conhost.exe
652x641unsecapp.exeNT AUTHORITY\SYSTEMC:\Windows\system32\wbem\unsecapp.exe
684x640WmiPrvSE.exeNT AUTHORITY\SYSTEMC:\Windows\system32\wbem\wmiprvse.exe
2712x640conhost.exeAdmin-PC\AdminC:\Windows\system32\conhost.exe
2796x641svchost.exeNT AUTHORITY\SYSTEMC:\Windows\System32\svchost.exe
2852x640svchost.exeNT AUTHORITY\SYSTEMC:\Windows\System32\svchost.exe
2928x640svchost.exeNT AUTHORITY\SYSTEMC:\Windows\System32\svchost.exe

Textual data also makes it hard to describe differences between two process lists, how would you describe the differences between the process lists on different hosts?

A solution to this problem already exists – we can describe a process list numerically. Looking at the process list above, we can derive some simple numerical data:

  • There are 33 processes
  • The ratio of processes to users is 8.25
  • There are 4 observable users

By describing items numerically, we can start to analyze differences, rank, and categorize items. Let’s add a second process list.

Process List AProcess List B
Process Count33157
Process Count/Users8.25157
User Count41

Viewing the numerical descriptions side-by-side reveal clear differences between each process list. We can now measure a process list on any given host, without knowing exactly what processes are running. So far, that doesn’t seem so useful, but with the knowledge that Process List A is a sandbox, and Process List B is not, we can examine the four new process lists below. Which of them are sandboxes?

Process List CProcess List DProcess List EProcess List F
Process Count308419534
Process/User7.5841958.5
User Count4114

How might we figure this out? Our solution was to sum the values of each column, then calculate the average of the host totals. For each host total, below average values are marked as 1 for a sandbox, and above average values are marked as 0 for a normal host.

ABCDEF
Process Count33157308419534 
Process Count/User8.251577.5841958.5 
User Count414114Host Score Average
Host Total59.2531554.522648065.5168.04
Sandbox Score101001 

Our solution seems to work out well, however, it was completely arbitrary. It’s likely that before working through our solution, you had already figured out which process lists were sandboxes. Not only did you correctly categorize the process lists, but you did so without textual data against four process lists you’ve never seen! Using the same data points, we can use machine learning to correctly categorize a process list.

Machine Learning for Lowly Operators

The mathematic techniques used in machine learning attempt to replicate human learning. Much like the human brain has neurons, synapses, and electrical impulses that are all connected; artificial neural networks have nodes, weights, and an activation function that are all connected. Through repetition and making small adjustments between each iteration, both humans and artificial neural networks are able to adjust in order to get closer to an expected output. Effectively, machine learning attempts to replicate your brain with math. Both networks operate in a similar fashion as well.

In biology, an electrical impulse is introduced into a neural network, the electrical impulse travels across a synapse, and is processed by a neuron. The strength of the electrical impulse received from the synapse determines whether or not the neuron is activated. Performing the same action repeatedly strengthens the synapses between particular neurons.
In machine learning, an input is introduced into an artificial neural network. The input travels along a link weight into a node where it is passed into an activation function. The output of the activation function determines whether or not the node is activated. By iteratively examining outputs relative to a target value, link weights can be adjusted to reduce the error.

Artificial Neural Networks (ANNs) can have an arbitrary size. The network explored in this post has 3 inputs, 3 hidden layers, and a single output. One thing to notice about the larger ANN is the number of connections between each node. Each connection represents an additional calculation we can perform, both increasing the efficiency, and accuracy of the network. Additionally, as the ANN increases in size, the math doesn’t change (unless you want to get fancy), only the number of calculations.

Gathering and Preparing Data

Gathering a dataset of process lists is relatively easy. Any document with a macro will be executed in a sandbox by any half decent mail filter, and the rest are normal hosts. To get a process list from a sandbox, or a remote system, a macro will need to gather and post a process list back for collection and processing. For processing, the dataset needs to be parsed. The process count, process to user ratio, and unique process count need to be calculated and saved. Finally, each item in the dataset needs to be correctly labelled with either 0 or 1. Alternatively, the macro could gather the numerical data from the process list and post the results back. Choose your own adventure. We prefer to have the raw process list for operational purposes.

There is one more transformation we need to make to the process list data set. Earlier we compared the sum of each process list to the average of each process list total. Using an average in this way is problematic, as very large or very small process list results could adjust the average significantly. Significant shifts would reclassify potentially large numbers of hosts, introducing volatility into our predictions. To help with this, we scale (normalize) the data set. There are a few techniques to do this. We tested all the scaling functions from skikit-learn and chose the StandardScalar transform. What is important here is that overly large or small values no longer have such a volatile effect on classification. 

ABCDEF
Process Count-0.7861.285-0.836-0.0651.92-0.770
Process Count/User-0.5011.652-0.6630.5210.846-0.648
Unique Process Count0.812-0.9020.813-0.902-0.9020.813
Label101001

Building and Training a Network

The data used in the example above, is pulled from our data set. With it we can start to explore how machine learning can help attackers detect sandboxes. At a high level, in order to successfully train an artificial neural network, we will iteratively:

  1. Introduce scaled data into the artificial neural network.
  2. Calculate the output of the activation function.
  3. Provide feedback to the network in the form of 0 or 1 (its label).
  4. Calculate the difference between the output and the feedback.
  5. Update the link weights in an attempt to reduce the difference calculated in step 4.

Some of you may be wondering about step 3. A small, but significant, detail responsible for our earlier success at detecting a sandbox was the fact we told you “Process List A” was a sandbox. From then on, the values of Process List A provided a reference point for everything else. An artificial neural network requires a similar reference point in order to measure how “wrong” it was.

Using skikit-learn, we trained 3 models on 190 unique process lists. The data was scaled in 3 different ways, and even combined. In the end we simply chose the model that performed the best. When the network receives process list data it has never seen before, the network will (hopefully) output an accurate prediction.

Classifying a host as 1 or 0
Confidence level of prediction for the above categorization

The scratchpad code for the neural network can be found here. We opted not to explain all the math involved, although we did write it up. If you have questions feel free to send them our way. Otherwise, if you’re just generally interested, we highly recommend Tariq Rashids book, “Make your own neural network”.

Weaponizing An Artificial Neural Network

Time to put the ANN to work. To test our proof-of-concept, we wrote a simple macro that:

  1. Collected a process list
  2. Calculated the inputs (process count, process/user count, and user count)
  3. Posted the values back to our server and ran them through the neural network for a prediction
  4. If the neural network predicts a normal host, stage code, otherwise do nothing. 

Next, we uploaded the malicious document to several online malware scanners (virustotal, malware-analysis, etc) and waited. We executed the macro on a non-sandbox (first highlighted post-back), then after a few minutes 2 different sandboxes executed the macro and posted the calculated values back. Running the post-back values through the neural network provided 3 accurate predictions!

Macro post-backs (normal, sandbox, sandbox)
Probabilities and classifications for macro post-backs

From the predictions additional logic would be used to deploy malware or not.  Just to recap what we accomplished,

  1. Derived numerical values from a process list
  2. Built a dataset of those values, and properly scaled them
  3. Trained an artificial neural network to successfully categorize a sandbox, based on the dataset
  4. Wrote a macro to post-back required values
  5. Sent in some test payloads and used the post-back values to predict a categorization

Conclusion

Hopefully, this was a good introduction on how attackers can harness the power of machine learning. We were able to successfully classify a sandbox from the wild. Most notably, the checks were not static and harnessed knowledge of every process list in the data set. For style points, the network we created could be embedded in an Excel document, and make the checks client side.

Regardless, of where the ANN we created sits, machine learning will no doubt change the face of offensive security. From malware with embedded networks, to operator assistance, the possibilities are endless (and very exciting).

Credits and Sources

First and foremost, I want to thank Tariq Rashid (@rzeta0) for his book, “Make Your Own Neural Network”. It’s everything you want to know about machine learning, without all the “math-splaining.” Tariq also kindly answered a few questions I had along the way.

Secondly, I would like to thank James McAffrey of Microsoft for checking my math and giving me back some of my sanity.

If this post piqued your interest in machine learning, I highly recommend “Make Your Own Neural Network” as a starting place.

Here are some other links that were helpful to us (if you go down this road):

[post_title] => Machine Learning for Red Teams, Part 1 [post_excerpt] => [post_status] => publish [comment_status] => open [ping_status] => open [post_password] => [post_name] => machine-learning-for-red-teams-part-1 [to_ping] => [pinged] => [post_modified] => 2021-04-13 00:05:20 [post_modified_gmt] => 2021-04-13 00:05:20 [post_content_filtered] => [post_parent] => 0 [guid] => http://www.netspi.com/?p=23535 [menu_order] => 210 [post_type] => post [post_mime_type] => [comment_count] => 0 [filter] => raw ) [comment_count] => 0 [current_comment] => -1 [found_posts] => 2 [max_num_pages] => 0 [max_num_comment_pages] => 0 [is_single] => [is_preview] => [is_page] => [is_archive] => [is_date] => [is_year] => [is_month] => [is_day] => [is_time] => [is_author] => [is_category] => [is_tag] => [is_tax] => [is_search] => [is_feed] => [is_comment_feed] => [is_trackback] => [is_home] => 1 [is_privacy_policy] => [is_404] => [is_embed] => [is_paged] => [is_admin] => [is_attachment] => [is_singular] => [is_robots] => [is_favicon] => [is_posts_page] => [is_post_type_archive] => [query_vars_hash:WP_Query:private] => 98a085ccaca57ccf4507c3ef4bef4122 [query_vars_changed:WP_Query:private] => [thumbnails_cached] => [stopwords:WP_Query:private] => [compat_fields:WP_Query:private] => Array ( [0] => query_vars_hash [1] => query_vars_changed ) [compat_methods:WP_Query:private] => Array ( [0] => init_query_flags [1] => parse_tax_query ) )

Is your organization prepared for a ransomware attack? Explore our Ransomware Attack Simulation service.

X