This installment is part of a broader learning series to help you become a Jupyter Notebook ninja in Microsoft Sentinel. The installments will be bite-sized to enable you to easily digest the new content.
KNOWLEDGE CHECK: And, once you've completed all of the parts of this series, you can take the Knowledge Check. If you score 80% or more in the Knowledge Check, you can expect your very own Notebooks Ninja participation certificate from us.
Jupyter Notebooks are a fantastic resource for security analysts, providing a range of powerful and flexible capabilities. Microsoft Sentinel’s integration with Notebooks can provide a quick and straightforward way for security analysts to use Notebooks, however for those new to Notebooks and coding they can be a little daunting.
In this blog we will cover some of the basics of creating your first Microsoft Sentinel Notebook using Python, including how to troubleshoot some common issues you may come across.
Before we begin, make sure to familiarize yourself with Notebooks in Microsoft Sentinel via Azure Machine Learning.
If you wish to learn more about this topic, we are running introductory training on December 16th, 2021: Become a Jupyter Notebooks Ninja – MSTICPy Fundamentals to Build Your Own Notebooks. Sign Up Here
One of the important things about using Python in Notebooks is that you can install and use code libraries (referred to as packages) created by others, allowing you to access the functionality they provide without having to code them yourself.
There are several ways to install Python packages depending on how you want to find and access the packages, however the simplest and easiest is using pip. Pip (https://pypi.org/project/pip/) is the package installer for Python and makes finding and installing Python packages simple.
You can use pip to install packages via the command line, or if you are using a Notebook, directly in a Notebook cell. Installing directly in a Notebook is often preferred as it ensures that you are installing the package in the same Python environment the Notebook is being executed in. To install via a Notebook code cell, we need to use `%pip` followed by install and the package name. e.g.:
%pip install requests
If you already have a package installed but you want to update to the latest version, you can add the `--upgrade` parameter to the command used:
%pip install –upgrade requests
You may also want to install a specific version of a package. This can be done by specifying the version number.
%pip install requests==2.22.0
Once a package is installed, you need to import the package before it can be used. This is done with the `import` statement.
There are 2 ways to import things in Python:
- `import <package>` - this will do a standard import of the package
- `from <package> import <item>` - this imports a specific item from the package
You can also import packages and rename them for ease when calling them later:
`import <package> as <alias>`
import pandas as pd
For example, the popular Machine Learning tool package scikit-learn is installed with:
%pip install scikit-learn
However, it is imported with:
Now that we know how to install and import packages, we can install packages that will be useful to us in creating our Notebook. MSTICPy is a package created by the Microsoft Threat Intelligence Center (MSTIC) and provides a range of tools to make security analysis and investigations in Notebooks quicker and easier. You cand find out more about MSTICPy here:
We can now install MSTICPy. To make sure we get the latest version if we already have it installed, we are going to use the –upgrade parameter.
%pip install --upgrade msticpy
Now we could import MSTICPy with `import msticpy` however it is a big package with a lot of features, so to make it easier we have a function called `init_notebook` that conducts several checks to make sure the environment is good, handles key imports and set up for us.
import msticpy msticpy.init_notebook(globals())
MSTICPy can handle connections to a variety of data sources and services, including Microsoft Sentinel. As such it needs to handle several configuration details and credentials, things such as the Microsoft Sentinel workspaces you want to get data from, or API (Application Programming Interfaces) keys for external services such as Virus Total.
To make it easier to manage and re-use the configuration and credentials for these things MSTICPy has its own config file that holds these items - `msticpyconfig.yaml`.
The first time you use MSTICPy you need to populate your msticpyconfig.yaml file. This is a one-time activity once you have created it, you can simply re-use in future. To help with the set-up we have created several Notebook widgets to help you populate the file.
We have also created a Notebook to help you create to file. Once you have run the ‘Getting Started’ Notebook it is recommended that you run the ‘Configuring your Notebook Environment’ Notebook before creating your first Notebook, you can find this in the Microsoft Sentinel portal.
You can also find more documentation on the config file and creation of it, in the MSTICPy docs
Querying data from Microsoft Sentinel is handled by MSTICPy's `QueryProvider`. The first step is to initialize a QueryProvider and tell it we want to use the Microsoft Sentinel Query provider.
The other thing we want to provide the QueryProvider with is some details of the workspace we want to connect to. We *could* do this manually, but it is much easier to get details from the configuration we set up earlier. We can do this with `WorkspaceConfig`
from msticpy.nbtools import nbinit nbinit.init_Notebook(namespace=globals()) qry_prov=QueryProvider("MicrosoftSentinel") ws_config = WorkspaceConfig(workspace="MyWorkspace")
What WorkspaceConfig is doing is creating the connection string used by the QueryProvider. We can see what that connection string looks like with:
Once set up we can tell the `QueryProvider` to `connect` which will kick off the authentication process. There are several ways that we can handle that authentication but when starting off we can use the default options that prompts the user to log in using a Device Code.
This will then display a code in the Notebook cell output and prompt you to open a browser and end the code shown. You will then login as normal using your Azure AD (Azure Active Directory) credentials.
You can then go back to the Notebook and see that the authentication has been completed:
Now that we are connected to Microsoft Sentinel, we can start to look at running some queries to get some data. MSTICPy comes with several built-in Microsoft Sentinel queries to get some common datasets into the Notebook. These are different to the queries included in the Microsoft Sentinel GitHub and are more focused on collecting common sets of data that users might need to answer analytical questions.
You can see a list of the MSTICPy queries with `.list_queries.`
You can also use `.browse_queries()` to see the available queries in an interactive browser widget.
Now that we have found a query that we want to run we simply pass its name to the `QueryProvider` and that in turn returns to results of the query in a Pandas Data Frame. Most queries support additional parameters, but we are showing one here that does not need any parameters.
More typically the query function will expect parameters such as the host name or IP address that you are searching for.
If you try to run a query without supplying the required parameter, it will return an error message including the help for the query with the parameter definitions.
Most queries also require date/time parameters for the beginning/ending bounds of the query. By default, these are supplied by a time range set in the query provider. Each instance of a query provider has its own time range. You can change the default query range by running the following.
This brings up a widget letting you change the defaults for this query provider. You can also supply "start” and “end” parameters to the query function – either as Python datetimes or as time strings:
from datetime import datetime qry_prov.LinuxSyslog.user_logon( host_name="mylxhost", start="2021-11-19 20:30", end=datetime.utcnow() )
In addition to the stock query, we can customize certain elements of the query.
For example, if we want to append a line with `| take 10` to the query we have selected to limit the number of results returned we can pass that in with the `add_query_items` parameter:
qry_prov.SecurityAlert.list_alerts(add_query_items="| take 10")
Data returned by the `QueryProvider` comes back in a Pandas Data Frame. This provides us with a powerful and flexible way to access our data.
One of the core things we want to do is look at specific rows in our table. Each table has an index that can be used to call a row using `.loc`, alternatively we can return a row by its position in the table with `.iloc`
We can also choose just to return specific columns by providing a list of them to the Data Frame (note the "[:5]” means return the last 5 rows):
alert_df.iloc[:5][["AlertName", "AlertSeverity", "Description"]]
We can also do things such as search for rows with specific data:
One of the powerful elements of Notebooks is combining data from Microsoft Sentinel with data from other sources. One of the most common sources of this data in security is Threat Intelligence (TI) data. MSTICPy has support for several Threat Intelligence data sources including:
The first step in using these TI sources is to create a `TILookup` object. This can then be used to perform lookups against the various supported providers.
Lookups can be done against individual items via `.lookup_ioc` or against multiple items with `.lookup_iocs` and you can configure things such as which Threat Intelligence sources are used.
ti = TILookup() ti.lookup_iocs(signin_df, obs_col="IPAddress", providers=["GreyNoise"])
To make viewing results easier there is a widget to allow you to interactively browse results:
MSTICPy also has integration with a range of Azure APIs that can be used to retrieve additional information or perform actions such as get Microsoft Sentinel incidents.
from msticpy.data.azure_sentinel import AzureSentinel azs = AzureSentinel() azs.connect() azs.get_incident(incident_id = "7c768f11-31f1-46ca-8a5c-25df2e6b7021", sub_id = "8df49d90-99eb-4c31-985d-64b3f33caa93", res_grp= "sent", ws_name="workspace")
You can find out more about MSTICPy’s support for Azure APIs in the documentation: https://msticpy.readthedocs.io/en/latest/data_acquisition/AzureData.html & https://msticpy.readthedocs.io/en/latest/data_acquisition/AzureSentinel.html
The ability to create complex, interactive visualizations is one of the key benefits of Notebooks, allowing analysts to see data in a unique way and use it to identify patterns of anomalies that may not otherwise be possible to identify.
Creating these visualizations from scratch can be quite a complex task and involve a lot of code if starting from nothing. To make the process easier MSTICPy contains several common visualizations work out the box with common data sources from Microsoft Sentinel, and that can quickly and easily be called with minimal code.
Understanding when events occurred and in what order is a key component of many security investigations. MSTICPy can plot diverse types of timelines with several types of data.
user_df = qry_prov.Azure.list_aad_signins_for_account(account_name="email@example.com") timeline.display_timeline(user_df, source_columns=["UserPrincipalName", "ResultType"]
We can also plot time lines showing events with a duration rather than a single time stamp with ` display_timeline_duration`:
timeline_duration.display_timeline_duration(alert_df, group_by="AlertName", time_column="StartTimeUtc", end_time_column="EndTimeUtc")
alert_df.mp_plot.timeline(group_by="Severity", source_columns=["AlertName", "TimeGenerated"])
The Matrix Plot graph in MSTICPy allows you to plot the interactions between two elements in your data. This can be useful for seeing the relationships between points in a dataset, for example if you wanted to see how often certain IP addresses are communicating with each other in a network you can create a matrix plot with a source IP address on one axis, and a destination IP address on the other axis.
As with the timeline plots, the matrix plot can be created directly from a DataFrame using `mp_plot`:
network_data.mp_plot.matrix(x="SourceIP", y="DestinationIP", title="IP Interaction")
We have seen a couple of widgets already in the query and threat intelligence result browsers. These widgets make Notebooks much more accessible by providing a visual way to interact and customize them without having to write any code. MSTICPy includes a number visual, interactive widgets to allow users to select various parameters to customize the Notebook.
network_vendor_data_q = "CommonSecurityLog | summarize by DeviceVendor" network_vendor_data = qry_prov.exec_query(network_vendor_data_q) network_selector = nbwidgets.SelectItem( item_list=network_vendor_data["DeviceVendor"].to_list(), description='Select a vendor', action=print, auto_display=True );
q_times = nbwidgets.QueryTime(units='day', max_before=20, before=5, max_after=1) q_times.display()
security_alerts = qry_prov.SecurityAlert.list_alerts(add_query_items="| take 10") alert_select = nbwidgets.SelectAlert(alerts=security_alerts, action=nbdisplay.display_alert) display(Markdown('### Alert selector with action=DisplayAlert')) display(HTML("<b> Alert selector with action=DisplayAlert </b>")) alert_select.display()
What you have seen here is just a tiny taster of what Microsoft Sentinel Notebooks can do. However, luckily, we have a lot of additional resources to help you learn what you need and get started with Notebooks.
We recommend that you do the following:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.