Finding Transport Rule Size Part 2 – Regex Limit
Published Aug 11 2022 12:48 PM 5,306 Views

In part one of the Finding Transport Rule Size blog post series, we explained how to determine the size of an Exchange Transport Rule (ETR). Since then, we have received questions from customers who want to know how to determine the character limit for regular expressions (sometimes called regex) used in ETRs. So, in this post, we will go through how regular expression limit works and how the size is calculated.

We have seen a trend in unnecessary pattern matching in transport rules for scenarios that are better covered by simple word matching. Pattern matching is more complex in nature, can lead to unpredictable results, and can cause slowness because of longer rule evaluation times. Before using regular expressions in pattern matches, see if the same can be achieved using simple word matching. Where possible, you should use the simpler equivalent "Contains Words" predicate. Finally, the regular expressions and scripts in this post are not supported by Microsoft Support and it’s your responsibility to test them before using them in your production environment.

Currently, we limit the characters used in regular expression properties in ETRs to 20KB (20,480 bytes), considering one byte per character. This limit is not counted per rule; it is a sum of all existing rules that use a set of specific property types. If you want to know what properties count toward this limit, consider all predicates under property type: Patterns, Words and CharacterSets.

If you reach the limit, the following error message will appear:

RegexSize01.jpg

Unfortunately, there is no out-of-the-box method to determine the total KB size of these properties, but we can figure it out with PowerShell.

Let’s start with the full script block, which was made to be as simple as possible. We encourage you to improve it to suit your needs, such as adding it to a function or merging with the script from part one of this blog series.

 

[int]$total = 0
$traportRules = Get-TransportRule
[System.Collections.Generic.List[string]]$predicates = ($traportRules | Get-Member -MemberType Property | where {$_.Name -Like "*Words" -or $_.Name -Like "*Patterns"}).Name
foreach ($transportRule in $traportRules) {
	[int]$totalperRule = 0
	foreach ($predicate in $predicates) {
		if ($transportRule.$predicate -gt 0) {
			$predicateLenght = ($transportRule.$predicate -join '' | Measure-Object -Character).Characters
			if ($predicate -eq "SubjectOrBodyContainsWords" -or $predicate -eq "SubjectOrBodyMatchesPatterns") {
				$totalperRule = $totalperRule + [convert]::ToInt32($predicateLenght, 10) * 2
			}
			else {
				$totalperRule = $totalperRule + [convert]::ToInt32($predicateLenght, 10)
			}
		}
	}
	Write-Output "The Transport Rule $($transportRule) is $($transportRule.State) and has $($totalperRule) bytes"
	$total = $total + $totalperRule
}
Write-Output "The total characteres size in your transport rules is $($total) bytes. The maximum size allowed is 20kb (20,480 bytes)"

 

Now let’s break the code into pieces to understand how it works. Basically, we are getting all ETR properties that end with “Words” and “Patterns.” All ETR predicates that count toward the regex limit should end with “Patterns” and “Words,” so this is the easiest way to get them all.

 

[System.Collections.Generic.List[string]]$predicates = ($traportRules | Get-Member -MemberType Property | where {$_.Name -Like "*Words" -or $_.Name -Like "*Patterns"}).Name

 

For any predicate ending with “Words” or “Patterns,” we will join the values and get number of chars:

 

if ($transportRule.$predicate -gt 0) {
	$predicateLenght = ($transportRule.$predicate -join '' | Measure-Object -Character).Characters

 

If the predicate in use is SubjectOrBodyContainsWords or SubjectOrBodyMatchesPatterns, we count twice. We convert the number of chars from string to integer and multiply by two:

 

if ($predicate -eq "SubjectOrBodyContainsWords" -or $predicate -eq "SubjectOrBodyMatchesPatterns") {
	$totalperRule = $totalperRule + [convert]::ToInt32($predicateLenght, 10) * 2

 

For all other predicates, we simply convert the number of chars from string to integer:

 

else {
	$totalperRule = $totalperRule + [convert]::ToInt32($predicateLenght, 10)

 

By the end of each loop, we calculate the total found for the specific rule in the last iteration, and the total found for all previous iterations:

 

$total = $total + $totalperRule 

 

Although the output mentions whether the rule is disabled, it doesn’t mean that a disabled rule won’t count toward the limit; regardless of the ETR state, all rules count toward the 20k limit. The state is included to give you an idea of what rule could be deleted to free up some space. Since the limit is not adjustable in Exchange Online, this script can help you proactively prevent reaching the 20k limit.

RegexSize02.jpg

Now that you know how to determine the usage of your ETRs, we encourage you to monitor and avoid reaching the limit. This can help deal with unexpected issues. For example, when you rename a rule, the number of characters is counted as if it were a new rule. If you are already at 20.479 KB, you have only one byte available. If you rename a rule that uses 2 bytes or more, you will get the following error:

RegexSize03.jpg

If you want to know more about Regular Expressions as how you can develop and add them to ETRs, have a look at this quick reference to learn more about regex development and this article to give you more information about Regular Expressions in Transport Rules.

We hope that this blog post is useful. Please use the Comments section to ask questions or provide suggestions.

Thanks to Arindam Thokder, Nino Bilic and Yiran Lin for their support and review of this article.

Denis Vilaça Signorelli
Service Engineer

Co-Authors
Version history
Last update:
‎Aug 11 2022 12:48 PM
Updated by:
www.000webhost.com