Log obfuscation: Hash or mask sensitive data in your logs
With log obfuscation rules, you can prevent certain types of information from being saved in New Relic.
Requirements
Our log obfuscation feature is available as part of our Data Plus option.
What is log obfuscation?
Our service automatically masks number patterns that we identify as likely being sensitive items, such as credit card or Social Security numbers.
If you need additional obfuscation, one option is to adjust the configuration of the log forwarder you use (for example, our infrastructure agent). But an easier option is to use our log obfuscation feature, available with Data Plus. This feature lets you set up log obfuscation rules directly from the log management UI, or via our NerdGraph API, without lengthy manual configuration. You'll define regular expressions matching your sensitive information, and then create rules to obfuscate that data. You can choose either to have sensitive information masked or hashed.
Definitions
Here are some important terms:
Obfuscation rules define what logs to apply obfuscation actions to.
Obfuscation rule actions define what attributes to look at, what text to obfuscate, and how to obfuscate (either by masking or hashing).
Obfuscation expressions are named regular expressions identifying what text to obfuscate.
Masking completely removes information, replacing it with X characters. You cannot search for specific values once this is done.
Hashing hides information. You can use the hashing tool to get the hash of a sensitive value, and then search for logs containing that hash.
How obfuscation works
The JSON objects displayed in the following example are simplifications of the payloads used by our NerdGraph API. This will help you better correlate the different API operations with their UI equivalent counterparts.
Example: Log record before obfuscation
Imagine you have the following log record:
{
"message":"The credit card number 4321-5678-9876-2345 belongs to user user@email.com (born on 01/02/2003) with SSN 123-12-1234",
"creditCardNumber":"4321-5678-9876-2345",
"ssn":"123-12-1234",
"department":"sales",
"serviceName":"loginService"
}
This log record contains several sensitive data. Ideally, you would like your log to end up looking like this:
{
"message":"The credit card number 9aa9bc1528859aee1b1df75795f1ebd54beb2f0d26c8a1d4580a71a07189cdd5 belongs to user user@email.com (born on XXXXXXXXXX) with SSN 30e6897f76dc102e32ee1d781c43417d259e586eac15c963d75ab8b5187769da",
Here is the basic process you would use to obfuscate the sensitive data in this example.
You decide you want to apply the following obfuscation actions to all the logs coming from that service:
HASH the credit card number present in the message and creditCardNumber attributes.
MASK the birth date present in the message attribute.
HASH the Social Security number present in the message and ssn attributes.
The first thing you need to do is to define some obfuscation expressions that allow you to capture this sensitive information.
Dica
The examples below show regular expressions you'd use directly in the UI. Later in this document, we discuss how you could use escaped versions of these regular expressions for NerdGraph.
Obfuscation expression
Definition
Credit card number
We need to capture 4 groups of 4 digits separated by hyphens:
We need to capture 3 groups of 3, 2, and 4 digits separated by hyphens:
{
"name":"Social Security Number",
"regex":"(d{3}-d{2}-d{4})"
}
Born date (loginService specific)
In this example, the born date is part of the Login service. We define the portion to obfuscate based on the date information in the surrounding words "(born on 01/02/2003)":
{
"name":"Born date - loginService specific",
"regex":"born on (.*))"
}
Each obfuscation expression defines how to capture some sensitive information out of a string (using a regex) and associates it with some friendly name so that you can easily identify it later.
Obfuscation expressions can be reusable; they are totally agnostic to how the log attribute containing the sensitive data is named. For instance, the Social Security expression defined above could be applied to a log attribute named ssn, socialSecurityNumber, or socSecNum.
You can also create non-reusable obfuscation expressions, like Born date (loginService specific), that are tightly coupled to the log attribute's format. For example, you could use whatever comes after born on and before.
Nested regular expression capture groups not supported
The main responsibility of this feature is to replace sensitive data with the hash or mask in an easy and performant way.
If you want or need to group different matches inside a capture group, remember to use the non-capture group syntax (?:{EXPRESSION}) to avoid creating nested capture groups.
For example, avoid (([A-Z]{12})|([a-z]{6})) and instead use ((?:[A-Z]{12})|((?:[a-z]{6})).
Now that we have defined how to capture our sensitive data, we need to specify which logs need to be obfuscated (the ones of the Login Service) and how (with the obfuscation actions we defined). To achieve this, we define an obfuscation rule.
{
"name":"Obfuscate Login Service Logs",
"filter":"serviceName = 'loginService' AND department = 'sales'",
"actions":[
{
"attributes":["message","creditCardNumber"],
"expression":{"name":"Credit Card Number"},
"method":"HASH_SHA256"
},
{
"attributes":["message"],
"expression":{"name":"Born date - loginService specific"},
"method":"MASK"
},
{
"attributes":["message","ssn"],
"expression":{"name":"Social Security Number"},
"method":"HASH_SHA256"
}
]
}
This rule contains three main components:
Obfuscation rule component
Description
Name
The name helps to easily identify what the rule does. In this example, this rule defines how to obfuscate the different attributes of the logs coming from the Login Service.
Filter
The filter uses NRQL format to tell our system how to identify the target logs coming from the Login Service. This example queries for logs where serviceName = loginService and department = sales.
Actions
Finally, this rule defines the set of obfuscation actions to apply to the logs matching the filter. Each action defines:
Which previously created obfuscation expression to use to extract the sensitive information from each set of attributes
Which obfuscation method (HASH_SHA256 or MASK) to be applied to obfuscate this data
Note that when defining obfuscation rules via NerdGraph, you will need to specify the id of the obfuscation expressions instead of their names. To make the previous example more readable, we used the obfuscation expression names instead.
As a final example, imagine we also needed to obfuscate logs coming from another service named "Checkout Service" that have an attribute serviceName = checkoutService as well as a ccn attribute that contains credit card information:
{
"message":"Order completed",
"ccn":"4321-5678-9876-2345",
"department":"sales",
"serviceName":"checkoutService"
}
To obfuscate the logs from this service, we would only have to define another obfuscation rule targeting these specific logs, and we would simply reuse the previously created Credit card number obfuscation expression:
{
"name":"Obfuscate Checkout Service Logs",
"filter":"serviceName = 'checkoutService' AND department = 'sales'",
"actions":[
{
"attributes":["ccn"],
"expression":{"name":"Credit Card Number"},
"method":"HASH_SHA256"
}
]
}
Checklist: Steps to obfuscate logs
To obfuscate your logs:
Study the shape of your logs by identifying patterns of sensitive data that appear in them. For example:
Do all your logs contain sensitive information? Or can you be more specific (only the logs from service A or region B)?
What sensitive information do they contain: credit card numbers, driver's license numbers, national IDs, biometrics, other values?
Create obfuscation expressions to identify how to extract sensitive data.
Define which obfuscation actions need to be applied to each of them. Ask yourself: Will I need to query my logs using this sensitive information later (consider using HASH), or do I need to remove this information entirely from my logs (consider using MASK)?
Dica
The Logs obfuscation UI includes a Hashing tool so that you can find a hash from a known value and copy it for use with other expressions and rules.
CPU limits
Obfuscation has per-minute CPU limits. If an account hits these resource limits, logs won't be obfuscated as expected. To check your CPU limits, go to your system Limits page in the Data management UI.
If you exceed the obfuscation per-minute CPU limits and logs cannot be obfuscated or hashed, the attribute the obfuscation rule was applied to will be dropped and replaced with text indicating why the attribute was dropped. For example, if the obfuscation rule is applied to the message field, and the CPU per-minute limit is reached, the resulting log will look like this:
{
...
"message":"<OBFUSCATION> The account is over its obfuscation per-minute limit, attribute dropped",
...
}
This is to prevent PII or other sensitive data from being ingested inadvertently.
In addition, the following `NrIntegrationError' will be logged to the account:
{
"category":"RateLimit",
"level":"error",
"limitName":"Log API obfuscation per account per minute",
"message":"You’ve exceeded our limit of per-account obfuscation time per-minute for the Log ingestion pipeline. Please reduce your usage or contact New Relic support.",
"name":"ObfuscationTimeLimitReached",
"newRelicFeature":"Logs",
"rateLimitType":"ObfuscationTimePerMinute",
"timestamp":1678819264283
}
To evaluate how well your obfuscation rules are working and see which ones are being skipped, go to Logs Obfuscation > Health. Obfuscation rules are CPU intensive, so these charts can help you decide which rules are most impacted by any resource limitations.
Obfuscation expressions
You can create, read, update, or delete obfuscation expressions by using the New Relic UI or by using NerdGraph, our GraphQL Explorer.
one.newrelic.com > All capabilities > Logs > Obfuscation: First create one or more obfuscation expressions, then create your obfuscation rules.
Use either of these options to create an obfuscation expression:
Enter a name for your new obfuscation rule, as well as a regular expression matching the sensitive data you want to capture. Use RE2 syntax.
The following example shows a basic obfuscation expression that will match credit card numbers:
Using NerdGraph:
Use the logConfigurationsCreateObfuscationExpression mutation under logConfigurations. Refer to the populated example in GraphiQL as well as the related documentation in api.newrelic.com/graphiql.
Importante
You must escape regular expressions when using them in NerdGraph. For example: ((?:(?:4\\d{3})|(?:5[1-5]\\d{2})|6(?:011|5[0-9]{2}))(?:-?|\\040?)(?:\\d{4}(?:-?|\\040?)){3}|(?:3[4,7]\\d{2})(?:-?|\\040?)\\d{6}(?:-?|\\040?)\\d{5})
Use either of these options to query an obfuscation expression:
Select the Expressions tab to view all the available obfuscation expressions and their definitions.
Using NerdGraph:
Use the obfuscationExpressions fetcher under actor.account.logConfigurations. Refer to the populated example in GraphiQL as well as the related documentation in api.newrelic.com/graphiql.
Use either of these options to update an obfuscation expression:
Click the ellipsis icon ... at the right side of the obfuscation expression you want to edit, and click Edit.
Modify fields as needed, and click Update.
Using NerdGraph:
Use the logConfigurationsUpdateObfuscationExpression mutation under logConfigurations. Refer to the populated example in GraphiQL as well as the related documentation in api.newrelic.com/graphiql.
Importante
You must escape regular expressions when using them in NerdGraph. For example: ((?:(?:4\\d{3})|(?:5[1-5]\\d{2})|6(?:011|5[0-9]{2}))(?:-?|\\040?)(?:\\d{4}(?:-?|\\040?)){3}|(?:3[4,7]\\d{2})(?:-?|\\040?)\\d{6}(?:-?|\\040?)\\d{5})
You do not need to specify all the fields of an obfuscation expression when updating it, only the id (mandatory) and the fields that you wish to modify.
Importante
You will not be able to delete an obfuscation expression if it is currently being used by an obfuscation rule.
Use either of these options to delete an obfuscation expression:
Click the ellipsis icon ... at the right side of the obfuscation expression you want to delete, and click Delete.
Confirm you want to delete the expression by clicking Delete.
Using NerdGraph:
Use the logConfigurationsDeleteObfuscationExpression mutation under logConfigurations. Refer to the populated example in GraphiQL as well as the related documentation in api.newrelic.com/graphiql.
The mutation will return a snapshot of the expression before being deleted.
Sample expressions
Below, we've provided some sample regex expressions to obfuscate some of the most common sensitive data types. Obfuscation expressions must be created for each New Relic account where those expressions will be in use.
Dica
The following examples are regular expressions you could use in the UI. To use these in GraphQL, you'd need to escape them as shown in this example.
Enter a name for your new obfuscation rule, as well as a matching criteria (in NRQL format) to capture the target set of logs you want to obfuscate.
Add a new actions (the first one is added automatically) to specify the obfuscation expression (regex) to capture each set of attributes, as well as whether to MASK or HASH them.
Multiple attributes can be specified comma-separated.
MASK will replace all matching characters with Xes. If you use MASK, you will not be able to query for a particular obfuscated value later.
HASH will replace sensitive data with the SHA-256 hash value. If you use HASH, you will be able to query them using our hashing tool, provided you know its cleartext value.
Your rule should look something like this:
Click Create rule to create and activate your obfuscation rule.
Using NerdGraph:
Use the logConfigurationsCreateObfuscationRule mutation under logConfigurations. Refer to the populated example in GraphiQL as well as the related documentation in api.newrelic.com/graphiql.
Importante
You must specify the obfuscation expressionId in order to link it to a given obfuscation action. To retrieve this id, you can query the obfuscationRules.
Use either of these options to query an obfuscation rule:
Select the Rules tab (default) to view all the available obfuscation rules and their definitions.
Using NerdGraph:
Use the obfuscationRules fetcher under actor.account.logConfigurations. Refer to the populated example in GraphiQL as well as the related documentation in api.newrelic.com/graphiql
Use either of these options to update an obfuscation rule:
Click the ellipsis icon ... at the right side of the obfuscation rule you want to edit, and click Edit.
Modify the fields as needed, and click Update rule.
Using NerdGraph:
Use the logConfigurationsUpdateObfuscationRule mutation under logConfigurations. Refer to the populated example in GraphiQL as well as the related documentation in api.newrelic.com/graphiql.
Importante
You must specify the obfuscation expressionId in order to link it to a given obfuscation action. To retrieve this id, you can use the obfuscationRules, as described in [query the obfuscationRules]](#rules-read).
You do not need to specify all the fields of an obfuscation rule when updating it, only the id (mandatory) and the fields that you wish to modify. Here is an example to only update the name.
Use either of these options to delete an obfuscation rule:
Click the ellipsis icon ... at the right side of the obfuscation rule you want to delete, and click Delete.
Confirm you want to delete the expression by clicking Delete.
Using NerdGraph:
Use the logConfigurationsDeleteObfuscationRule mutation under logConfigurations. Refer to the populated example in GraphiQL as well as the related documentation in api.newrelic.com/graphiql.
The mutation will return a snapshot of the rule before being deleted.