LogMasker – Log Masking library for Log4j and Logback

LogMasker is an easy-to-use library that allows your application to mask sensitive information before it is written to the log, or a log Personally Identifiable Information Reduction (PII Reduction) library. It does the masking directly on the log stream and minimizes the risk of sensitive data appearing inside any printed log.

This is done by intercepting the log event and masking it even before it has a chance to be written. Works with Log4j2 as well as Logback.

For more information, visit the project’s GitLab page: LogMasker on GitLab

Download

To download the latest libraries, head over to the Tags from the menu on the left. There, select the version you want to download and on the right you will see a ‘Download’ icon. From there you can retrieve the build artifact that contains the jar files for both Log4j2 and Logback.

What sensitive data can be masked

Currently, LogMasker can mask the following sensitive information:

  • Email addresses
  • IPv4 addresses
  • IBANs
  • Card numbers (PANs)
  • Passwords (if marked accordingly)

The maskers that are being used are easily configurable and the library allows you to write your own masker and include it in the masking process. By default, all maskers from above are used.

Performance

Each masker ads an additional layer of processing, so it is recommended that you only use the maskers that are needed for your business needs, especially if you have high throughput and write a lot of logs. There are performance tests for each masker as well as for integration directly with Log4j. Results on my machine are as follows:

MaskerNumber of lines maskedAverage time in ms (for all lines)
Email masker10000030ms
Password masker100000112ms
IP masker10000026ms
Card number masker10000050ms
IBAN masker (all countries)100000490ms
Masking Converter (all maskers)100000610ms
Masking Converter Exclusive (all masker)100000180ms
Log4j Integrated1000 (one log event)295ms

The tests were executed on a set of lines that were randomly selected. Each line includes at max one item that can be masked.

License

The library is developed under Apache License v2.0. For more information read the License file.

Why some code is not elegant

The library was build using speed in mind. As a result, some part of the maskers are not that elegant. Methods are longer than they should normally be since they were made to use the minimum number of new objects that are being created for each log event. Also, primitives are being used whenever possible.

Integration with Log4j 2

The library can be easily added and integrated with Log4J 2. Once imported into your project, it will provide a custom message converter (LogEventPatternConverter) which when used will mask all incoming data. To do this, replace the %m inside your message pattern with %msk or %mask.

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN" packages="com.ppopescu.logging">
    <Appenders>
        <Console name="Console" target="SYSTEM_OUT">
            <PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level - %msk%n"/>
        </Console>
    </Appenders>
    <Loggers>
        <Root level="info">
            <AppenderRef ref="Console"/>
        </Root>
    </Loggers>
</Configuration>

Integration with Logback

The library can be easily added and integrated with Logback. Once imported into your project, it will provide a custom message converter (MessageConverter) which can be used for all messages. To do this, include the following line in your configuration file:

<conversionRule conversionWord="mask" converterClass="com.ppopescu.logging.LogbackMaskingConverter" />

After that, replace the %message converter with the new %mask one. Example logback.xml file:

<configuration>
    <conversionRule conversionWord="mask" converterClass="com.ppopescu.logging.LogbackMaskingConverter" />

    <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
        <encoder>
            <pattern>%d{HH:mm:ss.SSS} [%thread] %-5level %logger{36} - %mask%n</pattern>
        </encoder>
    </appender>

    <root level="debug">
        <appender-ref ref="STDOUT" />
    </root>
</configuration>

Configuration options

The Converter allows you to configure what maskers are to be used. Bellow you will find the available maskers an examples for Log4j2. The same options are available for Logback as well by adapting to the Logback XML format. In Logback, options are separated by , and not in separate groups delimited by { and }.

Example:

Log4j2 XMLLogback XML
%msk{email}{ip}%mask{email, ip}

Maskers and their associated option:

MaskerOption to include itAdditional configuration options
Email Masker{email}
Password masker{pass}
IP Masker{ip}
Card number masker{card}:noStartDigits|noEndDigits
IBAN masker{iban}:CC|CC|CC

Configuring which maskers to use

By default, all maskers from above are used, however, this can easily be changed based on your needs. To do this, include the masker’s keyword in your Log4J2 XML as an option for the %msk pattern. Additionally, there is an {ALL} option that can be used to include all the above maskers. This is useful for adding your own masker alongside the pre-defined ones.

Example configuration that only masks emails and IPs:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN" packages="com.ppopescu.logging">
    <Appenders>
        <Console name="Console" target="SYSTEM_OUT">
            <PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level - %msk{email}{ip}%n"/>
        </Console>
    </Appenders>
    <Loggers>
        <Root level="info">
            <AppenderRef ref="Console"/>
        </Root>
    </Loggers>
</Configuration>

Configuring countries for IBAN masker

Since there are multiple formats for IBAN depending on the country, and masking all may take a long time, it is possible to specify only certain countries to be included. To do this, include the country codes, separated by | as an option for the IBAN masker. By default, all 79 countries are being used.

Example configuration that only masks IBANs in Romania and Germany:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN" packages="com.ppopescu.logging">
    <Appenders>
        <Console name="Console" target="SYSTEM_OUT">
            <PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level - %msk{iban:RO|DE}%n"/>
        </Console>
    </Appenders>
    <Loggers>
        <Root level="info">
            <AppenderRef ref="Console"/>
        </Root>
    </Loggers>
</Configuration>

Including a custom masker

You can include custom maskers with ease. To do this, first write one or more maskers classes that implement the LogMasker interface (from com.ppopescu.logging.masker). Next, specify the package where it is as an option to the masker configuration inside the XML file. Multiple packages can be included by separating them with |.

Example configuration file that uses a custom masker:

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN" packages="com.ppopescu.logging">
    <Appenders>
        <Console name="Console" target="SYSTEM_OUT">
            <PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level - %msk{custom:com.ppopescu.example}%n"/>
        </Console>
    </Appenders>
    <Loggers>
        <Root level="info">
            <AppenderRef ref="Console"/>
        </Root>
    </Loggers>
</Configuration>

Masking order

The maskers are used in sequential order. The order is dictated by the order of the options inside the XML file. If no configuration options are provided, or if the ALL option is used, the order is as follows: email, password, IP, Card Number, IBAN.

Exclusive option

If you can guarantee that there will always be just ONE maskable type inside a log even, you can specify the {e} or {exclusive} or {ex} option. If this is done, the converter won’t go to the next masker if the previous one managed to find a match.

For example, let’s assume that you have the following maskers configured: IP, Email, IBAN. For the log line This is a log with an email testemail@domain.com included, only the IP and Email will be executed, saving computational time.

NOTE: Only use this if you can guarantee that only one type is present