The OpenAPI Generator is a popular tool with more than 20k stars on GitHub that allows users to automatically generate source code based on an OpenAPI spec. This code generation is also available via a web API, which can be self-hosted but is also publicly available at https://api.openapi-generator.tech/.
In our continuous effort to help secure open-source projects and improve our Clean Code solution, we regularly scan open-source projects via SonarQube Cloud and evaluate the findings. In fact, everybody can also do it – SonarQube Cloud is a free code analysis product for open-source projects, regardless of their size or language.
When scanning the code base of the OpenAPI Generator, SonarQube Cloud reported a complex taint flow vulnerability, that propagates user-controlled data via 28 steps to a dangerous sink:
In this blog post, we will explain the technical details behind this taint flow vulnerability, which became CVE-2024-35219, a critical arbitrary file read and deletion vulnerability in the OpenAPI Generator.
Impact
OpenAPI Generator versions 7.5.0 and below are prone to an Arbitrary File Read/Delete vulnerability. Attackers can exploit this vulnerability to read and delete files and folders from an arbitrary, writable directory.
The vulnerability is tracked as CVE-2024-35219 and has been fixed with pull request #18652, which is included in version 7.6.0.
Technical Details
In this section, we will explain how the technique that SonarQube Cloud uses to identify taint flow vulnerabilities works and then examine the specific vulnerability in the OpenAPI Generator.
Taint Analysis
Taint analysis is one of the techniques that the engine powering SonarQube Server and SonarQube Cloud uses to identify security vulnerabilities in the analyzed source code. So, what is taint analysis?
An application’s logic is all about data, which is passed from one part of the code to another. For example, when you call a method, you pass some data to it as a parameter. This method may call another method and again passes on the parameter. This flow of data can be visualized as a graph like this:
There are specific entry points to this data flow called Source. An example of this could be the request body of an API handler method. At this point, an attacker could feed some data to the application and thus control the data that is passed onwards.
The counterpart to a Source is a dangerous Sink at the end of a flow. A Sink is a function or method that is known to be security-relevant when attacker-controlled data reaches it.
From a security point of view, the big question is whether an attacker can reach a security-sensitive sink. In other words: Is there a path from a Source to a Sink?
In the above example, the answer is yes. Data originating from an attacker-controllable Source eventually reaches a dangerous Sink. The steps in between the flow from the Source to the Sink are called Passthrough as these simply pass on the data.
In a real application, a shallow taint flow like the above example is not very realistic. It could have been easily spotted manually and never made it to production. However, a huge advantage of taint analysis is that it can follow all code paths and even find very complex taint flows to a deeply hidden Sink in the application’s source code:
In this example, the tainted data from a Source traverses many method and function calls before reaching a Sink. This critical flow is much harder to identify manually.
OpenAPI Generator Vulnerability
With this background knowledge, let’s have a look at the taint flow vulnerability SonarQube Cloud reported for the OpenAPI Generator:
Click here to see the issue on SonarQube Cloud yourself.
On the left side of the SonarQube Cloud UI, we can see all the steps of the vulnerable flow. The first step is highlighted as SOURCE
. This is the entry point where attackers might be able to feed in data to the application. In this case, the entry point is the @RequestBody
sent to the API endpoint /gen/clients/{language}
as we can see in the source code on the right side. SonarQube Cloud highlights the source code so that the flow can be easily tracked here. Following the flow, we can see that the request body is supposed to contain a GeneratorInput
object, which is highlighted as the next step.
An example request to the /gen/clients/{language}
endpoint with a GeneratorInput
object looks like this:
The provided JSON body is mapped to a GeneratorInput
object that looks like this:
By following the flow on SonarQube Cloud, we can see that this GeneratorInput
object is eventually passed to a call to Generator::generate
as the opts
parameter. If the options
member is set (opts.getOptions
), the destPath
is populated with the outputFolder
option:
This destPath
is further concatenated to the final outputFolder
directory used to store all generated source code files. This directory is passed to a call to the zip.compressFiles
method, which is used to store all generated source code files in a zip archive:
Further following the flow on SonarQube Cloud, we can see that the zip.compressFiles
method iterates over all files and folders in the provided directory and stores them in a zip archive via addFolderToZip
and addFileToZip
:
In the case of the user-controlled outputFolder
, the flow continues with a call to addFolderToZip
as we can see in the SonarQube Cloud UI. Here, this complex taint flow ends with the invocation of the listFiles
method on a user-controlled File
object in step 28
. This is the final Sink:
As indicated by the message beneath the Sink, processing this File
object – in this case, with a call to listFiles
– is dangerous, because the path of the File
object was constructed based on user-controlled data.
Security Impact
Since attackers can control this path and there is no verification that the provided directory resides within the intended temporary folder, attackers can use a path traversal sequence (../
) to target an arbitrary, writable folder. The zip.compressFiles
method recursively adds all files and folders from this directory to the zip archive, which can then be downloaded. For example, the following request can be used to set the directory to /home/user/.ssh
:
The generated source code files will be stored in /home/user/.ssh
. All files and folders in this directory will be added to the zip archive, which can be downloaded via the /gen/download/{fileId}
endpoint. This way, all files and folders from an arbitrary folder can be exfiltrated, including a potentially existing SSH key (id_rsa
) in this case:
However, downloading the generated zip archive has another destructive effect: the parent folder of the directory, including all files and folders, will be deleted after the zip archive has been generated. In this case, this includes all files and folders in /home/user
:
Thus, attackers can use this vulnerability not only to read arbitrary files and folders, but also to delete them.
Patch
While identifying a vulnerability as deeply hidden as this can be difficult, the actual patching process is typically straightforward. The issue was fixed by removing the code that concatenates the attacker-controllable option into the destination folder:
Timeline
Date | Action |
2024-04-29 | We report all issues to the OpenAPI Generator maintainers. |
2024-05-11 | We reach out to the maintainers again to ask for the status. |
2024-05-13 | The maintainers share a fix with us for review. |
2024-05-21 | The fix is released as part of version v.7.6.0. |
2024-05-21 | CVE-2024-35219 is assigned. |
2024-05-27 | The related security advisory is made public. |
Summary
In this blog post, we have seen how taint analysis can uncover deeply hidden vulnerabilities in source code. By tracking data from its origin (Source) to its ultimate use (Sink), this method can unveil complex taint flows that could lead to severe security vulnerabilities.
We examined a real-world example by covering a critical vulnerability in the OpenAPI Generator, which is based on a complex taint flow that SonarQube Cloud detected. This discovery highlights the importance of leveraging SAST-based tools like SonarQube Server and SonarQube Cloud to safeguard your application against these deeply hidden vulnerabilities.
Finally, we would like to thank the OpenAPI Generator maintainers for providing a comprehensive patch, and transparently informing all users.
Related Blog Posts
- Find Deeply Hidden Security Vulnerabilities with Deeper SAST by Sonar
- pfSense Security: Sensing Code Vulnerabilities with SonarQube Cloud
- Basic HTTP Authentication Risk: Uncovering pyspider Vulnerabilities
- Who are you? The Importance of Verifying Message Origins
- Unzipping Dangers: OpenRefine Zip Slip Vulnerability