File upload vulnerabilities

Context:

File download vulnerabilities occur when a web server allows users to download files to its file system without sufficiently validating elements such as their name, type, content, or size. Failure to properly restrict these could mean that even a basic image download function can be used to download arbitrary and potentially dangerous files instead. This could even include server-side script files that allow remote code execution.

In some cases, simply downloading the file itself is enough to cause damage. Other attacks may involve a follow-up HTTP request for the file, usually to trigger its execution by the server.

The impact of file download vulnerabilities generally depends on two key factors:

What aspect of the file the website fails to properly validate, whether it be its size, type, content, etc. What restrictions are imposed on the file once it has been successfully downloaded. In the worst case scenario, the file type is not properly validated and the server configuration allows certain file types (such as .php and .jsp) to be executed as code. In this case, an attacker could potentially download a server-side code file that acts as a web shell, granting them full control over the server.

If the file name is not properly validated, this could allow an attacker to overwrite critical files simply by downloading a file with the same name. If the server is also vulnerable to directory traversal, this could mean that attackers are even able to download files to unexpected locations.

Failing to ensure that the file size falls within expected thresholds could also allow for a form of denial of service (DoS) attack, where the attacker fills up available disk space.

Given the fairly obvious dangers, it is rare for live websites to have no restrictions on the files users are allowed to download. More commonly, developers implement what they think is robust validation that is inherently flawed or can be easily circumvented.

For example, they may attempt to blacklist dangerous file types, but fail to account for parsing gaps when checking file extensions. As with any blacklist, it is also easy to accidentally omit more obscure file types that may still be dangerous.

In other cases, the website may attempt to check the file type by verifying properties that can be easily manipulated by an attacker. Ultimately, even robust validation measures can be applied inconsistently across the host and directory network that make up the website, leading to gaps that can be exploited.

Exploiting unrestricted file downloads to deploy a web shell

From a security perspective, the most dangerous situation is when a website permits the download of server-side scripts, such as PHP, Java, or Python files, and is configured to execute them as code. This creates an easy way to establish a web shell on the server.

If successful in downloading a web shell, the attacker can gain complete control over the server. This allows them to access and modify any files, steal sensitive data, and launch attacks on internal infrastructure and other servers outside the network. For instance, the PHP code snippet below can be utilized to read any file from the server's file system:

 <?php echo file_get_contents('/path/to/file'); ?>

Once downloaded, sending a request for this malicious file will return the content of the target file in the response. A more flexible web shell may look like this:

 <?php echo system($_GET['command']); ?>

With this script, an attacker can pass an arbitrary system command via a query parameter like this:

 GET /test/exploit.php?command=id HTTP/1.1

Exploiting flawed file upload validation

This section explores the ways in which web servers attempt to validate and sanitize file uploads, as well as how attackers can exploit the flaws in these mechanisms to acquire a web shell for remote code execution.

Flawed file type validation

When users submit HTML forms, the browser usually sends the data in a POST request with the content type application/x-www-form-urlencoded. This is acceptable for transmitting basic text such as names and addresses, but it is not suitable for sending large amounts of binary data such as an image or PDF file. In this case, multipart/form-data is the recommended content type.

Imagine a form that asks users to upload an image, provide a description, and enter their username. Submitting such a form will generate a request that looks like the following:

 POST /images HTTP/1.1
Host: basic-website.com
Content-Length: 12345
Content-Type: multipart/form-data; boundary=---------------------------012345678901234567890123456

---------------------------012345678901234567890123456
Content-Disposition: form-data; name="image"; filename="test.jpg"
Content-Type: image/jpeg

[...binary content of example.jpg...]

---------------------------012345678901234567890123456
Content-Disposition: form-data; name="description"

This is a description of my image.

---------------------------012345678901234567890123456
Content-Disposition: form-data; name="username"

wiener
---------------------------012345678901234567890123456--

As shown above, the message body is divided into separate parts for each form entry, with a Content-Disposition header providing basic details about the corresponding input field. These parts may also have their own Content-Type header to inform the server of the MIME type of the data submitted.

One method used by websites to validate file uploads is to check if the input-specific Content-Type header matches the expected MIME type. For example, if the server only permits image files, it may only allow types such as image/jpeg and image/png.

However, if the value of this header is implicitly trusted by the server and no further validation is conducted to ensure that the file contents match the supposed MIME type, this defense can be easily bypassed.

Preventing Execution of Files in User-Accessible Directories

Preventing the execution of files in user-accessible directories is the second line of defense after preventing the uploading of dangerous file types. Servers usually only execute scripts that they have been explicitly configured to execute for the given MIME type. If the script is not permitted, the server may return an error message or serve the content as plain text instead.

 GET /static/exploit.php?command=id HTTP/1.1
Host: basic-website.com


HTTP/1.1 200 OK
Content-Type: text/plain
Content-Length: 39

<?php echo system($_GET['command']); ?>

It is important to note that the configuration may differ between directories, with a directory where user-supplied files are uploaded having more strict controls than other locations on the file system. Uploading a script to a different directory that is not intended to contain user-supplied files may ultimately result in the server executing the script.

Insufficient Blacklisting of Dangerous File Types

Blacklisting potentially dangerous file extensions such as .php is one way to prevent users from uploading malicious scripts, but it is inherently flawed as it is difficult to block all possible file extensions that can execute code. These blacklists can be bypassed by using lesser-known alternative file extensions that may still be executable, such as .php5, .shtml, etc.

Developers may need to add specific directives to their configuration files before servers like Apache will execute requested PHP files for a client. Additionally, special configuration files can be created in individual directories to override or add to global settings, such as the .htaccess file for Apache servers or the web.config file for IIS servers.

 LoadModule php_module /usr/lib/apache2/modules/libphp.so
AddType application/x-httpd-php .php

Developers can create directory-specific configuration files to override or add to global settings on servers, with Apache servers loading directory-specific configuration from a .htaccess file and IIS servers using a web.config file. For example, the web.config file can include directives that allow JSON files to be served to users.

 <staticContent>
    <mimeMap fileExtension=".json" mimeType="application/json" />
</staticContent>

While web servers use these configuration files when present, they are typically not accessible through HTTP requests. However, some servers may allow the upload of a malicious configuration file, enabling the mapping of arbitrary and custom file extensions to executable MIME types, even if the required file extension is blacklisted.

Even the most comprehensive blacklists can potentially be bypassed using obfuscation techniques. For instance, if the validation code is case-sensitive and does not recognize that exploit.pHp is a .php file, mapping the file extension to an MIME type that is not case-sensitive can enable the server to execute the malicious PHP file.

Other obfuscation techniques include providing multiple extensions, adding trailing characters, using URL encoding or double URL encoding, adding semicolons or null byte characters encoded in URL, and using multi-byte unicode characters that can be converted into bytes and null bytes after unicode conversion or normalization.

Defenses may involve removing or replacing dangerous extensions to prevent file execution. However, if this transformation is not applied recursively, an attacker can position the forbidden string to leave a valid file extension after its removal. These are just some examples of the numerous obfuscation techniques that can be used for file extensions:

 exploit.p.phpahp

Erroneous file content validation

More secure servers do not implicitly trust the Content-Type specified in a request, and instead attempt to verify that the file content matches what is expected. For example, an image upload function may check intrinsic properties of an image such as its dimensions to verify that it is actually an image file, and reject uploads that do not meet the expected criteria.

Similarly, some file types have specific sequences of bytes in their header or footer that can be used as a fingerprint or signature to determine whether the content matches the expected type. For instance, JPEG files always begin with the bytes FF D8 FF.

While this is a more reliable method of validating file type, it is still not completely foolproof. With tools like ExifTool, it is possible to create polyglot JPEG files containing malicious code in their metadata, making it challenging to detect and prevent such files from being uploaded.

Exploiting concurrency conditions for file downloads

Modern frameworks have hardened against file upload attacks by implementing precautions such as downloading to a sandboxed temporary directory and randomizing the name to avoid overwriting existing files. The temporary file undergoes validation before being transferred to its intended destination only if it is deemed safe.

However, developers may still implement their own file download processing independently of any framework. This can introduce dangerous race conditions that may allow an attacker to bypass even the most robust validation.

For instance, some websites download the file directly to the main file system and delete it if it fails validation. This behavior is typical of websites that rely on antivirus software and others to scan for malware. During the short period that the file exists on the server, which may only be a few milliseconds, the attacker can potentially execute it.

These vulnerabilities are often subtle, making them difficult to detect in black box testing, unless the relevant source code is disclosed.

Race conditions in URL-based file downloads

Race conditions can also occur in functions that allow file download by providing a URL, as the server needs to retrieve the file from the internet and create a local copy before validation can occur.

Since the file is loaded via HTTP, developers cannot rely on the framework's built-in mechanisms to securely validate files. They may create their own processes to temporarily store and validate the file, which may not be as secure.

For example, developers may use a random name to store the file in a temporary directory to prevent exploitation of race conditions. However, if the random directory name is generated using pseudo-random functions like PHP's uniqid(), it may be susceptible to brute-forcing.

Attackers may try to prolong the time to process the file, thereby lengthening the window for brute-forcing the directory name. One way to achieve this is to create a larger file with the payload at the beginning, followed by a large number of padding bytes, potentially taking advantage of how the file is processed in blocks.

Exploiting file download vulnerabilities without remote code execution

The examples we have explored thus far have focused on downloading server-side scripts for remote code execution, which is the most severe consequence of an insecure file download function. However, attackers can exploit these vulnerabilities in other ways as well.

Downloading malicious client-side scripts

Even if you cannot execute scripts on the server, you can still download scripts for client-side attacks. For instance, downloading HTML files or SVG images can enable attackers to use tags to create stored XSS payloads.

If the downloaded file is displayed on a page visited by other users, their browser will execute the script when it attempts to display the page. It is important to note that due to same-origin policy restrictions, such attacks will only work if the downloaded file is served from the same origin as the one from which you downloaded it.

Exploiting vulnerabilities in downloaded file analysis

If the downloaded file appears to be stored and served securely, attackers can resort to exploiting vulnerabilities specific to the analysis or processing of different file formats. For instance, if the server is parsing XML files, such as Microsoft Office .doc or .xls files, this could be a potential vector for XXE injection attacks.

Downloading files using PUT

It is worth noting that some web servers can be configured to support PUT requests, and if the necessary defenses are not in place, attackers can use this as an alternative means of downloading malicious files, even when a download function is not available through the web interface.

 PUT /images/exploit.php HTTP/1.1
Host: website.com
Content-Type: application/x-httpd-php
Content-Length: 54

<?php echo file_get_contents('/path/to/file'); ?>

How to prevent file download vulnerabilities

Allowing users to download files is a common feature and should not be dangerous if the necessary precautions are taken. Implementing the following practices can be an effective way to protect your websites from these vulnerabilities:

Check the file extension against a whitelist of allowed extensions rather than a blacklist of prohibited extensions. It is easier to identify the extensions you want to allow than to guess which ones an attacker might try to download.
Ensure that the file name does not contain any substrings that could be interpreted as a directory or traversal sequence (such as "../").
Rename downloaded files to avoid collisions that could result in overwriting existing files.
Do not download files to the server's permanent file system until they have been fully validated.
Whenever possible, use a well-established framework for preprocessing file downloads instead of attempting to develop your own validation mechanisms.