start page | rating of books | rating of authors | reviews | copyrights

Book HomeCGI Programming with PerlSearch this book

8.5. Data Storage

There are a number of security issues specifically related to reading and writing data. We'll discuss data storage in much greater detail in Chapter 10, "Data Persistence". Let's review the security issues now.

8.5.1. Dynamic Filenames

You should be extra careful when opening files where the filename is dynamically generated based upon user input. For example, you may have data arranged according to date, with a separate directory for each year and a separate file for each month. If you have a CGI script that allows the user to search for records in this file according to month and year, you would not want to use this code:

#!/usr/bin/perl -wT

use strict;
use CGI;
use CGIBook::Error;

my $q = new CGI;
my @missing;

my $month = $q->param( "month" ) or push @missing, "month";
my $year  = $q->param( "year"  ) or push @missing, "year";
my $key   = quotemeta( $q->param( "key" ) ) or push @missing, "key";

if ( @missing ) {
    my $fields = join ", ", @missing;
    error( $q, "You left the following required fields blank: $fields."  );
}

local *FILE;

## This is INSECURE unless you first check the validity of $year and $month
open FILE, "/usr/local/apache/data/$year/$month" or
    error( $q, "Invalid month or year" );

print $q->header( "text/html" ),
      $q->start_html( "Results" ),
      $q->h1( "Results" ),
      $q->start_pre;

while (<FILE>) {
    print if /$key/;
}

print $q->end_pre,
      $q->end_html;

Any user who supplied "../../../../../etc/passwd" as a month could browse /etc/passwd -- probably not a feature you want to provide. Assuming that your web form passes two-digit numbers for months and days, you should add the following lines:

unless ( $year =~ /^\d\d$/ and $month =~ /^\d\d$/ ) {
    error( $q, "Invalid month or year" );
}

You may have noticed that taint mode is enabled and wondered why it did not catch this security problem. Remember, the function of taint mode is to not allow you to accidentally use data that comes from outside your program to change resources outside your program. This code does not attempt to change any outside resources, so taint mode sees no reason to stop the script from reading /etc/passwd. Taint mode will only stop you from opening a file with an user-supplied filename if you are opening the file to write to it.

In this example, we were reading from a text file, but this security issue applies to other forms of data storage too. We could have just as easily been reading from a DBM file instead. Likewise when you use a RDBMS, you must specify what database you wish to connect to, and it is very poor design to allow the user to specify what database to open and read.

8.5.2. Location of Files

Your data files should not be directly browsable by the user, so they should not be in the web server's document tree. This is a mistake people frequently make when installing third party web applications. Many freely available web applications are distributed with all of their files -- including configuration files that contain important data like administrative passwords -- in one directory to make them easy to install. If you install the application as it comes packaged, then anyone who is familiar with the application can access the configuration information and possibly exploit it. Often these applications allow you to change filenames relatively easily, so some developers try to hide important data files by renaming them from their default name to a more obscure name. A much better solution is to move them out of the web document tree altogether.

Unless you store all of your data in an RDBMS, you should have a standard data tree just like your web document tree where you can store all your application data. Give each web application a subdirectory under the root data directory. Do not configure the web server to serve files out of this directory. In our examples, we use /usr/local/apache/data as the root of our data tree.

8.5.3. File Permissions

You should use your web server's filesystem to help you control read and write access to data files. On Unix systems, each directory and file has an owner, a group, and a set of permissions. The web server also runs as a particular user and group, such as nobody.

The web server should not have write access to any file it doesn't need to write to. This simple guideline may sound obvious, but it is often ignored in practice.

Data files that your scripts only need to read should be owned by nobody , and they should have a restrictive file permission like 0644. If the web server needs to be able to write to a file and it is not the creator of the file, you may want to set the group of the file to nobody and enable the group write bit by setting its permission to 0664.

If the web server needs to be able to create files or subdirectories within a directory, then that directory must be writable. Assign its group to nobody and change the permissions to 0775; otherwise, directories should be 0755. Realize that if you make a directory writable, then existing files can be deleted or replaced even if these files themselves are read-only.



Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.