start page | rating of books | rating of authors | reviews | copyrights

Book HomeApache: The Definitive GuideSearch this book

Chapter 7. Indexing

Contents:

Making Better Indexes in Apache
Making Our Own Indexes
Imagemaps

As we saw back on site.first (see Chapter 3, "Toward a Real Web Site"), if there is no index.html file in ... /htdocs, Apache concocts one called "Index of /", where "/" means the DocumentRoot directory. For many purposes this will, no doubt, be enough. But since this jury-rigged index is the first thing a client sees, you may want to do more.

7.1. Making Better Indexes in Apache

There is a wide range of possibilities; some are demonstrated at ... /site.fancyindex :

User webuser
Group webgroup
ServerName www.butterthlies.com
DocumentRoot /usr/www/site.fancyindex/htdocs

<Directory /usr/www/site.fancyindex/htdocs>
FancyIndexing on
AddDescription "One of our wonderful catalogs" catalog_summer.html 
    catalog autumn.html
IndexIgnore *.jpg
IndexIgnore  ..
IndexIgnore  icons HEADER README
AddIconByType (CAT,icons/bomb.gif) text/*
DefaultIcon icons/burst.gif
#AddIcon (DIR,icons/burst.gif) ^^DIRECTORY^^
HeaderName HEADER
ReadMeName README
</Directory>

When you type go on the server and access http://www.butterthlies.com/ on the browser, you should see a rather fancy display:

Welcome to BUTTERTHLIES INC  Name Last Modified Size Description 
--------------------------------------------------------------------
    <bomb>	catalog_autumn.html	23-Jul-1998 09:11	1k 	One of our wonderful catalogs 
    <bomb>	catalog_summer.html	25-Jul-1998 10:31	1k	One of our wonderful catalogs 
    <burst> index.html.ok		23-Jul-1998 09:11	1k 	
-------------------------------------------------------------------- 	Butterthlies Inc, Hopeful City, Nevada 99999

(This output is from Apache 1.3; the year is displayed in four-digit format to cope with the Year 2000 problem.) How does all this work? As you can see from the httpd.conf file, this smart formatting is displayed directory by directory. The key directive is IndexOptions.

7.1.1. IndexOptions

IndexOptions option  option ...
Server config, virtual host, directory, .htaccess

This directive was altered by the Apache Group as we went to press with this edition of the book; therefore, its behavior is different before and after Apache version 1.3.2. The options are as follows:

FancyIndexing

Turns on fancy indexing of directories (see Section 7.1.2, "FancyIndexing", later in this chapter).

Note that in versions of Apache prior to 1.3.2, the FancyIndexing and IndexOptions directives will override each other. You should use IndexOptions FancyIndexing in preference to the standalone Fancy-Indexing directive. As of Apache 1.3.2, a standalone FancyIndexing directive is combined with any IndexOptions directive already specified for the current scope.

IconHeight[= pixels] (Apache 1.3 and later)

The presence of this option, when used with IconWidth, will cause the server to include HEIGHT and WIDTH attributes in the <IMG> tag for the file icon. This allows browsers to precalculate the page layout without having to wait until all the images have been loaded. If no value is given for the option, it defaults to the standard height of the icons supplied with the Apache software.

IconsAreLinks

This option makes the icons part of the anchor for the filename, for fancy indexing.

IconWidth[= pixels] (Apache 1.3 and later)

The presence of this option, when used with IconHeight, will cause the server to include HEIGHT and WIDTH attributes in the <IMG> tag for the file icon. This allows browsers to precalculate the page layout without having to wait until all the images have been loaded. If no value is given for the option, it defaults to the standard width of the icons supplied with the Apache software.

NameWidth=[ n | *] (Apache 1.3.2 and later)

The NameWidth keyword allows you to specify the width of the filename column in bytes. If the keyword value is " * ", then the column is automatically sized to the length of the longest filename in the display.

ScanHTMLTitles

Enables the extraction of the title from HTML documents for fancy indexing. If the file does not have a description given by AddDescription, then httpd will read the document for the value of the <TITLE> tag. This process is CPU- and disk-intensive.

SuppressColumnSorting

If specified, Apache will not make the column headings in a fancy indexed directory listing into links for sorting. The default behavior is for them to be links; selecting the column heading will sort the directory listing by the values in that column. Only available in Apache 1.3 and later.

SuppressDescription

This option will suppress the file description in fancy indexing listings.

SuppressHTMLPreamble (Apache 1.3 and later)

If the directory actually contains a file specified by the HeaderName directive, the module usually includes the contents of the file after a standard HTML preamble (<HTML>, <HEAD>, etc.). The SuppressHTMLPreamble option disables this behavior, causing the module to start the display with the header file contents. The header file must contain appropriate HTML instructions in this case. If there is no header file, the preamble is generated as usual.

SuppressLastModified

This option will suppress the display of the last modification date in fancy indexing listings.

SuppressSize

This option will suppress the file size in fancy indexing listings.

There are some noticeable differences in the behavior of the IndexOptions directive in recent (post-1.3.0) versions of Apache. In Apache 1.3.2 and earlier, the default is that no options are enabled. If multiple IndexOptions could apply to a directory, then the most specific one is taken complete; the options are not merged. For example, if the specified directives are:

<Directory /web/docs> 
IndexOptions FancyIndexing 
</Directory>
<Directory /web/docs/spec> 
IndexOptions ScanHTMLTitles 
</Directory>

then only ScanHTMLTitles will be set for the /web/docs/spec directory.

Apache 1.3.3 introduced some significant changes in the handling of IndexOptions directives. In particular:

  • Multiple IndexOptions directives for a single directory are now merged together. The result of the previous example will now be the equivalent of IndexOptions FancyIndexing ScanHTMLTitles.

  • Incremental syntax (i.e., prefixing keywords with "+" or "-") has been added.

Whenever a "+" or "-" prefixed keyword is encountered, it is applied to the current IndexOptions settings (which may have been inherited from an upper-level directory). However, whenever an unprefixed keyword is processed, it clears all inherited options and any incremental settings encountered so far. Consider the following example:

IndexOptions +ScanHTMLTitles -IconsAreLinks FancyIndexing 
IndexOptions +SuppressSize

The net effect is equivalent to IndexOptions FancyIndexing +SuppressSize, because the unprefixed FancyIndexing discarded the incremental keywords before it but allowed them to start accumulating again afterward.

To unconditionally set the IndexOptions for a particular directory, clearing the inherited settings, specify keywords without either "+" or "-" prefixes.

7.1.2. FancyIndexing

FancyIndexing on_or_off
Server config, virtual host, directory, .htaccess

FancyIndexing turns fancy indexing on. The user can click on a column title to sort the entries by value. Clicking again will reverse the sort. Sorting can be turned off with the SuppressColumnSorting keyword for IndexOptions (see earlier in this chapter).

We can specify a description for individual files or for a list of them. We can exclude files from the listing with IndexIgnore.

7.1.3. IndexIgnore

IndexIgnore file1 file2 ...
Server config, virtual host, directory, .htaccess

IndexIgnore is followed by a list of files or wildcards to describe files. As we see in the following example, multiple IndexIgnores add to the list rather than replacing each other. By default, the list includes ".".

Here we want to ignore the *.jpg files (which are, after all, no use without the .html files that display them) and the parent directory, known to Unix and to Win32 as "..":

...
<Directory /usr/www/fancyindex.txt/htdocs>
FancyIndexing on
AddDescription "One of our wonderful catalogs" catalog_autumn.html catalog_summer.html
IndexIgnore *.jpg ..
</Directory>

You might want to use IndexIgnore for security reasons as well: what the eye doesn't see, the mouse finger can't steal.[51] You can put in extra IndexIgnore lines, and the effects are cumulative, so we could just as well write:

[51]Well, OK, you should never rely on this, but it doesn't hurt, right?

<Directory /usr/www/fancyindex.txt/htdocs>
FancyIndexing on
AddDescription "One of our wonderful catalogs" catalog_autumn.html catalog_summer.html
IndexIgnore *.jpg
IndexIgnore ..
</Directory>

We can add visual sparkle to our page, without which success on the Web is most unlikely, by giving icons to the files with the AddIcon directive. Apache has more icons than you can shake a stick at in its ... /icons directory. Without spending some time exploring, one doesn't know precisely what each one looks like, but bomb.gif sounds promising. The icons directory needs to be specified relative to the DocumentRoot directory, so we have made a subdirectory ... /htdocs/icons and copied bomb.gif into it. We can attach the bomb icon to all displayed .html files with:

...
AddIcon icons/bomb.gif  .html

7.1.4. AddIcon

AddIcon icon_name name
Server config, virtual host, directory, .htaccess

AddIcon expects the URL of an icon, followed by a file extension, a wildcard expression, a partial filename, or a complete filename to describe the files to which the icon will be added. We can iconify subdirectories off the DocumentRoot with ^^DIRECTORY^^, or make blank lines format properly with ^^BLANKICON^^. Since we have the convenient icons directory to practice with, we can iconify it with:

AddIcon /icons/burst.gif ^^DIRECTORY^^

Or we can make it disappear with:

...
IndexIgnore  icons
...

Not all browsers can display icons. We can cater to those that cannot by providing a text alternative alongside the icon URL:

AddIcon ("DIR",/icons/burst.gif) ^^DIRECTORY^^

This line will print the word DIR where the burst icon would have appeared to mark a directory (that is, the text is used as the ALT description in the link to the icon). You could, if you wanted, print the word "Directory" or "This is a directory." The choice is yours.

Examples:

AddIcon (IMG,/icons/image.xbm) .gif .jpg .xbm 
AddIcon /icons/dir.xbm ^^DIRECTORY^^ 
AddIcon /icons/backup.xbm *~

AddIconByType should be used in preference to AddIcon, when possible.

7.1.5. AddAlt

AddAlt string file file ...
Server config, virtual host, directory, .htaccess

AddAlt sets alternate text to display for the file if the client's browser can't display an icon. The string must be enclosed in double quotes.

7.1.6. AddDescription

AddDescription string file1 file2 ...
Server config, virtual host, directory, .htaccess

AddDescription expects a description string in double quotes, followed by a file extension, partial filename, wildcards, or full filename:

<Directory /usr/www/fancyindex.txt/htdocs>
FancyIndexing on
AddDescription "One of our wonderful catalogs" catalog_autumn.html 
    catalog_summer.html
IndexIgnore *.jpg
IndexIgnore ..
AddIcon (CAT,icons/bomb.gif)  .html
AddIcon (DIR,icons/burst.gif) ^^DIRECTORY^^
AddIcon icons/blank.gif ^^BLANKICON^^
DefaultIcon icons/blank.gif
</Directory>

Having achieved these wonders, we might now want to be a bit more sensible and choose our icons by MIME type using the AddIconByType directive.

7.1.7. DefaultIcon

DefaultIcon url
Server config, virtual host, directory, .htaccess

DefaultIcon sets a default icon to display for unknown file types. url points to the icon.

7.1.8. AddIconByType

AddIconByType icon mime_type1 mime_type2 ...
Server config, virtual host, directory, .htaccess

AddIconByType takes as an argument an icon URL, followed by a list of MIME types. Apache looks for the type entry in mime.types, either with or without a wildcard. We have the following MIME types:

...
text/html html htm
text/plain text
text/richtext rtx
text/tab-separated-values tsv
text/x-setext text
...

So, we could have one icon for all text files by including the line:

AddIconByType (TXT,icons/bomb.gif) text/*

Or we could be more specific, using four icons, a.gif, b.gif, c.gif, and d.gif :

AddIconByType (TXT,/icons/a.gif) text/html
AddIconByType (TXT,/icons/b.gif) text/plain
AddIconByType (TXT,/icons/c.gif) text/tab-separated-values
AddIconByType (TXT,/icons/d.gif) text/x-setext

Let's try out the simpler case:

<Directory /usr/www/fancyindex.txt/htdocs>
FancyIndexing on
AddDescription "One of our wonderful catalogs" catalog_autumn.html 
    catalog_summer.html
IndexIgnore *.jpg
IndexIgnore ..
AddIconByType (CAT,icons/bomb.gif)  text/*
AddIcon (DIR,icons/burst.gif) ^^DIRECTORY^^
</Directory>

For a further refinement, we can use AddIconByEncoding to give a special icon to encoded files.

7.1.9. AddAltByType

AddAltByType string mime_type1 mime_type2 ...
Server config, virtual host, directory, .htaccess

AddAltByType provides a text string for the browser to display if it cannot show an icon. The string must be enclosed in double quotes.

7.1.10. AddIconByEncoding

AddIconByEncoding icon mime_encoding1 mime_encoding2 ...
Server config, virtual host, directory, .htaccess

AddIconByEncoding takes an icon name followed by a list of MIME encodings. For instance, x-compress files can be iconified with:

...
AddIconByEncoding (COMP,/icons/d.gif) application/x-compress
...

7.1.11. AddAltByEncoding

AddAltByEncoding string mime_encoding1 mime_encoding2 ...
Server config, virtual host, directory, .htaccess

AddAltByEncoding provides a text string for the browser to display if it can't put up an icon. The string must be enclosed in double quotes.

Next, in our relentless drive for perfection, we can print standard headers and footers to our menus with the HeaderName and ReadmeName directives.

7.1.12. HeaderName

HeaderName filename
Server config, virtual host, directory, .htaccess

This directive inserts a header, read from filename, at the top of the index. The name of the file is taken to be relative to the directory being indexed. Apache will look first for filename.html and, if that is not found, then filename.

7.1.13. ReadmeName

ReadmeName filename
Server config, virtual host, directory, .htaccess

filename is taken to be the name of the file to be included, relative to the directory being indexed. Apache tries to include filename.html as an HTML document and, if that fails, as text.

If we simply call the file HEADER, Apache will look first for HEADER.html and display it if found. If not, it will look for HEADER and display that. The HEADER file can be:

Welcome to BUTTERTHLIES, Inc.

and the README file:

Butterthlies Inc., Hopeful City, Nevada 99999

to correspond with our index.html. We don't want HEADER and README to appear in the menu themselves, so we add them to the IndexIgnore directive:

<Directory /usr/www/fancyindex.txt/htdocs>
FancyIndexing on
AddDescription "One of our wonderful catalogs"
catalog_autumn.html catalog_summer.html
IndexIgnore *.jpg
IndexIgnore .. icons HEADER README
AddIconByType (CAT,icons/bomb.gif)  text/*
AddIcon (DIR,icons/burst.gif) ^^DIRECTORY^^
HeaderName HEADER
ReadMeName README
</Directory>

Since HEADER and README can be HTML scripts, you can wrap the directory listing up in a whole lot of fancy interactive stuff if you want.

But, on the whole, FancyIndexing is just a cheap and cheerful way of getting something up on the Web. For an elegant Net solution, study the next section.



Library Navigation Links

Copyright © 2001 O'Reilly & Associates. All rights reserved.