Restricting Access and Serving Nonstandard Document Types

in your web space

Technical Bulletin 15
updated - 3/02 jah


Table of Contents

  1. Introduction
  2. General Security Issues
  3. Restricting Access to Lehigh
  4. Restricting Access by Lehigh ID and password
  5. Restricting Access by Password
  6. Combining Access Restriction Types
  7. Serving Nonstandard Document Types
  8. Combining .htaccess File Options

Introduction

Lehigh's web server will generally serve files to anyone on the Internet who requests them. For most purposes, this is good. But what if you would like to restrict access to certain files? Three types of restrictions, to those at Lehigh, to those who have a Lehigh (AFS) ID and password, and to those who have an ID and password you've given them, will be discussed below.

Lehigh's web server, like most web servers, generally serves HTML (HyperText Markup Language) files. It also serves JPEG and GIF graphics files, and a few other file types, as standard fare. But what if you would like to share, say, Microsoft Word or PowerPoint files via the Web? This can be done, but requires configuration on both the server end and the client end.

In each of the above two cases, you wish Lehigh's web server to consider certain documents as extraordinary, and not treat them in the standard way. One or more files called .htaccess placed in your web data directories can instruct Lehigh's server to treat documents in a nonstandard way, allowing you to both restrict access to documents and to serve nonstandard document types. (Note that UNIX treats files which begin with a period (.) as hidden, including the .htaccess file. The hidden files will not appear in the standard directory listing--ls--but can be seen with a complete directory listing--ls -a.)

Notation conventions used in this document include:

  1. Input and output likely to be seen on a command line are shown in a typewriter-style font, like command.
  2. If part of a command line needs particular information you must specify, it is emphasized, like command your_userid.

General Security Issues

Lehigh's web server has access to all the files you give it access to, and it will serve those documents to anyone who asks to see them. Which documents are they? Generally, all the files in the public directory within your home directory in AFS file space. You probably store all of your WWW files in your public/www-data/ directory, because of the convention on Lehigh's web server that ~your_userid/ is a shortcut for your_userid/public/www-data/.

AFS permissions

Generally, Lehigh's web server has access to all of the files in the public directory within your home directory in AFS file space. This is because the AFS permissions have been automatically set for you to allow the web server access.

The AFS permission which allows the web server access to your files is the permission system:anyuser rl. (You can look at the AFS permissions for a directory by executing the command fs la in that directory.) This means that any files to which the web server has access are also accessible to anyone who has access to the AFS file system. This includes Carnegie Mellon University, Michigan State University, and, of course, everyone at Lehigh.

Therefore, the "restricted" access which we will discuss in the rest of this bulletin should be understood in the correct context: Lehigh's web server will honor these restrictions. Lehigh's web server, though, has nothing to do with AFS access or permissions (except that it must have permission to see the files), so the files may still be seen by any AFS user.

NOTE: this means that anyone at Lehigh can see any of your web pages, by going into AFS, whether or not restrictions are placed on those pages or directories

The index.html file

You may have noticed that sometimes Lehigh's web server returns the index to a directory.
 
ExampleVertical line The URL http://www.lehigh.edu/computing/web/limits returns an index to the AFS directory /home/inlts/public/www-data/computing/web/limts, which has the shortcut /computing/web/limits 

Specifically, the server generates an index when the path specified in the URL is a directory and when one more condition is satisfied--the directory does not contain a file called index.html, or a pointer to another index file.

If, on the other hand, the directory does contain a file called index.html, the server returns the index.html file rather than generating an index.
 
ExampleVertical line The URL http://www.lehigh.edu/computing/software returns the the document /computing/software/index.html. (The same document would also be returned by URL http://www.lehigh.edu/computing/software/index.html.)

Use of the index.html file can be considered a security feature: it keeps people from looking at directory of your files through a web browser.

If you have set up your webspace since October 15, 1999, it will include a symbolic link that aliases your userid.html file to index.html. You can remove this symbolic link, if you choose, by going into your webspace (from the Network Server, enter shell, then cd webspace) and entering rm index.html.

A cleaner way to set the index file to be your home page, if you already have a home page home.html in the same directory, is to add a DirectoryIndex command to a hidden .htaccess file in your directory, so:

DirectoryIndex home.html
Now, instead of looking for index.html, the server will look for home.html. This command is 'inherited' by all the subdirectories under that directory, so you may want to include an order of precedence:
DirectoryIndex home.html index.html mine.html
This means that the server, in that directory and any subdirectories, will first look for a file called home.html, then for index.html, and finally, as a last resort, mine.html.


Restricting Access to Lehigh

The hidden file .htaccess can be used to restrict access to your files, so that only people at Lehigh can view them.
 
ExampleVertical line The URL http://www.lehigh.edu/computing/web/limits/lehigh.only can only be viewed from Lehigh. 

If there is a file called .htaccess in the directory containing the requested file, or in any parent directory, the rules contained in the .htaccess file apply to the requested file. The directory containing the file in the example above has a .htaccess file which restricts access to that directory so that only people accessing the page from Lehigh can see it. This .htaccess file contains:

<Limit GET>
order deny,allow
deny from all
allow from 128.180
</Limit>
This file is specifying an algorithm: Deny access to everyone, unless they are from 128.180, meaning that their IP address begins with 128.180 (which covers all computers at Lehigh).

To restrict access to the files in a directory, create a file called .htaccess in that directory which contains the text shown above. 


Restricting Access via AFS (Lehigh) ID and password

All Lehigh users have an AFS ID and password. That AFS ID is the same one they use to check their mail, log into the Network Server, use the Compute Servers, etc. Therefore, it is possible to restrict use of files in a directory to users who prove that they are valid Lehigh users by entering their AFS ID and password.
ExampleVertical line The URL http://www.lehigh.edu/computing/web/limits/afs/ can only be viewed by people with Lehigh Userids

If there is a file called .htaccess in the directory containing the requested file, or in any parent directory, the rules contained in the .htaccess file apply to the requested file. The directory containing the file in the example above has a .htaccess file which restricts access to that directory so that only people accessing the page from Lehigh can see it. This .htaccess file contains:

AuthExternal    afs
AuthName        AFSauthentication
AuthType        Basic
require         valid-user
This file is specifying an algorithm: Deny access to everyone, unless they can autheticate with an AFS ID and password. The browser will show a window or entry box prompting the user for AFS ID and password. If AFS says that combination is correct, the user is allowed access. (Note: the AFS ID is not recorded anywhere.)

To restrict access to the files in a directory, create a file called .htaccess in that directory which contains the text shown above.

You can also specify that only specific people at Lehigh be allowed to access your pages:

AuthExternal    afs
AuthName        AFSauthentication
AuthType        Basic
require user    userid1 userid2 userid3
Where the userids are the Lehigh userids of the people you wish to allow access to. (To create a file of people to allow access, create a group file (described below), point to it in the .htaccess file (AuthGroupFile, described below) and change the require line to: require group groupname.


Restricting Access by Password

An example

The .htaccess file can also be used to restrict access to your files to individuals you have given an name and password.
 
ExampleVertical line
  1. The directory http://www.lehigh.edu/comp/web/limits/password is restricted to those who can supply the correct name and password. In this example, the name is jiminy and the password is cricket.
  2. If you follow the above link or otherwise try to access this restricted directory, you will come to the following dialog box, requesting name and password: 
  3.  

    Authorization dialog box image
     

  4. After providing the correct name and password, you will be served the requested document, in this case, an index of the directory restricted/.

Setting up the .htaccess file

If there is a file called .htaccess in the directory containing the requested file, or in any parent directory, the access rules contained in the .htaccess file apply to the requested file. The directory containing the file in the above example has a .htaccess file which restricts access to that directory to those who can supply the correct name and password. In this case, the .htaccess file contains:
AuthUserFile /afs/cc/home/inlts/public/comp/web/limits/restricted/.passwordfile
AuthName TB15RealmA
AuthType Basic
require user jiminy
This .htaccess file requires some explanation:
  1. AuthUserFile defines the file which contains the name and password information. The following section discusses this file.
  2. AuthName defines the name of the authentication realm.
  3. To understand the function of the authentication realm, it helps to understand that the server requests authentication for every document which is in a protected area. You don't need to provide the authentication name and password every time, though, because your browser stores the name and password after you enter it, and provides it automatically to the server on subsequent requests. If you go to a second protected area, your browser will ask for the new name and password. How does the browser know which name and password go with which protected area? By distinguishing between different authentication realms, which should have different names specified by AuthName.
  4. AuthType Basic is specified. This is the only authorization type described in HTTP (HyperText Transfer Protocol).
Create your .htaccess file as follows:
  1. AuthUserFile file

  2. Substitute the name of your password file. (The next section discusses how to create this file.) Keep the password file in a restricted area. The password file name must start with /afs/cc/home/..., and most likely starts with /afs/cc/home/your_userid/public/www-data/...
  3. AuthName realm
  4. Use a name which is descriptive of the restricted area.
  5. AuthType Basic
  6. require user username
  7. username here is the one which users must enter to access the restricted area.

Setting up the password file

The password file is not created manually, but with a command run at the UNIX command line. This command, htpasswd, can be run on the Compute Server (use SSH to connect to CS/Rigel)  but not on the Network Server or the SGI workstations.
 
ExampleVertical line In order to create the jiminy/cricket username/password combination, I did the following: 
>htpasswd -c .passwordfile jiminy
Adding password for jiminy.
New password:
Re-type new password:
>
(Feel free to look at the resulting password file, if you like. You will see that the passwords are encoded, so they cannot be read directly.) 

A few comments on this example:

  1. In the example above, -c was necessary to create a new file; by default, the htpasswd command adds username/password combinations to an existing file.
  2. I specified the username on the command line, and was prompted for the password.
  3. The file .passwordfile is created in my present working directory.
Create your password file using the htpasswd command:
htpasswd -c file user.
After setting up both the .htaccess file, and the password file referred to in it, your the files in the directory with the .htaccess file will be restricted.

Restricting Access by Group

Access can also be restricted to a group of users, each of whom has their own individual passwords.
 
ExampleVertical line The directory http://www.lehigh.edu/computing/web/limits/group is restricted by group. In this example, there is a group containing three name/password combinations which allow access: manny/pep1, moe/pep2, and jack/pep3

Three files are required to set up access restriction by group:

  1. The .htaccess file, as we've seen before.
  2. A group file, which is new, defines the users who are in the group.
  3. A password file, as we've also seen before. In this case, though, the password file must contain an entry for each user.
In this case, the .htaccess file contains:
AuthUserFile /afs/cc/home/inlts/public/www-data/comp/web/limits/group/.passwordfile
AuthGroupFile /afs/cc/home/inlts/public/www-data/comp/web/limits/group/.groupfile
AuthName TB15RealmB
AuthType Basic
require group pep-boys
The group file defines the group. In this case, the group file .groupfile contains:
pep-boys: manny moe jack
The password file .passwordfile used in this case was created with a sequence of three UNIX commands:
htpasswd -c .passwordfile manny
htpasswd .passwordfile moe
htpasswd .passwordfile jack
In order to restrict access by group, then:
  1. Create an .htaccess file as described above, under Setting up the .htaccess file, but with two changes:
    1. Add a line:
    2. AuthGroupFile groupfile
    3. Change the line within the <Limit> tags from
    4. require user username
      to
      require group groupname
  2. Create a password file as described above, under "Setting up the password file."
  3. Create a group file, with the name given in the AuthGroupFile line, defining the users who are in the group. Use the example as a template.


Combining Access Restrictions

It is possible to provide multiple tests for access to a directory. Entering multiple restrictions in order, along with a satisfy stipulation, will do it. To make the user satisfy all of the requirements, include a 'satisfy all' line; to allow the user access if satisfying one of the requirements, include 'satisfy any'

In order to restrict access to people with AFS IDs and passwords, who also happen to be coming in from Lehigh IP addresses:

<Limit GET>
order deny,allow
deny from all
allow from 128.180
AuthExternal    afs
AuthName        AFSauthentication
AuthType        Basic
require         valid-user
satisfy all
</Limit>
To require anyone not coming in from Lehigh to give a correct AFS ID and password for access:
<Limit GET>
order deny,allow
deny from all
allow from 128.180
AuthExternal    afs
AuthName        AFSauthentication
AuthType        Basic
require         valid-user
satisfy any
</Limit>


Serving Nonstandard Document Types

Lehigh's server can serve nonstandard document types, such as Microsoft Word or PowerPoint documents to any user on the Web. Using the server to do this requires configuration of both the web server and the user's client.

The server setup uses the .htaccess file mechanism. Setting up the client is dependent on the client being used; we will discuss setting up Netscape 2.0, though other client setups will be similar.

Configuring the client

Before you can see this in action, you must have your client (browser) configured. Therefore, we will first go through the configuration of one browser, Netscape 2.0:

  1. Under the Options menu, choose the General Preferences item.
  2. In the Preferences window, choose the Helpers tab.
  3. Press the Create New Type button.
  4. In the Configure New Mime Type window, for Mime Type, type application, and for Mime SubType, type msword then press the OK button.
  5. In the File Extensions field, type doc.
  6. In the area labelled Action, select the Launch the Application, and press the Browse button and find an application which will be used to view the Microsoft Word documents. Appropriate applications would be the Microsoft Word Viewer, Microsoft Word, or another application (such as Word Perfect) which can interpret Microsoft Word documents.
Configuring browsers other than Netscape 2.0 is very similar.
 
ExamplesVertical line
  1. The document http://www.lehigh.edu/~ludoc/bull/tb15/example/nonstandard/speedup.doc is a Microsoft Word document. If you've configured your browser as above, when you click on the link to the Microsoft Word document, your browser should automatically launch the application you've chosen and display the document. An application launched by Netscape in this manner is called a helper application.
  2. A second example can be viewed after you configure your browser with a helper application for PowerPoint documents, with MIME type/subtype as application/powerpoint and extension ppt: http://www.lehigh.edu/~ludoc/bull/tb15/example/nonstandard/tables.ppt
  3. A third example can be viewed after you configure your browser with a helper application for WordPerfect documents, with MIME type/subtype as application/wordperfect and extension wpd: http://www.lehigh.edu/~ludoc/bull/tb15/example/nonstandard/wpinstal.wpd.

Configuring the server: MIME Types

Another .htaccess file is required to configure the server. If there is a file called .htaccess in the directory containing the requested file, or in any parent directory, the access rules contained in the .htaccess file apply to the requested file. In the case of the above examples,the .htaccess file reads as follows:
AddType application/powerpoint .ppt
AddType application/msword .doc
AddType application/wordperfect .wpd
Whenever a web server sends information to a web client, it first sends a tag, describing the information it is sending. The tag allow the client to process the information properly. The name for these tags is MIME types, Multipurpose Internet Mail Extension types, descriptions originally meant for use with e-mail attachments.

There are standard MIME types every graphical browser uses. HTML documents have the MIME type text/html. GIF graphics have the MIME type image/gif, and JPEG graphics have the MIME type image/jpeg. Every document must have a MIME type in order for a browser to know how to handle it.

In the example above, Lehigh's web server sent a Microsoft Word document to your browser. The browser knew how to handle the document (treat it as a Microsoft Word document) only because the server had tagged the document with MIME type application/msword. Furthermore, the server knew to attach the MIME type application/msword to the file because it had the file extension .doc.

So, there is a two step process in order for a client to properly handle the document:

  1. Lehigh's server must attach the correct MIME type, based on the file extension.
  2. The client must know how to handle a document of the given MIME type.
It may already be clear at this point that the function of the .htaccess file, in this case, is to teach the server new MIME types. Lehigh's server already knows how to handle many file types, and attaches the correct MIME types automatically; the list of MIME types and file extensions it is configured to use automatically is in the file MIME types. If you wish to use MIME types which are not configured already, you must include statements in a .htaccess file which tells the server how to map file extensions to MIME types.

Summary

In order to serve nonstandard document types:
  1. Configure the client: configure the browser to launch an appropriate helper application for MIME type. Note that this must be done for each person who wishes to view the document.
  2. Configure the server: if the MIME type you wish to use is not in Lehigh's server's configuration, add an .htaccess file must define the proper MIME type.

Combining .htaccess File Options

A .htaccess file can contain both AddType and <Limit> statements:
AddType application/powerpoint .ppt
AddType application/msword .doc
<Limit GET>
order deny,allow
deny from all
allow from lehigh.edu
</Limit>
This file would both limit access to Lehigh, and allow serving Word and PowerPoint documents.


References

Both of the following documents are part of the documentation for NCSA httpd, the server Lehigh uses as its web server:

http://hoohoo.ncsa.uiuc.edu/docs/tutorials/user.html
http://hoohoo.ncsa.uiuc.edu/docs/setup/access/Overview.html


Written by drb2@lehigh.edu.
Last modified by drb2@lehigh.edu on 12 March 1997.