Restricting Access and Serving Nonstandard Document Types
in your web space
Technical Bulletin 15
updated - 3/02 jah
Table of Contents
- Introduction
- General Security Issues
- Restricting Access to Lehigh
- Restricting Access by Lehigh ID and password
- Restricting Access by Password
- Combining Access Restriction Types
- Serving Nonstandard Document Types
- Combining .htaccess File Options
Introduction
Lehigh's web server will generally serve files to anyone on the Internet
who requests them. For most purposes, this is good. But what if you would
like to restrict access to certain files? Three types of restrictions, to
those at Lehigh, to those who have a Lehigh (AFS) ID and password, and to
those who have an ID and password you've given them, will be discussed below.
Lehigh's web server, like most web servers, generally serves HTML (HyperText
Markup Language) files. It also serves JPEG and GIF graphics files, and a
few other file types, as standard fare. But what if you would like to share,
say, Microsoft Word or PowerPoint files via the Web? This can be done, but
requires configuration on both the server end and the client end.
In each of the above two cases, you wish Lehigh's web server to consider
certain documents as extraordinary, and not treat them in the standard way.
One or more files called .htaccess placed in your web data directories
can instruct Lehigh's server to treat documents in a nonstandard way, allowing
you to both restrict access to documents and to serve nonstandard document
types. (Note that UNIX treats files which begin with a period (.)
as hidden, including the .htaccess file. The hidden files will
not appear in the standard directory listing--ls--but can be seen
with a complete directory listing--ls -a.)
Notation conventions used in this document include:
- Input and output likely to be seen on a command line are shown in
a typewriter-style font, like command.
- If part of a command line needs particular information you must
specify, it is emphasized, like command your_userid.
General Security Issues
Lehigh's web server has access to all the files you give it access to,
and it will serve those documents to anyone who asks to see them. Which documents
are they? Generally, all the files in the public directory within
your home directory in AFS file space. You probably store all of your WWW
files in your public/www-data/ directory, because of the convention
on Lehigh's web server that ~your_userid/ is a shortcut
for your_userid/public/www-data/.
AFS permissions
Generally, Lehigh's web server has access to all of the files in the public
directory within your home directory in AFS file space. This is because the
AFS permissions have been automatically set for you to allow the web server
access.
The AFS permission which allows the web server access to your files is
the permission system:anyuser rl. (You can look at the AFS permissions
for a directory by executing the command fs la in that directory.)
This means that any files to which the web server has access are also accessible
to anyone who has access to the AFS file system. This includes Carnegie
Mellon University, Michigan State University, and, of course, everyone at
Lehigh.
Therefore, the "restricted" access which we will discuss in the rest of
this bulletin should be understood in the correct context: Lehigh's web
server will honor these restrictions. Lehigh's web server, though, has nothing
to do with AFS access or permissions (except that it must have permission
to see the files), so the files may still be seen by any AFS user.
NOTE: this means that anyone at Lehigh can see any
of your web pages, by going into AFS, whether or not restrictions are placed
on those pages or directories
The index.html file
You may have noticed that sometimes Lehigh's web server returns the index
to a directory.
Specifically, the server generates an index when the path specified in
the URL is a directory and when one more condition is satisfied--the
directory does not contain a file called index.html, or a pointer
to another index file.
If, on the other hand, the directory does contain a file called index.html,
the server returns the index.html file rather than generating an
index.
Use of the index.html file can be considered a security feature:
it keeps people from looking at directory of your files through a web browser.
If you have set up your webspace since October 15, 1999, it will include
a symbolic link that aliases your userid.html file to index.html.
You can remove this symbolic link, if you choose, by going into your webspace
(from the Network Server, enter shell, then cd webspace) and
entering rm index.html.
A cleaner way to set the index file to be your home page, if you already
have a home page home.html in the same directory, is to add a DirectoryIndex
command to a hidden .htaccess file in your directory, so:
DirectoryIndex home.html
Now, instead of looking for index.html, the server will look for home.html.
This command is 'inherited' by all the subdirectories under that directory,
so you may want to include an order of precedence:
DirectoryIndex home.html index.html mine.html
This means that the server, in that directory and any subdirectories,
will first look for a file called home.html, then for index.html, and finally,
as a last resort, mine.html.
Restricting Access to Lehigh
The hidden file .htaccess can be used to restrict access to your
files, so that only people at Lehigh can view them.
If there is a file called .htaccess in the directory containing
the requested file, or in any parent directory, the rules contained in the
.htaccess file apply to the requested file. The directory containing
the file in the example above has a .htaccess file
which restricts access to that directory so that only people accessing the
page from Lehigh can see it. This .htaccess file contains:
<Limit GET>
order deny,allow
deny from all
allow from 128.180
</Limit>
This file is specifying an algorithm: Deny access to everyone, unless
they are from 128.180, meaning that their IP address begins with
128.180 (which covers all computers at Lehigh).
To restrict access to the files in a directory, create a file called .htaccess
in that directory which contains the text shown above.
Restricting Access via AFS (Lehigh) ID and password
All Lehigh users have an AFS ID and password. That AFS ID is
the same one they use to check their mail, log into the Network Server,
use the Compute Servers, etc. Therefore, it is possible to restrict use of
files in a directory to users who prove that they are valid Lehigh users
by entering their AFS ID and password.
Example
|
The URL http://www.lehigh.edu/computing/web/limits/afs/
can only be viewed by people with Lehigh Userids |
If there is a file called .htaccess in the directory containing
the requested file, or in any parent directory, the rules contained in the
.htaccess file apply to the requested file. The directory containing
the file in the example above has a .htaccess file which restricts
access to that directory so that only people accessing the page from Lehigh
can see it. This .htaccess file contains:
AuthExternal afs
AuthName AFSauthentication
AuthType Basic
require valid-user
This file is specifying an algorithm: Deny access to everyone, unless they
can autheticate with an AFS ID and password. The browser will show a window
or entry box prompting the user for AFS ID and password. If AFS says that
combination is correct, the user is allowed access. (Note: the AFS ID is
not recorded anywhere.)
To restrict access to the files in a directory, create a file called .htaccess
in that directory which contains the text shown above.
You can also specify that only specific people at Lehigh be allowed to
access your pages:
AuthExternal afs
AuthName AFSauthentication
AuthType Basic
require user userid1 userid2 userid3
Where the userids are the Lehigh userids of the people you wish to allow
access to. (To create a file of people to allow access, create a group file
(described below), point to it in the .htaccess file (AuthGroupFile, described
below) and change the require line to: require group groupname.
Restricting Access by Password
An example
The .htaccess file can also be used to restrict access to your
files to individuals you have given an name and password.
Example
|
- The directory http://www.lehigh.edu/comp/web/limits/password
is restricted to those who can supply the correct name and password. In this
example, the name is jiminy and the password is cricket.
- If you follow the above link or otherwise try to access this restricted
directory, you will come to the following dialog box, requesting name and
password:
- After providing the correct name and password, you will be
served the requested document, in this case, an index of the directory restricted/.
|
Setting up the .htaccess file
If there is a file called .htaccess in the directory containing
the requested file, or in any parent directory, the access rules contained
in the .htaccess file apply to the requested file. The directory
containing the file in the above example has a .htaccess file which
restricts access to that directory to those who can supply the correct name
and password. In this case, the .htaccess
file contains:
AuthUserFile /afs/cc/home/inlts/public/comp/web/limits/restricted/.passwordfile
AuthName TB15RealmA
AuthType Basic
require user jiminy
This .htaccess file requires some explanation:
- AuthUserFile defines the file which contains the name and
password information. The following section discusses this file.
- AuthName defines the name of the authentication realm.
To understand the function of the authentication realm, it helps to understand
that the server requests authentication for every document which
is in a protected area. You don't need to provide the authentication name
and password every time, though, because your browser stores the name and
password after you enter it, and provides it automatically to the server
on subsequent requests. If you go to a second protected area, your browser
will ask for the new name and password. How does the browser know which
name and password go with which protected area? By distinguishing between
different authentication realms, which should have different names specified
by AuthName. - AuthType Basic is specified. This
is the only authorization type described in HTTP (HyperText Transfer Protocol).
Create your .htaccess file as follows:
- AuthUserFile file
Substitute the name of your password file. (The next section discusses
how to create this file.) Keep the password file in a restricted area. The
password file name must start with /afs/cc/home/..., and
most likely starts with /afs/cc/home/your_userid/public/www-data/...
- AuthName realm
Use a name which is descriptive of the restricted area. - AuthType
Basic
- require user username
username here is the one which users must enter to access
the restricted area.
Setting up the password file
The password file is not created manually, but with a command run at the
UNIX command line. This command, htpasswd, can be run on the Compute
Server (use SSH to connect to CS/Rigel) but not on the Network
Server or the SGI workstations.
Example
|
In order to create the jiminy/cricket username/password
combination, I did the following:
>htpasswd -c .passwordfile jiminy Adding password for jiminy. New password: Re-type new password: >
(Feel free to look at the resulting password file,
if you like. You will see that the passwords are encoded, so they cannot be
read directly.) |
A few comments on this example:
- In the example above, -c was necessary to create a new file;
by default, the htpasswd command adds username/password combinations
to an existing file.
- I specified the username on the command line, and was prompted for
the password.
- The file .passwordfile is created in my present working
directory.
Create your password file using the htpasswd command:
- htpasswd -c file user.
After setting up both the .htaccess file, and the password file
referred to in it, your the files in the directory with the .htaccess
file will be restricted.
Restricting Access by Group
Access can also be restricted to a group of users, each of whom has their
own individual passwords.
Three files are required to set up access restriction by group:
- The .htaccess file, as we've seen before.
- A group file, which is new, defines the users who are in
the group.
- A password file, as we've also seen before. In this case, though,
the password file must contain an entry for each user.
In this case, the .htaccess
file contains:
AuthUserFile /afs/cc/home/inlts/public/www-data/comp/web/limits/group/.passwordfile
AuthGroupFile /afs/cc/home/inlts/public/www-data/comp/web/limits/group/.groupfile
AuthName TB15RealmB
AuthType Basic
require group pep-boys
The group file defines the group. In this case, the group file .groupfile contains:
pep-boys: manny moe jack
The password file .passwordfile
used in this case was created with a sequence of three UNIX commands:
htpasswd -c .passwordfile manny
htpasswd .passwordfile moe
htpasswd .passwordfile jack
In order to restrict access by group, then:
- Create an .htaccess file as described above, under Setting up the .htaccess file, but with two
changes:
- Add a line:
- AuthGroupFile groupfile
- Change the line within the <Limit> tags from
- require user username
to
- require group groupname
- Create a password file as described above, under "Setting up the
password file."
- Create a group file, with the name given in the AuthGroupFile
line, defining the users who are in the group. Use the example as a template.
Combining Access Restrictions
It is possible to provide multiple tests for access to a directory. Entering
multiple restrictions in order, along with a satisfy stipulation, will do
it. To make the user satisfy all of the requirements, include a 'satisfy
all' line; to allow the user access if satisfying one of the requirements,
include 'satisfy any'
In order to restrict access to people with AFS IDs and passwords, who also
happen to be coming in from Lehigh IP addresses:
<Limit GET>
order deny,allow
deny from all
allow from 128.180
AuthExternal afs
AuthName AFSauthentication
AuthType Basic
require valid-user
satisfy all
</Limit>
To require anyone not coming in from Lehigh to give a correct AFS ID and
password for access:
<Limit GET>
order deny,allow
deny from all
allow from 128.180
AuthExternal afs
AuthName AFSauthentication
AuthType Basic
require valid-user
satisfy any
</Limit>
Serving Nonstandard Document Types
Lehigh's server can serve nonstandard document types, such as Microsoft
Word or PowerPoint documents to any user on the Web. Using the server to do
this requires configuration of both the web server and the user's client.
The server setup uses the .htaccess file mechanism. Setting up
the client is dependent on the client being used; we will discuss setting
up Netscape 2.0, though other client setups will be similar.
Configuring the client
Before you can see this in action, you must have your client (browser)
configured. Therefore, we will first go through the configuration of one
browser, Netscape 2.0:
- Under the Options menu, choose the General Preferences
item.
- In the Preferences window, choose the Helpers tab.
- Press the Create New Type button.
- In the Configure New Mime Type window, for Mime Type,
type application, and for Mime SubType, type msword
then press the OK button.
- In the File Extensions field, type doc.
- In the area labelled Action, select the Launch the Application,
and press the Browse button and find an application which will be
used to view the Microsoft Word documents. Appropriate applications would
be the Microsoft Word Viewer, Microsoft Word, or another application (such
as Word Perfect) which can interpret Microsoft Word documents.
Configuring browsers other than Netscape 2.0 is very similar.
Configuring the server: MIME Types
Another .htaccess file is required to configure the server. If
there is a file called .htaccess in the directory containing the
requested file, or in any parent directory, the access rules contained in
the .htaccess file apply to the requested file. In the case of the
above examples,the .htaccess
file reads as follows:
AddType application/powerpoint .ppt
AddType application/msword .doc
AddType application/wordperfect .wpd
Whenever a web server sends information to a web client, it first sends
a tag, describing the information it is sending. The tag allow the client
to process the information properly. The name for these tags is MIME types,
Multipurpose Internet Mail Extension types, descriptions originally meant
for use with e-mail attachments.
There are standard MIME types every graphical browser uses. HTML documents
have the MIME type text/html. GIF graphics have the MIME type image/gif,
and JPEG graphics have the MIME type image/jpeg. Every document
must have a MIME type in order for a browser to know how to handle it.
In the example above, Lehigh's web server sent a Microsoft Word document
to your browser. The browser knew how to handle the document (treat it as
a Microsoft Word document) only because the server had tagged the document
with MIME type application/msword. Furthermore, the server knew
to attach the MIME type application/msword to the file because it
had the file extension .doc.
So, there is a two step process in order for a client to properly handle
the document:
- Lehigh's server must attach the correct MIME type, based on the file
extension.
- The client must know how to handle a document of the given MIME
type.
It may already be clear at this point that the function of the .htaccess file, in this case,
is to teach the server new MIME types. Lehigh's server already knows how
to handle many file types, and attaches the correct MIME types automatically;
the list of MIME types and file extensions it is configured to use automatically
is in the file MIME types. If you wish to use
MIME types which are not configured already, you must include statements
in a .htaccess file which tells the server how to map file extensions
to MIME types.
Summary
In order to serve nonstandard document types:
- Configure the client: configure the browser to launch an appropriate
helper application for MIME type. Note that this must be done for each person
who wishes to view the document.
- Configure the server: if the MIME type you wish to use is not in
Lehigh's server's configuration, add an
.htaccess file must define the proper MIME type.
Combining .htaccess File Options
A .htaccess file can contain both AddType and <Limit>
statements:
AddType application/powerpoint .ppt
AddType application/msword .doc
<Limit GET>
order deny,allow
deny from all
allow from lehigh.edu
</Limit>
This file would both limit access to Lehigh, and allow serving Word and
PowerPoint documents.
References
Both of the following documents are part of the documentation for NCSA
httpd, the server Lehigh uses as its web server:
http://hoohoo.ncsa.uiuc.edu/docs/tutorials/user.html
http://hoohoo.ncsa.uiuc.edu/docs/setup/access/Overview.html
Written by drb2@lehigh.edu.
Last modified by drb2@lehigh.edu on 12 March 1997.