Web Basics (Day 1 of 1) Fundamental Concepts

What is a Network?
What is the Internet?
What is the World-Wide Web?
Previous... | Next...

What is a Network?

A network is a set of computers that have been interconnected so that they can communicate with one another, and thereby share information and resources.

Physical Infrastructure

Diagram of Example Network

Networks come in various sizes. In a small network, sometimes known as a "Local Area Network" or LAN, all of the computers are connected to a single "hub" that allows them to communicate with other computers on the same network. Larger networks can be created by connecting two or more smaller networks together. Such interconnected networks are known as "internets" (small "i"). In such cases, each smaller network is referred to as a "subnet." Special devices known as "routers" connect these subnets to a high-speed backbone, or to external long-haul networks, as shown in the schematic here. Notice that this means that a particular computer can participate in both local and wide-area networking at the same time, using the same hardware.

Information Flow

To properly understand modern computer networks, it is important to recognize some things about the way information moves within the network. Unlike the early telephone system, which used special switches to create a dedicated circuit between a sender and a receiver (this is called a "circuit-switched network"), computer networks use a method called "packet switching." What this means is that every communication is broken down into small units, called packets, that are independently routed to their destinations.

Every computer is not directly-connected to every other computer; instead, information must often pass through a series of routers and/or other computers in order to reach its destination, specified in terms of an address. Each computer on the network (each "node") has a globally-unique address; no other computer anywhere in the world has this same address.

In some cases, there may be more than one path to a given destination. Different packets, even from within the same message, may take different paths to reach their destinations (and may not necessarily arrive in the same order they were originally sent in). Some packets may be lost or become corrupted in transit. The network is responsible for sorting all of this out and making sure that every packet arrives intact (including being sent again, if necessary) and is returned to its proper order within the original message.

This is a little bit like taking a large cargo, loading it onto trucks, and sending it across country via the highway system. A good dispatcher can compensate for slow roads and places where bridges are out by routing traffic around these. Rather than clogging one small road, the trucks can be divided among several parallel routes, and all can thus reach their destinations faster. The end result is a system that is robust and highly reliable, because of its flexibility.

Protocols and Layering

One term that is frequently mentioned in connection with discussions about networks is "protocol." In diplomatic circles, this term refers to formalized customs or procedures governing conversations, debates, meetings, and other sorts of diplomatic exchange. This includes proper forms of address, ritual greetings ("Good Morning, Your Highness" and so on), who gets to speak first, when one is entitled to make a rebuttal, what sort of terminology is appropriate, and other similar sorts of things.

Well, with respect to computers, the term still means pretty much the same thing. It describes formalized rules or accepted customary usage governing communications between computers. The exact sequence of messages used to request, establish, maintain, or break off communication, and the format of their contents, is specified by a protocol. This is what makes computer-based communication possible. If both the sending and receiving computer programs didn't agree on how they were going to communicate, each would see the other's messages as nothing more than gibberish.

But the rules aren't specified all at once, at one level. Just as a manager might delegate certain details to subordinates, for example specifying that a package be delivered, but leaving the decision of whether to use Federal Express or UPS or some other shipper up to an assistant, computer protocols are divided into several layers. Lower-level protocols specify details more closely related to the workings of the hardware, while higher-level protocols specify broad functions more closely related to behavior that a typical user would see. All of the protocols at the various levels need to work together, however, and so these related protocols are seen as belonging to a family: higher-level protocols built upon services provided by lower-level protocols.

What is the Internet?

"The" Internet (capital "I" this time) is the global information system consisting of computers on various interconnected networks using the TCP/IP family of protocols. Notice that the Internet is also an internet, according to our earlier definition, but that the reverse is not true. There are many other large networks (such as those operated by America Online and Compuserve, among others), which, whether they connect to the Internet or not, are not part of it (because they are based upon protocols other than TCP/IP). There are also private TCP/IP-based networks that are not connected to the Internet.

TCP/IP Protocol Family

The layers used in the TCP/IP family of protocols are shown in the table below. In fact, the family takes its name from two protocols at different levels: the transport-level protocol TCP, or "Transmission Control Protocol" and the network-level protocol IP, or "Internet Protocol." The protocols most users deal with directly, naturally, are the application-level protocols. Occasionally, however, it is useful to be aware of the other layers.

Application Telnet, FTP, SMTP, etc.
Transport TCP, UDP
Network IP
Link Ethernet, Token-Ring, PPP

Starting from the bottom and working upward...

Link-layer protocols deal with physically interfacing with the medium (such as coaxial cable) used to interconnect the computers into a network. This includes the operating system device driver software and the network interface card. Notice that different portions of the Internet may actually be built upon different physical network types.

Network-layer protocols deal with assigning addresses to all of the computers on the Internet, and with the movement of packets between source and destination. Routing of packets takes place at this level. (There are some other protocols at this level, such as ICMP, that are used for managing routers.)

Transport-layer protocols deal with data flow between computers. Dividing messages into packets, reassembling messages from packets, acknowledging receipt of packets or arranging for retransmission of missing or damaged packets, all take place here. TCP, in particular, uses the IP packet system to create reliable connections. (UDP provides faster, but not necessarily completely reliable, connections for specialized purposes like real-time audio or video, where occasional packet loss would not be disastrous.)

Application-layer protocols deal with the specifics of a particular application, such as electronic mail. Each application corresponds to a particular type of service supported by the Internet.

Client-Server Computing

An important aspect of most Internet applications is the fact that they are all based on a client-server model. This is basically a divide-and-conquer strategy for managing information and communication resources. In this division of labor, one program, known as "the server" is seen as holding, or controlling access to, some resource (or alternatively as providing some service). Another program (usually, but not always, on another computer) known as "the client" is seen as making a request for, and becoming the recipient of, this resource or service. Notice that the complete application involves both the client and the server; neither alone is sufficient.

Typically, the user interacts directly with the client program on his or her local computer, and uses it to retrieve information from some remote computer that is running the server program. Often, the server program is little more than a robot: it is continuously running, and it sits and waits for incoming requests, which it then attempts to fulfill. The "conversation" between client and server is a series of requests and responses: the client asks for something, and the server either satisfies the request (by providing whatever-it-is that was asked-for) or returns an error message indicating why it can't.

Sideline: Domain Name Services

One of the most fundamental of the underlying Internet services is one most users take for granted; they usually aren't aware they are using a service at all. This is DNS, the Domain Name Services system. DNS is what makes it possible to specify a computer by name, for example "cs1.cc.lehigh.edu", instead of by number (128.180.1.27, in the case of cs1).

In the discussion above, we mention that addresses are assigned to individual computers throughout the Internet at the network layer. These addresses are known as IP addresses, and all routing of packets from one computer to another on the Internet uses them to specify source and destination. Whenever you specify a particular computer by name, this name must first be translated (the technical term is "resolved") into an IP address. So, for example, if you try to connect to a remote computer by name using Telnet, you are actually using two services: Telnet and DNS. Telnet requests DNS resolution to obtain an IP address it can use to find the remote computer it is supposed to connect to.

There is no centralized listing of all of the computers on the Internet, either by name or in any other form. DNS is a distributed database of names. DNS servers at many sites around the globe contain information about the computers at that site, plus information about other nearby DNS servers which can be queried about computers outside the site (Lehigh maintains two official DNS servers as part of its campus network). DNS servers also cache (remember) addresses for computers that have been requested recently, in case they are needed again. Requests for distant computers are passed along until an address is found, a server is found that can definitely assert that no such machine exists, or a specified timeout period expires (this last case means that sometimes the system will fail to find a distant computer which does exist--in such instances, because of caching, trying again may solve the problem).

Sideline: Local Area Networks

One common source of confusion about networking concerns the way Local Area Networks (LANs) fit into the picture. It's not as hard as it seems. There are several areas of difference, and they are much more significant than the areas of overlap. LANs, per se, are not considered to be part of the Internet, even though the same physical hardware (Ethernet cable and network interface cards, mostly) may be used to access both.

For one thing, LANs are typically used to share different kinds of resources than are made available by Internet applications: LANs are often used to provide access to shared applications programs (which are run directly from the LAN as if they were on the user's hard drive), shared disk space for common storage, and shared peripheral devices (such as printers and scanners). LANs (as the name implies) are typically used for creating smaller networks, usually within a single organization (or department). They can be also be used to provide internal electronic mail and messaging services, but Lehigh does not use them in this way, preferring to use Internet services for this.

More importantly, a LAN uses different protocols for moving information around the network than do Internet applications. LANs are also packet-based, and may use the same link-layer protocols (such as Ethernet), along with the same underlying hardware, but all higher-level functions are provided using different protocols. In Lehigh's case, where our LANs are based on Novell Netware, the LAN protocol is called IPX. Both types of packets (IP and IPX) circulate through the network simuataneously; so a given portion of the network may be both a subnet of the Internet and a LAN at the same time (in fact, a single user may be actively using both types of functions at the same time--for example, loading and running an application program, such as Telnet, from the LAN in order to access an Internet service), but that doesn't mean that these two things are the same thing.

What is the World-Wide Web?

The World-Wide Web is a collection of documents and services, distributed across the Internet, and linked together by hypertext links into an interconnected whole.

As you can see from this definition, the web is not the same thing as the Internet, but it is definitely related. It is a subset of the Internet, taken from a particular point-of-view.

Much of what is available via the web consists of web documents (sometimes misleadingly referred to as "home pages"). These are special multipart documents (documents which, although they are viewed as a single enitity, actually consist of several separate and distinct files), which can incorporate hypertext features (links that lead you from document to document) and multimedia (special non-textual features, such as graphics, animations, video, sound, and interactive elements, including specially-embedded programs). The base document of such a multipart document is a file which is mostly just text, plus some specialized commands (called "markup") which describe the structure of the overall compound document and determine where and how the other components (such as image files) are to be embedded within it. It is this markup which also creates the links between documents. The rules governing this markup forms a simple language called HTML (which just stands for "Hyper Text Markup Language").

However, the web incorporates more than just a set of documents of a special type. It also provides a way of accessing various Internet services, many of which were not specifically designed for use with the web. It is therefore also a new form of user interface to many different types of information available on the Internet.


Top of Page...  Previous... | Next...

URL: http://www.lehigh.edu/~inwww/seminar/basics/net-intro.html
Copyright © 1997 Lehigh University Information Resources