Overview of the Core
The main concept in the Library is a "request/response" model where an
application issues a request for a URI (URL). The Library then tries
to fulfill the request as efficient as possible either by requesting
the URL at the origin server, a proxy server, a gateway, directly from
the local file system, or a locally cached version. Data is delivered
back to the application as soon as it gets ready which guarantees
minimum access delay for the application. From version 3.0, the
Library supports threads including its own platform independent thread
model called "libwww threads". This allows
multiple requests to be handled simultaneously without blocking the
application while waiting on data.
Requests and Responses
The "request/response" model is illustrated in the control/data
diagram shown below. The diagram shows only the core modules - the
other modules are "pasted in" later. Note, that the Library code is to
the right of the thick vertical line (green), and the application to
the left can be any type of application, for example a proxy or a
client. The architecture of the Library does support clients and
proxies in pretty much the same way as it makes little difference to
the Library: a client has a user interface whereas a server has a
network interface. It is a good idea to study the Line Mode Browser and the httpd as reference implementations using
the Library to see this duality.
Another thing to note is that the Library from version 3.1 supports
large scale data flow from the application to the network as well as
from the network to the application. This has an important impact on
the functionality that can be put into applications, for example
allowing collaborative authoring possibilities via the Web. The
architecture behind this is described in the section "Post Webs - an API for PUT and POST".

The thin lines (red) is control flow, the thick lines (blue) is data
flow and the "lightning" (magenta) is control flow as a result of
events handled by the Library. Let's see what happens when an
application issues a request. The description is based on having an
event loop - this can either be the one provided by the Library or an
external event loop provided by the application. The section on libwww threads explains more on how this can
be set up. The numbers refer to the figure above.
- The event manager is waiting for an event from the application.
This can for example be a user clicking the mouse on a link or types a
number on the keyboard.
- When an event arrives, the event manager calls the user event
handler provided by the application.
- The user event handler issues a request by calling the access
manager.
- The access manager contacts the cache manager to see if the object
is already cached. If data is to be sent to the network (for
example using the HTTP PUT method) then the cache manager is not
requested.
- If the cache manager says "no" then the protocol manager is
contacted to down load the object. If "yes" then the cache file is
accessed.
- The cache manager can also contact the protocol manager directly
if the cached object turns out to be stale or a reload has explicit
requested by the application.
- If the protocol manager successfully can access the data object
then the cache manager is contacted in order to cache or refresh the
object.
- When data is arriving, either from the cache manager or the
protocol manager it is passed to the format manager that handles any
data format conversion as requested by the application.
- The protocol can recursively call the access manager in case of
redirections and inadequate access authentication for the request
(after prompting the user).
- The converted data is either handed from the network to the
application or from the application to the network as it gets
ready. If no data is ready, control is given back to the event
manager.
- When data is ready to be sent or received from the network, the
event manager calls the protocol manager directly to handle the data.
- When the request is terminated the application is called with the
result of the request so that it for example can update a history list
of visited documents.
This description is the "macro" description of how the core modules
interact and in the rest of this document we shall see more of the
details of what is going on inside the core modules and what data
structures are involved. Note that by using a threaded model, the
Library can handle multiple requests simultaneously. An example on how
to do this is described in the section "Libwww
Threads".
- Access Manager
- The access manager is the main entry point for requesting a data
object pointed to by a URI. It has a set of methods that allows the
application to request different services, for example to get a URI,
post a URI, or to search a URI. When the application issues a request,
the access manager does the following:
- Translates the URI according to the rules given, for example by a
rule file. It also looks for gateways or proxies that should be
contacted for a specific access method. Rules can be registered
dynamically as described in the User's Guide.
- If the request is on the local file system, the access manager
verifies that access to local files is allowed. This might not always
be the case, as is the case when the Line
Mode Browser is used as a login shell for telnet sessions.
- Then the cache manager is contacted to see if the object
already has been accessed. The application might administer a memory
cache in which cache this is consulted before the cache.
- If the data object is not cached then the protocol module is
called to actually perform the access to the network.
- When a request is to be terminated, the access manager can log the
result of the request to a local file so that the "browse route" can
be reconstructed.
- Protocol Manager
- The protocol manager is invoked by the access manager in order to
access a document not found in memory or in cache. The manager
consists of a set of protocol modules handling the access schemes
HTTP, FTP, NNTP, Gopher, WAIS, Telnet, and access to the local file
system. The protocol modules are registered dynamically (using static
linking) and the User's Guide describes how
modules can be registered. Each protocol module is responsible for
establishing the connection to the remote server (or the local
file-system) and extract information using a specific access
method. When data arrives from the network, it is passed on to the
format manager.
- Format Manager
- The stream format manager takes care of the transportation of
streams of data from the network to the application and vice versa. It
also performs any parsing and data format conversion requested based
on a set of registered format converters and a simple algorithm for
selecting the best conversion. As the protocol modules, data format
converters can be registered dynamically, and the current set of
streams includes among others: MIME, SGML, HTML, and LaTeX.
- Cache Manager
- The cache manager is used to save data objects once they have been
down loaded from the network. The cache uses the hierarchy indicated
in the URLs as a way to identify items in the cache but is still under
construction and requires a lot of work to be a highly efficient cache
manager!
- Error Manager
- This module manages an information stack which contains
information of all errors occurred during the communication with a
remote server or simply information about the current state. Using a
stack for this kind of information provides the possibility of nested
error messages where each message can be classified and filtered
according to its impact on the current request, for example "Fatal",
"Non-Fatal", "Warning" etc. The filtering can be used to decide which
level of messages will be passed back to the user.
- Net Manager
- The net manager provides an interface for handling asynchronous
sockets which is an integral part of the Library.
- Event Manager
- The event manager is a "session layer" handling which thread
should be the active thread. A thread can either be an internal libwww
thread or an external thread, for example a Posix thread, and the
event manager can itself be either the internal Library manager or an
external event manager. Currently the internal event manager uses a
select function call to decide which thread should be made the active
one, however an external event manager can use another decision
model. One of the design ideas behind the event manager is that it can
be extended to a full session layer manager handling for example the
control of a HTTP-NG connection. The event manager is described
together with the internal thread model in the section "Libwww Threads".
Henrik Frystyk, libwww@w3.org, November 1995