PHHTTPD
Zach Brown
Copyright © 2000 by Zach Brown
_________________________________________________________________
Table of Contents
1. [1]Introduction
[2]Architectural Overview
[3]Supported Systems
2. [4]Configuration File
[5]Overview
[6]Global config section
[7]Virtual Servers
3. [8]Logging
[9]Overview
[10]Configuration
[11]Format and Strange Behaviour
4. [12]Run Time Facilities
[13]Overview
[14]Log Rotating
[15]Status Reporting
_________________________________________________________________
Chapter 1. Introduction
phhttpd is an HTTP accelerator. It serves fast static HTTP fetches
from a local file-system and passes slower dynamic requests back to a
waiting server. It features a lean networking I/O core and an
aggressive content cache that help it perform its job efficiently.
_________________________________________________________________
Architectural Overview
phhttpd features a very slim I/O core. It does all its networking work
using non-blocking system calls driven by whatever event model is most
appropriate for the host operating system. This allows a single
execution context to handle as many client connections as the event
model dictates.
phhttpd's job is to serve static content as quickly as it possibly
can. To do this it maintains a cache of content in memory. When a
request is serviced, phhttpd saves a reference to the on disk content
and whatever HTTP headers are dependent on the content. Next time a
request for this content is received, phhttpd can service it very
quickly. This cache can be prepopulated-populated at run time, or can
be built dynamically as requests come in. Its size may also be capped
by the administrator so that it doesn't overwhelm a system.
phhttpd is a threaded stand alone daemon. The number of threads is
currently statically defined at run time. Incoming connections are
evenly balanced among the running threads, regardless of what content
they may be serving. Connections are served by the thread that
accepted them until the transfer is done.
_________________________________________________________________
Supported Systems
phhttpd is currently only expected to build and run on Linux systems
using glibc2.1 under a kernel that supports passing POLL* information
over real-time SIGIO signals. This means later 2.3.x kernels or a
2.2.x kernel that has been patched.
I badly want this to change. If you're interested in doing porting
work to other Operating Systems, please do let me know.
_________________________________________________________________
Chapter 2. Configuration File
Overview
phhttpd uses an XML config file format to express how it should behave
while running. More information on XML may be found near
[16]http://www.w3.org/XML/
phhttpd's configuration centers around the concept of virtual servers.
For us, a virtual server may be thought of as the merging of a
document tree and the actions phhttpd takes while serving that
content.
phhttpd.conf may be thought of as having two main sections. The global
section, which defines properties that are consistent across the
entire running phhttpd server, and multiple virtual sections that
describe properties of that only apply to a virtual server. There will
only be one global section while multiple virtual sections are
allowed.
_________________________________________________________________
Global config section
The global section defines properties of the running server that don't
apply to a single virtual server. It should be enclosed in
Global config entities
cache max=NUM
Sets the maximum number of cached responses that will be held
in memory. Each cached responses holds a minimal amount of
memory. More importantly, each cached response holds an open
file descriptor to the file with real content and an mmap()ed
region of that content. phhttpd will start pruning the cache
when it notices either of these two resources coming under
pressure, but has no way to easily deduce that its running low
on memory. The administrator may set this value to set an upper
bound on the number of responses to keep in memory.
control file=PATH
This specifies the file that will be used to talk with
phhttpd_ctl.
globallog file=PATH
This specifies the file to which global messages will be
logged.
mime file=PATH
This specifies the file that contains the mapping of file
extensions to MIME types. It should be of the form:
text/sgml sgml sgm
video/mpeg mpeg mpg mpe
timeout inactivity=NUM
Controls various network connection timeouts. 'inactivity' sets
the amount of time that a connection can be idle before phhttpd
will forcibly disconnect it. inactivity defaults to 0, which
lets the connections idle until TCP timeouts take effect.
sendfile
Enabling this option tells phhttpd to use sendfile() rather
than write()ing from an mmap()ed region. Avoiding calling
mmap() will shorten the amount of time it takes to build cached
responses.
_________________________________________________________________
Virtual Servers
A Virtual Server can be thought of the abstraction of serving up a
content tree ( "docroot" in apache speak). There are a set of
attributes that are used to define a virtual server. These attributes
are used to decide which virtual server will process a client's
request. Then there are attributes which define how the content is
served.
A virtual server must have a docroot. The virtual tag in the config
file has a docroot attribute that must be set.
...
There can be as many virtual sections in the configuration file as one
likes.
Global config entities
md5
This enables the generation of the Content-MD5: header. This
greatly increases the cost of creating a cached response for
this virtual, because the MD5 function must be applied to the
entire content of the response. Once the response is created,
though, there is no per-request overhead.
prepop
This will cause phhttpd to traverse the entire docroot at
initialization time and prepare cached responses for all the
files it finds. This happens in the back ground during normal
operation, so there is no dramatic increase in the time it
takes for phhttpd to start serving connections.
name
This tag surrounds the string that will be used to identify the
server. This string will be compared to the Host: header given
in the request from the client, or will be compared to the
'host part' of the full URL if that was given. This will be
used in combination with the network address and port pair to
determine if a request should be served by a virtual server.
listen v4=DOT.TED.QU.AD port=PORT
This virtual server will be chosen to serve an incoming request
if that request was made to the network address specified in
this entity. There can be as many of these as one likes in a
given virtual server, and '*' may be specified for either
parameter to indicate that all addresses or ports should match.
logs
The logs section of the virtual server define the per virtual
log files that should be written to during operation. See the
following section on logging.
_________________________________________________________________
Chapter 3. Logging
"All kids love log!"
_________________________________________________________________
Overview
phhttpd maintains log buffers for each log it writes too. Logged
events are put in these buffers at reporting time rather than being
immediately written to disk. These logs are written as they are filled
during normal operation, or at regular intervals. This greatly reduces
the performance impact of keeping detailed logs.
_________________________________________________________________
Configuration
phhttpd keeps interesting logs on a virtual server granularity. The
action of recording lots is specified by including an entity in the
log section of a virtual for the log source that wants to be kept.
There is an entity for each source of logging, and attributes to that
entity define where it is logged to. It looks something like this:
...
mode is the octal permissions mode of the file that is to be opened.
As it is parsed by dumb routines, a leading 0 is highly recommended.
fileis the file the logged events will be written to. The LOG_SOURCE
is one of:
access Successfully answered requests
agent The value given in the 'User-Agent' HTTP request header
referer The string given in the 'Referer' HTTP request header
_________________________________________________________________
Format and Strange Behaviour
phhttpd log entries are contained with a single line in a text file.
They contain the time the log entry was written, an opaque token that
is associated with the connection that caused the log entry, followed
by the actual entry.
The contents of the 'referer' and 'agent' log entries is simply the
string that was given with the header. The contents of the 'access'
log is a little more interesting. It has the decoded relative URL that
was asked for, followed by the total bytes that were transfered, and
the time in seconds that it took to transfer.
387f7a45 387f7a45800210ac8910500 /index.html - 2132 0
is an entry from an 'access' log.
The first field is the time in seconds since the Unix epoch, a.k.a.
time_t. The second field is associated with the client connection that
caused the log entry. It is constant for the duration of the
connection, and is written to all the logs entries, of whatever type,
that are generated. This allows a log parser to do more complete
connection granularity analysis. As it happens, this opaque token is
currently built up of the time the client was connected, its remote
and local network address, etc, but these values most _not_ be parsed
as they may change in the future.
Entries generated by a thread will be written in chronological order.
If, however, multiple threads are sharing an output file the resulting
entries may not be written in chronological order. It is up to the
parsing programs to use the 'time' field to sort by, if they care
about chronological order.
_________________________________________________________________
Chapter 4. Run Time Facilities
Overview
While phhttpd is running it listens to a 'control' socket for messages
from the administrator. The currently provided phhttpd_ctl program
allows the administrator to minimally interact with phhttpd. This
provides both control and status reporting.
phhttpd_ctl always wants a --control argument that specifies the
control socket of the running phhttpd daemon. This should match the
tag specified in the config file.
_________________________________________________________________
Log Rotating
phhttpd can be told to rotate its logs so that existing logs may be
processed.
The --rotate argument to phhttpd_ctl tells phhttpd to rename the
existing files to a unique name, open new files with the previously
used names, then close the renamed logs and start using the newly
created files. phhttpd_ctl will output the names of the newly created
files which will be safe to use once the command exits.
The --reopen argument to phhttpd_ctl tells phhttpd to close the
existing file logs and reopen the files with the filenames that were
configured. This implies that an external entity has moved the files
to new names and wants phhttpd to stop using them.
_________________________________________________________________
Status Reporting
The --status argument to phhttpd_ctl tells phhttpd to return a quick
status blurb about the server. It contains miscellaneous information
about the running state of the server.
References
1. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN10
2. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN13
3. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN18
4. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN22
5. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN24
6. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN30
7. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN74
8. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN110
9. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN114
10. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN117
11. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN136
12. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN144
13. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN146
14. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN153
15. file://localhost/export/sunsite/users/gferg/work/00_phhttpd-HOWTO.html#AEN163
16. http://www.w3.org/XML/