next up previous
Next: 2. Design Up: Usenetfs: A Stackable File Previous: Usenetfs: A Stackable File


1. Introduction

USENET is a popular world-wide network consisting of thousands of discussion and informational ``news groups.'' Many of these are very popular and receive thousands of articles each day. In addition, many control messages are exchanged between news servers around the world, a large portion of which are article cancellation messages generated by anti-spam detection software. All of these articles and messages must be processed fast. If they are not, new incoming articles may be dropped.

Traditional Unix file system directories are structured as a flat, linear sequence of entries representing files. When the operating system wants to lookup an entry in a directory with N entries, it may have to search all N entries to find the file in question. Portions of directories are often cached in the file system, so that subsequent lookups do not have to retrieve the data from disk. Table 1 shows the frequency of all file system operations that use a pathname on our news spool over a period of 24 hours.1 The table shows that the bulk of all operations are for looking up files, so these should run very fast regardless of the directory size.

Table 1: Frequency of File System Operations on a News Spool
Operation Frequency % Total
Readdir 38371 0.48
Lookup 7068838 88.41
Unlink 432269 5.41
Create 345647 4.32
All other 110473 1.38
Total 7995598 100.00

Operations such as creating news files and deleting ones are usually run synchronously and account for about 10% of news spool activity; these operations should also perform well on large newsgroups

These requirements necessitate a powerful news server that can copy memory fast, and have fast disks and I/O. As demands grow, the ability of the news server to process articles diminishes to a point where it starts rejecting or ``dropping'' articles. The effort to upgrade a site's news server is significant: large amounts of data need to be copied to a new server as fast as possible, because while an upgrade is in progress, new articles are not processed and can be lost.

In practice, many sites have resorted to reducing the number of articles in use by removing large newsgroups from their distribution and expiring articles more often, sometimes as often as several times a day. Most site administrators accepted the fact that their news servers will lose articles on occasion.

For example, our department runs an average size news server. We have several hundred users and three feeds from neighboring sites. Our server has had two major upgrades in the past 5 years, and several smaller upgrades in between. The major upgrades were from SunOS 4.1.3, to Solaris 2.x, and the last one was to Linux 2.0. Each major upgrade included news server (INN) software upgrade, a faster CPU, more memory, and more and faster disk space. Our previous news server was running on a Sun SparcStation 5 with 8GB of stripped disk space, 196MB of RAM, and Fast Ethernet. But the CPU and I/O bus had not been able to keep up with traffic, and for the last two years of that server's life, it kept on losing more and more articles. Just before it was replaced, our old news server was dropping 50% of all articles.

A few months ago we upgraded our news server to an AMD K6/200Mhz with faster disks and tripled the overall disk space available. We used the top-of-the-line SCSI cards and Fast Ethernet adapters. We also upgraded the operating system to Linux 2.0.34, because the Linux operating system is a small, fast, and highly optimized for the x86 platform. In addition, Linux's disk based file system (ext2fs) has two features useful for optimizing disk performance:

It can turn off the updating of access times of files in the inodes; access times are not useful for news systems.

While ext2fs' on-disk directory structure is linear, it hashes cached entries in kernel memory for faster access.

Since the upgrade, our new news server had dropped no articles, and has kept up with traffic. However, we have noticed that its network utilization is over 80% and that more disk space is constantly being added. At the current growth rate, we expect it to outrun its capabilities in a couple of years.

1.1 Current Solutions

Several current solutions are available to the problem of slow performance of large directories used with news servers. They fall into one of two categories:

Modified news servers that store articles in an alternate fashion[Fritchie97].

New native file systems that arrange directory entries in a manner that is accessible faster than linear search time[Reiser98,Sweeney96].

These solutions suffer from several problems.

Our approach modifies neither the news server/client software nor the native file systems.

1.2 The Stackable Vnode Interface

Usenetfs is a small file system based on the loopback (lofs)[SMCC92] one. Usenetfs mounts (``stacks'') itself on top of a news spool hierarchy and interfaces between existing news software and disk based file systems, a seen in Figure 1. It makes a hierarchy of many small directories appear to be a single large flat directory.

``Vnode Stacking''[Heidemann94,Rosenthal92,Skinner93] is a technique for modularizing file system functions, by allowing one vnode interface to call another. Before stacking existed, there was only a single vnode interface; higher level operating system code called the vnode interface which in turn called code for a specific file system. With vnode stacking, several vnode interfaces may exist and may call each other in order.

Figure 1: Stacked Vnode File System
\epsfig{file=figures/level-stack.eps, width=3in, height=2.75in}\end{centering}\end{figure}

The Usenetfs and vnode layers in Figure 1 are really at the same abstraction level; each one may call the other interchangeably.

next up previous
Next: 2. Design Up: Usenetfs: A Stackable File Previous: Usenetfs: A Stackable File
Erez Zadok