Caching

his chapter describes how the Netscape Proxy Server caches documents. It also describes how you can configure the cache by using the online forms and how the cache directory structure is maintained automatically by the cache monitor and cache manager.

How caching works

Caching reduces network traffic and offers faster response time for clients who are using the proxy server instead of going directly to remote servers.

When a client requests a web page or document from the proxy server, the proxy server copies the document from the remote server to its local cache directory structure while sending the document to the client.

When any client requests a document that was previously requested and copied into the proxy cache, the proxy returns the document from the cache instead of retrieving the document from the remote server again (see Figure 7.1). If the proxy determines the file is not up to date, it refreshes the document from the remote server and before sending it to the client.

Proxy document retrieval

Files in the cache are automatically maintained by the Netscape Proxy Server cache manager. The cache manager is a completely automated utility that performs cache clean-up on a regular basis to ensure that the cache doesn't get cluttered with out-of-date documents.

Dispersing files in the cache

The proxy server uses a specific algorithm to determine the directory where a document should be stored. This algorithm ensures equal dispersion of documents in the base directories, so the directories contain a small and nearly equal number of documents. Equal dispersion is important for two reasons:

The proxy uses the RSA MD5 algorithm (Message Digest 5) to reduce a URL to 8 characters, which it then uses for the file name of the document it stores in the cache.

The MD5 algorithm reduces the URL to 128 bits (16 bytes) of binary data. The proxy uses 48 bits (6 bytes) of this data to calculate an 8-character file name and determine the storage directory. This method would allow the proxy to cache over 70 million URLs.

Setting cache specifics

You can enable caching and control which types of protocols your proxy server will cache by setting the cache specifics. Cache specifics include the following items:

Note
Setting the specifics for a large cache is time-consuming and may cause the administration interface to time-out. Therefore, if you are creating a large cache, use the command line utilities to set cache specifics.
To set cache specifics:

  1. In the Sever Manager, choose Caching|Specifics. The Cache Specifics form appears.
  2. Change the information.
  3. Click OK.
The following sections describe the items listed on the Cache Specifics form. These sections include information that will help you to determine which settings will best suit your needs.

Enabling the cache

Caching is an effective way to reduce network traffic for users of the proxy server. It also offers a faster response time for clients by eliminating the need to always retrieve a document from a remote server. Your proxy server will function most effectively whenever caching is enabled.

You can enable the cache on the Cache Specifics form.

Creating a cache working directory

If you set up caching during installation, you specified a directory for the proxy's cache structure. This directory is also used as a "working directory" for the cache manager. The working directory is where the proxy puts the temporary files that are related to caching. The actual cache files are under cache partitions. The working directory you specify on the Cache Specifics form is often the parent directory for the cache (though it does not need to be). All cached files appear in an organized directory structure under the caching directory. If you change the cache directory name or move it to another location, you have to tell the proxy the new location.

You can extend the cache directory structure to multiple file systems so that you can have a large cache structure divided on multiple smaller disks instead of keeping it all on one large disk. Each proxy server must have its own cache directory structure--that is, cache directories can't be shared by multiple proxy servers.

You can create the working directory on the Cache Specifics form.

Recording URLs

Your proxy server allows you to record all cached URLs in a URL list. You can identify which directory will hold all cached URL information and enable URL recording on the Cache Specifics form. For information on viewing and editing the URL list, see "Accessing cache manager information" on page 104.

Note
The proxy does not have to record URLs to function properly. This feature exists so that the proxy administrator can view which URLs are in the cache. Continually recording URLs into a list may have an impact on the proxy's performance. To avoid this negative effect on performance, you can disable URL recording on the Cache Specifics form and view or manage URLs in the cache by using the command-line program: extras/proxy/urldbgen. This program generates the URL list on command and does not effect the proxy's performance. See "Repairing the cache URL list" for more information about urldbgen.

Setting the cache size

Cache size is the maximum size the cache is allowed to grow. The maximum cache size is 64 GB. The amount of disk space available for the proxy cache has a considerable effect on cache performance. If the cache is too small, the cache manager program must remove cached documents to make room on the disk more often, and documents must be retrieved from content servers more often; therefore slowing performance.

Large cache sizes are best because the more cached documents, the less the network traffic load and the faster the response time the proxy provides. Also, the cache manager removes cached documents if users no longer need them. Barring any file system limitations, cache size can never be too large; the excess space simply remains unused.

Netscape's proxy caching is designed to work efficiently at any size up to 64 GB. The exact cache size you choose depends on the number of people using your proxy server. For a single user cache, 20 to 50 MB is usually enough. For a proxy that caches a multitude of documents, you might need to allocate an entire 2 GB to 4 GB disk partition for the cache. You can also have the cache split on multiple disk partitions. For more information on partitions, see See "Adding and modifying cache partitions".

You can set the cache size on the Cache Specifics form.

Note
You might encounter problems with caching if the file system where the cache root resides has less disk space than the cache size you specify. Also, note that expanding the cache size requires a hard restart (shutdown/restart) for the changes to take effect.
Warning!
Changing the cache structure after installation requires that you reformat the structure and relocate existing files; therefore causing any alterations to be time-consuming. If you aren't sure what cache size to use, use 2 GB as the default value in the installation forms (this default can hold more than 2 GB of data and can be used with 3 to 5 GB caches).

Editing the cache capacity

You can edit the cache capacity through the Cache Specifics form as well as on the Cache Administration Operations form. For more information on editing the cache capacity, see "Setting the cache capacity".

Caching HTTP documents

Internally, caching HTTP documents differs from caching FTP and Gopher documents. HTTP documents offer caching functionality that documents of the other protocols do not. However, by setting up and configuring the cache properly, you can ensure that your proxy server will cache HTTP, FTP, and Gopher documents effectively.

All HTTP documents have a descriptive header section that the proxy server uses to compare and evaluate the document in the proxy cache and the document on the remote server. When the proxy does an up-to-date check on an HTTP document, it sends one request to the server that tells the server to return the document if the version in the cache is out of date. Often, the document hasn't changed since the last request and therefore is not transferred. This method of checking to see if an HTTP document is up-to-date saves bandwidth and decreases latency.

To reduce transactions with remote servers, the proxy server allows you to set a Cache Expiration setting for HTTP documents. The Cache Expiration setting tells the proxy to estimate if the HTTP document needs an up-to-date check before sending the request to the server. The proxy makes this estimate based on the HTTP document's Last-Modified date found in the header.

With HTTP documents, you can also use a Cache Refresh setting. This option specifies whether the proxy always does an up-to-date check (which would override an Expiration setting) or if the proxy waits a specific period of time before doing a check. Figure 7.2 shows what the proxy does if both an Expiration setting and a Refresh setting are specified. Using the Refresh setting decreases latency and saves bandwidth considerably.

Using the Cache Expiration and Cache Refresh settings with HTTP
Refresh Setting

Always do an up-to-date check

User-specified interval

Expiration

(Not applicable)

Use document's "expires" header

Estimate with document's Last-Modified header

Results

Always do an up-to-date check

Do an up-to-date check if interval expired

Smaller value of estimate and expires header

Using the smaller value guards against getting stale data from the cache for documents that change frequently.

Setting the HTTP cache refresh interval

If you decide that you want your proxy server to cache HTTP documents, you need to determine whether it should always do an up-to-date check for documents in the cache or if it should check based on a Cache Refresh setting (up-to-date check interval). For HTTP documents, a reasonable refresh interval would be 4 to 8 hours, for example. The longer the refresh interval, the fewer the number of times the proxy connects with remote servers. Even though the proxy doesn't do up-to-date checking during the refresh interval, users can force a refresh by clicking the Reload button in the client (such as the Netscape Navigator); this makes the proxy force an up-to-date check with the remote server.

You can set the refresh interval for HTTP documents on either the Cache Specifics form or the Cache Configuration form. For more information on using the Cache Specifics form, see "Setting cache specifics", and for more information on using the Cache Configuration form, see See "Configuring the cache".

Setting the HTTP cache expiration policy

You can also set up your server to check if the cached document is up-to-date by using a last-modified factor or explicit expiration information only.

Explicit expiration information is a header found in some HTTP documents that specifies the date and time when that file will become outdated. Not many HTTP documents use explicit Expires headers, so it's better to estimate based on the Last-modified header.

If you decide to have your HTTP documents cached based upon the Last-modified header, you need to select a factor to use in the expiration estimation. The factor is multiplied by the time between the last modification and the time that the document last had an up-to-date check. Smaller values make the proxy check documents more often. For example, suppose you have a document that was last changed ten days ago. If you set the last-modified factor to 0.1, the proxy interprets the factor to mean that the document is probably going to remain unchanged for one day (10 x 0.1 = 1). The proxy would, in that case, return the document from the cache if the document was checked less than a day ago.

On the other hand, in this same example, if the cache refresh setting for HTTP documents is set to less than one day, the proxy does the up-to-date check after that time has elapsed. The proxy always uses the value (cache refresh or cache expiration) that requires that it update the files most frequently.

You can set the expiration setting for HTTP documents on both the Cache Specifics form and the Cache Configuration form. For more information on using the Cache Specifics form, see "Setting cache specifics", and for more information on using the Cache Configuration form, see See "Configuring the cache".

Reporting HTTP accesses to the remote server

When a document is cached by the Netscape Proxy Server, it can be accessed many times before it is refreshed again. For the remote server, sending one copy to the proxy that will cache it still only represents one access, or "hit." The Netscape Proxy Server can count how many times a given document was accessed from the proxy cache between up-to-date checks, and then send that hit count back to the remote server in an additional HTTP request header (Cache-Info) the next time the document is refreshed. This way, if the remote server is configured to recognize this type of header, it receives a more accurate account of how many times a document was accessed.

You can enable HTTP access reporting on the Cache Specifics form. For more information on using the Cache Specifics form, see "Setting cache specifics".

Caching FTP and Gopher documents

FTP and Gopher protocols do not include a method for checking to see if a document is up-to-date. Therefore, the only way to optimize caching for FTP and Gopher protocols is to set a Cache Refresh interval. The Cache Refresh interval is the amount of time the proxy server will wait before retrieving the latest version of the document from the remote server. If you do not set a Cache Refresh time, the proxy will retrieve these documents even if the versions in the cache are up-to-date.

Setting FTP and Gopher cache refresh intervals

If you are setting a cache refresh interval for FTP and Gopher protocols, choose one that you consider safe for the documents the proxy gets. For example, if you store information that rarely changes, use a high number (several days). If the data changes constantly, you'll want the files to be retrieved at least every few hours. During the refresh time, you risk sending an out-of-date file to the client. If the interval is short enough (a few hours), you eliminate most of this risk while getting noticeably faster response time.

You can set the cache refresh interval for FTP and Gopher documents on either the Cache Specifics form or the Cache Configuration form. For more information on using the Cache Specifics form, see ;"Setting cache specifics", and for more information on using the Cache Configuration form, see "Configuring the cache".

Note
If your FTP and Gopher documents vary widely (some change often, others rarely), use the Cache Configuration form to create a separate template for each kind of document (for example, create a template with resources ftp://.*.gif) and then set a refresh interval that is appropriate for that resource.

Configuring the cache

You can configure the kind of caching you want for specific resources, using the Caching Configuration form. You can specify several configuration parameter values for URLs matching the regular expression pattern that you specify. This feature gives you fine-grain control of the proxy cache, based on the type of document cached. Configuring the cache can include identifying the following items:

To configure the cache:

  1. In the Server Manager, choose Caching|Configuration. The Caching Configuration form appears.
  2. Select the resource you are editing by either choosing it from the Editing pulldown or by clicking the Regular Expression button, entering a regular expression, and clicking OK.
  3. Change the configuration information.
  4. Click OK.
The following sections describe the items listed on the Caching Configuration form. These sections include information that will help you to determine which configuration will best suit your needs.

Setting the cache default

The proxy server allows you to identify a cache default for specific resources. A resource is a type of file that matches certain criteria that you specify. For instance, you may want your server to automatically cache all documents from the domain company.com. If so, you would click the Regular Expression button on the top of the Configuration form and in the field that appears, enter:
[a-z] *://[^/:].company.com.*. Then click the Cache radio button. Your server would automatically cache all cacheable documents from that domain.

Note
If you set the cache default for a particular resource to either "Derived configuration" or "Don't cache", it is not necessary to configure the cache for that resource. However, if you select a cache default of "Cache" for a resource, you can specify several other configuration items. For a list of these items, see "Configuring the cache" on page 97.

You can set the cache default for any resource on the Cache Configuration form. The cache default for the protocols HTTP, FTP, and Gopher can also be set on the Cache Specifics form.

Caching pages retrieved using HTTPS

You can choose to have your server cache files that are retrieved using HTTPS. Because documents that are retrieved using HTTPS are secure, they have to be encrypted by the remote server and then decrypted by the proxy before they are viewed by the client. This process can sometimes slow document retrieval. If clients frequently request a secure document through your proxy, you may want to store it in the cache. By storing the document in the cache, you will avoid the encryption and decryption process; therefore minimizing the time it takes to retrieve the document.

If you do not enable the caching of HTTPS documents, the proxy will assume the default, which is to not cache them.

You can set the policy for caching pages retrieved using HTTPS on the Cache Configuration form.

Caching pages that require authentication

You can choose to have your server cache files that require user authentication. If you choose to have your proxy server cache these files, the server will tag the files in the cache so that if a user asks for them, the server knows that the files require authentication from the remote server.

Because the proxy server does not know how remote servers authenticate and it does not know users' ids or passwords, it will simply force an up-to-date check with the remote server each time a request is made for a document that requires authentication. The user will therefore have to enter his or her id and password to gain access to the file. If the user has already accessed that server earlier in the Navigator session, the Navigator will automatically send the authentication information without prompting the user for it.

If you do not enable the caching of pages that require authentication, the proxy will assume the default, which is to not cache them.

You can set the policy for caching pages that require authentication on the Cache Configuration form.

Caching queries

Cached queries only work with HTTP documents. You can limit the length of queries that are cached, or you can completely inhibit caching of queries. The longer the query, the less likely it is to be repeated, and the less useful it is to cache.

These caching restrictions apply: the access method has to be GET, the document must not be protected (unless caching of authenticated pages is enabled), and the response must have at least a Last-modified header. This requires the query engine to indicate that the query result document can be cached. If the Last-modified header is present, the query engine should support conditional GET method (with an If-modified-since header) in order to make caching effective; otherwise it should return an Expires header.

If you do not enable the caching of queries, the proxy will assume the default, which is to not cache them.

You can set the query cache policy on the Cache Configuration form.

Setting the minimum and maximum cache file sizes

You can set the minimum and maximum sizes for files that will be cached by your proxy server. You may want to set a minimum size if you have a fast network connection. If your connection is fast, small files may be retrieved so quickly that it is not necessary for the server to cache these files. In this instance, you would want to cache only larger files. You may want to set a maximum file size to make sure that large files do not occupy too much of your proxy's disk space.

You can set the minimum and maximum cache file sizes on the Cache Configuration form.

Setting the cache behavior for client interruptions

If a document is only partly retrieved and the client interrupts the data transfer, the proxy has the ability to finish retrieving the document for the purpose of caching it. The proxy's default is to finish retrieving a document if 25% of it has already been retrieved. Otherwise, the proxy will terminate the remote server connection and remove the partial file. You can raise or lower the client interruption percentage on the Cache Configuration form.

Adding and modifying cache partitions

Cache partitions are reserved parts of disks or memory that are set aside for caching purposes. The largest cache capacity is 64 GB with 256 cache sections. If your caching capacity changes, you may want to change or add partitions using the Cache Partition Configuration form. From this form, you can edit a partition's location, mnemonic name, and maximum and minimum sizes. You can also view the cache section table for that partition.

To add cache partitions:

  1. In the Server Manager, click Caching|Partitions. The Cache Partition Table appears.
  2. Click Add Cache Partition.
  3. Enter the appropriate values for the new partition.
  4. Restart the proxy from the command line by going to the proxy directory and typing ./restart.
To modify cache partitions:

  1. In the Server Manager, click Caching|Partitions. The Cache Partition Table appears.
  2. Click on the name of the partition that you would like to change.
  3. Edit the information.
  4. Click Change.
  5. Restart the proxy from the command line by going to the proxy directory and typing ./restart.

Adding and modifying cache sections

The proxy cache is separated into one or more cache sections. You can have up to 256 sections. The number of cache sections must be a power of two (for example, 1, 2, 4, 8, 16, ..., 256).

Each cache section can hold 100 to 250 MB of data; the optimum size is around 125 MB per section. This means that if you pick a cache capacity of 500 MB, the installer will create 4 cache sections (500 ÷ 125 = 4); if you choose a cache capacity of 2 GB, the installer creates 16 sections (2000 ÷ 125 = 16). The smallest available capacity is 125 MB with a single cache section. The largest capacity is 32 GB (optimum) with 256 cache sections which can hold up to 64 GB of data.

Figure 7.3 shows the distribution of a cache capacity of 1 GB. Each cache section is noted by s for section, and then a section number. For example, with s3.4, the 3 indicates the power of 2 for the number of cache sections (23 = 8), and the 4 means the number for the section (for the 8 sections labeled 0 through 7). Therefore, s3.4 means section 5 of 8.

The cache root directory hierarchy

To add or modify cache sections:

  1. In the Server Manager, click Caching|Sections. The Cache Section Table appears.
  2. Change the information in the table.
  3. Click Make These Changes.
  4. Restart the proxy from the command line by going to the proxy directory and typing ./restart.

Setting the cache capacity

Cache capacity is directly related to the cache hierarchy in the cache directories. The larger the hierarchy, the bigger the capacity. The cache capacity should be equal to or greater than the cache size. Setting the capacity larger than the cache size can be helpful later; if you know that you plan to increase cache size later (such as by adding an external disk), you can set up the capacity to accommodate the larger size.

Expanding the cache capacity requires a hard restart (shutdown/restart) for the changes to take effect.

To set the cache capacity:

  1. In the Server Manager, click Caching|Capacity. The Cache Administrative Operations form appears.
  2. Choose a capacity from the capacity pulldown.
  3. Click Change Capacity.
Or

  1. In the Server Manager, click Caching|Specifics. The Cache Specifics form appears.
  2. Click the word "edit" that appears next to Cache capacity.
  3. Choose a capacity from the capacity pulldown.
  4. Click Change Capacity.

Enabling the cache monitor and manager

The proxy program spawns two extra copies of itself to perform cache management. These two processes are the cache monitor and cache manager. The cache monitor receives data from the server process pool about cache activity and maintains information about its size and other aspects. It occasionally triggers the cache manager to do the actual cache clean-up tasks.

If the cache manager process is accidentally killed, it will be started automatically. The cache manager daemon uses the same configuration file as the proxy server.

You can disable the cache manager and monitor if you plan to perform cache maintenance with an external program. Otherwise, the cache manager and monitor should be enabled.

By accessing cache manager information, you can view all cached URLs, control caching for specific documents, and see an estimated size of the current cache structure. You can explicitly expire documents in the cache (so that the next time they are accessed, the proxy does an up-to-date check to determine if the document in the cache needs to be refreshed) and you can remove documents from the cache. For more information on accessing cache manager information, "Accessing cache manager information."

To enable or disable the cache monitor and manager:

  1. In the Server Manger, click Caching|Special. The Special Cache Configuration form appears.
  2. Click the appropriate button to either enable or disable the cache manager and monitor.
  3. Click OK.

Accessing cache manager information

You can view the names and attributes of all cached URLs through the cache management information. Cache management information is a list of all cached documents grouped by access protocol and site name. This list is stored in the directory that you specify on the Cache Specifics form. You can limit the URLs you view in the list by typing a domain name into the Search field. By accessing this information, you can perform various cache management functions including, expiring and removing documents from the cache.

To access cache manager information:

  1. In the Server Manager, click Caching|Cache Management.
  2. Enter a DNS domain name in the Search field and click the Search button, or select a domain name from the list. A list of subdomains in that domain appears.
  3. Click on the name of a subdomain. A list of the hosts in that subdomain appears.
  4. Click on the name of a host. A list of all of URLs appears.
  5. Click on the name of a URL. Detailed cache information about that URL appears.
Note
Because continually recording URLs slows the proxy's performance, you do not have to enable URL recording to access cache management information. To access this information without effecting performance, you can run the command line program: extras/proxy/urldbgen. This program generates a list of cached URLs on command. Once you have generated this list you can use the Cache Management form to access and manage the cache.

Caching local hosts

If a URL requested from a local host lacks a domain name, the proxy server will not cache it in order to avoid duplicate caching. For example, if a user requests http://machine/filename.html and http://machine.netscape.com/filename.html from a local server, both URLs might appear in the cache. Because these files are from a local server, they may be retrieved so quickly that it is not necessary to cache them anyway.

However, if your company has servers in many remote locations, you may want to cache documents from all hosts to reduce network traffic and decrease the time needed to access the files.

To enable the caching of local hosts:

  1. In the Server Manager, click Caching|Cache Local Hosts
  2. Select the resource you are editing by either choosing it from the Editing pulldown or by clicking the Regular Expression button and entering the name of the resource to edit.
  3. Click the enabled button.
  4. Click OK.

Cache batch updates

Cache Batch Update allows you to cache files in a specified web site or do an up-to-date check on documents in the cache whenever the proxy server is not busy. From the Cache Batch Updates form, you can create, edit and delete batches and you can enable and disable batch updating.

Creating a batch update

You can cache multiple files at once by creating a batch update. The proxy server allows you to perform an up-to-date check on several files currently in the cache or cache multiple files in a particular web site.

To create a batch update:

  1. In the Server Manger, click Caching|Batch Updates. The Cache Batch Updates form appears.
  2. Select New and Create from the pulldowns next to "Select a configuration to edit".
  3. Click OK. A new Cache Batch Update form appears.
  4. In the Name section of the form, enter a name for the new batch update entry.
  5. In the Source section of the form, click the radio button for the type of batch update that you want to create. Click the first radio button if you want to perform an up-to-date check on all documents in the cache. Click the second radio button if you want to recursively cache URLs starting from the given source URL.
  6. In the Source section fields, identify the documents that you want to use in the batch update.
  7. In the Exceptions section, identify any files that you would like to exclude from the batch update.
  8. In the Resources section, enter the maximum number of simultaneous connections, and the maximum number of documents to traverse.
  9. In the Timing section, enter the start and end times for the generation of the batch update. Only one batch update can be active at any time, so it is best to not overlap other batch update configurations.
  10. Click OK.
Note You can create, edit and delete batch update configurations without having batch updates turned on. However, if you want your batch updates to be updated according to the times you set on the Cache Batch Updates form, you must turn updates on.

Editing or deleting a batch update configuration

You can edit or delete batch updates using the Cache Batch Updates form. You may want to edit a batch update if you need to exclude certain files or if you want to update the batch more frequently. You may also want to delete a batch update configuration completely.

To edit or delete a batch update configuration:

  1. In the Server Manger, click Caching|Batch Updates. The Cache Batch Updates form appears.
  2. If you want to edit a batch, select the name of that batch and the word Edit from the pulldowns next to the text "Select a configuration to edit." If you want to delete a batch, select the name of that batch and the word Delete from the pulldowns.
  3. Click OK. The Cache Batch Updates form appears.
  4. Modify the information as you wish.
  5. Click OK.

Using the cache command-line utilities

The proxy server comes with several command-line utilities that let you configure, change, generate, and repair your cache directory structure. Most of these utilities are duplications of the Server Manager forms, but you might want to use the utilities if you need to schedule the maintenance (for example, as a cron job). All of the utilities are located in the extras/proxy directory. The following sections describe the various utilities.

Building the cache directory structure

The utility cbuild creates a single directory structure for the proxy's cache. After creating the directory structure, you can use the Server Manager forms to enable the proxy to use the newly created cache.

cbuild -d <conf-dir> -s <user>
where <conf-dir> is the directory where the proxy server instance is installed and <user> is the user account that the created files and directories should be owned by if running cbuild as root. This user id should be the same user id that the proxy is running as. For example, the directory could be /usr/ns-home/proxy-id. The utility determines the cache directory and location of the cache database based on the directory you enter.

cbuild -c <cache-dir> -u <urldb-dir> -s <user>
where <cache-dir> is the directory for your cache structure, <urldb-dir> is the directory where the cache management information is kept, and <user> is the user account that the created files and directories should be owned by if running cbuild as root. This user id should be the same user id that the proxy is running as.

cbuild is located in the extras/proxy directory.

Upgrading the cache structure

If you have upgraded your existing 1.1 or 2.0 proxy server, you should upgrade the cache seperately. Depending on the size of your cache, a cache upgrade can be a time-consuming process. You can upgrade a version 1.1 or 2.0 cache directory structure and all of its files. The cupgrade utility for uprgrading a 1.1 strucure, moves all of the files from the old directories to the new 2.5 directory structure. The cupgrade utility for upgrading a 2.0 cache, works in-place and simply modifies the existing 2.0 cache so that it is in a 2.5 format.

Before you can upgrade a 1.1 cache structure, you must first make sure you have a version 2.5 structure. If you installed the proxy server by using the upgrade utility and enabled caching, then you already have a cache structure. If you don't have a cache directory structure, use the cbuild utility before running the cupgrade utility. If you are upgrading a 2.0 cache structure, you should not have a 2.5 cache. After upgrade, you should replace the 2.5 cache with the old cache.

cupgrade is located in the extras/proxy directory.

Upgrading a 1.1 cache structure

If you are upgrading a 1.1 cache structure, the cupgrade utility has the following syntax:

cupgrade -d <conf-dir> -o <1.1-cache-root> -s <user>

The <conf-dir> directory is where the proxy server is installed. For example, the directory could be /usr/ns-home/proxy-id. The utility determines the new cache directory and location of the cache database based on the configuration files found in the directory you enter. The <1.1-cache-root> is the directory of the version 1.1 cache structure. The <user> is the Unix user id that the files in the cache should be owned as. It is optional and should be included only if you run the cupgrade utility as "root" and your proxy as another user. For example, you could run cupgrade as "root" and your proxy as "nobody". In this case you would replace <user> with "nobody".

Note
Specifying user as nobody will not work on some systems, such as HP-UX. When using these systems, you must specify a user other than nobody for both the proxy and for cupgrade.
The cache upgrade can take anywhere from a few minutes to several hours depending on the size of the old cache structure.

Upgrading a 2.0 cache structure

If you are upgrading a 2.0 cache structure, the cupgrade utility has the following syntax:

cupgrade <sect> <sect> ... <sect>

The 2.0 upgrade should be run in the cache directory where all of the cache sections reside. Each <sect> is a section in the cache that you want to upgrade. The number of calls depends upon how many sections are in the cache. For example, if your cache directory is: /usr/ns-home/cache and you have a 1GB cache, you would then have 8 sections in your cache directory. You should type the following at the command line:

cd /usr/ns-home/cache
cupgrade s3.0 s3.1 s3.2 s3.3 s3.4 s3.5 s3.6 s3.7

Instead of typing each section, you could simply use s* to pass all of the section directory names. In this instance, you would type the following:

cd /usr/ns-home/cache
cupgrade s*

If you have multiple cache partitions you would need to run an upgrade utility for each partition. For example, your cache directory may be: /usr/ns-home/cache and you have a 2GB cache, 16 sections, and 2 partitions (with 8 sections on each partition). The partitions are /disk1/cache-1 and /disk2/cache-2. The syntax for the cupgrade utility would then be:

cd /usr/ns-home/cache/disk1/cache-1
cupgrade s4.00 s4.01 s4.02 s4.03 s4.04 s4.05 s4.06 s4.07

cd /usr/ns-home/cache/disk2/cache-2
cupgrade s4.08 s4.09 s4.10 s4.11 s4.12 s4.13 s4.14 s4.15

You could also upgrade all sections on both partitions by typing the following at the command line:

cupgrade /disk1/cache-1/s* /disk2/cache-2/s*

The cache upgrade can take anywhere from a few minutes to several hours depending on the size of the old cache structure.

Repairing the cache URL list

The proxy has a utility called urldbgen that goes through the entire cache directory structure and repairs the cache manager's URL list. Use this utility if your cache manager's URL list appears damaged when viewed through the Cache Management form (for example, if the URL list doesn't seem to contain all of the URLs that you know are cached or if the cache manager claims that the cache is empty or corrupt). You may also want to run this utility if you have disabled URL recording for the sake of performance, but want to generate a URL list on command.

To repair the cache URL list, run the utility called urldbgen located in the extras/proxy directory. This utility has two types of arguments you can use:

urldbgen -d <conf-dir> -s <user>
where <conf-dir> is the directory where the proxy server is installed and <user> is the user account that the created files and directories should be owned by if running cbuild as root. This user id should be the same user id that the proxy is running as. For example, the directory could be /usr/ns-home/proxy-id. The utility determines the cache directory and location of the cache database based on the directory you enter.

urldbgen -c <cache-dir> -u <urldb-dir> -s <user>
where <cache-dir> is the directory for your cache structure, <urldb-dir> is the directory where the cache URLs are recorded, and <user> is the user account that the created files and directories should be owned by if running cbuild as root. This user id should be the same user id that the proxy is running as.

Note
Running the URL list repair utility can take anywhere from a few seconds to a couple of hours to complete depending on the size of the cache and the speed and load of your machine and its disks.

You will rarely need this utility. The only way that the URL list will be corrupted is if something prevents the proxy from updating its URL list after it has completed writing a file to the cache. This could happen if the disk is full, if the proxy users' permissions prevent the proxy from writing to the list file, or if the system suddenly goes down. The URL list is located in the hosts subdirectory under the cache root directory.This utility can recreate the entire URL list from scratch if it is accidentally deleted.

Cleaning the cache directories

The proxy server has a command-line utility called urldbgc that goes through the URL database and purges any old files. It's good to run this utility if, for some reason, the database is out of sync with the actual files in the cache. You may also want to run this utility if you have disabled URL recording for the sake of performance, but want to generate a URL list on command. You can run this utility as a cron job and schedule it for the lowest peak time for your proxy server. The urldbgc utility has the following syntax:

urldbgc -d <conf-dir> -s <user>
where <conf-dir> is the directory where the proxy server is installed and <user> is the user account that the created files and directories should be owned by if running cbuild as root. This user id should be the same user id that the proxy is running as. For example, the directory could be /usr/ns-home/proxy-id. The utility determines the cache directory and location of the cache database based on the directory you enter.

urldbgc -c <cache-dir> -u <urldb-dir> -s <user>
where <cache-dir> is the directory for your cache structure, <urldb-dir> is the directory where the cache URL database is kept, and <user> is the user account that the created files and directories should be owned by if running cbuild as root. This user id should be the same user id that the proxy is running as.

Note
If you do not wish to garbage collect, but you want to fully delete all of the files in your cache, type the following at the command-line:

cd /cache find s* -typef -exec rm {} \.;

where <proxy directory> is the directory where your proxy is kept.