--- title: HTTP 캐시 slug: Mozilla/HTTP_cache translation_of: Mozilla/HTTP_cache ---

이 문서는 새로운 HTTP 캐시 version 2를 기술하고 있습니다.

코드는 /network/cache2 에 존재합니다.

API

다음은 HTTP 캐시 v2 API에 대한 자세한 설명으로, 예제들을 포함하고 있습니다. 이 문서에는 IDL files 주석에서 찾을 수 없거나 명시되지 않은 내용만 들어있습니다.

캐시 API는 완벽하게 스레드로부터 안전하며 블록킹이 없습니다.
IPC 지원은 없습니다. 기본 크롬 프로세스에서만 액세스할 수 있습니다.
프로파일이 없으면 새로운 HTTP 캐시가 작동하지만, 모든 것은 어떤 특정 제한도 따르지 않는 메모리에만 저장됩니다.

nsICacheService 등의 오래된 캐시 API를 더 이상 사용하지 말 것을 적극 권장합니다. 이는 곧 완전히 폐기되고 제거될 것입니다. (bug 913828).

nsICacheStorageService

HTTP 캐시 엔트리 포인트입니다. 서비스로만 액세스 가능하며, 스레드로부터 안전하고 스크립팅 가능합니다.
https://dxr.mozilla.org/mozilla-central/source/netwerk/cache2/nsICacheStorageService.idl
"@mozilla.org/netwerk/cache-storage-service;1"
특정 URL마다 캐시 항목에 대한 추가 액세스 – 아래 nsICacheEntry 참조 – 를 제공하는 “storage”(“저장소”) 객체 – 아래 nsICacheStorage 참조 – 에 액세스하는 메소드를 제공합니다.
현재 저장소는 3가지 유형이 있으며, 모든 액세스 메소드는 nsICacheStorage를 반환(return)합니다. :
- 메모리-온리 (memoryCacheStorage): 데이터를 메모리 캐시에만 저장하며, 이 저장소의 데이터는 절대로 디스크에 저장되지 않습니다.
- 디스크 (diskCacheStorage): 디스크에 데이터를 저장하지만, 기존 항목의 경우 memory-only storage도 검색합니다; 특수한 인수(argument)를 통해 지시를 받으면 애플리케이션 캐시 또한 주로 검색합니다.
- 애플리케이션 캐시 (appCacheStorage): 컨슈머가 특정 nsIApplicationCache (즉, 한 그룹의 특정 앱 캐시 버전)를 가지고 있을 때, 이 저장소는 애플리케이션 캐시 항목에 대한 읽기 및 쓰기 권한을 제공합니다.; 앱 캐시가 특정되지 않았다면, 저장소는 기존의 모든 앱 캐시에서 작동합니다.
The service also provides methods to clear the whole disk and memory cache content or purge any intermediate memory structures:
- clear – after it returns, all entries are no longer accessible through the cache APIs; the method is fast to execute and non-blocking in any way; the actual erase happens in background
- purgeFromMemory – removes (schedules to remove) any intermediate cache data held in memory for faster access (more about the intermediate cache below)

nsILoadContextInfo

Distinguishes the scope of the storage demanded to open.
Mandatory argument to *Storage methods of nsICacheStorageService.
https://dxr.mozilla.org/mozilla-central/source/netwerk/base/public/nsILoadContextInfo.idl
It is a helper interface wrapping following four arguments into a single one:
- private-browsing boolean flag
- anonymous load boolean flag
- app ID number (0 for no app)
- is-in-browser boolean flag
Helper functions to create nsILoadContextInfo objects:
- C++ consumers: functions at LoadContextInfo.h exported header
- JS consumers: resource://gre/modules/LoadContextInfo.jsm module methods
Two storage objects created with the same set of nsILoadContextInfo arguments are identical, containing the same cache entries.
Two storage objects created with in any way different nsILoadContextInfo arguments are strictly and completely distinct and cache entries in them do not overlap even when having the same URIs.

nsICacheStorage

https://dxr.mozilla.org/mozilla-central/source/netwerk/cache2/nsICacheStorage.idl
Obtained from call to one of the *Storage methods on nsICacheStorageService.
Represents a distinct storage area (or scope) to put and get cache entries mapped by URLs into and from it.
Similarity with the old cache: this interface may be with some limitations considered as a mirror to nsICacheSession, but less generic and not inclining to abuse.
Unimplemented or underimplemented functionality:

asyncEvictStorage (bug 977766), asyncVisitStorage (bug 916052)

nsICacheEntryOpenCallback

https://dxr.mozilla.org/mozilla-central/source/netwerk/cache2/nsICacheEntryOpenCallback.idl
The result of nsICacheStorage.asyncOpenURI is always and only sent to callbacks on this interface.
These callbacks are ensured to be invoked when asyncOpenURI returns NS_OK.
Important difference in behavior from the old cache: when the cache entry object is already present in memory or open as “force-new” (a.k.a “open-truncate”) this callback is invoked sooner then the asyncOpenURI method returns (i.e. immediately); there is currently no way to opt out of this feature (watch bug 938186).

nsICacheEntry

https://dxr.mozilla.org/mozilla-central/source/netwerk/cache2/nsICacheEntry.idl
Obtained asynchronously or pseudo-asynchronously by a call to nsICacheStorage.asyncOpenURI.
Provides access to a cached entry data and meta data for reading or writing or in some cases both, see below.

Lifetime of a new entry

Such entry is initially empty (no data or meta data is stored in it).
The aNew argument in onCacheEntryAvailable is true for and only for new entries.
Only one consumer (the so called "writer") may have such an entry available (obtained via onCacheEntryAvailable).
Other parallel openers of the same cache entry are blocked (wait) for invocation of their onCacheEntryAvailable until one of the following occurs:
- The writer simply throws the entry away: other waiting opener in line gets the entry again as "new", the cycle repeats.
  
  This applies in general, writers throwing away the cache entry means a failure to write the cache entry and a new writer is being looked for again, the cache entry remains empty (a.k.a. "new").
- The writer stored all necessary meta data in the cache entry and called metaDataReady on it: other consumers now get the entry and may examine and potentially modify the meta data and read the data (if any) of the cache entry.
- When the writer has data (i.e. the response payload) to write to the cache entry, it must open the output stream on it before it calls metaDataReady.
When the writer still keeps the cache entry and has open and keeps open the output stream on it, other consumers may open input streams on the entry. The data will be available as the writer writes data to the cache entry's output stream immediately, even before the output stream is closed. This is called concurrent read/write.

Concurrent read and write

Important difference in behavior from the old cache: the cache now supports reading a cache entry data while it is still being written by the first consumer - the writer.

This can only be engaged for resumable responses that (bug 960902) don't need revalidation. Reason is that when the writer is interrupted (by e.g. external canceling of the loading channel) concurrent readers would not be able to reach the remaning unread content.

This could be improved by keeping the network load running and being stored to the cache entry even after the writing channel has been canceled.

When the writer is interrupted, the first concurrent reader in line does a range request for the rest of the data - and becomes that way a new writer. The rest of the readers are still concurrently reading the content since output stream for the cache entry is again open and kept by the current writer.

Lifetime of an existing entry with only a partial content

Such a cache entry is first examined in the nsICacheEntryOpenCallback.onCacheEntryCheck callback, where it has to be checked for completeness.
In this case, the Content-Length (or different indicator) header doesn't equal to the data size reported by the cache entry.
The consumer then indicates the cache entry needs to be revalidated by returning ENTRY_NEEDS_REVALIDATION from onCacheEntryCheck.
This consumer, from the point of view the cache, takes a role of the writer.
Other parallel consumers, if any, are blocked until the writer calls setValid on the cache entry.
The consumer is then responsible to validate the partial content cache entry with the network server and attempt to load the rest of the data.
When the server responds positively (in case of an HTTP server with a 206 response code) the writer (in this order) opens the output stream on the cache entry and calls setValid to unblock other pending openers.
Concurrent read/write is engaged.

Lifetime of an existing entry that doesn't pass server revalidation

Such a cache entry is first examined in the nsICacheEntryOpenCallback.onCacheEntryCheck callback, where the consumer finds out it must be revalidated with the server before use.
The consumer then indicates the cache entry needs to be revalidated by returning ENTRY_NEEDS_REVALIDATION from onCacheEntryCheck.
This consumer, from the point of view the cache, takes a role of the writer.
Other parallel consumers, if any, are blocked until the writer calls setValid on the cache entry.
The consumer is then responsible to validate the partial content cache entry with the network server.
The server responses with a 200 response which means the cached content is no longer valid and a new version must be loaded from the network.
The writer then calls recreate on the cache entry. This returns a new empty entry to write the meta data and data to, the writer exchanges its cache entry by this new one and handles it as a new one.
The writer then (in this order) fills the necessary meta data of the cache entry, opens the output stream on it and calls metaDataReady on it.
Any other pending openers, if any, are now given this new entry to examine and read as an existing entry.

Adding a new storage

Should there be a need to add a new distinct storage for which the current scoping model would not be sufficient - use one of the two following ways:

[preffered] Add a new <Your>Storage method on nsICacheStorageService and if needed give it any arguments to specify the storage scope even more. Implementation only should need to enhance the context key generation and parsing code and enhance current - or create new when needed - nsICacheStorage implementations to carry any additional information down to the cache service.
[not preferred] Add a new argument to nsILoadContextInfo; be careful here, since some arguments on the context may not be known during the load time, what may lead to inter-context data leaking or implementation problems. Adding more distinction to nsILoadContextInfo also affects all existing storages which may not be always desirable.

See context keying details for more information.

Code examples

TBD

Opening an entry

Creating a new entry

Recreating an already open entry

Implementation

Threading

The cache API is fully thread-safe.

The cache is using a single background thread where any IO operations like opening, reading, writing and erasing happen. Also memory pool management, eviction, visiting loops happen on this thread.

The thread supports several priority levels. Dispatching to a level with a lower number is executed sooner then dispatching to higher number layers; also any loop on lower levels yields to higher levels so that scheduled deletion of 1000 files will not block opening cache entries.

OPEN_PRIORITY: except opening priority cache files also file dooming happens here to prevent races
READ_PRIORITY: top level documents and head blocking script cache files are open and read as the first
OPEN
READ: any normal priority content, such as images are open and read here
WRITE: writes are processed as last, we cache data in memory in the mean time
MANAGEMENT: level for the memory pool and CacheEntry background operations
CLOSE: file closing level
INDEX: index is being rebuild here
EVICT: files overreaching the disk space consumption limit are being evicted here

NOTE: Special case for eviction - when an eviction is scheduled on the IO thread, all operations pending on the OPEN level are first merged to the OPEN_PRIORITY level. The eviction preparation operation - i.e. clearing of the internal IO state - is then put to the end of the OPEN_PRIORITY level. All this happens atomically. This functionality is currently pending in bug 976866.

Storage and entries scopes

A scope key string used to map the storage scope is based on the arguments of nsILoadContextInfo. The form is following (currently pending in bug 968593):

a,b,i1009,p,

Regular expression: (.([^,]+)?,)*
The first letter is an identifier, identifiers are to be alphabetically sorted and always terminate with ','
a - when present the scope is belonging to an anonymous load
b - when present the scope is in browser element load
i - when present must have a decimal integer value that represents an app ID the scope belongs to, otherwise there is no app (app ID is considered 0)
p - when present the scope is of a private browsing load, this never persists

CacheStorageService keeps a global hashtable mapped by the scope key. Elements in this global hashtable are hashtables of cache entries. The cache entries are mapped by concantation of Enhance ID and URI passed to nsICacheStorage.asyncOpenURI. So that when an entry is beeing looked up, first the global hashtable is searched using the scope key. An entries hashtable is found. Then this entries hashtable is searched using <enhance-id:><uri> string. The elemets in this hashtable are CacheEntry classes, see below.

The hash tables keep a strong reference to CacheEntry objects. The only way to remove CacheEntry objects from memory is by exhausting a memory limit for intermediate memory caching, what triggers a background process of purging expired and then least used entries from memory. Another way is to directly call the nsICacheStorageService.purge method. That method is also called automatically on the "memory-pressure" indication.

Access to the hashtables is protected by a global lock. We also - in a thread-safe manner - count the number of consumers keeping a reference on each entry. The open callback actually doesn't give the consumer directly the CacheEntry object but a small wrapper class that manages the 'consumer reference counter' on its cache entry. This both mechanisms ensure thread-safe access and also inability to have more then a single instance of a CacheEntry for a single <scope+enhanceID+URL> key.

CacheStorage, implementing the nsICacheStorage interface, is forwarding all calls to internal methods of CacheStorageService passing itself as an argument. CacheStorageService then generates the scope key using the nsILoadContextInfo of the storage. Note: CacheStorage keeps a thread-safe copy of nsILoadContextInfo passed to a *Storage method on nsICacheStorageService.

Invoking open callbacks

CacheEntry, implementing the nsICacheEntry interface, is responsible for managing the cache entry internal state and to properly invoke onCacheEntryCheck and onCacheEntryAvaiable callbacks to all callers of nsICacheStorage.asyncOpenURI.

Keeps a FIFO of all openers.
Keeps its internal state like NOTLOADED, LOADING, EMPTY, WRITING, READY, REVALIDATING.
Keeps the number of consumers keeping a reference to it.
Refers a CacheFile object that holds actual data and meta data and, when told to, persists it to the disk.

The openers FIFO is an array of CacheEntry::Callback objects. CacheEntry::Callback keeps a strong reference to the opener plus the opening flags. nsICacheStorage.asyncOpenURI forwards to CacheEntry::AsyncOpen and triggers the following pseudo-code:

CacheStorage::AsyncOpenURI - the API entry point:

globally atomic:
- look a given CacheEntry in CacheStorageService hash tables up
- if not found: create a new one, add it to the proper hash table and set its state to NOTLOADED
- consumer reference ++
call to CacheEntry::AsyncOpen
consumer reference --

CacheEntry::AsyncOpen (entry atomic):

the opener is added to FIFO, consumer reference ++ (dropped back after an opener is removed from the FIFO)
state == NOTLOADED:
- state = LOADING
- when OPEN_TRUNCATE flag was used:
  - CacheFile is created as 'new', state = EMPTY
- otherwise:
  - CacheFile is created and load on it started
  - CacheEntry::OnFileReady notification is now expected
state == LOADING: just do nothing and exit
call to CacheEntry::InvokeCallbacks

CacheEntry::InvokeCallbacks (entry atomic):

called on:
- a new opener has been added to the FIFO via an AsyncOpen call
- asynchronous result of CacheFile open
- the writer throws the entry away
- the output stream of the entry has been opened or closed
- metaDataReady or setValid on the entry has been called
- the entry has been doomed
state == EMPTY:
- on OPER_READONLY flag use: onCacheEntryAvailable with null for the cache entry
- otherwise:
  - state = WRITING
  - opener is removed from the FIFO and remembered as the current 'writer'
  - onCacheEntryAvailable with aNew = true and this entry is invoked (on the caller thread) for the writer
state == READY:
- onCacheEntryCheck with the entry is invoked on the first opener in FIFO - on the caller thread if demanded
- result == RECHECK_AFTER_WRITE_FINISHED:
  - opener is left in the FIFO with a flag RecheckAfterWrite
  - such openers are skipped until the output stream on the entry is closed, then onCacheEntryCheck is re-invoked on them
  - Note: here is a potential for endless looping when RECHECK_AFTER_WRITE_FINISHED is abused
- result == ENTRY_NEEDS_REVALIDATION:
  - state = REVALIDATING, this prevents invocation of any callback until CacheEntry::SetValid is called
  - continue as in state ENTRY_WANTED (just bellow)
- result == ENTRY_WANTED:
  - consumer reference ++ (dropped back when the consumer releases the entry)
  - onCacheEntryAvailable is invoked on the opener with aNew = false and the entry
  - opener is removed from the FIFO
- result == ENTRY_NOT_WANTED:
  - onCacheEntryAvailable is invoked on the opener with null for the entry
  - opener is removed from the FIFO
state == WRITING or REVALIDATING:
- do nothing and exit
any other value of state is unexpected here (assertion failure)
loop this process while there are openers in the FIFO

CacheEntry::OnFileReady (entry atomic):

load result == failure or the file has not been found on disk (is new): state = EMPTY
otherwise: state = READY since the cache file has been found and is usable containing meta data and data of the entry
call to CacheEntry::InvokeCallbacks

CacheEntry::OnHandleClosed (entry atomic):

Called when any consumer throws the cache entry away
If the handle is not the handle given to the current writer, then exit
state == WRITING: the writer failed to call metaDataReady on the entry - state = EMPTY
state == REVALIDATING: the writer failed the re-validation process and failed to call setValid on the entry - state = READY
call to CacheEntry::InvokeCallbacks

All consumers release the reference:

the entry may now be purged (removed) from memory when found expired or least used on overrun of the memory pool limit
when this is a disk cache entry, its cached data chunks are released from memory and only meta data is kept

Intermediate memory caching of frequently used metadata (a.k.a. disk cache memory pool)

This is a description of this feature status that is currently only a patch in bug 986179. Current behavior is simpler and causes a serious memory consumption regression (bug 975367).

For the disk cache entries we keep some of the most recent and most used cache entries' meta data in memory for immediate zero-thread-loop opening. The default size of this meta data memory pool is only 250kB and is controlled by a new browser.cache.disk.metadata_memory_limit preference. When the limit is exceeded, we purge (throw away) first expired and then least used entries to free up memory again.

Only CacheEntry objects that are already loaded and filled with data and having the 'consumer reference == 0' (bug 942835) can be purged.

The 'least used' entries are recognized by the lowest value of frecency we re-compute for each entry on its every access. The decay time is controlled by the browser.cache.frecency_half_life_hours preference and defaults to 6 hours. The best decay time will be based on results of an experiment.

The memory pool is represented by two lists (strong refering ordered arrays) of CacheEntry objects:

Sorted by expiration time (that default to 0xFFFFFFFF)
Sorted by frecency (defaults to 0)

We have two such pools, one for memory-only entries actually representing the memory-only cache and one for disk cache entries for which we only keep the meta data. Each pool has a different limit checking - the memory cache pool is controlled by browser.cache.memory.capacity, the disk entries pool is already described above. The pool can be accessed and modified only on the cache background thread.

"@mozilla.org/netwerk/cache-storage-service;1"