Sun's Network File System (NFS)
@OSTEP
Last updated
Was this helpful?
@OSTEP
Last updated
Was this helpful?
Setup: one server that stores the data on its disks, and a number of clients that request data through well-formed protocol messages
Client-side file system
File server
Benefits
easy sharing of data across clients
centralized administration
backing up files can be done from the few server machines instead of from multiple clients
security
Note: reasons for server crash
power outage, bugs, memory leaks, network acts strangely (becomes partitioned)
Key: in- stead of building a proprietary and closed system, Sun instead developed an open protocol which simply specified the exact message formats that clients and servers would use to communicate.
Goal of the protocol: simple and fast server crash recovery
the protocol is designed to deliver in each protocol request all the information that is needed in order to complete the request.
Stateful protocol
fd: shared (or distributed) state between the client and the server
Information is ephemeral (i.e., in memory), can be lost when the server crashed
Recovery protocol
Client keeps enough information around in its memory to be able to tell the server what it needs to know
Client crash?
I.e. in a situation where the file is never close().
NSF: Stateless approach
each client operation contains all the information needed to complete the request. Client, at worst, may have to retry a request.
File Handle
E.x. Client-side FS sends a lookup, the fd will be returned if successful. Then client can issue READ and WRITE protocol messages on a file, which, in read case, pass the file handle along with the offset. For write, a success code is returned.
The client-side file system tracks open files, and generally translates application requests into the relevant set of protocol messages. The server simply responds to protocol messages, each of which contains all information needed to complete request.
Idempotency: a request is idempotent when the effect of performing operation multiple times is equivalent to the effect of performing the operation a single time.
Requests in NFS that is idempotent
LOOKUP, READ
WRITE: which contains the data, the count, and (importantly) the exact offset to write data to
Some which are not
MKDIR
But PERFECT IS THE ENEMY OF THE GOOD (VOLTAIRE'S LAW). Accepting that life isn't perfect and still building the system is a sign of good engineering.
Three types of loss
Request loss
Server down
Reply lost on way back from server
After sending the request, the client sets a timer to go off after a specified time period. If timeout, send the request again.
In-memory caching: first access expensive (network communication), and subsequent ones are quick (from client memory)
Temporary buffer for writes
Update Visibility
C2 might buffer its writes in its cache for a time before propagating them to the server. C3 (or other client) might get stale versions of the file
Stale Cache
Cache might still contain the stale version of the data, even if the server side data has changed
NSF handles as follows:
Flush-on-close (close-to-open) consistency semantics
When a file is written to and subsequently closed by a client application, the client flushed all updates to the server
Performance problem
Short-lived file was created and then deleted, it would be forced to the server
Attribute cache: hard to reason about exactly which version of a file the client is getting
Before using a cached block, send a GETATTR request to the server to fetch the file's attributes (last-modified time)
If the time-of-modification is more recent than the time the file was fetched, then client invalidates the file and removes it from the cache
In this case, flooded with GETATTR requests!
Introduce the attribute cache where client can directly look up
NOT return success on a write protocol request until the write has been forced to stable storage (disk or other persistent device)
Allow client to detect server failure during a write, and retry until it finally succeeds
But: this can make write a major bottleneck
Trick:
battery-backed memory: no fear of losing the data or having the cost to write right away
File system design to write to disk quickly when one needs to do so