ftp.nice.ch/peanuts/GeneralData/Usenet/news/1989/CSN-89.tar.gz#/comp-sys-next/1989/Dec/nfs-failure-between-Gould-NP1-and-Personal-Iris

This is nfs-failure-between-Gould-NP1-and-Personal-Iris in view mode; [Up]


Date: Sun 25-Dec-1989 04:52:46 From: Unknown Subject: Re: nfs failure between Gould NP1 and Personal Iris In article <MIKE.89Nov27094420@cfdl.larc.nasa.gov>, mike@cfdl.larc.nasa.gov (Mike Walker) writes: > I am having a strange error occur on a NFS mounted partition on our PI. > First a little info about the machines involved: > > 1) Personal Iris 4D-20 w/ Irix 3.2 > 2) Gould NP1 w/ UTX/32 3.1 (BSD w/ SVR3 extensions) > 3) Sun 3/280 w/ Sun Unix 3.4 > > I have one file system from machines 2 and 3 above mounted on the PI. > Everything seems to work fine with the Sun based fs, but certain > operations fail on the fs mounted off of the Gould. Symptoms: > > - ls works everywhere > - cat, grep, etc. (normal file access) works everywhere > - echo * fails only on the Gould file-system (error: ``no match'') > - find fails only on the Gould file-system > (error: ``getwd: read error in ..'') > - None of these problems show up on the Sun or the Gould using local, > Sun NFS, or Irix NFS file-systems. Mike informed me via private communication that only the C-shell's echo failed to match * against visible filenames; 'echo *' in the Bourne shell worked as expected. This clue, plus Ethernet packet traces captured by Mike (thanks!), exposed a server bug seen at previous Connectathons (a Connectathon is an annual NFS interoperation conference thrown by Sun, attended by most NFS vendors). Clients may call the NFS readdir remote procedure with an arbitrary byte count indicating the number of bytes allocated for filesystem-independent directory entries. The reference NFS server code uses this byte count to allocate space for server-dependent directory entries, and calls the local filesystem to read the directory. Older reference NFS ports contained BSD Fast File System (FFS) readdir code that failed with EINVAL if the requested byte count was less than, or not congruent with, DIRBLKSIZ. DIRBLKSIZ is typically 512. SGI's C-shell, and several other BSD-derived programs that SGI ships, use a byte count of 512 when they call the BSD version of readdir(3B). If the directory is remote, and if its NFS server is based on an older NFS reference port and has a DIRBLKSIZ of, say, 1024, the server will reject the client's readdir call with a status code equal to EINVAL (22). This is exactly what Mike's Gould server does, so it is likely that Gould has defined their DIRBLKSIZ to be 1024 (perhaps because their disks use 1024-byte sectors). Our C-shell, a straight port of 4.3BSD csh, doesn't check for readdir errors, so the EINVAL causes 'echo *' to silently complete, apparently successfully, but with "No match". The bourne shell uses the AT&T-based readdir(3C) routine, which asks for 4096 bytes worth of directory entries, thus avoiding the bug. Note that the NFS protocol doesn't define EINVAL as a well-known status code -- however, the protocol's status codes are defined by enumerating certain 4.2BSD/SunOS intro(2) error numbers, and all NFS implementations that I've seen from Sun fail to check for error numbers not in the status enumeration, in order to avoid sending them. Almost any server error code could leak through the protocol. Our NFS maps unspecified error numbers such as EINVAL onto the NFSERR_IO status code. Gould's NFS does not. NFS implementors have always relied on the Sun reference ports of NFS to 4.3BSD for standardization, lacking a complete spec (the NFS version 2 protocol has an RFC, but it doesn't place any restrictions on readdir's byte count argument; it doesn't even distinguish between client and server uses of this number). The latest reference port (NFSSRC4.0.x) that Sun has shipped to licensed NFS vendors has fixed BSD FFS readdir to accept any byte count. Perhaps Gould has, or will soon have, a version of NFS based on this release. Brendan Eich Silicon Graphics, Inc. brendan@sgi.com ====================================================================== Alex Woo, MS 227-2 | woo@pioneer.arc.nasa.gov NASA Ames Research Center | woo@ames-nas.arpa Moffett Field, CA 94035 | {seismo,topaz,lll-crg,ucbvax}! ====================================================================== {hplabs,hao,ihnp4,decwrl,allegra,tektronix,menlo70}!ames!pioneer!woo ======================================================================

These are the contents of the former NiCE NeXT User Group NeXTSTEP/OpenStep software archive, currently hosted by Marcel Waldvogel and Netfuture.ch.