| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | This is the implementation of the SystemV/Coherent filesystem for Linux. | 
|  | 2 | It grew out of separate filesystem implementations | 
|  | 3 |  | 
|  | 4 | Xenix FS      Doug Evans <dje@cygnus.com>  June 1992 | 
|  | 5 | SystemV FS    Paul B. Monday <pmonday@eecs.wsu.edu> March-June 1993 | 
|  | 6 | Coherent FS   B. Haible <haible@ma2s2.mathematik.uni-karlsruhe.de> June 1993 | 
|  | 7 |  | 
|  | 8 | and was merged together in July 1993. | 
|  | 9 |  | 
|  | 10 | These filesystems are rather similar. Here is a comparison with Minix FS: | 
|  | 11 |  | 
|  | 12 | * Linux fdisk reports on partitions | 
|  | 13 | - Minix FS     0x81 Linux/Minix | 
|  | 14 | - Xenix FS     ?? | 
|  | 15 | - SystemV FS   ?? | 
|  | 16 | - Coherent FS  0x08 AIX bootable | 
|  | 17 |  | 
|  | 18 | * Size of a block or zone (data allocation unit on disk) | 
|  | 19 | - Minix FS     1024 | 
|  | 20 | - Xenix FS     1024 (also 512 ??) | 
|  | 21 | - SystemV FS   1024 (also 512 and 2048) | 
|  | 22 | - Coherent FS   512 | 
|  | 23 |  | 
|  | 24 | * General layout: all have one boot block, one super block and | 
|  | 25 | separate areas for inodes and for directories/data. | 
|  | 26 | On SystemV Release 2 FS (e.g. Microport) the first track is reserved and | 
|  | 27 | all the block numbers (including the super block) are offset by one track. | 
|  | 28 |  | 
|  | 29 | * Byte ordering of "short" (16 bit entities) on disk: | 
|  | 30 | - Minix FS     little endian  0 1 | 
|  | 31 | - Xenix FS     little endian  0 1 | 
|  | 32 | - SystemV FS   little endian  0 1 | 
|  | 33 | - Coherent FS  little endian  0 1 | 
|  | 34 | Of course, this affects only the file system, not the data of files on it! | 
|  | 35 |  | 
|  | 36 | * Byte ordering of "long" (32 bit entities) on disk: | 
|  | 37 | - Minix FS     little endian  0 1 2 3 | 
|  | 38 | - Xenix FS     little endian  0 1 2 3 | 
|  | 39 | - SystemV FS   little endian  0 1 2 3 | 
|  | 40 | - Coherent FS  PDP-11         2 3 0 1 | 
|  | 41 | Of course, this affects only the file system, not the data of files on it! | 
|  | 42 |  | 
|  | 43 | * Inode on disk: "short", 0 means non-existent, the root dir ino is: | 
|  | 44 | - Minix FS                            1 | 
|  | 45 | - Xenix FS, SystemV FS, Coherent FS   2 | 
|  | 46 |  | 
|  | 47 | * Maximum number of hard links to a file: | 
|  | 48 | - Minix FS     250 | 
|  | 49 | - Xenix FS     ?? | 
|  | 50 | - SystemV FS   ?? | 
|  | 51 | - Coherent FS  >=10000 | 
|  | 52 |  | 
|  | 53 | * Free inode management: | 
|  | 54 | - Minix FS                             a bitmap | 
|  | 55 | - Xenix FS, SystemV FS, Coherent FS | 
|  | 56 | There is a cache of a certain number of free inodes in the super-block. | 
|  | 57 | When it is exhausted, new free inodes are found using a linear search. | 
|  | 58 |  | 
|  | 59 | * Free block management: | 
|  | 60 | - Minix FS                             a bitmap | 
|  | 61 | - Xenix FS, SystemV FS, Coherent FS | 
|  | 62 | Free blocks are organized in a "free list". Maybe a misleading term, | 
|  | 63 | since it is not true that every free block contains a pointer to | 
|  | 64 | the next free block. Rather, the free blocks are organized in chunks | 
|  | 65 | of limited size, and every now and then a free block contains pointers | 
|  | 66 | to the free blocks pertaining to the next chunk; the first of these | 
|  | 67 | contains pointers and so on. The list terminates with a "block number" | 
|  | 68 | 0 on Xenix FS and SystemV FS, with a block zeroed out on Coherent FS. | 
|  | 69 |  | 
|  | 70 | * Super-block location: | 
|  | 71 | - Minix FS     block 1 = bytes 1024..2047 | 
|  | 72 | - Xenix FS     block 1 = bytes 1024..2047 | 
|  | 73 | - SystemV FS   bytes 512..1023 | 
|  | 74 | - Coherent FS  block 1 = bytes 512..1023 | 
|  | 75 |  | 
|  | 76 | * Super-block layout: | 
|  | 77 | - Minix FS | 
|  | 78 | unsigned short s_ninodes; | 
|  | 79 | unsigned short s_nzones; | 
|  | 80 | unsigned short s_imap_blocks; | 
|  | 81 | unsigned short s_zmap_blocks; | 
|  | 82 | unsigned short s_firstdatazone; | 
|  | 83 | unsigned short s_log_zone_size; | 
|  | 84 | unsigned long s_max_size; | 
|  | 85 | unsigned short s_magic; | 
|  | 86 | - Xenix FS, SystemV FS, Coherent FS | 
|  | 87 | unsigned short s_firstdatazone; | 
|  | 88 | unsigned long  s_nzones; | 
|  | 89 | unsigned short s_fzone_count; | 
|  | 90 | unsigned long  s_fzones[NICFREE]; | 
|  | 91 | unsigned short s_finode_count; | 
|  | 92 | unsigned short s_finodes[NICINOD]; | 
|  | 93 | char           s_flock; | 
|  | 94 | char           s_ilock; | 
|  | 95 | char           s_modified; | 
|  | 96 | char           s_rdonly; | 
|  | 97 | unsigned long  s_time; | 
|  | 98 | short          s_dinfo[4]; -- SystemV FS only | 
|  | 99 | unsigned long  s_free_zones; | 
|  | 100 | unsigned short s_free_inodes; | 
|  | 101 | short          s_dinfo[4]; -- Xenix FS only | 
|  | 102 | unsigned short s_interleave_m,s_interleave_n; -- Coherent FS only | 
|  | 103 | char           s_fname[6]; | 
|  | 104 | char           s_fpack[6]; | 
|  | 105 | then they differ considerably: | 
|  | 106 | Xenix FS | 
|  | 107 | char           s_clean; | 
|  | 108 | char           s_fill[371]; | 
|  | 109 | long           s_magic; | 
|  | 110 | long           s_type; | 
|  | 111 | SystemV FS | 
|  | 112 | long           s_fill[12 or 14]; | 
|  | 113 | long           s_state; | 
|  | 114 | long           s_magic; | 
|  | 115 | long           s_type; | 
|  | 116 | Coherent FS | 
|  | 117 | unsigned long  s_unique; | 
|  | 118 | Note that Coherent FS has no magic. | 
|  | 119 |  | 
|  | 120 | * Inode layout: | 
|  | 121 | - Minix FS | 
|  | 122 | unsigned short i_mode; | 
|  | 123 | unsigned short i_uid; | 
|  | 124 | unsigned long  i_size; | 
|  | 125 | unsigned long  i_time; | 
|  | 126 | unsigned char  i_gid; | 
|  | 127 | unsigned char  i_nlinks; | 
|  | 128 | unsigned short i_zone[7+1+1]; | 
|  | 129 | - Xenix FS, SystemV FS, Coherent FS | 
|  | 130 | unsigned short i_mode; | 
|  | 131 | unsigned short i_nlink; | 
|  | 132 | unsigned short i_uid; | 
|  | 133 | unsigned short i_gid; | 
|  | 134 | unsigned long  i_size; | 
|  | 135 | unsigned char  i_zone[3*(10+1+1+1)]; | 
|  | 136 | unsigned long  i_atime; | 
|  | 137 | unsigned long  i_mtime; | 
|  | 138 | unsigned long  i_ctime; | 
|  | 139 |  | 
|  | 140 | * Regular file data blocks are organized as | 
|  | 141 | - Minix FS | 
|  | 142 | 7 direct blocks | 
|  | 143 | 1 indirect block (pointers to blocks) | 
|  | 144 | 1 double-indirect block (pointer to pointers to blocks) | 
|  | 145 | - Xenix FS, SystemV FS, Coherent FS | 
|  | 146 | 10 direct blocks | 
|  | 147 | 1 indirect block (pointers to blocks) | 
|  | 148 | 1 double-indirect block (pointer to pointers to blocks) | 
|  | 149 | 1 triple-indirect block (pointer to pointers to pointers to blocks) | 
|  | 150 |  | 
|  | 151 | * Inode size, inodes per block | 
|  | 152 | - Minix FS        32   32 | 
|  | 153 | - Xenix FS        64   16 | 
|  | 154 | - SystemV FS      64   16 | 
|  | 155 | - Coherent FS     64    8 | 
|  | 156 |  | 
|  | 157 | * Directory entry on disk | 
|  | 158 | - Minix FS | 
|  | 159 | unsigned short inode; | 
|  | 160 | char name[14/30]; | 
|  | 161 | - Xenix FS, SystemV FS, Coherent FS | 
|  | 162 | unsigned short inode; | 
|  | 163 | char name[14]; | 
|  | 164 |  | 
|  | 165 | * Dir entry size, dir entries per block | 
|  | 166 | - Minix FS     16/32    64/32 | 
|  | 167 | - Xenix FS     16       64 | 
|  | 168 | - SystemV FS   16       64 | 
|  | 169 | - Coherent FS  16       32 | 
|  | 170 |  | 
|  | 171 | * How to implement symbolic links such that the host fsck doesn't scream: | 
|  | 172 | - Minix FS     normal | 
|  | 173 | - Xenix FS     kludge: as regular files with  chmod 1000 | 
|  | 174 | - SystemV FS   ?? | 
|  | 175 | - Coherent FS  kludge: as regular files with  chmod 1000 | 
|  | 176 |  | 
|  | 177 |  | 
|  | 178 | Notation: We often speak of a "block" but mean a zone (the allocation unit) | 
|  | 179 | and not the disk driver's notion of "block". | 
|  | 180 |  | 
|  | 181 |  | 
|  | 182 | Bruno Haible  <haible@ma2s2.mathematik.uni-karlsruhe.de> |