| Mark Fasheh | 8df08c8 | 2005-12-15 14:31:23 -0800 | [diff] [blame] | 1 | dlmfs | 
|  | 2 | ================== | 
|  | 3 | A minimal DLM userspace interface implemented via a virtual file | 
|  | 4 | system. | 
|  | 5 |  | 
|  | 6 | dlmfs is built with OCFS2 as it requires most of its infrastructure. | 
|  | 7 |  | 
|  | 8 | Project web page:    http://oss.oracle.com/projects/ocfs2 | 
|  | 9 | Tools web page:      http://oss.oracle.com/projects/ocfs2-tools | 
|  | 10 | OCFS2 mailing lists: http://oss.oracle.com/projects/ocfs2/mailman/ | 
|  | 11 |  | 
|  | 12 | All code copyright 2005 Oracle except when otherwise noted. | 
|  | 13 |  | 
|  | 14 | CREDITS | 
|  | 15 | ======= | 
|  | 16 |  | 
|  | 17 | Some code taken from ramfs which is Copyright (C) 2000 Linus Torvalds | 
|  | 18 | and Transmeta Corp. | 
|  | 19 |  | 
|  | 20 | Mark Fasheh <mark.fasheh@oracle.com> | 
|  | 21 |  | 
|  | 22 | Caveats | 
|  | 23 | ======= | 
|  | 24 | - Right now it only works with the OCFS2 DLM, though support for other | 
|  | 25 | DLM implementations should not be a major issue. | 
|  | 26 |  | 
|  | 27 | Mount options | 
|  | 28 | ============= | 
|  | 29 | None | 
|  | 30 |  | 
|  | 31 | Usage | 
|  | 32 | ===== | 
|  | 33 |  | 
|  | 34 | If you're just interested in OCFS2, then please see ocfs2.txt. The | 
|  | 35 | rest of this document will be geared towards those who want to use | 
|  | 36 | dlmfs for easy to setup and easy to use clustered locking in | 
|  | 37 | userspace. | 
|  | 38 |  | 
|  | 39 | Setup | 
|  | 40 | ===== | 
|  | 41 |  | 
|  | 42 | dlmfs requires that the OCFS2 cluster infrastructure be in | 
|  | 43 | place. Please download ocfs2-tools from the above url and configure a | 
|  | 44 | cluster. | 
|  | 45 |  | 
|  | 46 | You'll want to start heartbeating on a volume which all the nodes in | 
|  | 47 | your lockspace can access. The easiest way to do this is via | 
|  | 48 | ocfs2_hb_ctl (distributed with ocfs2-tools). Right now it requires | 
|  | 49 | that an OCFS2 file system be in place so that it can automatically | 
|  | 50 | find it's heartbeat area, though it will eventually support heartbeat | 
|  | 51 | against raw disks. | 
|  | 52 |  | 
|  | 53 | Please see the ocfs2_hb_ctl and mkfs.ocfs2 manual pages distributed | 
|  | 54 | with ocfs2-tools. | 
|  | 55 |  | 
|  | 56 | Once you're heartbeating, DLM lock 'domains' can be easily created / | 
|  | 57 | destroyed and locks within them accessed. | 
|  | 58 |  | 
|  | 59 | Locking | 
|  | 60 | ======= | 
|  | 61 |  | 
|  | 62 | Users may access dlmfs via standard file system calls, or they can use | 
|  | 63 | 'libo2dlm' (distributed with ocfs2-tools) which abstracts the file | 
|  | 64 | system calls and presents a more traditional locking api. | 
|  | 65 |  | 
|  | 66 | dlmfs handles lock caching automatically for the user, so a lock | 
|  | 67 | request for an already acquired lock will not generate another DLM | 
|  | 68 | call. Userspace programs are assumed to handle their own local | 
|  | 69 | locking. | 
|  | 70 |  | 
|  | 71 | Two levels of locks are supported - Shared Read, and Exlcusive. | 
|  | 72 | Also supported is a Trylock operation. | 
|  | 73 |  | 
|  | 74 | For information on the libo2dlm interface, please see o2dlm.h, | 
|  | 75 | distributed with ocfs2-tools. | 
|  | 76 |  | 
|  | 77 | Lock value blocks can be read and written to a resource via read(2) | 
|  | 78 | and write(2) against the fd obtained via your open(2) call. The | 
|  | 79 | maximum currently supported LVB length is 64 bytes (though that is an | 
|  | 80 | OCFS2 DLM limitation). Through this mechanism, users of dlmfs can share | 
|  | 81 | small amounts of data amongst their nodes. | 
|  | 82 |  | 
|  | 83 | mkdir(2) signals dlmfs to join a domain (which will have the same name | 
|  | 84 | as the resulting directory) | 
|  | 85 |  | 
|  | 86 | rmdir(2) signals dlmfs to leave the domain | 
|  | 87 |  | 
|  | 88 | Locks for a given domain are represented by regular inodes inside the | 
|  | 89 | domain directory.  Locking against them is done via the open(2) system | 
|  | 90 | call. | 
|  | 91 |  | 
|  | 92 | The open(2) call will not return until your lock has been granted or | 
|  | 93 | an error has occurred, unless it has been instructed to do a trylock | 
|  | 94 | operation. If the lock succeeds, you'll get an fd. | 
|  | 95 |  | 
|  | 96 | open(2) with O_CREAT to ensure the resource inode is created - dlmfs does | 
|  | 97 | not automatically create inodes for existing lock resources. | 
|  | 98 |  | 
|  | 99 | Open Flag     Lock Request Type | 
|  | 100 | ---------     ----------------- | 
|  | 101 | O_RDONLY      Shared Read | 
|  | 102 | O_RDWR        Exclusive | 
|  | 103 |  | 
|  | 104 | Open Flag     Resulting Locking Behavior | 
|  | 105 | ---------     -------------------------- | 
|  | 106 | O_NONBLOCK    Trylock operation | 
|  | 107 |  | 
|  | 108 | You must provide exactly one of O_RDONLY or O_RDWR. | 
|  | 109 |  | 
|  | 110 | If O_NONBLOCK is also provided and the trylock operation was valid but | 
|  | 111 | could not lock the resource then open(2) will return ETXTBUSY. | 
|  | 112 |  | 
|  | 113 | close(2) drops the lock associated with your fd. | 
|  | 114 |  | 
|  | 115 | Modes passed to mkdir(2) or open(2) are adhered to locally. Chown is | 
|  | 116 | supported locally as well. This means you can use them to restrict | 
|  | 117 | access to the resources via dlmfs on your local node only. | 
|  | 118 |  | 
|  | 119 | The resource LVB may be read from the fd in either Shared Read or | 
|  | 120 | Exclusive modes via the read(2) system call. It can be written via | 
|  | 121 | write(2) only when open in Exclusive mode. | 
|  | 122 |  | 
|  | 123 | Once written, an LVB will be visible to other nodes who obtain Read | 
|  | 124 | Only or higher level locks on the resource. | 
|  | 125 |  | 
|  | 126 | See Also | 
|  | 127 | ======== | 
|  | 128 | http://opendlm.sourceforge.net/cvsmirror/opendlm/docs/dlmbook_final.pdf | 
|  | 129 |  | 
|  | 130 | For more information on the VMS distributed locking API. |