| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 1 | /*****************************************/ | 
 | 2 | Kernel Connector. | 
 | 3 | /*****************************************/ | 
 | 4 |  | 
 | 5 | Kernel connector - new netlink based userspace <-> kernel space easy | 
 | 6 | to use communication module. | 
 | 7 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 8 | The Connector driver makes it easy to connect various agents using a | 
 | 9 | netlink based network.  One must register a callback and an identifier. | 
 | 10 | When the driver receives a special netlink message with the appropriate | 
 | 11 | identifier, the appropriate callback will be called. | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 12 |  | 
 | 13 | From the userspace point of view it's quite straightforward: | 
 | 14 |  | 
 | 15 | 	socket(); | 
 | 16 | 	bind(); | 
 | 17 | 	send(); | 
 | 18 | 	recv(); | 
 | 19 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 20 | But if kernelspace wants to use the full power of such connections, the | 
 | 21 | driver writer must create special sockets, must know about struct sk_buff | 
 | 22 | handling, etc...  The Connector driver allows any kernelspace agents to use | 
 | 23 | netlink based networking for inter-process communication in a significantly | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 24 | easier way: | 
 | 25 |  | 
| Philipp Reisner | 7069331 | 2009-10-02 02:40:05 +0000 | [diff] [blame] | 26 | int cn_add_callback(struct cb_id *id, char *name, void (*callback) (struct cn_msg *, struct netlink_skb_parms *)); | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 27 | void cn_netlink_send(struct cn_msg *msg, u32 __group, int gfp_mask); | 
 | 28 |  | 
 | 29 | struct cb_id | 
 | 30 | { | 
 | 31 | 	__u32			idx; | 
 | 32 | 	__u32			val; | 
 | 33 | }; | 
 | 34 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 35 | idx and val are unique identifiers which must be registered in the | 
 | 36 | connector.h header for in-kernel usage.  void (*callback) (void *) is a | 
 | 37 | callback function which will be called when a message with above idx.val | 
 | 38 | is received by the connector core.  The argument for that function must | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 39 | be dereferenced to struct cn_msg *. | 
 | 40 |  | 
 | 41 | struct cn_msg | 
 | 42 | { | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 43 | 	struct cb_id		id; | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 44 |  | 
 | 45 | 	__u32			seq; | 
 | 46 | 	__u32			ack; | 
 | 47 |  | 
 | 48 | 	__u32			len;		/* Length of the following data */ | 
 | 49 | 	__u8			data[0]; | 
 | 50 | }; | 
 | 51 |  | 
 | 52 | /*****************************************/ | 
 | 53 | Connector interfaces. | 
 | 54 | /*****************************************/ | 
 | 55 |  | 
| Philipp Reisner | 7069331 | 2009-10-02 02:40:05 +0000 | [diff] [blame] | 56 | int cn_add_callback(struct cb_id *id, char *name, void (*callback) (struct cn_msg *, struct netlink_skb_parms *)); | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 57 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 58 |  Registers new callback with connector core. | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 59 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 60 |  struct cb_id *id		- unique connector's user identifier. | 
 | 61 | 				  It must be registered in connector.h for legal in-kernel users. | 
 | 62 |  char *name			- connector's callback symbolic name. | 
| Philipp Reisner | 7069331 | 2009-10-02 02:40:05 +0000 | [diff] [blame] | 63 |  void (*callback) (struct cn..)	- connector's callback. | 
 | 64 | 				  cn_msg and the sender's credentials | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 65 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 66 |  | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 67 | void cn_del_callback(struct cb_id *id); | 
 | 68 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 69 |  Unregisters new callback with connector core. | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 70 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 71 |  struct cb_id *id		- unique connector's user identifier. | 
 | 72 |  | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 73 |  | 
| Evgeniy Polyakov | b191ba0 | 2006-03-20 22:21:40 -0800 | [diff] [blame] | 74 | int cn_netlink_send(struct cn_msg *msg, u32 __groups, int gfp_mask); | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 75 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 76 |  Sends message to the specified groups.  It can be safely called from | 
 | 77 |  softirq context, but may silently fail under strong memory pressure. | 
 | 78 |  If there are no listeners for given group -ESRCH can be returned. | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 79 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 80 |  struct cn_msg *		- message header(with attached data). | 
 | 81 |  u32 __group			- destination group. | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 82 | 				  If __group is zero, then appropriate group will | 
 | 83 | 				  be searched through all registered connector users, | 
 | 84 | 				  and message will be delivered to the group which was | 
 | 85 | 				  created for user with the same ID as in msg. | 
 | 86 | 				  If __group is not zero, then message will be delivered | 
 | 87 | 				  to the specified group. | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 88 |  int gfp_mask			- GFP mask. | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 89 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 90 |  Note: When registering new callback user, connector core assigns | 
| Francis Galiegue | a33f322 | 2010-04-23 00:08:02 +0200 | [diff] [blame] | 91 |  netlink group to the user which is equal to its id.idx. | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 92 |  | 
 | 93 | /*****************************************/ | 
 | 94 | Protocol description. | 
 | 95 | /*****************************************/ | 
 | 96 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 97 | The current framework offers a transport layer with fixed headers.  The | 
 | 98 | recommended protocol which uses such a header is as following: | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 99 |  | 
 | 100 | msg->seq and msg->ack are used to determine message genealogy.  When | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 101 | someone sends a message, they use a locally unique sequence and random | 
 | 102 | acknowledge number.  The sequence number may be copied into | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 103 | nlmsghdr->nlmsg_seq too. | 
 | 104 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 105 | The sequence number is incremented with each message sent. | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 106 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 107 | If you expect a reply to the message, then the sequence number in the | 
 | 108 | received message MUST be the same as in the original message, and the | 
 | 109 | acknowledge number MUST be the same + 1. | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 110 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 111 | If we receive a message and its sequence number is not equal to one we | 
 | 112 | are expecting, then it is a new message.  If we receive a message and | 
 | 113 | its sequence number is the same as one we are expecting, but its | 
 | 114 | acknowledge is not equal to the acknowledge number in the original | 
 | 115 | message + 1, then it is a new message. | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 116 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 117 | Obviously, the protocol header contains the above id. | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 118 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 119 | The connector allows event notification in the following form: kernel | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 120 | driver or userspace process can ask connector to notify it when | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 121 | selected ids will be turned on or off (registered or unregistered its | 
 | 122 | callback).  It is done by sending a special command to the connector | 
 | 123 | driver (it also registers itself with id={-1, -1}). | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 124 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 125 | As example of this usage can be found in the cn_test.c module which | 
 | 126 | uses the connector to request notification and to send messages. | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 127 |  | 
 | 128 | /*****************************************/ | 
 | 129 | Reliability. | 
 | 130 | /*****************************************/ | 
 | 131 |  | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 132 | Netlink itself is not a reliable protocol.  That means that messages can | 
| Evgeniy Polyakov | 7672d0b | 2005-09-11 19:15:07 -0700 | [diff] [blame] | 133 | be lost due to memory pressure or process' receiving queue overflowed, | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 134 | so caller is warned that it must be prepared.  That is why the struct | 
 | 135 | cn_msg [main connector's message header] contains u32 seq and u32 ack | 
 | 136 | fields. | 
| Evgeniy Polyakov | eb0d604 | 2005-10-13 14:42:04 -0700 | [diff] [blame] | 137 |  | 
 | 138 | /*****************************************/ | 
 | 139 | Userspace usage. | 
 | 140 | /*****************************************/ | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 141 |  | 
| Evgeniy Polyakov | eb0d604 | 2005-10-13 14:42:04 -0700 | [diff] [blame] | 142 | 2.6.14 has a new netlink socket implementation, which by default does not | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 143 | allow people to send data to netlink groups other than 1. | 
 | 144 | So, if you wish to use a netlink socket (for example using connector) | 
 | 145 | with a different group number, the userspace application must subscribe to | 
 | 146 | that group first.  It can be achieved by the following pseudocode: | 
| Evgeniy Polyakov | eb0d604 | 2005-10-13 14:42:04 -0700 | [diff] [blame] | 147 |  | 
 | 148 | s = socket(PF_NETLINK, SOCK_DGRAM, NETLINK_CONNECTOR); | 
 | 149 |  | 
 | 150 | l_local.nl_family = AF_NETLINK; | 
 | 151 | l_local.nl_groups = 12345; | 
 | 152 | l_local.nl_pid = 0; | 
 | 153 |  | 
 | 154 | if (bind(s, (struct sockaddr *)&l_local, sizeof(struct sockaddr_nl)) == -1) { | 
 | 155 | 	perror("bind"); | 
 | 156 | 	close(s); | 
 | 157 | 	return -1; | 
 | 158 | } | 
 | 159 |  | 
 | 160 | { | 
 | 161 | 	int on = l_local.nl_groups; | 
 | 162 | 	setsockopt(s, 270, 1, &on, sizeof(on)); | 
 | 163 | } | 
 | 164 |  | 
 | 165 | Where 270 above is SOL_NETLINK, and 1 is a NETLINK_ADD_MEMBERSHIP socket | 
| Mike Frysinger | 41144ca | 2009-07-17 10:13:58 -0700 | [diff] [blame] | 166 | option.  To drop a multicast subscription, one should call the above socket | 
 | 167 | option with the NETLINK_DROP_MEMBERSHIP parameter which is defined as 0. | 
| Evgeniy Polyakov | eb0d604 | 2005-10-13 14:42:04 -0700 | [diff] [blame] | 168 |  | 
 | 169 | 2.6.14 netlink code only allows to select a group which is less or equal to | 
 | 170 | the maximum group number, which is used at netlink_kernel_create() time. | 
 | 171 | In case of connector it is CN_NETLINK_USERS + 0xf, so if you want to use | 
 | 172 | group number 12345, you must increment CN_NETLINK_USERS to that number. | 
 | 173 | Additional 0xf numbers are allocated to be used by non-in-kernel users. | 
 | 174 |  | 
 | 175 | Due to this limitation, group 0xffffffff does not work now, so one can | 
 | 176 | not use add/remove connector's group notifications, but as far as I know,  | 
 | 177 | only cn_test.c test module used it. | 
 | 178 |  | 
 | 179 | Some work in netlink area is still being done, so things can be changed in | 
 | 180 | 2.6.15 timeframe, if it will happen, documentation will be updated for that | 
 | 181 | kernel. |