|  | ====================== | 
|  | RxRPC NETWORK PROTOCOL | 
|  | ====================== | 
|  |  | 
|  | The RxRPC protocol driver provides a reliable two-phase transport on top of UDP | 
|  | that can be used to perform RxRPC remote operations.  This is done over sockets | 
|  | of AF_RXRPC family, using sendmsg() and recvmsg() with control data to send and | 
|  | receive data, aborts and errors. | 
|  |  | 
|  | Contents of this document: | 
|  |  | 
|  | (*) Overview. | 
|  |  | 
|  | (*) RxRPC protocol summary. | 
|  |  | 
|  | (*) AF_RXRPC driver model. | 
|  |  | 
|  | (*) Control messages. | 
|  |  | 
|  | (*) Socket options. | 
|  |  | 
|  | (*) Security. | 
|  |  | 
|  | (*) Example client usage. | 
|  |  | 
|  | (*) Example server usage. | 
|  |  | 
|  | (*) AF_RXRPC kernel interface. | 
|  |  | 
|  |  | 
|  | ======== | 
|  | OVERVIEW | 
|  | ======== | 
|  |  | 
|  | RxRPC is a two-layer protocol.  There is a session layer which provides | 
|  | reliable virtual connections using UDP over IPv4 (or IPv6) as the transport | 
|  | layer, but implements a real network protocol; and there's the presentation | 
|  | layer which renders structured data to binary blobs and back again using XDR | 
|  | (as does SunRPC): | 
|  |  | 
|  | +-------------+ | 
|  | | Application | | 
|  | +-------------+ | 
|  | |     XDR     |		Presentation | 
|  | +-------------+ | 
|  | |    RxRPC    |		Session | 
|  | +-------------+ | 
|  | |     UDP     |		Transport | 
|  | +-------------+ | 
|  |  | 
|  |  | 
|  | AF_RXRPC provides: | 
|  |  | 
|  | (1) Part of an RxRPC facility for both kernel and userspace applications by | 
|  | making the session part of it a Linux network protocol (AF_RXRPC). | 
|  |  | 
|  | (2) A two-phase protocol.  The client transmits a blob (the request) and then | 
|  | receives a blob (the reply), and the server receives the request and then | 
|  | transmits the reply. | 
|  |  | 
|  | (3) Retention of the reusable bits of the transport system set up for one call | 
|  | to speed up subsequent calls. | 
|  |  | 
|  | (4) A secure protocol, using the Linux kernel's key retention facility to | 
|  | manage security on the client end.  The server end must of necessity be | 
|  | more active in security negotiations. | 
|  |  | 
|  | AF_RXRPC does not provide XDR marshalling/presentation facilities.  That is | 
|  | left to the application.  AF_RXRPC only deals in blobs.  Even the operation ID | 
|  | is just the first four bytes of the request blob, and as such is beyond the | 
|  | kernel's interest. | 
|  |  | 
|  |  | 
|  | Sockets of AF_RXRPC family are: | 
|  |  | 
|  | (1) created as type SOCK_DGRAM; | 
|  |  | 
|  | (2) provided with a protocol of the type of underlying transport they're going | 
|  | to use - currently only PF_INET is supported. | 
|  |  | 
|  |  | 
|  | The Andrew File System (AFS) is an example of an application that uses this and | 
|  | that has both kernel (filesystem) and userspace (utility) components. | 
|  |  | 
|  |  | 
|  | ====================== | 
|  | RXRPC PROTOCOL SUMMARY | 
|  | ====================== | 
|  |  | 
|  | An overview of the RxRPC protocol: | 
|  |  | 
|  | (*) RxRPC sits on top of another networking protocol (UDP is the only option | 
|  | currently), and uses this to provide network transport.  UDP ports, for | 
|  | example, provide transport endpoints. | 
|  |  | 
|  | (*) RxRPC supports multiple virtual "connections" from any given transport | 
|  | endpoint, thus allowing the endpoints to be shared, even to the same | 
|  | remote endpoint. | 
|  |  | 
|  | (*) Each connection goes to a particular "service".  A connection may not go | 
|  | to multiple services.  A service may be considered the RxRPC equivalent of | 
|  | a port number.  AF_RXRPC permits multiple services to share an endpoint. | 
|  |  | 
|  | (*) Client-originating packets are marked, thus a transport endpoint can be | 
|  | shared between client and server connections (connections have a | 
|  | direction). | 
|  |  | 
|  | (*) Up to a billion connections may be supported concurrently between one | 
|  | local transport endpoint and one service on one remote endpoint.  An RxRPC | 
|  | connection is described by seven numbers: | 
|  |  | 
|  | Local address	} | 
|  | Local port	} Transport (UDP) address | 
|  | Remote address	} | 
|  | Remote port	} | 
|  | Direction | 
|  | Connection ID | 
|  | Service ID | 
|  |  | 
|  | (*) Each RxRPC operation is a "call".  A connection may make up to four | 
|  | billion calls, but only up to four calls may be in progress on a | 
|  | connection at any one time. | 
|  |  | 
|  | (*) Calls are two-phase and asymmetric: the client sends its request data, | 
|  | which the service receives; then the service transmits the reply data | 
|  | which the client receives. | 
|  |  | 
|  | (*) The data blobs are of indefinite size, the end of a phase is marked with a | 
|  | flag in the packet.  The number of packets of data making up one blob may | 
|  | not exceed 4 billion, however, as this would cause the sequence number to | 
|  | wrap. | 
|  |  | 
|  | (*) The first four bytes of the request data are the service operation ID. | 
|  |  | 
|  | (*) Security is negotiated on a per-connection basis.  The connection is | 
|  | initiated by the first data packet on it arriving.  If security is | 
|  | requested, the server then issues a "challenge" and then the client | 
|  | replies with a "response".  If the response is successful, the security is | 
|  | set for the lifetime of that connection, and all subsequent calls made | 
|  | upon it use that same security.  In the event that the server lets a | 
|  | connection lapse before the client, the security will be renegotiated if | 
|  | the client uses the connection again. | 
|  |  | 
|  | (*) Calls use ACK packets to handle reliability.  Data packets are also | 
|  | explicitly sequenced per call. | 
|  |  | 
|  | (*) There are two types of positive acknowledgement: hard-ACKs and soft-ACKs. | 
|  | A hard-ACK indicates to the far side that all the data received to a point | 
|  | has been received and processed; a soft-ACK indicates that the data has | 
|  | been received but may yet be discarded and re-requested.  The sender may | 
|  | not discard any transmittable packets until they've been hard-ACK'd. | 
|  |  | 
|  | (*) Reception of a reply data packet implicitly hard-ACK's all the data | 
|  | packets that make up the request. | 
|  |  | 
|  | (*) An call is complete when the request has been sent, the reply has been | 
|  | received and the final hard-ACK on the last packet of the reply has | 
|  | reached the server. | 
|  |  | 
|  | (*) An call may be aborted by either end at any time up to its completion. | 
|  |  | 
|  |  | 
|  | ===================== | 
|  | AF_RXRPC DRIVER MODEL | 
|  | ===================== | 
|  |  | 
|  | About the AF_RXRPC driver: | 
|  |  | 
|  | (*) The AF_RXRPC protocol transparently uses internal sockets of the transport | 
|  | protocol to represent transport endpoints. | 
|  |  | 
|  | (*) AF_RXRPC sockets map onto RxRPC connection bundles.  Actual RxRPC | 
|  | connections are handled transparently.  One client socket may be used to | 
|  | make multiple simultaneous calls to the same service.  One server socket | 
|  | may handle calls from many clients. | 
|  |  | 
|  | (*) Additional parallel client connections will be initiated to support extra | 
|  | concurrent calls, up to a tunable limit. | 
|  |  | 
|  | (*) Each connection is retained for a certain amount of time [tunable] after | 
|  | the last call currently using it has completed in case a new call is made | 
|  | that could reuse it. | 
|  |  | 
|  | (*) Each internal UDP socket is retained [tunable] for a certain amount of | 
|  | time [tunable] after the last connection using it discarded, in case a new | 
|  | connection is made that could use it. | 
|  |  | 
|  | (*) A client-side connection is only shared between calls if they have have | 
|  | the same key struct describing their security (and assuming the calls | 
|  | would otherwise share the connection).  Non-secured calls would also be | 
|  | able to share connections with each other. | 
|  |  | 
|  | (*) A server-side connection is shared if the client says it is. | 
|  |  | 
|  | (*) ACK'ing is handled by the protocol driver automatically, including ping | 
|  | replying. | 
|  |  | 
|  | (*) SO_KEEPALIVE automatically pings the other side to keep the connection | 
|  | alive [TODO]. | 
|  |  | 
|  | (*) If an ICMP error is received, all calls affected by that error will be | 
|  | aborted with an appropriate network error passed through recvmsg(). | 
|  |  | 
|  |  | 
|  | Interaction with the user of the RxRPC socket: | 
|  |  | 
|  | (*) A socket is made into a server socket by binding an address with a | 
|  | non-zero service ID. | 
|  |  | 
|  | (*) In the client, sending a request is achieved with one or more sendmsgs, | 
|  | followed by the reply being received with one or more recvmsgs. | 
|  |  | 
|  | (*) The first sendmsg for a request to be sent from a client contains a tag to | 
|  | be used in all other sendmsgs or recvmsgs associated with that call.  The | 
|  | tag is carried in the control data. | 
|  |  | 
|  | (*) connect() is used to supply a default destination address for a client | 
|  | socket.  This may be overridden by supplying an alternate address to the | 
|  | first sendmsg() of a call (struct msghdr::msg_name). | 
|  |  | 
|  | (*) If connect() is called on an unbound client, a random local port will | 
|  | bound before the operation takes place. | 
|  |  | 
|  | (*) A server socket may also be used to make client calls.  To do this, the | 
|  | first sendmsg() of the call must specify the target address.  The server's | 
|  | transport endpoint is used to send the packets. | 
|  |  | 
|  | (*) Once the application has received the last message associated with a call, | 
|  | the tag is guaranteed not to be seen again, and so it can be used to pin | 
|  | client resources.  A new call can then be initiated with the same tag | 
|  | without fear of interference. | 
|  |  | 
|  | (*) In the server, a request is received with one or more recvmsgs, then the | 
|  | the reply is transmitted with one or more sendmsgs, and then the final ACK | 
|  | is received with a last recvmsg. | 
|  |  | 
|  | (*) When sending data for a call, sendmsg is given MSG_MORE if there's more | 
|  | data to come on that call. | 
|  |  | 
|  | (*) When receiving data for a call, recvmsg flags MSG_MORE if there's more | 
|  | data to come for that call. | 
|  |  | 
|  | (*) When receiving data or messages for a call, MSG_EOR is flagged by recvmsg | 
|  | to indicate the terminal message for that call. | 
|  |  | 
|  | (*) A call may be aborted by adding an abort control message to the control | 
|  | data.  Issuing an abort terminates the kernel's use of that call's tag. | 
|  | Any messages waiting in the receive queue for that call will be discarded. | 
|  |  | 
|  | (*) Aborts, busy notifications and challenge packets are delivered by recvmsg, | 
|  | and control data messages will be set to indicate the context.  Receiving | 
|  | an abort or a busy message terminates the kernel's use of that call's tag. | 
|  |  | 
|  | (*) The control data part of the msghdr struct is used for a number of things: | 
|  |  | 
|  | (*) The tag of the intended or affected call. | 
|  |  | 
|  | (*) Sending or receiving errors, aborts and busy notifications. | 
|  |  | 
|  | (*) Notifications of incoming calls. | 
|  |  | 
|  | (*) Sending debug requests and receiving debug replies [TODO]. | 
|  |  | 
|  | (*) When the kernel has received and set up an incoming call, it sends a | 
|  | message to server application to let it know there's a new call awaiting | 
|  | its acceptance [recvmsg reports a special control message].  The server | 
|  | application then uses sendmsg to assign a tag to the new call.  Once that | 
|  | is done, the first part of the request data will be delivered by recvmsg. | 
|  |  | 
|  | (*) The server application has to provide the server socket with a keyring of | 
|  | secret keys corresponding to the security types it permits.  When a secure | 
|  | connection is being set up, the kernel looks up the appropriate secret key | 
|  | in the keyring and then sends a challenge packet to the client and | 
|  | receives a response packet.  The kernel then checks the authorisation of | 
|  | the packet and either aborts the connection or sets up the security. | 
|  |  | 
|  | (*) The name of the key a client will use to secure its communications is | 
|  | nominated by a socket option. | 
|  |  | 
|  |  | 
|  | Notes on recvmsg: | 
|  |  | 
|  | (*) If there's a sequence of data messages belonging to a particular call on | 
|  | the receive queue, then recvmsg will keep working through them until: | 
|  |  | 
|  | (a) it meets the end of that call's received data, | 
|  |  | 
|  | (b) it meets a non-data message, | 
|  |  | 
|  | (c) it meets a message belonging to a different call, or | 
|  |  | 
|  | (d) it fills the user buffer. | 
|  |  | 
|  | If recvmsg is called in blocking mode, it will keep sleeping, awaiting the | 
|  | reception of further data, until one of the above four conditions is met. | 
|  |  | 
|  | (2) MSG_PEEK operates similarly, but will return immediately if it has put any | 
|  | data in the buffer rather than sleeping until it can fill the buffer. | 
|  |  | 
|  | (3) If a data message is only partially consumed in filling a user buffer, | 
|  | then the remainder of that message will be left on the front of the queue | 
|  | for the next taker.  MSG_TRUNC will never be flagged. | 
|  |  | 
|  | (4) If there is more data to be had on a call (it hasn't copied the last byte | 
|  | of the last data message in that phase yet), then MSG_MORE will be | 
|  | flagged. | 
|  |  | 
|  |  | 
|  | ================ | 
|  | CONTROL MESSAGES | 
|  | ================ | 
|  |  | 
|  | AF_RXRPC makes use of control messages in sendmsg() and recvmsg() to multiplex | 
|  | calls, to invoke certain actions and to report certain conditions.  These are: | 
|  |  | 
|  | MESSAGE ID		SRT DATA	MEANING | 
|  | =======================	=== ===========	=============================== | 
|  | RXRPC_USER_CALL_ID	sr- User ID	App's call specifier | 
|  | RXRPC_ABORT		srt Abort code	Abort code to issue/received | 
|  | RXRPC_ACK		-rt n/a		Final ACK received | 
|  | RXRPC_NET_ERROR		-rt error num	Network error on call | 
|  | RXRPC_BUSY		-rt n/a		Call rejected (server busy) | 
|  | RXRPC_LOCAL_ERROR	-rt error num	Local error encountered | 
|  | RXRPC_NEW_CALL		-r- n/a		New call received | 
|  | RXRPC_ACCEPT		s-- n/a		Accept new call | 
|  |  | 
|  | (SRT = usable in Sendmsg / delivered by Recvmsg / Terminal message) | 
|  |  | 
|  | (*) RXRPC_USER_CALL_ID | 
|  |  | 
|  | This is used to indicate the application's call ID.  It's an unsigned long | 
|  | that the app specifies in the client by attaching it to the first data | 
|  | message or in the server by passing it in association with an RXRPC_ACCEPT | 
|  | message.  recvmsg() passes it in conjunction with all messages except | 
|  | those of the RXRPC_NEW_CALL message. | 
|  |  | 
|  | (*) RXRPC_ABORT | 
|  |  | 
|  | This is can be used by an application to abort a call by passing it to | 
|  | sendmsg, or it can be delivered by recvmsg to indicate a remote abort was | 
|  | received.  Either way, it must be associated with an RXRPC_USER_CALL_ID to | 
|  | specify the call affected.  If an abort is being sent, then error EBADSLT | 
|  | will be returned if there is no call with that user ID. | 
|  |  | 
|  | (*) RXRPC_ACK | 
|  |  | 
|  | This is delivered to a server application to indicate that the final ACK | 
|  | of a call was received from the client.  It will be associated with an | 
|  | RXRPC_USER_CALL_ID to indicate the call that's now complete. | 
|  |  | 
|  | (*) RXRPC_NET_ERROR | 
|  |  | 
|  | This is delivered to an application to indicate that an ICMP error message | 
|  | was encountered in the process of trying to talk to the peer.  An | 
|  | errno-class integer value will be included in the control message data | 
|  | indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call | 
|  | affected. | 
|  |  | 
|  | (*) RXRPC_BUSY | 
|  |  | 
|  | This is delivered to a client application to indicate that a call was | 
|  | rejected by the server due to the server being busy.  It will be | 
|  | associated with an RXRPC_USER_CALL_ID to indicate the rejected call. | 
|  |  | 
|  | (*) RXRPC_LOCAL_ERROR | 
|  |  | 
|  | This is delivered to an application to indicate that a local error was | 
|  | encountered and that a call has been aborted because of it.  An | 
|  | errno-class integer value will be included in the control message data | 
|  | indicating the problem, and an RXRPC_USER_CALL_ID will indicate the call | 
|  | affected. | 
|  |  | 
|  | (*) RXRPC_NEW_CALL | 
|  |  | 
|  | This is delivered to indicate to a server application that a new call has | 
|  | arrived and is awaiting acceptance.  No user ID is associated with this, | 
|  | as a user ID must subsequently be assigned by doing an RXRPC_ACCEPT. | 
|  |  | 
|  | (*) RXRPC_ACCEPT | 
|  |  | 
|  | This is used by a server application to attempt to accept a call and | 
|  | assign it a user ID.  It should be associated with an RXRPC_USER_CALL_ID | 
|  | to indicate the user ID to be assigned.  If there is no call to be | 
|  | accepted (it may have timed out, been aborted, etc.), then sendmsg will | 
|  | return error ENODATA.  If the user ID is already in use by another call, | 
|  | then error EBADSLT will be returned. | 
|  |  | 
|  |  | 
|  | ============== | 
|  | SOCKET OPTIONS | 
|  | ============== | 
|  |  | 
|  | AF_RXRPC sockets support a few socket options at the SOL_RXRPC level: | 
|  |  | 
|  | (*) RXRPC_SECURITY_KEY | 
|  |  | 
|  | This is used to specify the description of the key to be used.  The key is | 
|  | extracted from the calling process's keyrings with request_key() and | 
|  | should be of "rxrpc" type. | 
|  |  | 
|  | The optval pointer points to the description string, and optlen indicates | 
|  | how long the string is, without the NUL terminator. | 
|  |  | 
|  | (*) RXRPC_SECURITY_KEYRING | 
|  |  | 
|  | Similar to above but specifies a keyring of server secret keys to use (key | 
|  | type "keyring").  See the "Security" section. | 
|  |  | 
|  | (*) RXRPC_EXCLUSIVE_CONNECTION | 
|  |  | 
|  | This is used to request that new connections should be used for each call | 
|  | made subsequently on this socket.  optval should be NULL and optlen 0. | 
|  |  | 
|  | (*) RXRPC_MIN_SECURITY_LEVEL | 
|  |  | 
|  | This is used to specify the minimum security level required for calls on | 
|  | this socket.  optval must point to an int containing one of the following | 
|  | values: | 
|  |  | 
|  | (a) RXRPC_SECURITY_PLAIN | 
|  |  | 
|  | Encrypted checksum only. | 
|  |  | 
|  | (b) RXRPC_SECURITY_AUTH | 
|  |  | 
|  | Encrypted checksum plus packet padded and first eight bytes of packet | 
|  | encrypted - which includes the actual packet length. | 
|  |  | 
|  | (c) RXRPC_SECURITY_ENCRYPTED | 
|  |  | 
|  | Encrypted checksum plus entire packet padded and encrypted, including | 
|  | actual packet length. | 
|  |  | 
|  |  | 
|  | ======== | 
|  | SECURITY | 
|  | ======== | 
|  |  | 
|  | Currently, only the kerberos 4 equivalent protocol has been implemented | 
|  | (security index 2 - rxkad).  This requires the rxkad module to be loaded and, | 
|  | on the client, tickets of the appropriate type to be obtained from the AFS | 
|  | kaserver or the kerberos server and installed as "rxrpc" type keys.  This is | 
|  | normally done using the klog program.  An example simple klog program can be | 
|  | found at: | 
|  |  | 
|  | http://people.redhat.com/~dhowells/rxrpc/klog.c | 
|  |  | 
|  | The payload provided to add_key() on the client should be of the following | 
|  | form: | 
|  |  | 
|  | struct rxrpc_key_sec2_v1 { | 
|  | uint16_t	security_index;	/* 2 */ | 
|  | uint16_t	ticket_length;	/* length of ticket[] */ | 
|  | uint32_t	expiry;		/* time at which expires */ | 
|  | uint8_t		kvno;		/* key version number */ | 
|  | uint8_t		__pad[3]; | 
|  | uint8_t		session_key[8];	/* DES session key */ | 
|  | uint8_t		ticket[0];	/* the encrypted ticket */ | 
|  | }; | 
|  |  | 
|  | Where the ticket blob is just appended to the above structure. | 
|  |  | 
|  |  | 
|  | For the server, keys of type "rxrpc_s" must be made available to the server. | 
|  | They have a description of "<serviceID>:<securityIndex>" (eg: "52:2" for an | 
|  | rxkad key for the AFS VL service).  When such a key is created, it should be | 
|  | given the server's secret key as the instantiation data (see the example | 
|  | below). | 
|  |  | 
|  | add_key("rxrpc_s", "52:2", secret_key, 8, keyring); | 
|  |  | 
|  | A keyring is passed to the server socket by naming it in a sockopt.  The server | 
|  | socket then looks the server secret keys up in this keyring when secure | 
|  | incoming connections are made.  This can be seen in an example program that can | 
|  | be found at: | 
|  |  | 
|  | http://people.redhat.com/~dhowells/rxrpc/listen.c | 
|  |  | 
|  |  | 
|  | ==================== | 
|  | EXAMPLE CLIENT USAGE | 
|  | ==================== | 
|  |  | 
|  | A client would issue an operation by: | 
|  |  | 
|  | (1) An RxRPC socket is set up by: | 
|  |  | 
|  | client = socket(AF_RXRPC, SOCK_DGRAM, PF_INET); | 
|  |  | 
|  | Where the third parameter indicates the protocol family of the transport | 
|  | socket used - usually IPv4 but it can also be IPv6 [TODO]. | 
|  |  | 
|  | (2) A local address can optionally be bound: | 
|  |  | 
|  | struct sockaddr_rxrpc srx = { | 
|  | .srx_family	= AF_RXRPC, | 
|  | .srx_service	= 0,  /* we're a client */ | 
|  | .transport_type	= SOCK_DGRAM,	/* type of transport socket */ | 
|  | .transport.sin_family	= AF_INET, | 
|  | .transport.sin_port	= htons(7000), /* AFS callback */ | 
|  | .transport.sin_address	= 0,  /* all local interfaces */ | 
|  | }; | 
|  | bind(client, &srx, sizeof(srx)); | 
|  |  | 
|  | This specifies the local UDP port to be used.  If not given, a random | 
|  | non-privileged port will be used.  A UDP port may be shared between | 
|  | several unrelated RxRPC sockets.  Security is handled on a basis of | 
|  | per-RxRPC virtual connection. | 
|  |  | 
|  | (3) The security is set: | 
|  |  | 
|  | const char *key = "AFS:cambridge.redhat.com"; | 
|  | setsockopt(client, SOL_RXRPC, RXRPC_SECURITY_KEY, key, strlen(key)); | 
|  |  | 
|  | This issues a request_key() to get the key representing the security | 
|  | context.  The minimum security level can be set: | 
|  |  | 
|  | unsigned int sec = RXRPC_SECURITY_ENCRYPTED; | 
|  | setsockopt(client, SOL_RXRPC, RXRPC_MIN_SECURITY_LEVEL, | 
|  | &sec, sizeof(sec)); | 
|  |  | 
|  | (4) The server to be contacted can then be specified (alternatively this can | 
|  | be done through sendmsg): | 
|  |  | 
|  | struct sockaddr_rxrpc srx = { | 
|  | .srx_family	= AF_RXRPC, | 
|  | .srx_service	= VL_SERVICE_ID, | 
|  | .transport_type	= SOCK_DGRAM,	/* type of transport socket */ | 
|  | .transport.sin_family	= AF_INET, | 
|  | .transport.sin_port	= htons(7005), /* AFS volume manager */ | 
|  | .transport.sin_address	= ..., | 
|  | }; | 
|  | connect(client, &srx, sizeof(srx)); | 
|  |  | 
|  | (5) The request data should then be posted to the server socket using a series | 
|  | of sendmsg() calls, each with the following control message attached: | 
|  |  | 
|  | RXRPC_USER_CALL_ID	- specifies the user ID for this call | 
|  |  | 
|  | MSG_MORE should be set in msghdr::msg_flags on all but the last part of | 
|  | the request.  Multiple requests may be made simultaneously. | 
|  |  | 
|  | If a call is intended to go to a destination other than the default | 
|  | specified through connect(), then msghdr::msg_name should be set on the | 
|  | first request message of that call. | 
|  |  | 
|  | (6) The reply data will then be posted to the server socket for recvmsg() to | 
|  | pick up.  MSG_MORE will be flagged by recvmsg() if there's more reply data | 
|  | for a particular call to be read.  MSG_EOR will be set on the terminal | 
|  | read for a call. | 
|  |  | 
|  | All data will be delivered with the following control message attached: | 
|  |  | 
|  | RXRPC_USER_CALL_ID	- specifies the user ID for this call | 
|  |  | 
|  | If an abort or error occurred, this will be returned in the control data | 
|  | buffer instead, and MSG_EOR will be flagged to indicate the end of that | 
|  | call. | 
|  |  | 
|  |  | 
|  | ==================== | 
|  | EXAMPLE SERVER USAGE | 
|  | ==================== | 
|  |  | 
|  | A server would be set up to accept operations in the following manner: | 
|  |  | 
|  | (1) An RxRPC socket is created by: | 
|  |  | 
|  | server = socket(AF_RXRPC, SOCK_DGRAM, PF_INET); | 
|  |  | 
|  | Where the third parameter indicates the address type of the transport | 
|  | socket used - usually IPv4. | 
|  |  | 
|  | (2) Security is set up if desired by giving the socket a keyring with server | 
|  | secret keys in it: | 
|  |  | 
|  | keyring = add_key("keyring", "AFSkeys", NULL, 0, | 
|  | KEY_SPEC_PROCESS_KEYRING); | 
|  |  | 
|  | const char secret_key[8] = { | 
|  | 0xa7, 0x83, 0x8a, 0xcb, 0xc7, 0x83, 0xec, 0x94 }; | 
|  | add_key("rxrpc_s", "52:2", secret_key, 8, keyring); | 
|  |  | 
|  | setsockopt(server, SOL_RXRPC, RXRPC_SECURITY_KEYRING, "AFSkeys", 7); | 
|  |  | 
|  | The keyring can be manipulated after it has been given to the socket. This | 
|  | permits the server to add more keys, replace keys, etc. whilst it is live. | 
|  |  | 
|  | (2) A local address must then be bound: | 
|  |  | 
|  | struct sockaddr_rxrpc srx = { | 
|  | .srx_family	= AF_RXRPC, | 
|  | .srx_service	= VL_SERVICE_ID, /* RxRPC service ID */ | 
|  | .transport_type	= SOCK_DGRAM,	/* type of transport socket */ | 
|  | .transport.sin_family	= AF_INET, | 
|  | .transport.sin_port	= htons(7000), /* AFS callback */ | 
|  | .transport.sin_address	= 0,  /* all local interfaces */ | 
|  | }; | 
|  | bind(server, &srx, sizeof(srx)); | 
|  |  | 
|  | (3) The server is then set to listen out for incoming calls: | 
|  |  | 
|  | listen(server, 100); | 
|  |  | 
|  | (4) The kernel notifies the server of pending incoming connections by sending | 
|  | it a message for each.  This is received with recvmsg() on the server | 
|  | socket.  It has no data, and has a single dataless control message | 
|  | attached: | 
|  |  | 
|  | RXRPC_NEW_CALL | 
|  |  | 
|  | The address that can be passed back by recvmsg() at this point should be | 
|  | ignored since the call for which the message was posted may have gone by | 
|  | the time it is accepted - in which case the first call still on the queue | 
|  | will be accepted. | 
|  |  | 
|  | (5) The server then accepts the new call by issuing a sendmsg() with two | 
|  | pieces of control data and no actual data: | 
|  |  | 
|  | RXRPC_ACCEPT		- indicate connection acceptance | 
|  | RXRPC_USER_CALL_ID	- specify user ID for this call | 
|  |  | 
|  | (6) The first request data packet will then be posted to the server socket for | 
|  | recvmsg() to pick up.  At that point, the RxRPC address for the call can | 
|  | be read from the address fields in the msghdr struct. | 
|  |  | 
|  | Subsequent request data will be posted to the server socket for recvmsg() | 
|  | to collect as it arrives.  All but the last piece of the request data will | 
|  | be delivered with MSG_MORE flagged. | 
|  |  | 
|  | All data will be delivered with the following control message attached: | 
|  |  | 
|  | RXRPC_USER_CALL_ID	- specifies the user ID for this call | 
|  |  | 
|  | (8) The reply data should then be posted to the server socket using a series | 
|  | of sendmsg() calls, each with the following control messages attached: | 
|  |  | 
|  | RXRPC_USER_CALL_ID	- specifies the user ID for this call | 
|  |  | 
|  | MSG_MORE should be set in msghdr::msg_flags on all but the last message | 
|  | for a particular call. | 
|  |  | 
|  | (9) The final ACK from the client will be posted for retrieval by recvmsg() | 
|  | when it is received.  It will take the form of a dataless message with two | 
|  | control messages attached: | 
|  |  | 
|  | RXRPC_USER_CALL_ID	- specifies the user ID for this call | 
|  | RXRPC_ACK		- indicates final ACK (no data) | 
|  |  | 
|  | MSG_EOR will be flagged to indicate that this is the final message for | 
|  | this call. | 
|  |  | 
|  | (10) Up to the point the final packet of reply data is sent, the call can be | 
|  | aborted by calling sendmsg() with a dataless message with the following | 
|  | control messages attached: | 
|  |  | 
|  | RXRPC_USER_CALL_ID	- specifies the user ID for this call | 
|  | RXRPC_ABORT		- indicates abort code (4 byte data) | 
|  |  | 
|  | Any packets waiting in the socket's receive queue will be discarded if | 
|  | this is issued. | 
|  |  | 
|  | Note that all the communications for a particular service take place through | 
|  | the one server socket, using control messages on sendmsg() and recvmsg() to | 
|  | determine the call affected. | 
|  |  | 
|  |  | 
|  | ========================= | 
|  | AF_RXRPC KERNEL INTERFACE | 
|  | ========================= | 
|  |  | 
|  | The AF_RXRPC module also provides an interface for use by in-kernel utilities | 
|  | such as the AFS filesystem.  This permits such a utility to: | 
|  |  | 
|  | (1) Use different keys directly on individual client calls on one socket | 
|  | rather than having to open a whole slew of sockets, one for each key it | 
|  | might want to use. | 
|  |  | 
|  | (2) Avoid having RxRPC call request_key() at the point of issue of a call or | 
|  | opening of a socket.  Instead the utility is responsible for requesting a | 
|  | key at the appropriate point.  AFS, for instance, would do this during VFS | 
|  | operations such as open() or unlink().  The key is then handed through | 
|  | when the call is initiated. | 
|  |  | 
|  | (3) Request the use of something other than GFP_KERNEL to allocate memory. | 
|  |  | 
|  | (4) Avoid the overhead of using the recvmsg() call.  RxRPC messages can be | 
|  | intercepted before they get put into the socket Rx queue and the socket | 
|  | buffers manipulated directly. | 
|  |  | 
|  | To use the RxRPC facility, a kernel utility must still open an AF_RXRPC socket, | 
|  | bind an address as appropriate and listen if it's to be a server socket, but | 
|  | then it passes this to the kernel interface functions. | 
|  |  | 
|  | The kernel interface functions are as follows: | 
|  |  | 
|  | (*) Begin a new client call. | 
|  |  | 
|  | struct rxrpc_call * | 
|  | rxrpc_kernel_begin_call(struct socket *sock, | 
|  | struct sockaddr_rxrpc *srx, | 
|  | struct key *key, | 
|  | unsigned long user_call_ID, | 
|  | gfp_t gfp); | 
|  |  | 
|  | This allocates the infrastructure to make a new RxRPC call and assigns | 
|  | call and connection numbers.  The call will be made on the UDP port that | 
|  | the socket is bound to.  The call will go to the destination address of a | 
|  | connected client socket unless an alternative is supplied (srx is | 
|  | non-NULL). | 
|  |  | 
|  | If a key is supplied then this will be used to secure the call instead of | 
|  | the key bound to the socket with the RXRPC_SECURITY_KEY sockopt.  Calls | 
|  | secured in this way will still share connections if at all possible. | 
|  |  | 
|  | The user_call_ID is equivalent to that supplied to sendmsg() in the | 
|  | control data buffer.  It is entirely feasible to use this to point to a | 
|  | kernel data structure. | 
|  |  | 
|  | If this function is successful, an opaque reference to the RxRPC call is | 
|  | returned.  The caller now holds a reference on this and it must be | 
|  | properly ended. | 
|  |  | 
|  | (*) End a client call. | 
|  |  | 
|  | void rxrpc_kernel_end_call(struct rxrpc_call *call); | 
|  |  | 
|  | This is used to end a previously begun call.  The user_call_ID is expunged | 
|  | from AF_RXRPC's knowledge and will not be seen again in association with | 
|  | the specified call. | 
|  |  | 
|  | (*) Send data through a call. | 
|  |  | 
|  | int rxrpc_kernel_send_data(struct rxrpc_call *call, struct msghdr *msg, | 
|  | size_t len); | 
|  |  | 
|  | This is used to supply either the request part of a client call or the | 
|  | reply part of a server call.  msg.msg_iovlen and msg.msg_iov specify the | 
|  | data buffers to be used.  msg_iov may not be NULL and must point | 
|  | exclusively to in-kernel virtual addresses.  msg.msg_flags may be given | 
|  | MSG_MORE if there will be subsequent data sends for this call. | 
|  |  | 
|  | The msg must not specify a destination address, control data or any flags | 
|  | other than MSG_MORE.  len is the total amount of data to transmit. | 
|  |  | 
|  | (*) Abort a call. | 
|  |  | 
|  | void rxrpc_kernel_abort_call(struct rxrpc_call *call, u32 abort_code); | 
|  |  | 
|  | This is used to abort a call if it's still in an abortable state.  The | 
|  | abort code specified will be placed in the ABORT message sent. | 
|  |  | 
|  | (*) Intercept received RxRPC messages. | 
|  |  | 
|  | typedef void (*rxrpc_interceptor_t)(struct sock *sk, | 
|  | unsigned long user_call_ID, | 
|  | struct sk_buff *skb); | 
|  |  | 
|  | void | 
|  | rxrpc_kernel_intercept_rx_messages(struct socket *sock, | 
|  | rxrpc_interceptor_t interceptor); | 
|  |  | 
|  | This installs an interceptor function on the specified AF_RXRPC socket. | 
|  | All messages that would otherwise wind up in the socket's Rx queue are | 
|  | then diverted to this function.  Note that care must be taken to process | 
|  | the messages in the right order to maintain DATA message sequentiality. | 
|  |  | 
|  | The interceptor function itself is provided with the address of the socket | 
|  | and handling the incoming message, the ID assigned by the kernel utility | 
|  | to the call and the socket buffer containing the message. | 
|  |  | 
|  | The skb->mark field indicates the type of message: | 
|  |  | 
|  | MARK				MEANING | 
|  | ===============================	======================================= | 
|  | RXRPC_SKB_MARK_DATA		Data message | 
|  | RXRPC_SKB_MARK_FINAL_ACK	Final ACK received for an incoming call | 
|  | RXRPC_SKB_MARK_BUSY		Client call rejected as server busy | 
|  | RXRPC_SKB_MARK_REMOTE_ABORT	Call aborted by peer | 
|  | RXRPC_SKB_MARK_NET_ERROR	Network error detected | 
|  | RXRPC_SKB_MARK_LOCAL_ERROR	Local error encountered | 
|  | RXRPC_SKB_MARK_NEW_CALL		New incoming call awaiting acceptance | 
|  |  | 
|  | The remote abort message can be probed with rxrpc_kernel_get_abort_code(). | 
|  | The two error messages can be probed with rxrpc_kernel_get_error_number(). | 
|  | A new call can be accepted with rxrpc_kernel_accept_call(). | 
|  |  | 
|  | Data messages can have their contents extracted with the usual bunch of | 
|  | socket buffer manipulation functions.  A data message can be determined to | 
|  | be the last one in a sequence with rxrpc_kernel_is_data_last().  When a | 
|  | data message has been used up, rxrpc_kernel_data_delivered() should be | 
|  | called on it.. | 
|  |  | 
|  | Non-data messages should be handled to rxrpc_kernel_free_skb() to dispose | 
|  | of.  It is possible to get extra refs on all types of message for later | 
|  | freeing, but this may pin the state of a call until the message is finally | 
|  | freed. | 
|  |  | 
|  | (*) Accept an incoming call. | 
|  |  | 
|  | struct rxrpc_call * | 
|  | rxrpc_kernel_accept_call(struct socket *sock, | 
|  | unsigned long user_call_ID); | 
|  |  | 
|  | This is used to accept an incoming call and to assign it a call ID.  This | 
|  | function is similar to rxrpc_kernel_begin_call() and calls accepted must | 
|  | be ended in the same way. | 
|  |  | 
|  | If this function is successful, an opaque reference to the RxRPC call is | 
|  | returned.  The caller now holds a reference on this and it must be | 
|  | properly ended. | 
|  |  | 
|  | (*) Reject an incoming call. | 
|  |  | 
|  | int rxrpc_kernel_reject_call(struct socket *sock); | 
|  |  | 
|  | This is used to reject the first incoming call on the socket's queue with | 
|  | a BUSY message.  -ENODATA is returned if there were no incoming calls. | 
|  | Other errors may be returned if the call had been aborted (-ECONNABORTED) | 
|  | or had timed out (-ETIME). | 
|  |  | 
|  | (*) Record the delivery of a data message and free it. | 
|  |  | 
|  | void rxrpc_kernel_data_delivered(struct sk_buff *skb); | 
|  |  | 
|  | This is used to record a data message as having been delivered and to | 
|  | update the ACK state for the call.  The socket buffer will be freed. | 
|  |  | 
|  | (*) Free a message. | 
|  |  | 
|  | void rxrpc_kernel_free_skb(struct sk_buff *skb); | 
|  |  | 
|  | This is used to free a non-DATA socket buffer intercepted from an AF_RXRPC | 
|  | socket. | 
|  |  | 
|  | (*) Determine if a data message is the last one on a call. | 
|  |  | 
|  | bool rxrpc_kernel_is_data_last(struct sk_buff *skb); | 
|  |  | 
|  | This is used to determine if a socket buffer holds the last data message | 
|  | to be received for a call (true will be returned if it does, false | 
|  | if not). | 
|  |  | 
|  | The data message will be part of the reply on a client call and the | 
|  | request on an incoming call.  In the latter case there will be more | 
|  | messages, but in the former case there will not. | 
|  |  | 
|  | (*) Get the abort code from an abort message. | 
|  |  | 
|  | u32 rxrpc_kernel_get_abort_code(struct sk_buff *skb); | 
|  |  | 
|  | This is used to extract the abort code from a remote abort message. | 
|  |  | 
|  | (*) Get the error number from a local or network error message. | 
|  |  | 
|  | int rxrpc_kernel_get_error_number(struct sk_buff *skb); | 
|  |  | 
|  | This is used to extract the error number from a message indicating either | 
|  | a local error occurred or a network error occurred. | 
|  |  | 
|  | (*) Allocate a null key for doing anonymous security. | 
|  |  | 
|  | struct key *rxrpc_get_null_key(const char *keyname); | 
|  |  | 
|  | This is used to allocate a null RxRPC key that can be used to indicate | 
|  | anonymous security for a particular domain. |