| Jonathan Corbet | 6c19efb | 2009-09-08 17:49:37 -0600 | [diff] [blame] | 1 | Using flexible arrays in the kernel | 
|  | 2 | Last updated for 2.6.31 | 
|  | 3 | Jonathan Corbet <corbet@lwn.net> | 
|  | 4 |  | 
|  | 5 | Large contiguous memory allocations can be unreliable in the Linux kernel. | 
|  | 6 | Kernel programmers will sometimes respond to this problem by allocating | 
|  | 7 | pages with vmalloc().  This solution not ideal, though.  On 32-bit systems, | 
|  | 8 | memory from vmalloc() must be mapped into a relatively small address space; | 
|  | 9 | it's easy to run out.  On SMP systems, the page table changes required by | 
|  | 10 | vmalloc() allocations can require expensive cross-processor interrupts on | 
|  | 11 | all CPUs.  And, on all systems, use of space in the vmalloc() range | 
|  | 12 | increases pressure on the translation lookaside buffer (TLB), reducing the | 
|  | 13 | performance of the system. | 
|  | 14 |  | 
|  | 15 | In many cases, the need for memory from vmalloc() can be eliminated by | 
|  | 16 | piecing together an array from smaller parts; the flexible array library | 
|  | 17 | exists to make this task easier. | 
|  | 18 |  | 
|  | 19 | A flexible array holds an arbitrary (within limits) number of fixed-sized | 
|  | 20 | objects, accessed via an integer index.  Sparse arrays are handled | 
|  | 21 | reasonably well.  Only single-page allocations are made, so memory | 
|  | 22 | allocation failures should be relatively rare.  The down sides are that the | 
|  | 23 | arrays cannot be indexed directly, individual object size cannot exceed the | 
|  | 24 | system page size, and putting data into a flexible array requires a copy | 
|  | 25 | operation.  It's also worth noting that flexible arrays do no internal | 
|  | 26 | locking at all; if concurrent access to an array is possible, then the | 
|  | 27 | caller must arrange for appropriate mutual exclusion. | 
|  | 28 |  | 
|  | 29 | The creation of a flexible array is done with: | 
|  | 30 |  | 
|  | 31 | #include <linux/flex_array.h> | 
|  | 32 |  | 
|  | 33 | struct flex_array *flex_array_alloc(int element_size, | 
|  | 34 | unsigned int total, | 
|  | 35 | gfp_t flags); | 
|  | 36 |  | 
|  | 37 | The individual object size is provided by element_size, while total is the | 
|  | 38 | maximum number of objects which can be stored in the array.  The flags | 
|  | 39 | argument is passed directly to the internal memory allocation calls.  With | 
|  | 40 | the current code, using flags to ask for high memory is likely to lead to | 
|  | 41 | notably unpleasant side effects. | 
|  | 42 |  | 
|  | 43 | Storing data into a flexible array is accomplished with a call to: | 
|  | 44 |  | 
|  | 45 | int flex_array_put(struct flex_array *array, unsigned int element_nr, | 
|  | 46 | void *src, gfp_t flags); | 
|  | 47 |  | 
|  | 48 | This call will copy the data from src into the array, in the position | 
|  | 49 | indicated by element_nr (which must be less than the maximum specified when | 
|  | 50 | the array was created).  If any memory allocations must be performed, flags | 
|  | 51 | will be used.  The return value is zero on success, a negative error code | 
|  | 52 | otherwise. | 
|  | 53 |  | 
|  | 54 | There might possibly be a need to store data into a flexible array while | 
|  | 55 | running in some sort of atomic context; in this situation, sleeping in the | 
|  | 56 | memory allocator would be a bad thing.  That can be avoided by using | 
|  | 57 | GFP_ATOMIC for the flags value, but, often, there is a better way.  The | 
|  | 58 | trick is to ensure that any needed memory allocations are done before | 
|  | 59 | entering atomic context, using: | 
|  | 60 |  | 
|  | 61 | int flex_array_prealloc(struct flex_array *array, unsigned int start, | 
|  | 62 | unsigned int end, gfp_t flags); | 
|  | 63 |  | 
|  | 64 | This function will ensure that memory for the elements indexed in the range | 
|  | 65 | defined by start and end has been allocated.  Thereafter, a | 
|  | 66 | flex_array_put() call on an element in that range is guaranteed not to | 
|  | 67 | block. | 
|  | 68 |  | 
|  | 69 | Getting data back out of the array is done with: | 
|  | 70 |  | 
|  | 71 | void *flex_array_get(struct flex_array *fa, unsigned int element_nr); | 
|  | 72 |  | 
|  | 73 | The return value is a pointer to the data element, or NULL if that | 
|  | 74 | particular element has never been allocated. | 
|  | 75 |  | 
|  | 76 | Note that it is possible to get back a valid pointer for an element which | 
|  | 77 | has never been stored in the array.  Memory for array elements is allocated | 
|  | 78 | one page at a time; a single allocation could provide memory for several | 
|  | 79 | adjacent elements.  The flexible array code does not know if a specific | 
|  | 80 | element has been written; it only knows if the associated memory is | 
|  | 81 | present.  So a flex_array_get() call on an element which was never stored | 
|  | 82 | in the array has the potential to return a pointer to random data.  If the | 
|  | 83 | caller does not have a separate way to know which elements were actually | 
|  | 84 | stored, it might be wise, at least, to add GFP_ZERO to the flags argument | 
|  | 85 | to ensure that all elements are zeroed. | 
|  | 86 |  | 
|  | 87 | There is no way to remove a single element from the array.  It is possible, | 
|  | 88 | though, to remove all elements with a call to: | 
|  | 89 |  | 
|  | 90 | void flex_array_free_parts(struct flex_array *array); | 
|  | 91 |  | 
|  | 92 | This call frees all elements, but leaves the array itself in place. | 
|  | 93 | Freeing the entire array is done with: | 
|  | 94 |  | 
|  | 95 | void flex_array_free(struct flex_array *array); | 
|  | 96 |  | 
|  | 97 | As of this writing, there are no users of flexible arrays in the mainline | 
|  | 98 | kernel.  The functions described here are also not exported to modules; | 
|  | 99 | that will probably be fixed when somebody comes up with a need for it. |