|  | [Sat Mar  2 10:32:33 PST 1996 KERNEL_BUG-HOWTO lm@sgi.com (Larry McVoy)] | 
|  |  | 
|  | This is how to track down a bug if you know nothing about kernel hacking. | 
|  | It's a brute force approach but it works pretty well. | 
|  |  | 
|  | You need: | 
|  |  | 
|  | . A reproducible bug - it has to happen predictably (sorry) | 
|  | . All the kernel tar files from a revision that worked to the | 
|  | revision that doesn't | 
|  |  | 
|  | You will then do: | 
|  |  | 
|  | . Rebuild a revision that you believe works, install, and verify that. | 
|  | . Do a binary search over the kernels to figure out which one | 
|  | introduced the bug.  I.e., suppose 1.3.28 didn't have the bug, but | 
|  | you know that 1.3.69 does.  Pick a kernel in the middle and build | 
|  | that, like 1.3.50.  Build & test; if it works, pick the mid point | 
|  | between .50 and .69, else the mid point between .28 and .50. | 
|  | . You'll narrow it down to the kernel that introduced the bug.  You | 
|  | can probably do better than this but it gets tricky. | 
|  |  | 
|  | . Narrow it down to a subdirectory | 
|  |  | 
|  | - Copy kernel that works into "test".  Let's say that 3.62 works, | 
|  | but 3.63 doesn't.  So you diff -r those two kernels and come | 
|  | up with a list of directories that changed.  For each of those | 
|  | directories: | 
|  |  | 
|  | Copy the non-working directory next to the working directory | 
|  | as "dir.63". | 
|  | One directory at time, try moving the working directory to | 
|  | "dir.62" and mv dir.63 dir"time, try | 
|  |  | 
|  | mv dir dir.62 | 
|  | mv dir.63 dir | 
|  | find dir -name '*.[oa]' -print | xargs rm -f | 
|  |  | 
|  | And then rebuild and retest.  Assuming that all related | 
|  | changes were contained in the sub directory, this should | 
|  | isolate the change to a directory. | 
|  |  | 
|  | Problems: changes in header files may have occurred; I've | 
|  | found in my case that they were self explanatory - you may | 
|  | or may not want to give up when that happens. | 
|  |  | 
|  | . Narrow it down to a file | 
|  |  | 
|  | - You can apply the same technique to each file in the directory, | 
|  | hoping that the changes in that file are self contained. | 
|  |  | 
|  | . Narrow it down to a routine | 
|  |  | 
|  | - You can take the old file and the new file and manually create | 
|  | a merged file that has | 
|  |  | 
|  | #ifdef VER62 | 
|  | routine() | 
|  | { | 
|  | ... | 
|  | } | 
|  | #else | 
|  | routine() | 
|  | { | 
|  | ... | 
|  | } | 
|  | #endif | 
|  |  | 
|  | And then walk through that file, one routine at a time and | 
|  | prefix it with | 
|  |  | 
|  | #define VER62 | 
|  | /* both routines here */ | 
|  | #undef VER62 | 
|  |  | 
|  | Then recompile, retest, move the ifdefs until you find the one | 
|  | that makes the difference. | 
|  |  | 
|  | Finally, you take all the info that you have, kernel revisions, bug | 
|  | description, the extent to which you have narrowed it down, and pass | 
|  | that off to whomever you believe is the maintainer of that section. | 
|  | A post to linux.dev.kernel isn't such a bad idea if you've done some | 
|  | work to narrow it down. | 
|  |  | 
|  | If you get it down to a routine, you'll probably get a fix in 24 hours. | 
|  |  | 
|  | My apologies to Linus and the other kernel hackers for describing this | 
|  | brute force approach, it's hardly what a kernel hacker would do.  However, | 
|  | it does work and it lets non-hackers help fix bugs.  And it is cool | 
|  | because Linux snapshots will let you do this - something that you can't | 
|  | do with vendor supplied releases. | 
|  |  |