| Rafael J. Wysocki | 5b79520 | 2007-05-08 00:24:07 -0700 | [diff] [blame] | 1 | Debugging suspend and resume | 
|  | 2 | (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL | 
|  | 3 |  | 
|  | 4 | 1. Testing suspend to disk (STD) | 
|  | 5 |  | 
|  | 6 | To verify that the STD works, you can try to suspend in the "reboot" mode: | 
|  | 7 |  | 
|  | 8 | # echo reboot > /sys/power/disk | 
|  | 9 | # echo disk > /sys/power/state | 
|  | 10 |  | 
|  | 11 | and the system should suspend, reboot, resume and get back to the command prompt | 
|  | 12 | where you have started the transition.  If that happens, the STD is most likely | 
|  | 13 | to work correctly, but you need to repeat the test at least a couple of times in | 
|  | 14 | a row for confidence.  This is necessary, because some problems only show up on | 
|  | 15 | a second attempt at suspending and resuming the system.  You should also test | 
|  | 16 | the "platform" and "shutdown" modes of suspend: | 
|  | 17 |  | 
|  | 18 | # echo platform > /sys/power/disk | 
|  | 19 | # echo disk > /sys/power/state | 
|  | 20 |  | 
|  | 21 | or | 
|  | 22 |  | 
|  | 23 | # echo shutdown > /sys/power/disk | 
|  | 24 | # echo disk > /sys/power/state | 
|  | 25 |  | 
|  | 26 | in which cases you will have to press the power button to make the system | 
|  | 27 | resume.  If that does not work, you will need to identify what goes wrong. | 
|  | 28 |  | 
|  | 29 | a) Test mode of STD | 
|  | 30 |  | 
|  | 31 | To verify if there are any drivers that cause problems you can run the STD | 
|  | 32 | in the test mode: | 
|  | 33 |  | 
|  | 34 | # echo test > /sys/power/disk | 
|  | 35 | # echo disk > /sys/power/state | 
|  | 36 |  | 
|  | 37 | in which case the system should freeze tasks, suspend devices, disable nonboot | 
|  | 38 | CPUs (if any), wait for 5 seconds, enable nonboot CPUs, resume devices, thaw | 
|  | 39 | tasks and return to your command prompt.  If that fails, most likely there is | 
|  | 40 | a driver that fails to either suspend or resume (in the latter case the system | 
|  | 41 | may hang or be unstable after the test, so please take that into consideration). | 
|  | 42 | To find this driver, you can carry out a binary search according to the rules: | 
|  | 43 | - if the test fails, unload a half of the drivers currently loaded and repeat | 
|  | 44 | (that would probably involve rebooting the system, so always note what drivers | 
|  | 45 | have been loaded before the test), | 
|  | 46 | - if the test succeeds, load a half of the drivers you have unloaded most | 
|  | 47 | recently and repeat. | 
|  | 48 |  | 
|  | 49 | Once you have found the failing driver (there can be more than just one of | 
|  | 50 | them), you have to unload it every time before the STD transition.  In that case | 
|  | 51 | please make sure to report the problem with the driver. | 
|  | 52 |  | 
|  | 53 | It is also possible that a cycle can still fail after you have unloaded | 
|  | 54 | all modules. In that case, you would want to look in your kernel configuration | 
|  | 55 | for the drivers that can be compiled as modules (testing again with them as | 
|  | 56 | modules), and possibly also try boot time options such as "noapic" or "noacpi". | 
|  | 57 |  | 
|  | 58 | b) Testing minimal configuration | 
|  | 59 |  | 
|  | 60 | If the test mode of STD works, you can boot the system with "init=/bin/bash" | 
|  | 61 | and attempt to suspend in the "reboot", "shutdown" and "platform" modes.  If | 
|  | 62 | that does not work, there probably is a problem with a driver statically | 
|  | 63 | compiled into the kernel and you can try to compile more drivers as modules, | 
|  | 64 | so that they can be tested individually.  Otherwise, there is a problem with a | 
|  | 65 | modular driver and you can find it by loading a half of the modules you normally | 
|  | 66 | use and binary searching in accordance with the algorithm: | 
|  | 67 | - if there are n modules loaded and the attempt to suspend and resume fails, | 
|  | 68 | unload n/2 of the modules and try again (that would probably involve rebooting | 
|  | 69 | the system), | 
|  | 70 | - if there are n modules loaded and the attempt to suspend and resume succeeds, | 
|  | 71 | load n/2 modules more and try again. | 
|  | 72 |  | 
|  | 73 | Again, if you find the offending module(s), it(they) must be unloaded every time | 
|  | 74 | before the STD transition, and please report the problem with it(them). | 
|  | 75 |  | 
|  | 76 | c) Advanced debugging | 
|  | 77 |  | 
|  | 78 | In case the STD does not work on your system even in the minimal configuration | 
|  | 79 | and compiling more drivers as modules is not practical or some modules cannot | 
|  | 80 | be unloaded, you can use one of the more advanced debugging techniques to find | 
| Andres Salomon | 8f4ce8c | 2007-10-18 03:04:50 -0700 | [diff] [blame] | 81 | the problem.  First, if there is a serial port in your box, you can boot the | 
|  | 82 | kernel with the 'no_console_suspend' parameter and try to log kernel | 
| Rafael J. Wysocki | 5b79520 | 2007-05-08 00:24:07 -0700 | [diff] [blame] | 83 | messages using the serial console.  This may provide you with some information | 
|  | 84 | about the reasons of the suspend (resume) failure.  Alternatively, it may be | 
|  | 85 | possible to use a FireWire port for debugging with firescope | 
|  | 86 | (ftp://ftp.firstfloor.org/pub/ak/firescope/).  On i386 it is also possible to | 
|  | 87 | use the PM_TRACE mechanism documented in Documentation/s2ram.txt . | 
|  | 88 |  | 
|  | 89 | 2. Testing suspend to RAM (STR) | 
|  | 90 |  | 
|  | 91 | To verify that the STR works, it is generally more convenient to use the s2ram | 
|  | 92 | tool available from http://suspend.sf.net and documented at | 
|  | 93 | http://en.opensuse.org/s2ram .  However, before doing that it is recommended to | 
|  | 94 | carry out the procedure described in section 1. | 
|  | 95 |  | 
|  | 96 | Assume you have resolved the problems with the STD and you have found some | 
|  | 97 | failing drivers.  These drivers are also likely to fail during the STR or | 
|  | 98 | during the resume, so it is better to unload them every time before the STR | 
|  | 99 | transition.  Now, you can follow the instructions at | 
|  | 100 | http://en.opensuse.org/s2ram to test the system, but if it does not work | 
|  | 101 | "out of the box", you may need to boot it with "init=/bin/bash" and test | 
|  | 102 | s2ram in the minimal configuration.  In that case, you may be able to search | 
|  | 103 | for failing drivers by following the procedure analogous to the one described in | 
|  | 104 | 1b).  If you find some failing drivers, you will have to unload them every time | 
|  | 105 | before the STR transition (ie. before you run s2ram), and please report the | 
|  | 106 | problems with them. |