| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 |  | 
|  | 2 | Debugging on Linux for s/390 & z/Architecture | 
|  | 3 | by | 
|  | 4 | Denis Joseph Barrow (djbarrow@de.ibm.com,barrow_dj@yahoo.com) | 
|  | 5 | Copyright (C) 2000-2001 IBM Deutschland Entwicklung GmbH, IBM Corporation | 
|  | 6 | Best viewed with fixed width fonts | 
|  | 7 |  | 
|  | 8 | Overview of Document: | 
|  | 9 | ===================== | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 10 | This document is intended to give a good overview of how to debug | 
| Matt LaPlante | fff9289 | 2006-10-03 22:47:42 +0200 | [diff] [blame] | 11 | Linux for s/390 & z/Architecture. It isn't intended as a complete reference & not a | 
|  | 12 | tutorial on the fundamentals of C & assembly. It doesn't go into | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 13 | 390 IO in any detail. It is intended to complement the documents in the | 
|  | 14 | reference section below & any other worthwhile references you get. | 
|  | 15 |  | 
|  | 16 | It is intended like the Enterprise Systems Architecture/390 Reference Summary | 
|  | 17 | to be printed out & used as a quick cheat sheet self help style reference when | 
|  | 18 | problems occur. | 
|  | 19 |  | 
|  | 20 | Contents | 
|  | 21 | ======== | 
|  | 22 | Register Set | 
|  | 23 | Address Spaces on Intel Linux | 
|  | 24 | Address Spaces on Linux for s/390 & z/Architecture | 
|  | 25 | The Linux for s/390 & z/Architecture Kernel Task Structure | 
|  | 26 | Register Usage & Stackframes on Linux for s/390 & z/Architecture | 
|  | 27 | A sample program with comments | 
|  | 28 | Compiling programs for debugging on Linux for s/390 & z/Architecture | 
|  | 29 | Figuring out gcc compile errors | 
|  | 30 | Debugging Tools | 
|  | 31 | objdump | 
|  | 32 | strace | 
|  | 33 | Performance Debugging | 
|  | 34 | Debugging under VM | 
|  | 35 | s/390 & z/Architecture IO Overview | 
|  | 36 | Debugging IO on s/390 & z/Architecture under VM | 
|  | 37 | GDB on s/390 & z/Architecture | 
|  | 38 | Stack chaining in gdb by hand | 
|  | 39 | Examining core dumps | 
|  | 40 | ldd | 
|  | 41 | Debugging modules | 
|  | 42 | The proc file system | 
|  | 43 | Starting points for debugging scripting languages etc. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 44 | SysRq | 
|  | 45 | References | 
|  | 46 | Special Thanks | 
|  | 47 |  | 
|  | 48 | Register Set | 
|  | 49 | ============ | 
|  | 50 | The current architectures have the following registers. | 
|  | 51 |  | 
|  | 52 | 16  General propose registers, 32 bit on s/390 64 bit on z/Architecture, r0-r15 or gpr0-gpr15 used for arithmetic & addressing. | 
|  | 53 |  | 
|  | 54 | 16 Control registers, 32 bit on s/390 64 bit on z/Architecture, ( cr0-cr15 kernel usage only ) used for memory management, | 
|  | 55 | interrupt control,debugging control etc. | 
|  | 56 |  | 
|  | 57 | 16 Access registers ( ar0-ar15 ) 32 bit on s/390 & z/Architecture | 
|  | 58 | not used by normal programs but potentially could | 
|  | 59 | be used as temporary storage. Their main purpose is their 1 to 1 | 
|  | 60 | association with general purpose registers and are used in | 
|  | 61 | the kernel for copying data between kernel & user address spaces. | 
|  | 62 | Access register 0 ( & access register 1 on z/Architecture ( needs 64 bit | 
|  | 63 | pointer ) ) is currently used by the pthread library as a pointer to | 
|  | 64 | the current running threads private area. | 
|  | 65 |  | 
|  | 66 | 16 64 bit floating point registers (fp0-fp15 ) IEEE & HFP floating | 
|  | 67 | point format compliant on G5 upwards & a Floating point control reg (FPC) | 
|  | 68 | 4  64 bit registers (fp0,fp2,fp4 & fp6) HFP only on older machines. | 
|  | 69 | Note: | 
|  | 70 | Linux (currently) always uses IEEE & emulates G5 IEEE format on older machines, | 
|  | 71 | ( provided the kernel is configured for this ). | 
|  | 72 |  | 
|  | 73 |  | 
|  | 74 | The PSW is the most important register on the machine it | 
|  | 75 | is 64 bit on s/390 & 128 bit on z/Architecture & serves the roles of | 
|  | 76 | a program counter (pc), condition code register,memory space designator. | 
|  | 77 | In IBM standard notation I am counting bit 0 as the MSB. | 
|  | 78 | It has several advantages over a normal program counter | 
|  | 79 | in that you can change address translation & program counter | 
|  | 80 | in a single instruction. To change address translation, | 
|  | 81 | e.g. switching address translation off requires that you | 
|  | 82 | have a logical=physical mapping for the address you are | 
|  | 83 | currently running at. | 
|  | 84 |  | 
|  | 85 | Bit           Value | 
|  | 86 | s/390 z/Architecture | 
|  | 87 | 0       0     Reserved ( must be 0 ) otherwise specification exception occurs. | 
|  | 88 |  | 
|  | 89 | 1       1     Program Event Recording 1 PER enabled, | 
| Matt LaPlante | a2ffd27 | 2006-10-03 22:49:15 +0200 | [diff] [blame] | 90 | PER is used to facilitate debugging e.g. single stepping. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 91 |  | 
|  | 92 | 2-4    2-4    Reserved ( must be 0 ). | 
|  | 93 |  | 
|  | 94 | 5       5     Dynamic address translation 1=DAT on. | 
|  | 95 |  | 
|  | 96 | 6       6     Input/Output interrupt Mask | 
|  | 97 |  | 
|  | 98 | 7       7     External interrupt Mask used primarily for interprocessor signalling & | 
|  | 99 | clock interrupts. | 
|  | 100 |  | 
|  | 101 | 8-11  8-11    PSW Key used for complex memory protection mechanism not used under linux | 
|  | 102 |  | 
|  | 103 | 12      12    1 on s/390 0 on z/Architecture | 
|  | 104 |  | 
|  | 105 | 13      13    Machine Check Mask 1=enable machine check interrupts | 
|  | 106 |  | 
|  | 107 | 14      14    Wait State set this to 1 to stop the processor except for interrupts & give | 
|  | 108 | time to other LPARS used in CPU idle in the kernel to increase overall | 
|  | 109 | usage of processor resources. | 
|  | 110 |  | 
|  | 111 | 15      15    Problem state ( if set to 1 certain instructions are disabled ) | 
|  | 112 | all linux user programs run with this bit 1 | 
|  | 113 | ( useful info for debugging under VM ). | 
|  | 114 |  | 
|  | 115 | 16-17 16-17   Address Space Control | 
|  | 116 |  | 
|  | 117 | 00 Primary Space Mode when DAT on | 
|  | 118 | The linux kernel currently runs in this mode, CR1 is affiliated with | 
|  | 119 | this mode & points to the primary segment table origin etc. | 
|  | 120 |  | 
|  | 121 | 01 Access register mode this mode is used in functions to | 
|  | 122 | copy data between kernel & user space. | 
|  | 123 |  | 
|  | 124 | 10 Secondary space mode not used in linux however CR7 the | 
|  | 125 | register affiliated with this mode is & this & normally | 
|  | 126 | CR13=CR7 to allow us to copy data between kernel & user space. | 
|  | 127 | We do this as follows: | 
|  | 128 | We set ar2 to 0 to designate its | 
|  | 129 | affiliated gpr ( gpr2 )to point to primary=kernel space. | 
|  | 130 | We set ar4 to 1 to designate its | 
|  | 131 | affiliated gpr ( gpr4 ) to point to secondary=home=user space | 
|  | 132 | & then essentially do a memcopy(gpr2,gpr4,size) to | 
|  | 133 | copy data between the address spaces, the reason we use home space for the | 
|  | 134 | kernel & don't keep secondary space free is that code will not run in | 
|  | 135 | secondary space. | 
|  | 136 |  | 
|  | 137 | 11 Home Space Mode all user programs run in this mode. | 
|  | 138 | it is affiliated with CR13. | 
|  | 139 |  | 
|  | 140 | 18-19 18-19   Condition codes (CC) | 
|  | 141 |  | 
|  | 142 | 20    20      Fixed point overflow mask if 1=FPU exceptions for this event | 
|  | 143 | occur ( normally 0 ) | 
|  | 144 |  | 
|  | 145 | 21    21      Decimal overflow mask if 1=FPU exceptions for this event occur | 
|  | 146 | ( normally 0 ) | 
|  | 147 |  | 
|  | 148 | 22    22      Exponent underflow mask if 1=FPU exceptions for this event occur | 
|  | 149 | ( normally 0 ) | 
|  | 150 |  | 
|  | 151 | 23    23      Significance Mask if 1=FPU exceptions for this event occur | 
|  | 152 | ( normally 0 ) | 
|  | 153 |  | 
|  | 154 | 24-31 24-30   Reserved Must be 0. | 
|  | 155 |  | 
|  | 156 | 31      Extended Addressing Mode | 
|  | 157 | 32      Basic Addressing Mode | 
|  | 158 | Used to set addressing mode | 
|  | 159 | PSW 31   PSW 32 | 
|  | 160 | 0         0        24 bit | 
|  | 161 | 0         1        31 bit | 
|  | 162 | 1         1        64 bit | 
|  | 163 |  | 
|  | 164 | 32             1=31 bit addressing mode 0=24 bit addressing mode (for backward | 
| Matt LaPlante | 6c28f2c | 2006-10-03 22:46:31 +0200 | [diff] [blame] | 165 | compatibility), linux always runs with this bit set to 1 | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 166 |  | 
|  | 167 | 33-64          Instruction address. | 
|  | 168 | 33-63    Reserved must be 0 | 
|  | 169 | 64-127   Address | 
|  | 170 | In 24 bits mode bits 64-103=0 bits 104-127 Address | 
|  | 171 | In 31 bits mode bits 64-96=0 bits 97-127 Address | 
|  | 172 | Note: unlike 31 bit mode on s/390 bit 96 must be zero | 
|  | 173 | when loading the address with LPSWE otherwise a | 
|  | 174 | specification exception occurs, LPSW is fully backward | 
|  | 175 | compatible. | 
|  | 176 |  | 
|  | 177 |  | 
|  | 178 | Prefix Page(s) | 
|  | 179 | -------------- | 
|  | 180 | This per cpu memory area is too intimately tied to the processor not to mention. | 
|  | 181 | It exists between the real addresses 0-4096 on s/390 & 0-8192 z/Architecture & is exchanged | 
|  | 182 | with a 1 page on s/390 or 2 pages on z/Architecture in absolute storage by the set | 
|  | 183 | prefix instruction in linux'es startup. | 
|  | 184 | This page is mapped to a different prefix for each processor in an SMP configuration | 
|  | 185 | ( assuming the os designer is sane of course :-) ). | 
|  | 186 | Bytes 0-512 ( 200 hex ) on s/390 & 0-512,4096-4544,4604-5119 currently on z/Architecture | 
|  | 187 | are used by the processor itself for holding such information as exception indications & | 
|  | 188 | entry points for exceptions. | 
|  | 189 | Bytes after 0xc00 hex are used by linux for per processor globals on s/390 & z/Architecture | 
| Matt LaPlante | 3f6dee9 | 2006-10-03 22:45:33 +0200 | [diff] [blame] | 190 | ( there is a gap on z/Architecture too currently between 0xc00 & 1000 which linux uses ). | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 191 | The closest thing to this on traditional architectures is the interrupt | 
|  | 192 | vector table. This is a good thing & does simplify some of the kernel coding | 
|  | 193 | however it means that we now cannot catch stray NULL pointers in the | 
|  | 194 | kernel without hard coded checks. | 
|  | 195 |  | 
|  | 196 |  | 
|  | 197 |  | 
|  | 198 | Address Spaces on Intel Linux | 
|  | 199 | ============================= | 
|  | 200 |  | 
|  | 201 | The traditional Intel Linux is approximately mapped as follows forgive | 
|  | 202 | the ascii art. | 
|  | 203 | 0xFFFFFFFF 4GB Himem                        ***************** | 
|  | 204 | *               * | 
|  | 205 | * Kernel Space  * | 
|  | 206 | *               * | 
|  | 207 | *****************          **************** | 
|  | 208 | User Space Himem (typically 0xC0000000 3GB )*  User Stack   *          *              * | 
|  | 209 | *****************          *              * | 
|  | 210 | *  Shared Libs  *          * Next Process * | 
|  | 211 | *****************          *     to       * | 
|  | 212 | *               *    <==   *     Run      *  <== | 
|  | 213 | *  User Program *          *              * | 
|  | 214 | *   Data BSS    *          *              * | 
|  | 215 | *	 Text       *          *              * | 
|  | 216 | *   Sections    *          *              * | 
|  | 217 | 0x00000000         			    *****************          **************** | 
|  | 218 |  | 
|  | 219 | Now it is easy to see that on Intel it is quite easy to recognise a kernel address | 
|  | 220 | as being one greater than user space himem ( in this case 0xC0000000). | 
|  | 221 | & addresses of less than this are the ones in the current running program on this | 
|  | 222 | processor ( if an smp box ). | 
|  | 223 | If using the virtual machine ( VM ) as a debugger it is quite difficult to | 
|  | 224 | know which user process is running as the address space you are looking at | 
|  | 225 | could be from any process in the run queue. | 
|  | 226 |  | 
|  | 227 | The limitation of Intels addressing technique is that the linux | 
|  | 228 | kernel uses a very simple real address to virtual addressing technique | 
|  | 229 | of Real Address=Virtual Address-User Space Himem. | 
|  | 230 | This means that on Intel the kernel linux can typically only address | 
|  | 231 | Himem=0xFFFFFFFF-0xC0000000=1GB & this is all the RAM these machines | 
|  | 232 | can typically use. | 
|  | 233 | They can lower User Himem to 2GB or lower & thus be | 
|  | 234 | able to use 2GB of RAM however this shrinks the maximum size | 
|  | 235 | of User Space from 3GB to 2GB they have a no win limit of 4GB unless | 
|  | 236 | they go to 64 Bit. | 
|  | 237 |  | 
|  | 238 |  | 
|  | 239 | On 390 our limitations & strengths make us slightly different. | 
|  | 240 | For backward compatibility we are only allowed use 31 bits (2GB) | 
| Matt LaPlante | 6c28f2c | 2006-10-03 22:46:31 +0200 | [diff] [blame] | 241 | of our 32 bit addresses, however, we use entirely separate address | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 242 | spaces for the user & kernel. | 
|  | 243 |  | 
|  | 244 | This means we can support 2GB of non Extended RAM on s/390, & more | 
|  | 245 | with the Extended memory management swap device & | 
|  | 246 | currently 4TB of physical memory currently on z/Architecture. | 
|  | 247 |  | 
|  | 248 |  | 
|  | 249 | Address Spaces on Linux for s/390 & z/Architecture | 
|  | 250 | ================================================== | 
|  | 251 |  | 
|  | 252 | Our addressing scheme is as follows | 
|  | 253 |  | 
|  | 254 |  | 
|  | 255 | Himem 0x7fffffff 2GB on s/390    *****************          **************** | 
|  | 256 | currently 0x3ffffffffff (2^42)-1 *  User Stack   *          *              * | 
|  | 257 | on z/Architecture.		 *****************          *              * | 
|  | 258 | *  Shared Libs  *          *              * | 
|  | 259 | *****************          *              * | 
|  | 260 | *               *          *    Kernel    * | 
|  | 261 | *  User Program *          *              * | 
|  | 262 | *   Data BSS    *          *              * | 
|  | 263 | *    Text       *          *              * | 
|  | 264 | *   Sections    *          *              * | 
|  | 265 | 0x00000000                       *****************          **************** | 
|  | 266 |  | 
|  | 267 | This also means that we need to look at the PSW problem state bit | 
|  | 268 | or the addressing mode to decide whether we are looking at | 
|  | 269 | user or kernel space. | 
|  | 270 |  | 
|  | 271 | Virtual Addresses on s/390 & z/Architecture | 
|  | 272 | =========================================== | 
|  | 273 |  | 
|  | 274 | A virtual address on s/390 is made up of 3 parts | 
|  | 275 | The SX ( segment index, roughly corresponding to the PGD & PMD in linux terminology ) | 
|  | 276 | being bits 1-11. | 
|  | 277 | The PX ( page index, corresponding to the page table entry (pte) in linux terminology ) | 
|  | 278 | being bits 12-19. | 
|  | 279 | The remaining bits BX (the byte index are the offset in the page ) | 
|  | 280 | i.e. bits 20 to 31. | 
|  | 281 |  | 
|  | 282 | On z/Architecture in linux we currently make up an address from 4 parts. | 
|  | 283 | The region index bits (RX) 0-32 we currently use bits 22-32 | 
|  | 284 | The segment index (SX) being bits 33-43 | 
|  | 285 | The page index (PX) being bits  44-51 | 
|  | 286 | The byte index (BX) being bits  52-63 | 
|  | 287 |  | 
|  | 288 | Notes: | 
|  | 289 | 1) s/390 has no PMD so the PMD is really the PGD also. | 
|  | 290 | A lot of this stuff is defined in pgtable.h. | 
|  | 291 |  | 
|  | 292 | 2) Also seeing as s/390's page indexes are only 1k  in size | 
|  | 293 | (bits 12-19 x 4 bytes per pte ) we use 1 ( page 4k ) | 
|  | 294 | to make the best use of memory by updating 4 segment indices | 
|  | 295 | entries each time we mess with a PMD & use offsets | 
|  | 296 | 0,1024,2048 & 3072 in this page as for our segment indexes. | 
|  | 297 | On z/Architecture our page indexes are now 2k in size | 
|  | 298 | ( bits 12-19 x 8 bytes per pte ) we do a similar trick | 
|  | 299 | but only mess with 2 segment indices each time we mess with | 
|  | 300 | a PMD. | 
|  | 301 |  | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 302 | 3) As z/Architecture supports up to a massive 5-level page table lookup we | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 303 | can only use 3 currently on Linux ( as this is all the generic kernel | 
|  | 304 | currently supports ) however this may change in future | 
|  | 305 | this allows us to access ( according to my sums ) | 
|  | 306 | 4TB of virtual storage per process i.e. | 
|  | 307 | 4096*512(PTES)*1024(PMDS)*2048(PGD) = 4398046511104 bytes, | 
|  | 308 | enough for another 2 or 3 of years I think :-). | 
|  | 309 | to do this we use a region-third-table designation type in | 
|  | 310 | our address space control registers. | 
|  | 311 |  | 
|  | 312 |  | 
|  | 313 | The Linux for s/390 & z/Architecture Kernel Task Structure | 
|  | 314 | ========================================================== | 
|  | 315 | Each process/thread under Linux for S390 has its own kernel task_struct | 
|  | 316 | defined in linux/include/linux/sched.h | 
|  | 317 | The S390 on initialisation & resuming of a process on a cpu sets | 
|  | 318 | the __LC_KERNEL_STACK variable in the spare prefix area for this cpu | 
| Matt LaPlante | 53cb472 | 2006-10-03 22:55:17 +0200 | [diff] [blame] | 319 | (which we use for per-processor globals). | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 320 |  | 
| Matt LaPlante | 53cb472 | 2006-10-03 22:55:17 +0200 | [diff] [blame] | 321 | The kernel stack pointer is intimately tied with the task structure for | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 322 | each processor as follows. | 
|  | 323 |  | 
|  | 324 | s/390 | 
|  | 325 | ************************ | 
|  | 326 | *  1 page kernel stack * | 
|  | 327 | *        ( 4K )        * | 
|  | 328 | ************************ | 
|  | 329 | *   1 page task_struct * | 
|  | 330 | *        ( 4K )        * | 
|  | 331 | 8K aligned  ************************ | 
|  | 332 |  | 
|  | 333 | z/Architecture | 
|  | 334 | ************************ | 
|  | 335 | *  2 page kernel stack * | 
|  | 336 | *        ( 8K )        * | 
|  | 337 | ************************ | 
|  | 338 | *  2 page task_struct  * | 
|  | 339 | *        ( 8K )        * | 
|  | 340 | 16K aligned ************************ | 
|  | 341 |  | 
|  | 342 | What this means is that we don't need to dedicate any register or global variable | 
|  | 343 | to point to the current running process & can retrieve it with the following | 
|  | 344 | very simple construct for s/390 & one very similar for z/Architecture. | 
|  | 345 |  | 
|  | 346 | static inline struct task_struct * get_current(void) | 
|  | 347 | { | 
|  | 348 | struct task_struct *current; | 
|  | 349 | __asm__("lhi   %0,-8192\n\t" | 
|  | 350 | "nr    %0,15" | 
|  | 351 | : "=r" (current) ); | 
|  | 352 | return current; | 
|  | 353 | } | 
|  | 354 |  | 
|  | 355 | i.e. just anding the current kernel stack pointer with the mask -8192. | 
| Matt LaPlante | fff9289 | 2006-10-03 22:47:42 +0200 | [diff] [blame] | 356 | Thankfully because Linux doesn't have support for nested IO interrupts | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 357 | & our devices have large buffers can survive interrupts being shut for | 
|  | 358 | short amounts of time we don't need a separate stack for interrupts. | 
|  | 359 |  | 
|  | 360 |  | 
|  | 361 |  | 
|  | 362 |  | 
|  | 363 | Register Usage & Stackframes on Linux for s/390 & z/Architecture | 
|  | 364 | ================================================================= | 
|  | 365 | Overview: | 
|  | 366 | --------- | 
|  | 367 | This is the code that gcc produces at the top & the bottom of | 
| Matt LaPlante | 992caac | 2006-10-03 22:52:05 +0200 | [diff] [blame] | 368 | each function. It usually is fairly consistent & similar from | 
|  | 369 | function to function & if you know its layout you can probably | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 370 | make some headway in finding the ultimate cause of a problem | 
|  | 371 | after a crash without a source level debugger. | 
|  | 372 |  | 
|  | 373 | Note: To follow stackframes requires a knowledge of C or Pascal & | 
|  | 374 | limited knowledge of one assembly language. | 
|  | 375 |  | 
|  | 376 | It should be noted that there are some differences between the | 
|  | 377 | s/390 & z/Architecture stack layouts as the z/Architecture stack layout didn't have | 
|  | 378 | to maintain compatibility with older linkage formats. | 
|  | 379 |  | 
|  | 380 | Glossary: | 
|  | 381 | --------- | 
|  | 382 | alloca: | 
|  | 383 | This is a built in compiler function for runtime allocation | 
|  | 384 | of extra space on the callers stack which is obviously freed | 
|  | 385 | up on function exit ( e.g. the caller may choose to allocate nothing | 
|  | 386 | of a buffer of 4k if required for temporary purposes ), it generates | 
|  | 387 | very efficient code ( a few cycles  ) when compared to alternatives | 
|  | 388 | like malloc. | 
|  | 389 |  | 
|  | 390 | automatics: These are local variables on the stack, | 
|  | 391 | i.e they aren't in registers & they aren't static. | 
|  | 392 |  | 
|  | 393 | back-chain: | 
|  | 394 | This is a pointer to the stack pointer before entering a | 
|  | 395 | framed functions ( see frameless function ) prologue got by | 
| Matt LaPlante | fff9289 | 2006-10-03 22:47:42 +0200 | [diff] [blame] | 396 | dereferencing the address of the current stack pointer, | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 397 | i.e. got by accessing the 32 bit value at the stack pointers | 
|  | 398 | current location. | 
|  | 399 |  | 
|  | 400 | base-pointer: | 
|  | 401 | This is a pointer to the back of the literal pool which | 
|  | 402 | is an area just behind each procedure used to store constants | 
|  | 403 | in each function. | 
|  | 404 |  | 
|  | 405 | call-clobbered: The caller probably needs to save these registers if there | 
|  | 406 | is something of value in them, on the stack or elsewhere before making a | 
|  | 407 | call to another procedure so that it can restore it later. | 
|  | 408 |  | 
|  | 409 | epilogue: | 
|  | 410 | The code generated by the compiler to return to the caller. | 
|  | 411 |  | 
|  | 412 | frameless-function | 
|  | 413 | A frameless function in Linux for s390 & z/Architecture is one which doesn't | 
|  | 414 | need more than the register save area ( 96 bytes on s/390, 160 on z/Architecture ) | 
|  | 415 | given to it by the caller. | 
|  | 416 | A frameless function never: | 
|  | 417 | 1) Sets up a back chain. | 
|  | 418 | 2) Calls alloca. | 
|  | 419 | 3) Calls other normal functions | 
|  | 420 | 4) Has automatics. | 
|  | 421 |  | 
|  | 422 | GOT-pointer: | 
|  | 423 | This is a pointer to the global-offset-table in ELF | 
|  | 424 | ( Executable Linkable Format, Linux'es most common executable format ), | 
|  | 425 | all globals & shared library objects are found using this pointer. | 
|  | 426 |  | 
|  | 427 | lazy-binding | 
|  | 428 | ELF shared libraries are typically only loaded when routines in the shared | 
|  | 429 | library are actually first called at runtime. This is lazy binding. | 
|  | 430 |  | 
|  | 431 | procedure-linkage-table | 
|  | 432 | This is a table found from the GOT which contains pointers to routines | 
|  | 433 | in other shared libraries which can't be called to by easier means. | 
|  | 434 |  | 
|  | 435 | prologue: | 
|  | 436 | The code generated by the compiler to set up the stack frame. | 
|  | 437 |  | 
|  | 438 | outgoing-args: | 
|  | 439 | This is extra area allocated on the stack of the calling function if the | 
|  | 440 | parameters for the callee's cannot all be put in registers, the same | 
|  | 441 | area can be reused by each function the caller calls. | 
|  | 442 |  | 
|  | 443 | routine-descriptor: | 
|  | 444 | A COFF  executable format based concept of a procedure reference | 
|  | 445 | actually being 8 bytes or more as opposed to a simple pointer to the routine. | 
|  | 446 | This is typically defined as follows | 
|  | 447 | Routine Descriptor offset 0=Pointer to Function | 
|  | 448 | Routine Descriptor offset 4=Pointer to Table of Contents | 
|  | 449 | The table of contents/TOC is roughly equivalent to a GOT pointer. | 
|  | 450 | & it means that shared libraries etc. can be shared between several | 
|  | 451 | environments each with their own TOC. | 
|  | 452 |  | 
|  | 453 |  | 
|  | 454 | static-chain: This is used in nested functions a concept adopted from pascal | 
|  | 455 | by gcc not used in ansi C or C++ ( although quite useful ), basically it | 
|  | 456 | is a pointer used to reference local variables of enclosing functions. | 
|  | 457 | You might come across this stuff once or twice in your lifetime. | 
|  | 458 |  | 
|  | 459 | e.g. | 
|  | 460 | The function below should return 11 though gcc may get upset & toss warnings | 
|  | 461 | about unused variables. | 
|  | 462 | int FunctionA(int a) | 
|  | 463 | { | 
|  | 464 | int b; | 
|  | 465 | FunctionC(int c) | 
|  | 466 | { | 
|  | 467 | b=c+1; | 
|  | 468 | } | 
|  | 469 | FunctionC(10); | 
|  | 470 | return(b); | 
|  | 471 | } | 
|  | 472 |  | 
|  | 473 |  | 
|  | 474 | s/390 & z/Architecture Register usage | 
|  | 475 | ===================================== | 
|  | 476 | r0       used by syscalls/assembly                  call-clobbered | 
|  | 477 | r1	 used by syscalls/assembly                  call-clobbered | 
|  | 478 | r2       argument 0 / return value 0                call-clobbered | 
|  | 479 | r3       argument 1 / return value 1 (if long long) call-clobbered | 
|  | 480 | r4       argument 2                                 call-clobbered | 
|  | 481 | r5       argument 3                                 call-clobbered | 
| Heiko Carstens | d8c351a | 2007-02-05 21:17:34 +0100 | [diff] [blame] | 482 | r6	 argument 4				    saved | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 483 | r7       pointer-to arguments 5 to ...              saved | 
|  | 484 | r8       this & that                                saved | 
|  | 485 | r9       this & that                                saved | 
|  | 486 | r10      static-chain ( if nested function )        saved | 
|  | 487 | r11      frame-pointer ( if function used alloca )  saved | 
|  | 488 | r12      got-pointer                                saved | 
|  | 489 | r13      base-pointer                               saved | 
|  | 490 | r14      return-address                             saved | 
|  | 491 | r15      stack-pointer                              saved | 
|  | 492 |  | 
|  | 493 | f0       argument 0 / return value ( float/double ) call-clobbered | 
|  | 494 | f2       argument 1                                 call-clobbered | 
|  | 495 | f4       z/Architecture argument 2                  saved | 
|  | 496 | f6       z/Architecture argument 3                  saved | 
|  | 497 | The remaining floating points | 
|  | 498 | f1,f3,f5 f7-f15 are call-clobbered. | 
|  | 499 |  | 
|  | 500 | Notes: | 
|  | 501 | ------ | 
|  | 502 | 1) The only requirement is that registers which are used | 
|  | 503 | by the callee are saved, e.g. the compiler is perfectly | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 504 | capable of using r11 for purposes other than a frame a | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 505 | frame pointer if a frame pointer is not needed. | 
|  | 506 | 2) In functions with variable arguments e.g. printf the calling procedure | 
|  | 507 | is identical to one without variable arguments & the same number of | 
|  | 508 | parameters. However, the prologue of this function is somewhat more | 
|  | 509 | hairy owing to it having to move these parameters to the stack to | 
|  | 510 | get va_start, va_arg & va_end to work. | 
|  | 511 | 3) Access registers are currently unused by gcc but are used in | 
|  | 512 | the kernel. Possibilities exist to use them at the moment for | 
|  | 513 | temporary storage but it isn't recommended. | 
|  | 514 | 4) Only 4 of the floating point registers are used for | 
|  | 515 | parameter passing as older machines such as G3 only have only 4 | 
|  | 516 | & it keeps the stack frame compatible with other compilers. | 
|  | 517 | However with IEEE floating point emulation under linux on the | 
|  | 518 | older machines you are free to use the other 12. | 
|  | 519 | 5) A long long or double parameter cannot be have the | 
|  | 520 | first 4 bytes in a register & the second four bytes in the | 
|  | 521 | outgoing args area. It must be purely in the outgoing args | 
|  | 522 | area if crossing this boundary. | 
|  | 523 | 6) Floating point parameters are mixed with outgoing args | 
|  | 524 | on the outgoing args area in the order the are passed in as parameters. | 
|  | 525 | 7) Floating point arguments 2 & 3 are saved in the outgoing args area for | 
|  | 526 | z/Architecture | 
|  | 527 |  | 
|  | 528 |  | 
|  | 529 | Stack Frame Layout | 
|  | 530 | ------------------ | 
|  | 531 | s/390     z/Architecture | 
|  | 532 | 0         0             back chain ( a 0 here signifies end of back chain ) | 
|  | 533 | 4         8             eos ( end of stack, not used on Linux for S390 used in other linkage formats ) | 
|  | 534 | 8         16            glue used in other s/390 linkage formats for saved routine descriptors etc. | 
|  | 535 | 12        24            glue used in other s/390 linkage formats for saved routine descriptors etc. | 
|  | 536 | 16        32            scratch area | 
|  | 537 | 20        40            scratch area | 
|  | 538 | 24        48            saved r6 of caller function | 
|  | 539 | 28        56            saved r7 of caller function | 
|  | 540 | 32        64            saved r8 of caller function | 
|  | 541 | 36        72            saved r9 of caller function | 
|  | 542 | 40        80            saved r10 of caller function | 
|  | 543 | 44        88            saved r11 of caller function | 
|  | 544 | 48        96            saved r12 of caller function | 
|  | 545 | 52        104           saved r13 of caller function | 
|  | 546 | 56        112           saved r14 of caller function | 
|  | 547 | 60        120           saved r15 of caller function | 
|  | 548 | 64        128           saved f4 of caller function | 
|  | 549 | 72        132           saved f6 of caller function | 
|  | 550 | 80                      undefined | 
|  | 551 | 96        160           outgoing args passed from caller to callee | 
|  | 552 | 96+x      160+x         possible stack alignment ( 8 bytes desirable ) | 
|  | 553 | 96+x+y    160+x+y       alloca space of caller ( if used ) | 
|  | 554 | 96+x+y+z  160+x+y+z     automatics of caller ( if used ) | 
|  | 555 | 0                       back-chain | 
|  | 556 |  | 
|  | 557 | A sample program with comments. | 
|  | 558 | =============================== | 
|  | 559 |  | 
|  | 560 | Comments on the function test | 
|  | 561 | ----------------------------- | 
|  | 562 | 1) It didn't need to set up a pointer to the constant pool gpr13 as it isn't used | 
|  | 563 | ( :-( ). | 
|  | 564 | 2) This is a frameless function & no stack is bought. | 
|  | 565 | 3) The compiler was clever enough to recognise that it could return the | 
|  | 566 | value in r2 as well as use it for the passed in parameter ( :-) ). | 
|  | 567 | 4) The basr ( branch relative & save ) trick works as follows the instruction | 
|  | 568 | has a special case with r0,r0 with some instruction operands is understood as | 
|  | 569 | the literal value 0, some risc architectures also do this ). So now | 
|  | 570 | we are branching to the next address & the address new program counter is | 
|  | 571 | in r13,so now we subtract the size of the function prologue we have executed | 
|  | 572 | + the size of the literal pool to get to the top of the literal pool | 
|  | 573 | 0040037c int test(int b) | 
|  | 574 | {                                                          # Function prologue below | 
|  | 575 | 40037c:	90 de f0 34 	stm	%r13,%r14,52(%r15) # Save registers r13 & r14 | 
|  | 576 | 400380:	0d d0       	basr	%r13,%r0           # Set up pointer to constant pool using | 
|  | 577 | 400382:	a7 da ff fa 	ahi	%r13,-6            # basr trick | 
|  | 578 | return(5+b); | 
|  | 579 | # Huge main program | 
|  | 580 | 400386:	a7 2a 00 05 	ahi	%r2,5              # add 5 to r2 | 
|  | 581 |  | 
|  | 582 | # Function epilogue below | 
|  | 583 | 40038a:	98 de f0 34 	lm	%r13,%r14,52(%r15) # restore registers r13 & 14 | 
|  | 584 | 40038e:	07 fe       	br	%r14               # return | 
|  | 585 | } | 
|  | 586 |  | 
|  | 587 | Comments on the function main | 
|  | 588 | ----------------------------- | 
|  | 589 | 1) The compiler did this function optimally ( 8-) ) | 
|  | 590 |  | 
|  | 591 | Literal pool for main. | 
|  | 592 | 400390:	ff ff ff ec 	.long 0xffffffec | 
|  | 593 | main(int argc,char *argv[]) | 
|  | 594 | {                                                          # Function prologue below | 
|  | 595 | 400394:	90 bf f0 2c 	stm	%r11,%r15,44(%r15) # Save necessary registers | 
|  | 596 | 400398:	18 0f       	lr	%r0,%r15           # copy stack pointer to r0 | 
|  | 597 | 40039a:	a7 fa ff a0 	ahi	%r15,-96           # Make area for callee saving | 
|  | 598 | 40039e:	0d d0       	basr	%r13,%r0           # Set up r13 to point to | 
|  | 599 | 4003a0:	a7 da ff f0 	ahi	%r13,-16           # literal pool | 
|  | 600 | 4003a4:	50 00 f0 00 	st	%r0,0(%r15)        # Save backchain | 
|  | 601 |  | 
|  | 602 | return(test(5));                                   # Main Program Below | 
|  | 603 | 4003a8:	58 e0 d0 00 	l	%r14,0(%r13)       # load relative address of test from | 
|  | 604 | # literal pool | 
|  | 605 | 4003ac:	a7 28 00 05 	lhi	%r2,5              # Set first parameter to 5 | 
|  | 606 | 4003b0:	4d ee d0 00 	bas	%r14,0(%r14,%r13)  # jump to test setting r14 as return | 
|  | 607 | # address using branch & save instruction. | 
|  | 608 |  | 
|  | 609 | # Function Epilogue below | 
|  | 610 | 4003b4:	98 bf f0 8c 	lm	%r11,%r15,140(%r15)# Restore necessary registers. | 
|  | 611 | 4003b8:	07 fe       	br	%r14               # return to do program exit | 
|  | 612 | } | 
|  | 613 |  | 
|  | 614 |  | 
|  | 615 | Compiler updates | 
|  | 616 | ---------------- | 
|  | 617 |  | 
|  | 618 | main(int argc,char *argv[]) | 
|  | 619 | { | 
|  | 620 | 4004fc:	90 7f f0 1c       	stm	%r7,%r15,28(%r15) | 
|  | 621 | 400500:	a7 d5 00 04       	bras	%r13,400508 <main+0xc> | 
|  | 622 | 400504:	00 40 04 f4       	.long	0x004004f4 | 
|  | 623 | # compiler now puts constant pool in code to so it saves an instruction | 
|  | 624 | 400508:	18 0f             	lr	%r0,%r15 | 
|  | 625 | 40050a:	a7 fa ff a0       	ahi	%r15,-96 | 
|  | 626 | 40050e:	50 00 f0 00       	st	%r0,0(%r15) | 
|  | 627 | return(test(5)); | 
|  | 628 | 400512:	58 10 d0 00       	l	%r1,0(%r13) | 
|  | 629 | 400516:	a7 28 00 05       	lhi	%r2,5 | 
|  | 630 | 40051a:	0d e1             	basr	%r14,%r1 | 
|  | 631 | # compiler adds 1 extra instruction to epilogue this is done to | 
|  | 632 | # avoid processor pipeline stalls owing to data dependencies on g5 & | 
|  | 633 | # above as register 14 in the old code was needed directly after being loaded | 
|  | 634 | # by the lm	%r11,%r15,140(%r15) for the br %14. | 
|  | 635 | 40051c:	58 40 f0 98       	l	%r4,152(%r15) | 
|  | 636 | 400520:	98 7f f0 7c       	lm	%r7,%r15,124(%r15) | 
|  | 637 | 400524:	07 f4             	br	%r4 | 
|  | 638 | } | 
|  | 639 |  | 
|  | 640 |  | 
|  | 641 | Hartmut ( our compiler developer ) also has been threatening to take out the | 
|  | 642 | stack backchain in optimised code as this also causes pipeline stalls, you | 
|  | 643 | have been warned. | 
|  | 644 |  | 
|  | 645 | 64 bit z/Architecture code disassembly | 
|  | 646 | -------------------------------------- | 
|  | 647 |  | 
|  | 648 | If you understand the stuff above you'll understand the stuff | 
|  | 649 | below too so I'll avoid repeating myself & just say that | 
|  | 650 | some of the instructions have g's on the end of them to indicate | 
|  | 651 | they are 64 bit & the stack offsets are a bigger, | 
|  | 652 | the only other difference you'll find between 32 & 64 bit is that | 
|  | 653 | we now use f4 & f6 for floating point arguments on 64 bit. | 
|  | 654 | 00000000800005b0 <test>: | 
|  | 655 | int test(int b) | 
|  | 656 | { | 
|  | 657 | return(5+b); | 
|  | 658 | 800005b0:	a7 2a 00 05       	ahi	%r2,5 | 
|  | 659 | 800005b4:	b9 14 00 22       	lgfr	%r2,%r2 # downcast to integer | 
|  | 660 | 800005b8:	07 fe             	br	%r14 | 
|  | 661 | 800005ba:	07 07             	bcr	0,%r7 | 
|  | 662 |  | 
|  | 663 |  | 
|  | 664 | } | 
|  | 665 |  | 
|  | 666 | 00000000800005bc <main>: | 
|  | 667 | main(int argc,char *argv[]) | 
|  | 668 | { | 
|  | 669 | 800005bc:	eb bf f0 58 00 24 	stmg	%r11,%r15,88(%r15) | 
|  | 670 | 800005c2:	b9 04 00 1f       	lgr	%r1,%r15 | 
|  | 671 | 800005c6:	a7 fb ff 60       	aghi	%r15,-160 | 
|  | 672 | 800005ca:	e3 10 f0 00 00 24 	stg	%r1,0(%r15) | 
|  | 673 | return(test(5)); | 
|  | 674 | 800005d0:	a7 29 00 05       	lghi	%r2,5 | 
|  | 675 | # brasl allows jumps > 64k & is overkill here bras would do fune | 
|  | 676 | 800005d4:	c0 e5 ff ff ff ee 	brasl	%r14,800005b0 <test> | 
|  | 677 | 800005da:	e3 40 f1 10 00 04 	lg	%r4,272(%r15) | 
|  | 678 | 800005e0:	eb bf f0 f8 00 04 	lmg	%r11,%r15,248(%r15) | 
|  | 679 | 800005e6:	07 f4             	br	%r4 | 
|  | 680 | } | 
|  | 681 |  | 
|  | 682 |  | 
|  | 683 |  | 
|  | 684 | Compiling programs for debugging on Linux for s/390 & z/Architecture | 
|  | 685 | ==================================================================== | 
|  | 686 | -gdwarf-2 now works it should be considered the default debugging | 
|  | 687 | format for s/390 & z/Architecture as it is more reliable for debugging | 
|  | 688 | shared libraries,  normal -g debugging works much better now | 
|  | 689 | Thanks to the IBM java compiler developers bug reports. | 
|  | 690 |  | 
|  | 691 | This is typically done adding/appending the flags -g or -gdwarf-2 to the | 
|  | 692 | CFLAGS & LDFLAGS variables Makefile of the program concerned. | 
|  | 693 |  | 
|  | 694 | If using gdb & you would like accurate displays of registers & | 
|  | 695 | stack traces compile without optimisation i.e make sure | 
|  | 696 | that there is no -O2 or similar on the CFLAGS line of the Makefile & | 
|  | 697 | the emitted gcc commands, obviously this will produce worse code | 
|  | 698 | ( not advisable for shipment ) but it is an  aid to the debugging process. | 
|  | 699 |  | 
|  | 700 | This aids debugging because the compiler will copy parameters passed in | 
|  | 701 | in registers onto the stack so backtracing & looking at passed in | 
|  | 702 | parameters will work, however some larger programs which use inline functions | 
|  | 703 | will not compile without optimisation. | 
|  | 704 |  | 
|  | 705 | Debugging with optimisation has since much improved after fixing | 
|  | 706 | some bugs, please make sure you are using gdb-5.0 or later developed | 
|  | 707 | after Nov'2000. | 
|  | 708 |  | 
|  | 709 | Figuring out gcc compile errors | 
|  | 710 | =============================== | 
|  | 711 | If you are getting a lot of syntax errors compiling a program & the problem | 
|  | 712 | isn't blatantly obvious from the source. | 
|  | 713 | It often helps to just preprocess the file, this is done with the -E | 
|  | 714 | option in gcc. | 
|  | 715 | What this does is that it runs through the very first phase of compilation | 
|  | 716 | ( compilation in gcc is done in several stages & gcc calls many programs to | 
|  | 717 | achieve its end result ) with the -E option gcc just calls the gcc preprocessor (cpp). | 
|  | 718 | The c preprocessor does the following, it joins all the files #included together | 
|  | 719 | recursively ( #include files can #include other files ) & also the c file you wish to compile. | 
|  | 720 | It puts a fully qualified path of the #included files in a comment & it | 
|  | 721 | does macro expansion. | 
|  | 722 | This is useful for debugging because | 
|  | 723 | 1) You can double check whether the files you expect to be included are the ones | 
|  | 724 | that are being included ( e.g. double check that you aren't going to the i386 asm directory ). | 
|  | 725 | 2) Check that macro definitions aren't clashing with typedefs, | 
| Matt LaPlante | fff9289 | 2006-10-03 22:47:42 +0200 | [diff] [blame] | 726 | 3) Check that definitions aren't being used before they are being included. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 727 | 4) Helps put the line emitting the error under the microscope if it contains macros. | 
|  | 728 |  | 
|  | 729 | For convenience the Linux kernel's makefile will do preprocessing automatically for you | 
|  | 730 | by suffixing the file you want built with .i ( instead of .o ) | 
|  | 731 |  | 
|  | 732 | e.g. | 
|  | 733 | from the linux directory type | 
|  | 734 | make arch/s390/kernel/signal.i | 
|  | 735 | this will build | 
|  | 736 |  | 
|  | 737 | s390-gcc -D__KERNEL__ -I/home1/barrow/linux/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer | 
|  | 738 | -fno-strict-aliasing -D__SMP__ -pipe -fno-strength-reduce   -E arch/s390/kernel/signal.c | 
|  | 739 | > arch/s390/kernel/signal.i | 
|  | 740 |  | 
|  | 741 | Now look at signal.i you should see something like. | 
|  | 742 |  | 
|  | 743 |  | 
|  | 744 | # 1 "/home1/barrow/linux/include/asm/types.h" 1 | 
|  | 745 | typedef unsigned short umode_t; | 
|  | 746 | typedef __signed__ char __s8; | 
|  | 747 | typedef unsigned char __u8; | 
|  | 748 | typedef __signed__ short __s16; | 
|  | 749 | typedef unsigned short __u16; | 
|  | 750 |  | 
|  | 751 | If instead you are getting errors further down e.g. | 
|  | 752 | unknown instruction:2515 "move.l" or better still unknown instruction:2515 | 
|  | 753 | "Fixme not implemented yet, call Martin" you are probably are attempting to compile some code | 
|  | 754 | meant for another architecture or code that is simply not implemented, with a fixme statement | 
|  | 755 | stuck into the inline assembly code so that the author of the file now knows he has work to do. | 
|  | 756 | To look at the assembly emitted by gcc just before it is about to call gas ( the gnu assembler ) | 
|  | 757 | use the -S option. | 
|  | 758 | Again for your convenience the Linux kernel's Makefile will hold your hand & | 
|  | 759 | do all this donkey work for you also by building the file with the .s suffix. | 
|  | 760 | e.g. | 
|  | 761 | from the Linux directory type | 
|  | 762 | make arch/s390/kernel/signal.s | 
|  | 763 |  | 
|  | 764 | s390-gcc -D__KERNEL__ -I/home1/barrow/linux/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer | 
|  | 765 | -fno-strict-aliasing -D__SMP__ -pipe -fno-strength-reduce  -S arch/s390/kernel/signal.c | 
|  | 766 | -o arch/s390/kernel/signal.s | 
|  | 767 |  | 
|  | 768 |  | 
|  | 769 | This will output something like, ( please note the constant pool & the useful comments | 
|  | 770 | in the prologue to give you a hand at interpreting it ). | 
|  | 771 |  | 
|  | 772 | .LC54: | 
|  | 773 | .string	"misaligned (__u16 *) in __xchg\n" | 
|  | 774 | .LC57: | 
|  | 775 | .string	"misaligned (__u32 *) in __xchg\n" | 
|  | 776 | .L$PG1: # Pool sys_sigsuspend | 
|  | 777 | .LC192: | 
|  | 778 | .long	-262401 | 
|  | 779 | .LC193: | 
|  | 780 | .long	-1 | 
|  | 781 | .LC194: | 
|  | 782 | .long	schedule-.L$PG1 | 
|  | 783 | .LC195: | 
|  | 784 | .long	do_signal-.L$PG1 | 
|  | 785 | .align 4 | 
|  | 786 | .globl sys_sigsuspend | 
|  | 787 | .type	 sys_sigsuspend,@function | 
|  | 788 | sys_sigsuspend: | 
|  | 789 | #	leaf function           0 | 
|  | 790 | #	automatics              16 | 
|  | 791 | #	outgoing args           0 | 
|  | 792 | #	need frame pointer      0 | 
|  | 793 | #	call alloca             0 | 
|  | 794 | #	has varargs             0 | 
|  | 795 | #	incoming args (stack)   0 | 
|  | 796 | #	function length         168 | 
|  | 797 | STM	8,15,32(15) | 
|  | 798 | LR	0,15 | 
|  | 799 | AHI	15,-112 | 
|  | 800 | BASR	13,0 | 
|  | 801 | .L$CO1:	AHI	13,.L$PG1-.L$CO1 | 
|  | 802 | ST	0,0(15) | 
|  | 803 | LR    8,2 | 
|  | 804 | N     5,.LC192-.L$PG1(13) | 
|  | 805 |  | 
|  | 806 | Adding -g to the above output makes the output even more useful | 
|  | 807 | e.g. typing | 
|  | 808 | make CC:="s390-gcc -g" kernel/sched.s | 
|  | 809 |  | 
|  | 810 | which compiles. | 
|  | 811 | s390-gcc -g -D__KERNEL__ -I/home/barrow/linux-2.3/include -Wall -Wstrict-prototypes -O2 -fomit-frame-pointer -fno-strict-aliasing -pipe -fno-strength-reduce   -S kernel/sched.c -o kernel/sched.s | 
|  | 812 |  | 
|  | 813 | also outputs stabs ( debugger ) info, from this info you can find out the | 
|  | 814 | offsets & sizes of various elements in structures. | 
|  | 815 | e.g. the stab for the structure | 
|  | 816 | struct rlimit { | 
|  | 817 | unsigned long	rlim_cur; | 
|  | 818 | unsigned long	rlim_max; | 
|  | 819 | }; | 
|  | 820 | is | 
|  | 821 | .stabs "rlimit:T(151,2)=s8rlim_cur:(0,5),0,32;rlim_max:(0,5),32,32;;",128,0,0,0 | 
|  | 822 | from this stab you can see that | 
|  | 823 | rlimit_cur starts at bit offset 0 & is 32 bits in size | 
|  | 824 | rlimit_max starts at bit offset 32 & is 32 bits in size. | 
|  | 825 |  | 
|  | 826 |  | 
|  | 827 | Debugging Tools: | 
|  | 828 | ================ | 
|  | 829 |  | 
|  | 830 | objdump | 
|  | 831 | ======= | 
|  | 832 | This is a tool with many options the most useful being ( if compiled with -g). | 
|  | 833 | objdump --source <victim program or object file> > <victims debug listing > | 
|  | 834 |  | 
|  | 835 |  | 
|  | 836 | The whole kernel can be compiled like this ( Doing this will make a 17MB kernel | 
|  | 837 | & a 200 MB listing ) however you have to strip it before building the image | 
|  | 838 | using the strip command to make it a more reasonable size to boot it. | 
|  | 839 |  | 
|  | 840 | A source/assembly mixed dump of the kernel can be done with the line | 
|  | 841 | objdump --source vmlinux > vmlinux.lst | 
| Matt LaPlante | fff9289 | 2006-10-03 22:47:42 +0200 | [diff] [blame] | 842 | Also, if the file isn't compiled -g, this will output as much debugging information | 
|  | 843 | as it can (e.g. function names). This is very slow as it spends lots | 
|  | 844 | of time searching for debugging info. The following self explanatory line should be used | 
|  | 845 | instead if the code isn't compiled -g, as it is much faster: | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 846 | objdump --disassemble-all --syms vmlinux > vmlinux.lst | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 847 |  | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 848 | As hard drive space is valuable most of us use the following approach. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 849 | 1) Look at the emitted psw on the console to find the crash address in the kernel. | 
|  | 850 | 2) Look at the file System.map ( in the linux directory ) produced when building | 
|  | 851 | the kernel to find the closest address less than the current PSW to find the | 
|  | 852 | offending function. | 
|  | 853 | 3) use grep or similar to search the source tree looking for the source file | 
|  | 854 | with this function if you don't know where it is. | 
|  | 855 | 4) rebuild this object file with -g on, as an example suppose the file was | 
|  | 856 | ( /arch/s390/kernel/signal.o ) | 
|  | 857 | 5) Assuming the file with the erroneous function is signal.c Move to the base of the | 
|  | 858 | Linux source tree. | 
|  | 859 | 6) rm /arch/s390/kernel/signal.o | 
|  | 860 | 7) make /arch/s390/kernel/signal.o | 
|  | 861 | 8) watch the gcc command line emitted | 
| Matt LaPlante | 3f6dee9 | 2006-10-03 22:45:33 +0200 | [diff] [blame] | 862 | 9) type it in again or alternatively cut & paste it on the console adding the -g option. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 863 | 10) objdump --source arch/s390/kernel/signal.o > signal.lst | 
|  | 864 | This will output the source & the assembly intermixed, as the snippet below shows | 
|  | 865 | This will unfortunately output addresses which aren't the same | 
|  | 866 | as the kernel ones you should be able to get around the mental arithmetic | 
|  | 867 | by playing with the --adjust-vma parameter to objdump. | 
|  | 868 |  | 
|  | 869 |  | 
|  | 870 |  | 
|  | 871 |  | 
| Adrian Bunk | 4448aaf | 2005-11-08 21:34:42 -0800 | [diff] [blame] | 872 | static inline void spin_lock(spinlock_t *lp) | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 873 | { | 
|  | 874 | a0:       18 34           lr      %r3,%r4 | 
|  | 875 | a2:       a7 3a 03 bc     ahi     %r3,956 | 
|  | 876 | __asm__ __volatile("    lhi   1,-1\n" | 
|  | 877 | a6:       a7 18 ff ff     lhi     %r1,-1 | 
|  | 878 | aa:       1f 00           slr     %r0,%r0 | 
|  | 879 | ac:       ba 01 30 00     cs      %r0,%r1,0(%r3) | 
|  | 880 | b0:       a7 44 ff fd     jm      aa <sys_sigsuspend+0x2e> | 
|  | 881 | saveset = current->blocked; | 
|  | 882 | b4:       d2 07 f0 68     mvc     104(8,%r15),972(%r4) | 
|  | 883 | b8:       43 cc | 
|  | 884 | return (set->sig[0] & mask) != 0; | 
|  | 885 | } | 
|  | 886 |  | 
|  | 887 | 6) If debugging under VM go down to that section in the document for more info. | 
|  | 888 |  | 
|  | 889 |  | 
|  | 890 | I now have a tool which takes the pain out of --adjust-vma | 
|  | 891 | & you are able to do something like | 
|  | 892 | make /arch/s390/kernel/traps.lst | 
|  | 893 | & it automatically generates the correctly relocated entries for | 
|  | 894 | the text segment in traps.lst. | 
|  | 895 | This tool is now standard in linux distro's in scripts/makelst | 
|  | 896 |  | 
|  | 897 | strace: | 
|  | 898 | ------- | 
|  | 899 | Q. What is it ? | 
|  | 900 | A. It is a tool for intercepting calls to the kernel & logging them | 
|  | 901 | to a file & on the screen. | 
|  | 902 |  | 
|  | 903 | Q. What use is it ? | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 904 | A. You can use it to find out what files a particular program opens. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 905 |  | 
|  | 906 |  | 
|  | 907 |  | 
|  | 908 | Example 1 | 
|  | 909 | --------- | 
|  | 910 | If you wanted to know does ping work but didn't have the source | 
|  | 911 | strace ping -c 1 127.0.0.1 | 
|  | 912 | & then look at the man pages for each of the syscalls below, | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 913 | ( In fact this is sometimes easier than looking at some spaghetti | 
| Matt LaPlante | 2fe0ae7 | 2006-10-03 22:50:39 +0200 | [diff] [blame] | 914 | source which conditionally compiles for several architectures ). | 
|  | 915 | Not everything that it throws out needs to make sense immediately. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 916 |  | 
|  | 917 | Just looking quickly you can see that it is making up a RAW socket | 
|  | 918 | for the ICMP protocol. | 
|  | 919 | Doing an alarm(10) for a 10 second timeout | 
|  | 920 | & doing a gettimeofday call before & after each read to see | 
|  | 921 | how long the replies took, & writing some text to stdout so the user | 
|  | 922 | has an idea what is going on. | 
|  | 923 |  | 
|  | 924 | socket(PF_INET, SOCK_RAW, IPPROTO_ICMP) = 3 | 
|  | 925 | getuid()                                = 0 | 
|  | 926 | setuid(0)                               = 0 | 
|  | 927 | stat("/usr/share/locale/C/libc.cat", 0xbffff134) = -1 ENOENT (No such file or directory) | 
|  | 928 | stat("/usr/share/locale/libc/C", 0xbffff134) = -1 ENOENT (No such file or directory) | 
|  | 929 | stat("/usr/local/share/locale/C/libc.cat", 0xbffff134) = -1 ENOENT (No such file or directory) | 
|  | 930 | getpid()                                = 353 | 
|  | 931 | setsockopt(3, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0 | 
|  | 932 | setsockopt(3, SOL_SOCKET, SO_RCVBUF, [49152], 4) = 0 | 
|  | 933 | fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(3, 1), ...}) = 0 | 
|  | 934 | mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40008000 | 
|  | 935 | ioctl(1, TCGETS, {B9600 opost isig icanon echo ...}) = 0 | 
|  | 936 | write(1, "PING 127.0.0.1 (127.0.0.1): 56 d"..., 42PING 127.0.0.1 (127.0.0.1): 56 data bytes | 
|  | 937 | ) = 42 | 
|  | 938 | sigaction(SIGINT, {0x8049ba0, [], SA_RESTART}, {SIG_DFL}) = 0 | 
|  | 939 | sigaction(SIGALRM, {0x8049600, [], SA_RESTART}, {SIG_DFL}) = 0 | 
|  | 940 | gettimeofday({948904719, 138951}, NULL) = 0 | 
|  | 941 | sendto(3, "\10\0D\201a\1\0\0\17#\2178\307\36"..., 64, 0, {sin_family=AF_INET, | 
|  | 942 | sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 64 | 
|  | 943 | sigaction(SIGALRM, {0x8049600, [], SA_RESTART}, {0x8049600, [], SA_RESTART}) = 0 | 
|  | 944 | sigaction(SIGALRM, {0x8049ba0, [], SA_RESTART}, {0x8049600, [], SA_RESTART}) = 0 | 
|  | 945 | alarm(10)                               = 0 | 
|  | 946 | recvfrom(3, "E\0\0T\0005\0\0@\1|r\177\0\0\1\177"..., 192, 0, | 
|  | 947 | {sin_family=AF_INET, sin_port=htons(50882), sin_addr=inet_addr("127.0.0.1")}, [16]) = 84 | 
|  | 948 | gettimeofday({948904719, 160224}, NULL) = 0 | 
|  | 949 | recvfrom(3, "E\0\0T\0006\0\0\377\1\275p\177\0"..., 192, 0, | 
|  | 950 | {sin_family=AF_INET, sin_port=htons(50882), sin_addr=inet_addr("127.0.0.1")}, [16]) = 84 | 
|  | 951 | gettimeofday({948904719, 166952}, NULL) = 0 | 
|  | 952 | write(1, "64 bytes from 127.0.0.1: icmp_se"..., | 
|  | 953 | 5764 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=28.0 ms | 
|  | 954 |  | 
|  | 955 | Example 2 | 
|  | 956 | --------- | 
|  | 957 | strace passwd 2>&1 | grep open | 
|  | 958 | produces the following output | 
|  | 959 | open("/etc/ld.so.cache", O_RDONLY)      = 3 | 
|  | 960 | open("/opt/kde/lib/libc.so.5", O_RDONLY) = -1 ENOENT (No such file or directory) | 
|  | 961 | open("/lib/libc.so.5", O_RDONLY)        = 3 | 
|  | 962 | open("/dev", O_RDONLY)                  = 3 | 
|  | 963 | open("/var/run/utmp", O_RDONLY)         = 3 | 
|  | 964 | open("/etc/passwd", O_RDONLY)           = 3 | 
|  | 965 | open("/etc/shadow", O_RDONLY)           = 3 | 
|  | 966 | open("/etc/login.defs", O_RDONLY)       = 4 | 
|  | 967 | open("/dev/tty", O_RDONLY)              = 4 | 
|  | 968 |  | 
|  | 969 | The 2>&1 is done to redirect stderr to stdout & grep is then filtering this input | 
|  | 970 | through the pipe for each line containing the string open. | 
|  | 971 |  | 
|  | 972 |  | 
|  | 973 | Example 3 | 
|  | 974 | --------- | 
| Matt LaPlante | 53cb472 | 2006-10-03 22:55:17 +0200 | [diff] [blame] | 975 | Getting sophisticated | 
|  | 976 | telnetd crashes & I don't know why | 
|  | 977 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 978 | Steps | 
|  | 979 | ----- | 
|  | 980 | 1) Replace the following line in /etc/inetd.conf | 
|  | 981 | telnet  stream  tcp     nowait  root    /usr/sbin/in.telnetd -h | 
|  | 982 | with | 
|  | 983 | telnet  stream  tcp     nowait  root    /blah | 
|  | 984 |  | 
|  | 985 | 2) Create the file /blah with the following contents to start tracing telnetd | 
|  | 986 | #!/bin/bash | 
|  | 987 | /usr/bin/strace -o/t1 -f /usr/sbin/in.telnetd -h | 
|  | 988 | 3) chmod 700 /blah to make it executable only to root | 
|  | 989 | 4) | 
|  | 990 | killall -HUP inetd | 
|  | 991 | or ps aux | grep inetd | 
|  | 992 | get inetd's process id | 
|  | 993 | & kill -HUP inetd to restart it. | 
|  | 994 |  | 
|  | 995 | Important options | 
|  | 996 | ----------------- | 
|  | 997 | -o is used to tell strace to output to a file in our case t1 in the root directory | 
|  | 998 | -f is to follow children i.e. | 
|  | 999 | e.g in our case above telnetd will start the login process & subsequently a shell like bash. | 
|  | 1000 | You will be able to tell which is which from the process ID's listed on the left hand side | 
|  | 1001 | of the strace output. | 
|  | 1002 | -p<pid> will tell strace to attach to a running process, yup this can be done provided | 
|  | 1003 | it isn't being traced or debugged already & you have enough privileges, | 
|  | 1004 | the reason 2 processes cannot trace or debug the same program is that strace | 
|  | 1005 | becomes the parent process of the one being debugged & processes ( unlike people ) | 
|  | 1006 | can have only one parent. | 
|  | 1007 |  | 
|  | 1008 |  | 
|  | 1009 | However the file /t1 will get big quite quickly | 
|  | 1010 | to test it telnet 127.0.0.1 | 
|  | 1011 |  | 
|  | 1012 | now look at what files in.telnetd execve'd | 
|  | 1013 | 413   execve("/usr/sbin/in.telnetd", ["/usr/sbin/in.telnetd", "-h"], [/* 17 vars */]) = 0 | 
|  | 1014 | 414   execve("/bin/login", ["/bin/login", "-h", "localhost", "-p"], [/* 2 vars */]) = 0 | 
|  | 1015 |  | 
|  | 1016 | Whey it worked!. | 
|  | 1017 |  | 
|  | 1018 |  | 
|  | 1019 | Other hints: | 
|  | 1020 | ------------ | 
|  | 1021 | If the program is not very interactive ( i.e. not much keyboard input ) | 
|  | 1022 | & is crashing in one architecture but not in another you can do | 
|  | 1023 | an strace of both programs under as identical a scenario as you can | 
|  | 1024 | on both architectures outputting to a file then. | 
|  | 1025 | do a diff of the two traces using the diff program | 
|  | 1026 | i.e. | 
|  | 1027 | diff output1 output2 | 
|  | 1028 | & maybe you'll be able to see where the call paths differed, this | 
|  | 1029 | is possibly near the cause of the crash. | 
|  | 1030 |  | 
|  | 1031 | More info | 
|  | 1032 | --------- | 
|  | 1033 | Look at man pages for strace & the various syscalls | 
|  | 1034 | e.g. man strace, man alarm, man socket. | 
|  | 1035 |  | 
|  | 1036 |  | 
|  | 1037 | Performance Debugging | 
|  | 1038 | ===================== | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 1039 | gcc is capable of compiling in profiling code just add the -p option | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1040 | to the CFLAGS, this obviously affects program size & performance. | 
|  | 1041 | This can be used by the gprof gnu profiling tool or the | 
|  | 1042 | gcov the gnu code coverage tool ( code coverage is a means of testing | 
|  | 1043 | code quality by checking if all the code in an executable in exercised by | 
|  | 1044 | a tester ). | 
|  | 1045 |  | 
|  | 1046 |  | 
|  | 1047 | Using top to find out where processes are sleeping in the kernel | 
|  | 1048 | ---------------------------------------------------------------- | 
|  | 1049 | To do this copy the System.map from the root directory where | 
|  | 1050 | the linux kernel was built to the /boot directory on your | 
|  | 1051 | linux machine. | 
|  | 1052 | Start top | 
|  | 1053 | Now type fU<return> | 
|  | 1054 | You should see a new field called WCHAN which | 
|  | 1055 | tells you where each process is sleeping here is a typical output. | 
|  | 1056 |  | 
|  | 1057 | 6:59pm  up 41 min,  1 user,  load average: 0.00, 0.00, 0.00 | 
|  | 1058 | 28 processes: 27 sleeping, 1 running, 0 zombie, 0 stopped | 
|  | 1059 | CPU states:  0.0% user,  0.1% system,  0.0% nice, 99.8% idle | 
|  | 1060 | Mem:   254900K av,   45976K used,  208924K free,       0K shrd,   28636K buff | 
|  | 1061 | Swap:       0K av,       0K used,       0K free                    8620K cached | 
|  | 1062 |  | 
|  | 1063 | PID USER     PRI  NI  SIZE  RSS SHARE WCHAN     STAT  LIB %CPU %MEM   TIME COMMAND | 
|  | 1064 | 750 root      12   0   848  848   700 do_select S       0  0.1  0.3   0:00 in.telnetd | 
|  | 1065 | 767 root      16   0  1140 1140   964           R       0  0.1  0.4   0:00 top | 
|  | 1066 | 1 root       8   0   212  212   180 do_select S       0  0.0  0.0   0:00 init | 
|  | 1067 | 2 root       9   0     0    0     0 down_inte SW      0  0.0  0.0   0:00 kmcheck | 
|  | 1068 |  | 
|  | 1069 | The time command | 
|  | 1070 | ---------------- | 
|  | 1071 | Another related command is the time command which gives you an indication | 
|  | 1072 | of where a process is spending the majority of its time. | 
|  | 1073 | e.g. | 
|  | 1074 | time ping -c 5 nc | 
|  | 1075 | outputs | 
|  | 1076 | real	0m4.054s | 
|  | 1077 | user	0m0.010s | 
|  | 1078 | sys	0m0.010s | 
|  | 1079 |  | 
|  | 1080 | Debugging under VM | 
|  | 1081 | ================== | 
|  | 1082 |  | 
|  | 1083 | Notes | 
|  | 1084 | ----- | 
|  | 1085 | Addresses & values in the VM debugger are always hex never decimal | 
|  | 1086 | Address ranges are of the format <HexValue1>-<HexValue2> or <HexValue1>.<HexValue2> | 
| Paolo Ornati | 670e9f3 | 2006-10-03 22:57:56 +0200 | [diff] [blame] | 1087 | e.g. The address range  0x2000 to 0x3000 can be described as 2000-3000 or 2000.1000 | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1088 |  | 
|  | 1089 | The VM Debugger is case insensitive. | 
|  | 1090 |  | 
|  | 1091 | VM's strengths are usually other debuggers weaknesses you can get at any resource | 
|  | 1092 | no matter how sensitive e.g. memory management resources,change address translation | 
|  | 1093 | in the PSW. For kernel hacking you will reap dividends if you get good at it. | 
|  | 1094 |  | 
|  | 1095 | The VM Debugger displays operators but not operands, probably because some | 
|  | 1096 | of it was written when memory was expensive & the programmer was probably proud that | 
|  | 1097 | it fitted into 2k of memory & the programmers & didn't want to shock hardcore VM'ers by | 
|  | 1098 | changing the interface :-), also the debugger displays useful information on the same line & | 
|  | 1099 | the author of the code probably felt that it was a good idea not to go over | 
|  | 1100 | the 80 columns on the screen. | 
|  | 1101 |  | 
|  | 1102 | As some of you are probably in a panic now this isn't as unintuitive as it may seem | 
|  | 1103 | as the 390 instructions are easy to decode mentally & you can make a good guess at a lot | 
|  | 1104 | of them as all the operands are nibble ( half byte aligned ) & if you have an objdump listing | 
|  | 1105 | also it is quite easy to follow, if you don't have an objdump listing keep a copy of | 
|  | 1106 | the s/390 Reference Summary & look at between pages 2 & 7 or alternatively the | 
|  | 1107 | s/390 principles of operation. | 
|  | 1108 | e.g. even I can guess that | 
|  | 1109 | 0001AFF8' LR    180F        CC 0 | 
|  | 1110 | is a ( load register ) lr r0,r15 | 
|  | 1111 |  | 
|  | 1112 | Also it is very easy to tell the length of a 390 instruction from the 2 most significant | 
|  | 1113 | bits in the instruction ( not that this info is really useful except if you are trying to | 
|  | 1114 | make sense of a hexdump of code ). | 
|  | 1115 | Here is a table | 
|  | 1116 | Bits                    Instruction Length | 
|  | 1117 | ------------------------------------------ | 
|  | 1118 | 00                          2 Bytes | 
|  | 1119 | 01                          4 Bytes | 
|  | 1120 | 10                          4 Bytes | 
|  | 1121 | 11                          6 Bytes | 
|  | 1122 |  | 
|  | 1123 |  | 
|  | 1124 |  | 
|  | 1125 |  | 
|  | 1126 | The debugger also displays other useful info on the same line such as the | 
|  | 1127 | addresses being operated on destination addresses of branches & condition codes. | 
|  | 1128 | e.g. | 
|  | 1129 | 00019736' AHI   A7DAFF0E    CC 1 | 
|  | 1130 | 000198BA' BRC   A7840004 -> 000198C2'   CC 0 | 
|  | 1131 | 000198CE' STM   900EF068 >> 0FA95E78    CC 2 | 
|  | 1132 |  | 
|  | 1133 |  | 
|  | 1134 |  | 
|  | 1135 | Useful VM debugger commands | 
|  | 1136 | --------------------------- | 
|  | 1137 |  | 
|  | 1138 | I suppose I'd better mention this before I start | 
|  | 1139 | to list the current active traces do | 
|  | 1140 | Q TR | 
|  | 1141 | there can be a maximum of 255 of these per set | 
|  | 1142 | ( more about trace sets later ). | 
|  | 1143 | To stop traces issue a | 
|  | 1144 | TR END. | 
|  | 1145 | To delete a particular breakpoint issue | 
|  | 1146 | TR DEL <breakpoint number> | 
|  | 1147 |  | 
|  | 1148 | The PA1 key drops to CP mode so you can issue debugger commands, | 
|  | 1149 | Doing alt c (on my 3270 console at least ) clears the screen. | 
|  | 1150 | hitting b <enter> comes back to the running operating system | 
|  | 1151 | from cp mode ( in our case linux ). | 
|  | 1152 | It is typically useful to add shortcuts to your profile.exec file | 
|  | 1153 | if you have one ( this is roughly equivalent to autoexec.bat in DOS ). | 
|  | 1154 | file here are a few from mine. | 
|  | 1155 | /* this gives me command history on issuing f12 */ | 
|  | 1156 | set pf12 retrieve | 
|  | 1157 | /* this continues */ | 
|  | 1158 | set pf8 imm b | 
|  | 1159 | /* goes to trace set a */ | 
|  | 1160 | set pf1 imm tr goto a | 
|  | 1161 | /* goes to trace set b */ | 
|  | 1162 | set pf2 imm tr goto b | 
|  | 1163 | /* goes to trace set c */ | 
|  | 1164 | set pf3 imm tr goto c | 
|  | 1165 |  | 
|  | 1166 |  | 
|  | 1167 |  | 
|  | 1168 | Instruction Tracing | 
|  | 1169 | ------------------- | 
|  | 1170 | Setting a simple breakpoint | 
|  | 1171 | TR I PSWA <address> | 
|  | 1172 | To debug a particular function try | 
|  | 1173 | TR I R <function address range> | 
|  | 1174 | TR I on its own will single step. | 
|  | 1175 | TR I DATA <MNEMONIC> <OPTIONAL RANGE> will trace for particular mnemonics | 
|  | 1176 | e.g. | 
|  | 1177 | TR I DATA 4D R 0197BC.4000 | 
|  | 1178 | will trace for BAS'es ( opcode 4D ) in the range 0197BC.4000 | 
|  | 1179 | if you were inclined you could add traces for all branch instructions & | 
|  | 1180 | suffix them with the run prefix so you would have a backtrace on screen | 
|  | 1181 | when a program crashes. | 
|  | 1182 | TR BR <INTO OR FROM> will trace branches into or out of an address. | 
|  | 1183 | e.g. | 
|  | 1184 | TR BR INTO 0 is often quite useful if a program is getting awkward & deciding | 
|  | 1185 | to branch to 0 & crashing as this will stop at the address before in jumps to 0. | 
|  | 1186 | TR I R <address range> RUN cmd d g | 
|  | 1187 | single steps a range of addresses but stays running & | 
|  | 1188 | displays the gprs on each step. | 
|  | 1189 |  | 
|  | 1190 |  | 
|  | 1191 |  | 
|  | 1192 | Displaying & modifying Registers | 
|  | 1193 | -------------------------------- | 
|  | 1194 | D G will display all the gprs | 
|  | 1195 | Adding a extra G to all the commands is necessary to access the full 64 bit | 
|  | 1196 | content in VM on z/Architecture obviously this isn't required for access registers | 
|  | 1197 | as these are still 32 bit. | 
|  | 1198 | e.g. DGG instead of DG | 
|  | 1199 | D X will display all the control registers | 
|  | 1200 | D AR will display all the access registers | 
|  | 1201 | D AR4-7 will display access registers 4 to 7 | 
|  | 1202 | CPU ALL D G will display the GRPS of all CPUS in the configuration | 
|  | 1203 | D PSW will display the current PSW | 
|  | 1204 | st PSW 2000 will put the value 2000 into the PSW & | 
|  | 1205 | cause crash your machine. | 
|  | 1206 | D PREFIX displays the prefix offset | 
|  | 1207 |  | 
|  | 1208 |  | 
|  | 1209 | Displaying Memory | 
|  | 1210 | ----------------- | 
|  | 1211 | To display memory mapped using the current PSW's mapping try | 
|  | 1212 | D <range> | 
|  | 1213 | To make VM display a message each time it hits a particular address & continue try | 
|  | 1214 | D I<range> will disassemble/display a range of instructions. | 
|  | 1215 | ST addr 32 bit word will store a 32 bit aligned address | 
|  | 1216 | D T<range> will display the EBCDIC in an address ( if you are that way inclined ) | 
|  | 1217 | D R<range> will display real addresses ( without DAT ) but with prefixing. | 
|  | 1218 | There are other complex options to display if you need to get at say home space | 
|  | 1219 | but are in primary space the easiest thing to do is to temporarily | 
|  | 1220 | modify the PSW to the other addressing mode, display the stuff & then | 
|  | 1221 | restore it. | 
|  | 1222 |  | 
|  | 1223 |  | 
|  | 1224 |  | 
|  | 1225 | Hints | 
|  | 1226 | ----- | 
|  | 1227 | If you want to issue a debugger command without halting your virtual machine with the | 
|  | 1228 | PA1 key try prefixing the command with #CP e.g. | 
|  | 1229 | #cp tr i pswa 2000 | 
|  | 1230 | also suffixing most debugger commands with RUN will cause them not | 
|  | 1231 | to stop just display the mnemonic at the current instruction on the console. | 
|  | 1232 | If you have several breakpoints you want to put into your program & | 
|  | 1233 | you get fed up of cross referencing with System.map | 
|  | 1234 | you can do the following trick for several symbols. | 
|  | 1235 | grep do_signal System.map | 
|  | 1236 | which emits the following among other things | 
|  | 1237 | 0001f4e0 T do_signal | 
|  | 1238 | now you can do | 
|  | 1239 |  | 
|  | 1240 | TR I PSWA 0001f4e0 cmd msg * do_signal | 
|  | 1241 | This sends a message to your own console each time do_signal is entered. | 
|  | 1242 | ( As an aside I wrote a perl script once which automatically generated a REXX | 
|  | 1243 | script with breakpoints on every kernel procedure, this isn't a good idea | 
|  | 1244 | because there are thousands of these routines & VM can only set 255 breakpoints | 
|  | 1245 | at a time so you nearly had to spend as long pruning the file down as you would | 
|  | 1246 | entering the msg's by hand ),however, the trick might be useful for a single object file. | 
|  | 1247 | On linux'es 3270 emulator x3270 there is a very useful option under the file ment | 
|  | 1248 | Save Screens In File this is very good of keeping a copy of traces. | 
|  | 1249 |  | 
|  | 1250 | From CMS help <command name> will give you online help on a particular command. | 
|  | 1251 | e.g. | 
|  | 1252 | HELP DISPLAY | 
|  | 1253 |  | 
|  | 1254 | Also CP has a file called profile.exec which automatically gets called | 
|  | 1255 | on startup of CMS ( like autoexec.bat ), keeping on a DOS analogy session | 
|  | 1256 | CP has a feature similar to doskey, it may be useful for you to | 
|  | 1257 | use profile.exec to define some keystrokes. | 
|  | 1258 | e.g. | 
|  | 1259 | SET PF9 IMM B | 
|  | 1260 | This does a single step in VM on pressing F8. | 
|  | 1261 | SET PF10  ^ | 
|  | 1262 | This sets up the ^ key. | 
|  | 1263 | which can be used for ^c (ctrl-c),^z (ctrl-z) which can't be typed directly into some 3270 consoles. | 
|  | 1264 | SET PF11 ^- | 
|  | 1265 | This types the starting keystrokes for a sysrq see SysRq below. | 
|  | 1266 | SET PF12 RETRIEVE | 
|  | 1267 | This retrieves command history on pressing F12. | 
|  | 1268 |  | 
|  | 1269 |  | 
|  | 1270 | Sometimes in VM the display is set up to scroll automatically this | 
|  | 1271 | can be very annoying if there are messages you wish to look at | 
|  | 1272 | to stop this do | 
|  | 1273 | TERM MORE 255 255 | 
|  | 1274 | This will nearly stop automatic screen updates, however it will | 
|  | 1275 | cause a denial of service if lots of messages go to the 3270 console, | 
|  | 1276 | so it would be foolish to use this as the default on a production machine. | 
|  | 1277 |  | 
|  | 1278 |  | 
|  | 1279 | Tracing particular processes | 
|  | 1280 | ---------------------------- | 
|  | 1281 | The kernel's text segment is intentionally at an address in memory that it will | 
|  | 1282 | very seldom collide with text segments of user programs ( thanks Martin ), | 
|  | 1283 | this simplifies debugging the kernel. | 
|  | 1284 | However it is quite common for user processes to have addresses which collide | 
|  | 1285 | this can make debugging a particular process under VM painful under normal | 
|  | 1286 | circumstances as the process may change when doing a | 
|  | 1287 | TR I R <address range>. | 
|  | 1288 | Thankfully after reading VM's online help I figured out how to debug | 
|  | 1289 | I particular process. | 
|  | 1290 |  | 
|  | 1291 | Your first problem is to find the STD ( segment table designation ) | 
|  | 1292 | of the program you wish to debug. | 
|  | 1293 | There are several ways you can do this here are a few | 
|  | 1294 | 1) objdump --syms <program to be debugged> | grep main | 
|  | 1295 | To get the address of main in the program. | 
|  | 1296 | tr i pswa <address of main> | 
|  | 1297 | Start the program, if VM drops to CP on what looks like the entry | 
|  | 1298 | point of the main function this is most likely the process you wish to debug. | 
|  | 1299 | Now do a D X13 or D XG13 on z/Architecture. | 
|  | 1300 | On 31 bit the STD is bits 1-19 ( the STO segment table origin ) | 
|  | 1301 | & 25-31 ( the STL segment table length ) of CR13. | 
|  | 1302 | now type | 
|  | 1303 | TR I R STD <CR13's value> 0.7fffffff | 
|  | 1304 | e.g. | 
|  | 1305 | TR I R STD 8F32E1FF 0.7fffffff | 
|  | 1306 | Another very useful variation is | 
|  | 1307 | TR STORE INTO STD <CR13's value> <address range> | 
|  | 1308 | for finding out when a particular variable changes. | 
|  | 1309 |  | 
|  | 1310 | An alternative way of finding the STD of a currently running process | 
|  | 1311 | is to do the following, ( this method is more complex but | 
| Matt LaPlante | 6c28f2c | 2006-10-03 22:46:31 +0200 | [diff] [blame] | 1312 | could be quite convenient if you aren't updating the kernel much & | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1313 | so your kernel structures will stay constant for a reasonable period of | 
|  | 1314 | time ). | 
|  | 1315 |  | 
|  | 1316 | grep task /proc/<pid>/status | 
|  | 1317 | from this you should see something like | 
|  | 1318 | task: 0f160000 ksp: 0f161de8 pt_regs: 0f161f68 | 
|  | 1319 | This now gives you a pointer to the task structure. | 
|  | 1320 | Now make CC:="s390-gcc -g" kernel/sched.s | 
|  | 1321 | To get the task_struct stabinfo. | 
|  | 1322 | ( task_struct is defined in include/linux/sched.h ). | 
|  | 1323 | Now we want to look at | 
|  | 1324 | task->active_mm->pgd | 
|  | 1325 | on my machine the active_mm in the task structure stab is | 
|  | 1326 | active_mm:(4,12),672,32 | 
|  | 1327 | its offset is 672/8=84=0x54 | 
|  | 1328 | the pgd member in the mm_struct stab is | 
|  | 1329 | pgd:(4,6)=*(29,5),96,32 | 
|  | 1330 | so its offset is 96/8=12=0xc | 
|  | 1331 |  | 
|  | 1332 | so we'll | 
|  | 1333 | hexdump -s 0xf160054 /dev/mem | more | 
|  | 1334 | i.e. task_struct+active_mm offset | 
|  | 1335 | to look at the active_mm member | 
|  | 1336 | f160054 0fee cc60 0019 e334 0000 0000 0000 0011 | 
|  | 1337 | hexdump -s 0x0feecc6c /dev/mem | more | 
|  | 1338 | i.e. active_mm+pgd offset | 
|  | 1339 | feecc6c 0f2c 0000 0000 0001 0000 0001 0000 0010 | 
|  | 1340 | we get something like | 
|  | 1341 | now do | 
|  | 1342 | TR I R STD <pgd|0x7f> 0.7fffffff | 
|  | 1343 | i.e. the 0x7f is added because the pgd only | 
|  | 1344 | gives the page table origin & we need to set the low bits | 
|  | 1345 | to the maximum possible segment table length. | 
|  | 1346 | TR I R STD 0f2c007f 0.7fffffff | 
|  | 1347 | on z/Architecture you'll probably need to do | 
|  | 1348 | TR I R STD <pgd|0x7> 0.ffffffffffffffff | 
|  | 1349 | to set the TableType to 0x1 & the Table length to 3. | 
|  | 1350 |  | 
|  | 1351 |  | 
|  | 1352 |  | 
|  | 1353 | Tracing Program Exceptions | 
|  | 1354 | -------------------------- | 
|  | 1355 | If you get a crash which says something like | 
|  | 1356 | illegal operation or specification exception followed by a register dump | 
|  | 1357 | You can restart linux & trace these using the tr prog <range or value> trace option. | 
|  | 1358 |  | 
|  | 1359 |  | 
|  | 1360 |  | 
|  | 1361 | The most common ones you will normally be tracing for is | 
|  | 1362 | 1=operation exception | 
|  | 1363 | 2=privileged operation exception | 
|  | 1364 | 4=protection exception | 
|  | 1365 | 5=addressing exception | 
|  | 1366 | 6=specification exception | 
|  | 1367 | 10=segment translation exception | 
|  | 1368 | 11=page translation exception | 
|  | 1369 |  | 
|  | 1370 | The full list of these is on page 22 of the current s/390 Reference Summary. | 
|  | 1371 | e.g. | 
|  | 1372 | tr prog 10 will trace segment translation exceptions. | 
|  | 1373 | tr prog on its own will trace all program interruption codes. | 
|  | 1374 |  | 
|  | 1375 | Trace Sets | 
|  | 1376 | ---------- | 
|  | 1377 | On starting VM you are initially in the INITIAL trace set. | 
|  | 1378 | You can do a Q TR to verify this. | 
|  | 1379 | If you have a complex tracing situation where you wish to wait for instance | 
|  | 1380 | till a driver is open before you start tracing IO, but know in your | 
|  | 1381 | heart that you are going to have to make several runs through the code till you | 
|  | 1382 | have a clue whats going on. | 
|  | 1383 |  | 
|  | 1384 | What you can do is | 
|  | 1385 | TR I PSWA <Driver open address> | 
|  | 1386 | hit b to continue till breakpoint | 
|  | 1387 | reach the breakpoint | 
|  | 1388 | now do your | 
|  | 1389 | TR GOTO B | 
|  | 1390 | TR IO 7c08-7c09 inst int run | 
|  | 1391 | or whatever the IO channels you wish to trace are & hit b | 
|  | 1392 |  | 
|  | 1393 | To got back to the initial trace set do | 
|  | 1394 | TR GOTO INITIAL | 
|  | 1395 | & the TR I PSWA <Driver open address> will be the only active breakpoint again. | 
|  | 1396 |  | 
|  | 1397 |  | 
|  | 1398 | Tracing linux syscalls under VM | 
|  | 1399 | ------------------------------- | 
|  | 1400 | Syscalls are implemented on Linux for S390 by the Supervisor call instruction (SVC) there 256 | 
|  | 1401 | possibilities of these as the instruction is made up of a  0xA opcode & the second byte being | 
|  | 1402 | the syscall number. They are traced using the simple command. | 
|  | 1403 | TR SVC  <Optional value or range> | 
| Randy Dunlap | 58cc855 | 2009-01-06 14:42:42 -0800 | [diff] [blame] | 1404 | the syscalls are defined in linux/arch/s390/include/asm/unistd.h | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1405 | e.g. to trace all file opens just do | 
|  | 1406 | TR SVC 5 ( as this is the syscall number of open ) | 
|  | 1407 |  | 
|  | 1408 |  | 
|  | 1409 | SMP Specific commands | 
|  | 1410 | --------------------- | 
|  | 1411 | To find out how many cpus you have | 
|  | 1412 | Q CPUS displays all the CPU's available to your virtual machine | 
|  | 1413 | To find the cpu that the current cpu VM debugger commands are being directed at do | 
| Paolo Ornati | 670e9f3 | 2006-10-03 22:57:56 +0200 | [diff] [blame] | 1414 | Q CPU to change the current cpu VM debugger commands are being directed at do | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1415 | CPU <desired cpu no> | 
|  | 1416 |  | 
|  | 1417 | On a SMP guest issue a command to all CPUs try prefixing the command with cpu all. | 
|  | 1418 | To issue a command to a particular cpu try cpu <cpu number> e.g. | 
|  | 1419 | CPU 01 TR I R 2000.3000 | 
|  | 1420 | If you are running on a guest with several cpus & you have a IO related problem | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 1421 | & cannot follow the flow of code but you know it isn't smp related. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1422 | from the bash prompt issue | 
|  | 1423 | shutdown -h now or halt. | 
|  | 1424 | do a Q CPUS to find out how many cpus you have | 
|  | 1425 | detach each one of them from cp except cpu 0 | 
|  | 1426 | by issuing a | 
|  | 1427 | DETACH CPU 01-(number of cpus in configuration) | 
|  | 1428 | & boot linux again. | 
|  | 1429 | TR SIGP will trace inter processor signal processor instructions. | 
|  | 1430 | DEFINE CPU 01-(number in configuration) | 
|  | 1431 | will get your guests cpus back. | 
|  | 1432 |  | 
|  | 1433 |  | 
|  | 1434 | Help for displaying ascii textstrings | 
|  | 1435 | ------------------------------------- | 
|  | 1436 | On the very latest VM Nucleus'es VM can now display ascii | 
|  | 1437 | ( thanks Neale for the hint ) by doing | 
|  | 1438 | D TX<lowaddr>.<len> | 
|  | 1439 | e.g. | 
|  | 1440 | D TX0.100 | 
|  | 1441 |  | 
|  | 1442 | Alternatively | 
|  | 1443 | ============= | 
|  | 1444 | Under older VM debuggers ( I love EBDIC too ) you can use this little program I wrote which | 
|  | 1445 | will convert a command line of hex digits to ascii text which can be compiled under linux & | 
|  | 1446 | you can copy the hex digits from your x3270 terminal to your xterm if you are debugging | 
|  | 1447 | from a linuxbox. | 
|  | 1448 |  | 
|  | 1449 | This is quite useful when looking at a parameter passed in as a text string | 
|  | 1450 | under VM ( unless you are good at decoding ASCII in your head ). | 
|  | 1451 |  | 
|  | 1452 | e.g. consider tracing an open syscall | 
|  | 1453 | TR SVC 5 | 
|  | 1454 | We have stopped at a breakpoint | 
|  | 1455 | 000151B0' SVC   0A05     -> 0001909A'   CC 0 | 
|  | 1456 |  | 
|  | 1457 | D 20.8 to check the SVC old psw in the prefix area & see was it from userspace | 
|  | 1458 | ( for the layout of the prefix area consult P18 of the s/390 390 Reference Summary | 
|  | 1459 | if you have it available ). | 
|  | 1460 | V00000020  070C2000 800151B2 | 
|  | 1461 | The problem state bit wasn't set &  it's also too early in the boot sequence | 
|  | 1462 | for it to be a userspace SVC if it was we would have to temporarily switch the | 
|  | 1463 | psw to user space addressing so we could get at the first parameter of the open in | 
|  | 1464 | gpr2. | 
|  | 1465 | Next do a | 
|  | 1466 | D G2 | 
|  | 1467 | GPR  2 =  00014CB4 | 
|  | 1468 | Now display what gpr2 is pointing to | 
|  | 1469 | D 00014CB4.20 | 
|  | 1470 | V00014CB4  2F646576 2F636F6E 736F6C65 00001BF5 | 
|  | 1471 | V00014CC4  FC00014C B4001001 E0001000 B8070707 | 
|  | 1472 | Now copy the text till the first 00 hex ( which is the end of the string | 
|  | 1473 | to an xterm & do hex2ascii on it. | 
|  | 1474 | hex2ascii 2F646576 2F636F6E 736F6C65 00 | 
|  | 1475 | outputs | 
|  | 1476 | Decoded Hex:=/ d e v / c o n s o l e 0x00 | 
|  | 1477 | We were opening the console device, | 
|  | 1478 |  | 
|  | 1479 | You can compile the code below yourself for practice :-), | 
|  | 1480 | /* | 
|  | 1481 | *    hex2ascii.c | 
|  | 1482 | *    a useful little tool for converting a hexadecimal command line to ascii | 
|  | 1483 | * | 
|  | 1484 | *    Author(s): Denis Joseph Barrow (djbarrow@de.ibm.com,barrow_dj@yahoo.com) | 
|  | 1485 | *    (C) 2000 IBM Deutschland Entwicklung GmbH, IBM Corporation. | 
|  | 1486 | */ | 
|  | 1487 | #include <stdio.h> | 
|  | 1488 |  | 
|  | 1489 | int main(int argc,char *argv[]) | 
|  | 1490 | { | 
|  | 1491 | int cnt1,cnt2,len,toggle=0; | 
|  | 1492 | int startcnt=1; | 
|  | 1493 | unsigned char c,hex; | 
|  | 1494 |  | 
|  | 1495 | if(argc>1&&(strcmp(argv[1],"-a")==0)) | 
|  | 1496 | startcnt=2; | 
|  | 1497 | printf("Decoded Hex:="); | 
|  | 1498 | for(cnt1=startcnt;cnt1<argc;cnt1++) | 
|  | 1499 | { | 
|  | 1500 | len=strlen(argv[cnt1]); | 
|  | 1501 | for(cnt2=0;cnt2<len;cnt2++) | 
|  | 1502 | { | 
|  | 1503 | c=argv[cnt1][cnt2]; | 
|  | 1504 | if(c>='0'&&c<='9') | 
|  | 1505 | c=c-'0'; | 
|  | 1506 | if(c>='A'&&c<='F') | 
|  | 1507 | c=c-'A'+10; | 
|  | 1508 | if(c>='a'&&c<='f') | 
|  | 1509 | c=c-'a'+10; | 
|  | 1510 | switch(toggle) | 
|  | 1511 | { | 
|  | 1512 | case 0: | 
|  | 1513 | hex=c<<4; | 
|  | 1514 | toggle=1; | 
|  | 1515 | break; | 
|  | 1516 | case 1: | 
|  | 1517 | hex+=c; | 
|  | 1518 | if(hex<32||hex>127) | 
|  | 1519 | { | 
|  | 1520 | if(startcnt==1) | 
|  | 1521 | printf("0x%02X ",(int)hex); | 
|  | 1522 | else | 
|  | 1523 | printf("."); | 
|  | 1524 | } | 
|  | 1525 | else | 
|  | 1526 | { | 
|  | 1527 | printf("%c",hex); | 
|  | 1528 | if(startcnt==1) | 
|  | 1529 | printf(" "); | 
|  | 1530 | } | 
|  | 1531 | toggle=0; | 
|  | 1532 | break; | 
|  | 1533 | } | 
|  | 1534 | } | 
|  | 1535 | } | 
|  | 1536 | printf("\n"); | 
|  | 1537 | } | 
|  | 1538 |  | 
|  | 1539 |  | 
|  | 1540 |  | 
|  | 1541 |  | 
|  | 1542 | Stack tracing under VM | 
|  | 1543 | ---------------------- | 
|  | 1544 | A basic backtrace | 
|  | 1545 | ----------------- | 
|  | 1546 |  | 
|  | 1547 | Here are the tricks I use 9 out of 10 times it works pretty well, | 
|  | 1548 |  | 
|  | 1549 | When your backchain reaches a dead end | 
|  | 1550 | -------------------------------------- | 
|  | 1551 | This can happen when an exception happens in the kernel & the kernel is entered twice | 
|  | 1552 | if you reach the NULL pointer at the end of the back chain you should be | 
|  | 1553 | able to sniff further back if you follow the following tricks. | 
|  | 1554 | 1) A kernel address should be easy to recognise since it is in | 
|  | 1555 | primary space & the problem state bit isn't set & also | 
|  | 1556 | The Hi bit of the address is set. | 
|  | 1557 | 2) Another backchain should also be easy to recognise since it is an | 
|  | 1558 | address pointing to another address approximately 100 bytes or 0x70 hex | 
|  | 1559 | behind the current stackpointer. | 
|  | 1560 |  | 
|  | 1561 |  | 
|  | 1562 | Here is some practice. | 
|  | 1563 | boot the kernel & hit PA1 at some random time | 
|  | 1564 | d g to display the gprs, this should display something like | 
|  | 1565 | GPR  0 =  00000001  00156018  0014359C  00000000 | 
|  | 1566 | GPR  4 =  00000001  001B8888  000003E0  00000000 | 
|  | 1567 | GPR  8 =  00100080  00100084  00000000  000FE000 | 
|  | 1568 | GPR 12 =  00010400  8001B2DC  8001B36A  000FFED8 | 
|  | 1569 | Note that GPR14 is a return address but as we are real men we are going to | 
|  | 1570 | trace the stack. | 
|  | 1571 | display 0x40 bytes after the stack pointer. | 
|  | 1572 |  | 
|  | 1573 | V000FFED8  000FFF38 8001B838 80014C8E 000FFF38 | 
|  | 1574 | V000FFEE8  00000000 00000000 000003E0 00000000 | 
|  | 1575 | V000FFEF8  00100080 00100084 00000000 000FE000 | 
|  | 1576 | V000FFF08  00010400 8001B2DC 8001B36A 000FFED8 | 
|  | 1577 |  | 
|  | 1578 |  | 
|  | 1579 | Ah now look at whats in sp+56 (sp+0x38) this is 8001B36A our saved r14 if | 
|  | 1580 | you look above at our stackframe & also agrees with GPR14. | 
|  | 1581 |  | 
|  | 1582 | now backchain | 
|  | 1583 | d 000FFF38.40 | 
|  | 1584 | we now are taking the contents of SP to get our first backchain. | 
|  | 1585 |  | 
|  | 1586 | V000FFF38  000FFFA0 00000000 00014995 00147094 | 
|  | 1587 | V000FFF48  00147090 001470A0 000003E0 00000000 | 
|  | 1588 | V000FFF58  00100080 00100084 00000000 001BF1D0 | 
|  | 1589 | V000FFF68  00010400 800149BA 80014CA6 000FFF38 | 
|  | 1590 |  | 
|  | 1591 | This displays a 2nd return address of 80014CA6 | 
|  | 1592 |  | 
|  | 1593 | now do d 000FFFA0.40 for our 3rd backchain | 
|  | 1594 |  | 
|  | 1595 | V000FFFA0  04B52002 0001107F 00000000 00000000 | 
|  | 1596 | V000FFFB0  00000000 00000000 FF000000 0001107F | 
|  | 1597 | V000FFFC0  00000000 00000000 00000000 00000000 | 
|  | 1598 | V000FFFD0  00010400 80010802 8001085A 000FFFA0 | 
|  | 1599 |  | 
|  | 1600 |  | 
|  | 1601 | our 3rd return address is 8001085A | 
|  | 1602 |  | 
|  | 1603 | as the 04B52002 looks suspiciously like rubbish it is fair to assume that the kernel entry routines | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 1604 | for the sake of optimisation don't set up a backchain. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1605 |  | 
|  | 1606 | now look at System.map to see if the addresses make any sense. | 
|  | 1607 |  | 
|  | 1608 | grep -i 0001b3 System.map | 
|  | 1609 | outputs among other things | 
|  | 1610 | 0001b304 T cpu_idle | 
|  | 1611 | so 8001B36A | 
|  | 1612 | is cpu_idle+0x66 ( quiet the cpu is asleep, don't wake it ) | 
|  | 1613 |  | 
|  | 1614 |  | 
|  | 1615 | grep -i 00014 System.map | 
|  | 1616 | produces among other things | 
|  | 1617 | 00014a78 T start_kernel | 
|  | 1618 | so 0014CA6 is start_kernel+some hex number I can't add in my head. | 
|  | 1619 |  | 
|  | 1620 | grep -i 00108 System.map | 
|  | 1621 | this produces | 
|  | 1622 | 00010800 T _stext | 
|  | 1623 | so   8001085A is _stext+0x5a | 
|  | 1624 |  | 
|  | 1625 | Congrats you've done your first backchain. | 
|  | 1626 |  | 
|  | 1627 |  | 
|  | 1628 |  | 
|  | 1629 | s/390 & z/Architecture IO Overview | 
|  | 1630 | ================================== | 
|  | 1631 |  | 
|  | 1632 | I am not going to give a course in 390 IO architecture as this would take me quite a | 
|  | 1633 | while & I'm no expert. Instead I'll give a 390 IO architecture summary for Dummies if you have | 
|  | 1634 | the s/390 principles of operation available read this instead. If nothing else you may find a few | 
|  | 1635 | useful keywords in here & be able to use them on a web search engine like altavista to find | 
|  | 1636 | more useful information. | 
|  | 1637 |  | 
|  | 1638 | Unlike other bus architectures modern 390 systems do their IO using mostly | 
|  | 1639 | fibre optics & devices such as tapes & disks can be shared between several mainframes, | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 1640 | also S390 can support up to 65536 devices while a high end PC based system might be choking | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1641 | with around 64. Here is some of the common IO terminology | 
|  | 1642 |  | 
|  | 1643 | Subchannel: | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 1644 | This is the logical number most IO commands use to talk to an IO device there can be up to | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1645 | 0x10000 (65536) of these in a configuration typically there is a few hundred. Under VM | 
|  | 1646 | for simplicity they are allocated contiguously, however on the native hardware they are not | 
|  | 1647 | they typically stay consistent between boots provided no new hardware is inserted or removed. | 
|  | 1648 | Under Linux for 390 we use these as IRQ's & also when issuing an IO command (CLEAR SUBCHANNEL, | 
|  | 1649 | HALT SUBCHANNEL,MODIFY SUBCHANNEL,RESUME SUBCHANNEL,START SUBCHANNEL,STORE SUBCHANNEL & | 
|  | 1650 | TEST SUBCHANNEL ) we use this as the ID of the device we wish to talk to, the most | 
|  | 1651 | important of these instructions are START SUBCHANNEL ( to start IO ), TEST SUBCHANNEL ( to check | 
|  | 1652 | whether the IO completed successfully ), & HALT SUBCHANNEL ( to kill IO ), a subchannel | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 1653 | can have up to 8 channel paths to a device this offers redundancy if one is not available. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1654 |  | 
|  | 1655 |  | 
|  | 1656 | Device Number: | 
|  | 1657 | This number remains static & Is closely tied to the hardware, there are 65536 of these | 
|  | 1658 | also they are made up of a CHPID ( Channel Path ID, the most significant 8 bits ) | 
|  | 1659 | & another lsb 8 bits. These remain static even if more devices are inserted or removed | 
|  | 1660 | from the hardware, there is a 1 to 1 mapping between Subchannels & Device Numbers provided | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 1661 | devices aren't inserted or removed. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1662 |  | 
|  | 1663 | Channel Control Words: | 
|  | 1664 | CCWS are linked lists of instructions initially pointed to by an operation request block (ORB), | 
|  | 1665 | which is initially given to Start Subchannel (SSCH) command along with the subchannel number | 
|  | 1666 | for the IO subsystem to process while the CPU continues executing normal code. | 
|  | 1667 | These come in two flavours, Format 0 ( 24 bit for backward ) | 
|  | 1668 | compatibility & Format 1 ( 31 bit ). These are typically used to issue read & write | 
|  | 1669 | ( & many other instructions ) they consist of a length field & an absolute address field. | 
|  | 1670 | For each IO typically get 1 or 2 interrupts one for channel end ( primary status ) when the | 
|  | 1671 | channel is idle & the second for device end ( secondary status ) sometimes you get both | 
|  | 1672 | concurrently, you check how the IO went on by issuing a TEST SUBCHANNEL at each interrupt, | 
|  | 1673 | from which you receive an Interruption response block (IRB). If you get channel & device end | 
|  | 1674 | status in the IRB without channel checks etc. your IO probably went okay. If you didn't you | 
| Matt LaPlante | fff9289 | 2006-10-03 22:47:42 +0200 | [diff] [blame] | 1675 | probably need a doctor to examine the IRB & extended status word etc. | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 1676 | If an error occurs, more sophisticated control units have a facility known as | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1677 | concurrent sense this means that if an error occurs Extended sense information will | 
|  | 1678 | be presented in the Extended status word in the IRB if not you have to issue a | 
|  | 1679 | subsequent SENSE CCW command after the test subchannel. | 
|  | 1680 |  | 
|  | 1681 |  | 
|  | 1682 | TPI( Test pending interrupt) can also be used for polled IO but in multitasking multiprocessor | 
|  | 1683 | systems it isn't recommended except for checking special cases ( i.e. non looping checks for | 
|  | 1684 | pending IO etc. ). | 
|  | 1685 |  | 
|  | 1686 | Store Subchannel & Modify Subchannel can be used to examine & modify operating characteristics | 
|  | 1687 | of a subchannel ( e.g. channel paths ). | 
|  | 1688 |  | 
|  | 1689 | Other IO related Terms: | 
|  | 1690 | Sysplex: S390's Clustering Technology | 
|  | 1691 | QDIO: S390's new high speed IO architecture to support devices such as gigabit ethernet, | 
|  | 1692 | this architecture is also designed to be forward compatible with up & coming 64 bit machines. | 
|  | 1693 |  | 
|  | 1694 |  | 
|  | 1695 | General Concepts | 
|  | 1696 |  | 
|  | 1697 | Input Output Processors (IOP's) are responsible for communicating between | 
|  | 1698 | the mainframe CPU's & the channel & relieve the mainframe CPU's from the | 
|  | 1699 | burden of communicating with IO devices directly, this allows the CPU's to | 
|  | 1700 | concentrate on data processing. | 
|  | 1701 |  | 
|  | 1702 | IOP's can use one or more links ( known as channel paths ) to talk to each | 
|  | 1703 | IO device. It first checks for path availability & chooses an available one, | 
|  | 1704 | then starts ( & sometimes terminates IO ). | 
| Matt LaPlante | 992caac | 2006-10-03 22:52:05 +0200 | [diff] [blame] | 1705 | There are two types of channel path: ESCON & the Parallel IO interface. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1706 |  | 
|  | 1707 | IO devices are attached to control units, control units provide the | 
|  | 1708 | logic to interface the channel paths & channel path IO protocols to | 
|  | 1709 | the IO devices, they can be integrated with the devices or housed separately | 
|  | 1710 | & often talk to several similar devices ( typical examples would be raid | 
|  | 1711 | controllers or a control unit which connects to 1000 3270 terminals ). | 
|  | 1712 |  | 
|  | 1713 |  | 
|  | 1714 | +---------------------------------------------------------------+ | 
|  | 1715 | | +-----+ +-----+ +-----+ +-----+  +----------+  +----------+   | | 
|  | 1716 | | | CPU | | CPU | | CPU | | CPU |  |  Main    |  | Expanded |   | | 
|  | 1717 | | |     | |     | |     | |     |  |  Memory  |  |  Storage |   | | 
|  | 1718 | | +-----+ +-----+ +-----+ +-----+  +----------+  +----------+   | | 
|  | 1719 | |---------------------------------------------------------------+ | 
|  | 1720 | |   IOP        |      IOP      |       IOP                      | | 
|  | 1721 | |--------------------------------------------------------------- | 
|  | 1722 | | C | C | C | C | C | C | C | C | C | C | C | C | C | C | C | C | | 
|  | 1723 | ---------------------------------------------------------------- | 
|  | 1724 | ||                                              || | 
|  | 1725 | ||  Bus & Tag Channel Path                      || ESCON | 
|  | 1726 | ||  ======================                      || Channel | 
|  | 1727 | ||  ||                  ||                      || Path | 
|  | 1728 | +----------+               +----------+         +----------+ | 
|  | 1729 | |          |               |          |         |          | | 
|  | 1730 | |    CU    |               |    CU    |         |    CU    | | 
|  | 1731 | |          |               |          |         |          | | 
|  | 1732 | +----------+               +----------+         +----------+ | 
|  | 1733 | |      |                     |                |       | | 
|  | 1734 | +----------+ +----------+      +----------+   +----------+ +----------+ | 
|  | 1735 | |I/O Device| |I/O Device|      |I/O Device|   |I/O Device| |I/O Device| | 
|  | 1736 | +----------+ +----------+      +----------+   +----------+ +----------+ | 
|  | 1737 | CPU = Central Processing Unit | 
|  | 1738 | C = Channel | 
|  | 1739 | IOP = IP Processor | 
|  | 1740 | CU = Control Unit | 
|  | 1741 |  | 
|  | 1742 | The 390 IO systems come in 2 flavours the current 390 machines support both | 
|  | 1743 |  | 
| Matt LaPlante | 992caac | 2006-10-03 22:52:05 +0200 | [diff] [blame] | 1744 | The Older 360 & 370 Interface,sometimes called the Parallel I/O interface, | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1745 | sometimes called Bus-and Tag & sometimes Original Equipment Manufacturers | 
|  | 1746 | Interface (OEMI). | 
|  | 1747 |  | 
| Matt LaPlante | 992caac | 2006-10-03 22:52:05 +0200 | [diff] [blame] | 1748 | This byte wide Parallel channel path/bus has parity & data on the "Bus" cable | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1749 | & control lines on the "Tag" cable. These can operate in byte multiplex mode for | 
|  | 1750 | sharing between several slow devices or burst mode & monopolize the channel for the | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 1751 | whole burst. Up to 256 devices can be addressed  on one of these cables. These cables are | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1752 | about one inch in diameter. The maximum unextended length supported by these cables is | 
|  | 1753 | 125 Meters but this can be extended up to 2km with a fibre optic channel extended | 
|  | 1754 | such as a 3044. The maximum burst speed supported is 4.5 megabytes per second however | 
|  | 1755 | some really old processors support only transfer rates of 3.0, 2.0 & 1.0 MB/sec. | 
|  | 1756 | One of these paths can be daisy chained to up to 8 control units. | 
|  | 1757 |  | 
|  | 1758 |  | 
|  | 1759 | ESCON if fibre optic it is also called FICON | 
|  | 1760 | Was introduced by IBM in 1990. Has 2 fibre optic cables & uses either leds or lasers | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 1761 | for communication at a signaling rate of up to 200 megabits/sec. As 10bits are transferred | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1762 | for every 8 bits info this drops to 160 megabits/sec & to 18.6 Megabytes/sec once | 
|  | 1763 | control info & CRC are added. ESCON only operates in burst mode. | 
|  | 1764 |  | 
|  | 1765 | ESCONs typical max cable length is 3km for the led version & 20km for the laser version | 
|  | 1766 | known as XDF ( extended distance facility ). This can be further extended by using an | 
|  | 1767 | ESCON director which triples the above mentioned ranges. Unlike Bus & Tag as ESCON is | 
|  | 1768 | serial it uses a packet switching architecture the standard Bus & Tag control protocol | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 1769 | is however present within the packets. Up to 256 devices can be attached to each control | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1770 | unit that uses one of these interfaces. | 
|  | 1771 |  | 
|  | 1772 | Common 390 Devices include: | 
|  | 1773 | Network adapters typically OSA2,3172's,2116's & OSA-E gigabit ethernet adapters, | 
|  | 1774 | Consoles 3270 & 3215 ( a teletype emulated under linux for a line mode console ). | 
|  | 1775 | DASD's direct access storage devices ( otherwise known as hard disks ). | 
|  | 1776 | Tape Drives. | 
|  | 1777 | CTC ( Channel to Channel Adapters ), | 
| Matt LaPlante | 992caac | 2006-10-03 22:52:05 +0200 | [diff] [blame] | 1778 | ESCON or Parallel Cables used as a very high speed serial link | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1779 | between 2 machines. We use 2 cables under linux to do a bi-directional serial link. | 
|  | 1780 |  | 
|  | 1781 |  | 
|  | 1782 | Debugging IO on s/390 & z/Architecture under VM | 
|  | 1783 | =============================================== | 
|  | 1784 |  | 
|  | 1785 | Now we are ready to go on with IO tracing commands under VM | 
|  | 1786 |  | 
|  | 1787 | A few self explanatory queries: | 
|  | 1788 | Q OSA | 
|  | 1789 | Q CTC | 
|  | 1790 | Q DISK ( This command is CMS specific ) | 
|  | 1791 | Q DASD | 
|  | 1792 |  | 
|  | 1793 |  | 
|  | 1794 |  | 
|  | 1795 |  | 
|  | 1796 |  | 
|  | 1797 |  | 
|  | 1798 | Q OSA on my machine returns | 
|  | 1799 | OSA  7C08 ON OSA   7C08 SUBCHANNEL = 0000 | 
|  | 1800 | OSA  7C09 ON OSA   7C09 SUBCHANNEL = 0001 | 
|  | 1801 | OSA  7C14 ON OSA   7C14 SUBCHANNEL = 0002 | 
|  | 1802 | OSA  7C15 ON OSA   7C15 SUBCHANNEL = 0003 | 
|  | 1803 |  | 
| Matt LaPlante | 992caac | 2006-10-03 22:52:05 +0200 | [diff] [blame] | 1804 | If you have a guest with certain privileges you may be able to see devices | 
|  | 1805 | which don't belong to you. To avoid this, add the option V. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1806 | e.g. | 
|  | 1807 | Q V OSA | 
|  | 1808 |  | 
|  | 1809 | Now using the device numbers returned by this command we will | 
|  | 1810 | Trace the io starting up on the first device 7c08 & 7c09 | 
|  | 1811 | In our simplest case we can trace the | 
|  | 1812 | start subchannels | 
|  | 1813 | like TR SSCH 7C08-7C09 | 
|  | 1814 | or the halt subchannels | 
|  | 1815 | or TR HSCH 7C08-7C09 | 
|  | 1816 | MSCH's ,STSCH's I think you can guess the rest | 
|  | 1817 |  | 
|  | 1818 | Ingo's favourite trick is tracing all the IO's & CCWS & spooling them into the reader of another | 
|  | 1819 | VM guest so he can ftp the logfile back to his own machine.I'll do a small bit of this & give you | 
|  | 1820 | a look at the output. | 
|  | 1821 |  | 
|  | 1822 | 1) Spool stdout to VM reader | 
|  | 1823 | SP PRT TO (another vm guest ) or * for the local vm guest | 
|  | 1824 | 2) Fill the reader with the trace | 
|  | 1825 | TR IO 7c08-7c09 INST INT CCW PRT RUN | 
|  | 1826 | 3) Start up linux | 
|  | 1827 | i 00c | 
|  | 1828 | 4) Finish the trace | 
|  | 1829 | TR END | 
|  | 1830 | 5) close the reader | 
|  | 1831 | C PRT | 
|  | 1832 | 6) list reader contents | 
|  | 1833 | RDRLIST | 
|  | 1834 | 7) copy it to linux4's minidisk | 
|  | 1835 | RECEIVE / LOG TXT A1 ( replace | 
|  | 1836 | 8) | 
|  | 1837 | filel & press F11 to look at it | 
| Matt LaPlante | 53cb472 | 2006-10-03 22:55:17 +0200 | [diff] [blame] | 1838 | You should see something like: | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1839 |  | 
|  | 1840 | 00020942' SSCH  B2334000    0048813C    CC 0    SCH 0000    DEV 7C08 | 
|  | 1841 | CPA 000FFDF0   PARM 00E2C9C4    KEY 0  FPI C0  LPM 80 | 
|  | 1842 | CCW    000FFDF0  E4200100 00487FE8   0000  E4240100 ........ | 
|  | 1843 | IDAL                                      43D8AFE8 | 
|  | 1844 | IDAL                                      0FB76000 | 
|  | 1845 | 00020B0A'   I/O DEV 7C08 -> 000197BC'   SCH 0000   PARM 00E2C9C4 | 
|  | 1846 | 00021628' TSCH  B2354000 >> 00488164    CC 0    SCH 0000    DEV 7C08 | 
|  | 1847 | CCWA 000FFDF8   DEV STS 0C  SCH STS 00  CNT 00EC | 
|  | 1848 | KEY 0   FPI C0  CC 0   CTLS 4007 | 
|  | 1849 | 00022238' STSCH B2344000 >> 00488108    CC 0    SCH 0000    DEV 7C08 | 
|  | 1850 |  | 
|  | 1851 | If you don't like messing up your readed ( because you possibly booted from it ) | 
|  | 1852 | you can alternatively spool it to another readers guest. | 
|  | 1853 |  | 
|  | 1854 |  | 
|  | 1855 | Other common VM device related commands | 
|  | 1856 | --------------------------------------------- | 
|  | 1857 | These commands are listed only because they have | 
|  | 1858 | been of use to me in the past & may be of use to | 
|  | 1859 | you too. For more complete info on each of the commands | 
|  | 1860 | use type HELP <command> from CMS. | 
|  | 1861 | detaching devices | 
|  | 1862 | DET <devno range> | 
|  | 1863 | ATT <devno range> <guest> | 
|  | 1864 | attach a device to guest * for your own guest | 
|  | 1865 | READY <devno> cause VM to issue a fake interrupt. | 
|  | 1866 |  | 
|  | 1867 | The VARY command is normally only available to VM administrators. | 
|  | 1868 | VARY ON PATH <path> TO <devno range> | 
|  | 1869 | VARY OFF PATH <PATH> FROM <devno range> | 
|  | 1870 | This is used to switch on or off channel paths to devices. | 
|  | 1871 |  | 
|  | 1872 | Q CHPID <channel path ID> | 
|  | 1873 | This displays state of devices using this channel path | 
|  | 1874 | D SCHIB <subchannel> | 
|  | 1875 | This displays the subchannel information SCHIB block for the device. | 
|  | 1876 | this I believe is also only available to administrators. | 
|  | 1877 | DEFINE CTC <devno> | 
|  | 1878 | defines a virtual CTC channel to channel connection | 
|  | 1879 | 2 need to be defined on each guest for the CTC driver to use. | 
|  | 1880 | COUPLE  devno userid remote devno | 
|  | 1881 | Joins a local virtual device to a remote virtual device | 
|  | 1882 | ( commonly used for the CTC driver ). | 
|  | 1883 |  | 
|  | 1884 | Building a VM ramdisk under CMS which linux can use | 
|  | 1885 | def vfb-<blocksize> <subchannel> <number blocks> | 
|  | 1886 | blocksize is commonly 4096 for linux. | 
|  | 1887 | Formatting it | 
|  | 1888 | format <subchannel> <driver letter e.g. x> (blksize <blocksize> | 
|  | 1889 |  | 
|  | 1890 | Sharing a disk between multiple guests | 
|  | 1891 | LINK userid devno1 devno2 mode password | 
|  | 1892 |  | 
|  | 1893 |  | 
|  | 1894 |  | 
|  | 1895 | GDB on S390 | 
|  | 1896 | =========== | 
|  | 1897 | N.B. if compiling for debugging gdb works better without optimisation | 
|  | 1898 | ( see Compiling programs for debugging ) | 
|  | 1899 |  | 
|  | 1900 | invocation | 
|  | 1901 | ---------- | 
|  | 1902 | gdb <victim program> <optional corefile> | 
|  | 1903 |  | 
|  | 1904 | Online help | 
|  | 1905 | ----------- | 
|  | 1906 | help: gives help on commands | 
|  | 1907 | e.g. | 
|  | 1908 | help | 
|  | 1909 | help display | 
|  | 1910 | Note gdb's online help is very good use it. | 
|  | 1911 |  | 
|  | 1912 |  | 
|  | 1913 | Assembly | 
|  | 1914 | -------- | 
|  | 1915 | info registers: displays registers other than floating point. | 
|  | 1916 | info all-registers: displays floating points as well. | 
| Matt LaPlante | fff9289 | 2006-10-03 22:47:42 +0200 | [diff] [blame] | 1917 | disassemble: disassembles | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1918 | e.g. | 
|  | 1919 | disassemble without parameters will disassemble the current function | 
|  | 1920 | disassemble $pc $pc+10 | 
|  | 1921 |  | 
|  | 1922 | Viewing & modifying variables | 
|  | 1923 | ----------------------------- | 
|  | 1924 | print or p: displays variable or register | 
|  | 1925 | e.g. p/x $sp will display the stack pointer | 
|  | 1926 |  | 
|  | 1927 | display: prints variable or register each time program stops | 
|  | 1928 | e.g. | 
|  | 1929 | display/x $pc will display the program counter | 
|  | 1930 | display argc | 
|  | 1931 |  | 
|  | 1932 | undisplay : undo's display's | 
|  | 1933 |  | 
|  | 1934 | info breakpoints: shows all current breakpoints | 
|  | 1935 |  | 
| Matt LaPlante | fff9289 | 2006-10-03 22:47:42 +0200 | [diff] [blame] | 1936 | info stack: shows stack back trace ( if this doesn't work too well, I'll show you the | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1937 | stacktrace by hand below ). | 
|  | 1938 |  | 
|  | 1939 | info locals: displays local variables. | 
|  | 1940 |  | 
|  | 1941 | info args: display current procedure arguments. | 
|  | 1942 |  | 
|  | 1943 | set args: will set argc & argv each time the victim program is invoked. | 
|  | 1944 |  | 
|  | 1945 | set <variable>=value | 
|  | 1946 | set argc=100 | 
|  | 1947 | set $pc=0 | 
|  | 1948 |  | 
|  | 1949 |  | 
|  | 1950 |  | 
|  | 1951 | Modifying execution | 
|  | 1952 | ------------------- | 
|  | 1953 | step: steps n lines of sourcecode | 
|  | 1954 | step steps 1 line. | 
|  | 1955 | step 100 steps 100 lines of code. | 
|  | 1956 |  | 
|  | 1957 | next: like step except this will not step into subroutines | 
|  | 1958 |  | 
|  | 1959 | stepi: steps a single machine code instruction. | 
|  | 1960 | e.g. stepi 100 | 
|  | 1961 |  | 
|  | 1962 | nexti: steps a single machine code instruction but will not step into subroutines. | 
|  | 1963 |  | 
|  | 1964 | finish: will run until exit of the current routine | 
|  | 1965 |  | 
|  | 1966 | run: (re)starts a program | 
|  | 1967 |  | 
|  | 1968 | cont: continues a program | 
|  | 1969 |  | 
|  | 1970 | quit: exits gdb. | 
|  | 1971 |  | 
|  | 1972 |  | 
|  | 1973 | breakpoints | 
|  | 1974 | ------------ | 
|  | 1975 |  | 
|  | 1976 | break | 
|  | 1977 | sets a breakpoint | 
|  | 1978 | e.g. | 
|  | 1979 |  | 
|  | 1980 | break main | 
|  | 1981 |  | 
|  | 1982 | break *$pc | 
|  | 1983 |  | 
|  | 1984 | break *0x400618 | 
|  | 1985 |  | 
| Matt LaPlante | 19f5946 | 2009-04-27 15:06:31 +0200 | [diff] [blame] | 1986 | Here's a really useful one for large programs | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1987 | rbr | 
|  | 1988 | Set a breakpoint for all functions matching REGEXP | 
|  | 1989 | e.g. | 
|  | 1990 | rbr 390 | 
|  | 1991 | will set a breakpoint with all functions with 390 in their name. | 
|  | 1992 |  | 
|  | 1993 | info breakpoints | 
|  | 1994 | lists all breakpoints | 
|  | 1995 |  | 
|  | 1996 | delete: delete breakpoint by number or delete them all | 
|  | 1997 | e.g. | 
|  | 1998 | delete 1 will delete the first breakpoint | 
|  | 1999 | delete will delete them all | 
|  | 2000 |  | 
|  | 2001 | watch: This will set a watchpoint ( usually hardware assisted ), | 
|  | 2002 | This will watch a variable till it changes | 
|  | 2003 | e.g. | 
|  | 2004 | watch cnt, will watch the variable cnt till it changes. | 
|  | 2005 | As an aside unfortunately gdb's, architecture independent watchpoint code | 
|  | 2006 | is inconsistent & not very good, watchpoints usually work but not always. | 
|  | 2007 |  | 
|  | 2008 | info watchpoints: Display currently active watchpoints | 
|  | 2009 |  | 
|  | 2010 | condition: ( another useful one ) | 
|  | 2011 | Specify breakpoint number N to break only if COND is true. | 
|  | 2012 | Usage is `condition N COND', where N is an integer and COND is an | 
|  | 2013 | expression to be evaluated whenever breakpoint N is reached. | 
|  | 2014 |  | 
|  | 2015 |  | 
|  | 2016 |  | 
|  | 2017 | User defined functions/macros | 
|  | 2018 | ----------------------------- | 
|  | 2019 | define: ( Note this is very very useful,simple & powerful ) | 
|  | 2020 | usage define <name> <list of commands> end | 
|  | 2021 |  | 
|  | 2022 | examples which you should consider putting into .gdbinit in your home directory | 
|  | 2023 | define d | 
|  | 2024 | stepi | 
|  | 2025 | disassemble $pc $pc+10 | 
|  | 2026 | end | 
|  | 2027 |  | 
|  | 2028 | define e | 
|  | 2029 | nexti | 
|  | 2030 | disassemble $pc $pc+10 | 
|  | 2031 | end | 
|  | 2032 |  | 
|  | 2033 |  | 
|  | 2034 | Other hard to classify stuff | 
|  | 2035 | ---------------------------- | 
|  | 2036 | signal n: | 
|  | 2037 | sends the victim program a signal. | 
|  | 2038 | e.g. signal 3 will send a SIGQUIT. | 
|  | 2039 |  | 
|  | 2040 | info signals: | 
|  | 2041 | what gdb does when the victim receives certain signals. | 
|  | 2042 |  | 
|  | 2043 | list: | 
|  | 2044 | e.g. | 
|  | 2045 | list lists current function source | 
| Matt LaPlante | 6c28f2c | 2006-10-03 22:46:31 +0200 | [diff] [blame] | 2046 | list 1,10 list first 10 lines of current file. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2047 | list test.c:1,10 | 
|  | 2048 |  | 
|  | 2049 |  | 
|  | 2050 | directory: | 
|  | 2051 | Adds directories to be searched for source if gdb cannot find the source. | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 2052 | (note it is a bit sensitive about slashes) | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2053 | e.g. To add the root of the filesystem to the searchpath do | 
|  | 2054 | directory // | 
|  | 2055 |  | 
|  | 2056 |  | 
|  | 2057 | call <function> | 
|  | 2058 | This calls a function in the victim program, this is pretty powerful | 
|  | 2059 | e.g. | 
|  | 2060 | (gdb) call printf("hello world") | 
|  | 2061 | outputs: | 
|  | 2062 | $1 = 11 | 
|  | 2063 |  | 
|  | 2064 | You might now be thinking that the line above didn't work, something extra had to be done. | 
|  | 2065 | (gdb) call fflush(stdout) | 
|  | 2066 | hello world$2 = 0 | 
|  | 2067 | As an aside the debugger also calls malloc & free under the hood | 
|  | 2068 | to make space for the "hello world" string. | 
|  | 2069 |  | 
|  | 2070 |  | 
|  | 2071 |  | 
|  | 2072 | hints | 
|  | 2073 | ----- | 
|  | 2074 | 1) command completion works just like bash | 
|  | 2075 | ( if you are a bad typist like me this really helps ) | 
|  | 2076 | e.g. hit br <TAB> & cursor up & down :-). | 
|  | 2077 |  | 
|  | 2078 | 2) if you have a debugging problem that takes a few steps to recreate | 
|  | 2079 | put the steps into a file called .gdbinit in your current working directory | 
|  | 2080 | if you have defined a few extra useful user defined commands put these in | 
|  | 2081 | your home directory & they will be read each time gdb is launched. | 
|  | 2082 |  | 
|  | 2083 | A typical .gdbinit file might be. | 
|  | 2084 | break main | 
|  | 2085 | run | 
|  | 2086 | break runtime_exception | 
|  | 2087 | cont | 
|  | 2088 |  | 
|  | 2089 |  | 
|  | 2090 | stack chaining in gdb by hand | 
|  | 2091 | ----------------------------- | 
|  | 2092 | This is done using a the same trick described for VM | 
|  | 2093 | p/x (*($sp+56))&0x7fffffff get the first backchain. | 
|  | 2094 |  | 
|  | 2095 | For z/Architecture | 
|  | 2096 | Replace 56 with 112 & ignore the &0x7fffffff | 
|  | 2097 | in the macros below & do nasty casts to longs like the following | 
|  | 2098 | as gdb unfortunately deals with printed arguments as ints which | 
|  | 2099 | messes up everything. | 
|  | 2100 | i.e. here is a 3rd backchain dereference | 
|  | 2101 | p/x *(long *)(***(long ***)$sp+112) | 
|  | 2102 |  | 
|  | 2103 |  | 
|  | 2104 | this outputs | 
|  | 2105 | $5 = 0x528f18 | 
|  | 2106 | on my machine. | 
|  | 2107 | Now you can use | 
|  | 2108 | info symbol (*($sp+56))&0x7fffffff | 
|  | 2109 | you might see something like. | 
|  | 2110 | rl_getc + 36 in section .text  telling you what is located at address 0x528f18 | 
|  | 2111 | Now do. | 
|  | 2112 | p/x (*(*$sp+56))&0x7fffffff | 
|  | 2113 | This outputs | 
|  | 2114 | $6 = 0x528ed0 | 
|  | 2115 | Now do. | 
|  | 2116 | info symbol (*(*$sp+56))&0x7fffffff | 
|  | 2117 | rl_read_key + 180 in section .text | 
|  | 2118 | now do | 
|  | 2119 | p/x (*(**$sp+56))&0x7fffffff | 
|  | 2120 | & so on. | 
|  | 2121 |  | 
|  | 2122 | Disassembling instructions without debug info | 
|  | 2123 | --------------------------------------------- | 
| Matt LaPlante | 6c28f2c | 2006-10-03 22:46:31 +0200 | [diff] [blame] | 2124 | gdb typically complains if there is a lack of debugging | 
|  | 2125 | symbols in the disassemble command with | 
|  | 2126 | "No function contains specified address." To get around | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2127 | this do | 
|  | 2128 | x/<number lines to disassemble>xi <address> | 
|  | 2129 | e.g. | 
|  | 2130 | x/20xi 0x400730 | 
|  | 2131 |  | 
|  | 2132 |  | 
|  | 2133 |  | 
|  | 2134 | Note: Remember gdb has history just like bash you don't need to retype the | 
|  | 2135 | whole line just use the up & down arrows. | 
|  | 2136 |  | 
|  | 2137 |  | 
|  | 2138 |  | 
|  | 2139 | For more info | 
|  | 2140 | ------------- | 
|  | 2141 | From your linuxbox do | 
|  | 2142 | man gdb or info gdb. | 
|  | 2143 |  | 
|  | 2144 | core dumps | 
|  | 2145 | ---------- | 
|  | 2146 | What a core dump ?, | 
|  | 2147 | A core dump is a file generated by the kernel ( if allowed ) which contains the registers, | 
|  | 2148 | & all active pages of the program which has crashed. | 
|  | 2149 | From this file gdb will allow you to look at the registers & stack trace & memory of the | 
|  | 2150 | program as if it just crashed on your system, it is usually called core & created in the | 
|  | 2151 | current working directory. | 
|  | 2152 | This is very useful in that a customer can mail a core dump to a technical support department | 
|  | 2153 | & the technical support department can reconstruct what happened. | 
| Nicolas Kaiser | 2254f5a | 2006-12-04 15:40:23 +0100 | [diff] [blame] | 2154 | Provided they have an identical copy of this program with debugging symbols compiled in & | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2155 | the source base of this build is available. | 
|  | 2156 | In short it is far more useful than something like a crash log could ever hope to be. | 
|  | 2157 |  | 
|  | 2158 | In theory all that is missing to restart a core dumped program is a kernel patch which | 
|  | 2159 | will do the following. | 
|  | 2160 | 1) Make a new kernel task structure | 
|  | 2161 | 2) Reload all the dumped pages back into the kernel's memory management structures. | 
|  | 2162 | 3) Do the required clock fixups | 
|  | 2163 | 4) Get all files & network connections for the process back into an identical state ( really difficult ). | 
|  | 2164 | 5) A few more difficult things I haven't thought of. | 
|  | 2165 |  | 
|  | 2166 |  | 
|  | 2167 |  | 
|  | 2168 | Why have I never seen one ?. | 
|  | 2169 | Probably because you haven't used the command | 
|  | 2170 | ulimit -c unlimited in bash | 
|  | 2171 | to allow core dumps, now do | 
|  | 2172 | ulimit -a | 
|  | 2173 | to verify that the limit was accepted. | 
|  | 2174 |  | 
|  | 2175 | A sample core dump | 
|  | 2176 | To create this I'm going to do | 
|  | 2177 | ulimit -c unlimited | 
|  | 2178 | gdb | 
|  | 2179 | to launch gdb (my victim app. ) now be bad & do the following from another | 
|  | 2180 | telnet/xterm session to the same machine | 
|  | 2181 | ps -aux | grep gdb | 
|  | 2182 | kill -SIGSEGV <gdb's pid> | 
|  | 2183 | or alternatively use killall -SIGSEGV gdb if you have the killall command. | 
|  | 2184 | Now look at the core dump. | 
| Paolo Ornati | 670e9f3 | 2006-10-03 22:57:56 +0200 | [diff] [blame] | 2185 | ./gdb core | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2186 | Displays the following | 
|  | 2187 | GNU gdb 4.18 | 
|  | 2188 | Copyright 1998 Free Software Foundation, Inc. | 
|  | 2189 | GDB is free software, covered by the GNU General Public License, and you are | 
|  | 2190 | welcome to change it and/or distribute copies of it under certain conditions. | 
|  | 2191 | Type "show copying" to see the conditions. | 
|  | 2192 | There is absolutely no warranty for GDB.  Type "show warranty" for details. | 
|  | 2193 | This GDB was configured as "s390-ibm-linux"... | 
|  | 2194 | Core was generated by `./gdb'. | 
|  | 2195 | Program terminated with signal 11, Segmentation fault. | 
|  | 2196 | Reading symbols from /usr/lib/libncurses.so.4...done. | 
|  | 2197 | Reading symbols from /lib/libm.so.6...done. | 
|  | 2198 | Reading symbols from /lib/libc.so.6...done. | 
|  | 2199 | Reading symbols from /lib/ld-linux.so.2...done. | 
|  | 2200 | #0  0x40126d1a in read () from /lib/libc.so.6 | 
|  | 2201 | Setting up the environment for debugging gdb. | 
|  | 2202 | Breakpoint 1 at 0x4dc6f8: file utils.c, line 471. | 
|  | 2203 | Breakpoint 2 at 0x4d87a4: file top.c, line 2609. | 
|  | 2204 | (top-gdb) info stack | 
|  | 2205 | #0  0x40126d1a in read () from /lib/libc.so.6 | 
|  | 2206 | #1  0x528f26 in rl_getc (stream=0x7ffffde8) at input.c:402 | 
|  | 2207 | #2  0x528ed0 in rl_read_key () at input.c:381 | 
|  | 2208 | #3  0x5167e6 in readline_internal_char () at readline.c:454 | 
|  | 2209 | #4  0x5168ee in readline_internal_charloop () at readline.c:507 | 
|  | 2210 | #5  0x51692c in readline_internal () at readline.c:521 | 
| John Anthony Kazos Jr | be2a608 | 2007-05-09 08:50:42 +0200 | [diff] [blame] | 2211 | #6  0x5164fe in readline (prompt=0x7ffff810 "\177ÿøx\177ÿ÷Ø\177ÿøxÀ") | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2212 | at readline.c:349 | 
| Matt LaPlante | 19f5946 | 2009-04-27 15:06:31 +0200 | [diff] [blame] | 2213 | #7  0x4d7a8a in command_line_input (prompt=0x564420 "(gdb) ", repeat=1, | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2214 | annotation_suffix=0x4d6b44 "prompt") at top.c:2091 | 
|  | 2215 | #8  0x4d6cf0 in command_loop () at top.c:1345 | 
|  | 2216 | #9  0x4e25bc in main (argc=1, argv=0x7ffffdf4) at main.c:635 | 
|  | 2217 |  | 
|  | 2218 |  | 
|  | 2219 | LDD | 
|  | 2220 | === | 
|  | 2221 | This is a program which lists the shared libraries which a library needs, | 
|  | 2222 | Note you also get the relocations of the shared library text segments which | 
|  | 2223 | help when using objdump --source. | 
|  | 2224 | e.g. | 
|  | 2225 | ldd ./gdb | 
|  | 2226 | outputs | 
|  | 2227 | libncurses.so.4 => /usr/lib/libncurses.so.4 (0x40018000) | 
|  | 2228 | libm.so.6 => /lib/libm.so.6 (0x4005e000) | 
|  | 2229 | libc.so.6 => /lib/libc.so.6 (0x40084000) | 
|  | 2230 | /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000) | 
|  | 2231 |  | 
|  | 2232 |  | 
|  | 2233 | Debugging shared libraries | 
|  | 2234 | ========================== | 
|  | 2235 | Most programs use shared libraries, however it can be very painful | 
|  | 2236 | when you single step instruction into a function like printf for the | 
|  | 2237 | first time & you end up in functions like _dl_runtime_resolve this is | 
|  | 2238 | the ld.so doing lazy binding, lazy binding is a concept in ELF where | 
|  | 2239 | shared library functions are not loaded into memory unless they are | 
|  | 2240 | actually used, great for saving memory but a pain to debug. | 
|  | 2241 | To get around this either relink the program -static or exit gdb type | 
|  | 2242 | export LD_BIND_NOW=true this will stop lazy binding & restart the gdb'ing | 
|  | 2243 | the program in question. | 
|  | 2244 |  | 
|  | 2245 |  | 
|  | 2246 |  | 
|  | 2247 | Debugging modules | 
|  | 2248 | ================= | 
|  | 2249 | As modules are dynamically loaded into the kernel their address can be | 
|  | 2250 | anywhere to get around this use the -m option with insmod to emit a load | 
|  | 2251 | map which can be piped into a file if required. | 
|  | 2252 |  | 
|  | 2253 | The proc file system | 
|  | 2254 | ==================== | 
|  | 2255 | What is it ?. | 
|  | 2256 | It is a filesystem created by the kernel with files which are created on demand | 
|  | 2257 | by the kernel if read, or can be used to modify kernel parameters, | 
|  | 2258 | it is a powerful concept. | 
|  | 2259 |  | 
|  | 2260 | e.g. | 
|  | 2261 |  | 
|  | 2262 | cat /proc/sys/net/ipv4/ip_forward | 
|  | 2263 | On my machine outputs | 
|  | 2264 | 0 | 
|  | 2265 | telling me ip_forwarding is not on to switch it on I can do | 
|  | 2266 | echo 1 >  /proc/sys/net/ipv4/ip_forward | 
|  | 2267 | cat it again | 
|  | 2268 | cat /proc/sys/net/ipv4/ip_forward | 
|  | 2269 | On my machine now outputs | 
|  | 2270 | 1 | 
|  | 2271 | IP forwarding is on. | 
|  | 2272 | There is a lot of useful info in here best found by going in & having a look around, | 
|  | 2273 | so I'll take you through some entries I consider important. | 
|  | 2274 |  | 
| Sylvestre Ledru | f65e51d | 2011-04-04 15:04:46 -0700 | [diff] [blame] | 2275 | All the processes running on the machine have their own entry defined by | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2276 | /proc/<pid> | 
|  | 2277 | So lets have a look at the init process | 
|  | 2278 | cd /proc/1 | 
|  | 2279 |  | 
|  | 2280 | cat cmdline | 
|  | 2281 | emits | 
|  | 2282 | init [2] | 
|  | 2283 |  | 
|  | 2284 | cd /proc/1/fd | 
|  | 2285 | This contains numerical entries of all the open files, | 
|  | 2286 | some of these you can cat e.g. stdout (2) | 
|  | 2287 |  | 
|  | 2288 | cat /proc/29/maps | 
|  | 2289 | on my machine emits | 
|  | 2290 |  | 
|  | 2291 | 00400000-00478000 r-xp 00000000 5f:00 4103       /bin/bash | 
|  | 2292 | 00478000-0047e000 rw-p 00077000 5f:00 4103       /bin/bash | 
|  | 2293 | 0047e000-00492000 rwxp 00000000 00:00 0 | 
|  | 2294 | 40000000-40015000 r-xp 00000000 5f:00 14382      /lib/ld-2.1.2.so | 
|  | 2295 | 40015000-40016000 rw-p 00014000 5f:00 14382      /lib/ld-2.1.2.so | 
|  | 2296 | 40016000-40017000 rwxp 00000000 00:00 0 | 
|  | 2297 | 40017000-40018000 rw-p 00000000 00:00 0 | 
|  | 2298 | 40018000-4001b000 r-xp 00000000 5f:00 14435      /lib/libtermcap.so.2.0.8 | 
|  | 2299 | 4001b000-4001c000 rw-p 00002000 5f:00 14435      /lib/libtermcap.so.2.0.8 | 
|  | 2300 | 4001c000-4010d000 r-xp 00000000 5f:00 14387      /lib/libc-2.1.2.so | 
|  | 2301 | 4010d000-40111000 rw-p 000f0000 5f:00 14387      /lib/libc-2.1.2.so | 
|  | 2302 | 40111000-40114000 rw-p 00000000 00:00 0 | 
|  | 2303 | 40114000-4011e000 r-xp 00000000 5f:00 14408      /lib/libnss_files-2.1.2.so | 
|  | 2304 | 4011e000-4011f000 rw-p 00009000 5f:00 14408      /lib/libnss_files-2.1.2.so | 
|  | 2305 | 7fffd000-80000000 rwxp ffffe000 00:00 0 | 
|  | 2306 |  | 
|  | 2307 |  | 
|  | 2308 | Showing us the shared libraries init uses where they are in memory | 
|  | 2309 | & memory access permissions for each virtual memory area. | 
|  | 2310 |  | 
|  | 2311 | /proc/1/cwd is a softlink to the current working directory. | 
|  | 2312 | /proc/1/root is the root of the filesystem for this process. | 
|  | 2313 |  | 
|  | 2314 | /proc/1/mem is the current running processes memory which you | 
|  | 2315 | can read & write to like a file. | 
|  | 2316 | strace uses this sometimes as it is a bit faster than the | 
| Matt LaPlante | 2fe0ae7 | 2006-10-03 22:50:39 +0200 | [diff] [blame] | 2317 | rather inefficient ptrace interface for peeking at DATA. | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2318 |  | 
|  | 2319 |  | 
|  | 2320 | cat status | 
|  | 2321 |  | 
|  | 2322 | Name:   init | 
|  | 2323 | State:  S (sleeping) | 
|  | 2324 | Pid:    1 | 
|  | 2325 | PPid:   0 | 
|  | 2326 | Uid:    0       0       0       0 | 
|  | 2327 | Gid:    0       0       0       0 | 
|  | 2328 | Groups: | 
|  | 2329 | VmSize:      408 kB | 
|  | 2330 | VmLck:         0 kB | 
|  | 2331 | VmRSS:       208 kB | 
|  | 2332 | VmData:       24 kB | 
|  | 2333 | VmStk:         8 kB | 
|  | 2334 | VmExe:       368 kB | 
|  | 2335 | VmLib:         0 kB | 
|  | 2336 | SigPnd: 0000000000000000 | 
|  | 2337 | SigBlk: 0000000000000000 | 
|  | 2338 | SigIgn: 7fffffffd7f0d8fc | 
|  | 2339 | SigCgt: 00000000280b2603 | 
|  | 2340 | CapInh: 00000000fffffeff | 
|  | 2341 | CapPrm: 00000000ffffffff | 
|  | 2342 | CapEff: 00000000fffffeff | 
|  | 2343 |  | 
|  | 2344 | User PSW:    070de000 80414146 | 
|  | 2345 | task: 004b6000 tss: 004b62d8 ksp: 004b7ca8 pt_regs: 004b7f68 | 
|  | 2346 | User GPRS: | 
|  | 2347 | 00000400  00000000  0000000b  7ffffa90 | 
|  | 2348 | 00000000  00000000  00000000  0045d9f4 | 
|  | 2349 | 0045cafc  7ffffa90  7fffff18  0045cb08 | 
|  | 2350 | 00010400  804039e8  80403af8  7ffff8b0 | 
|  | 2351 | User ACRS: | 
|  | 2352 | 00000000  00000000  00000000  00000000 | 
|  | 2353 | 00000001  00000000  00000000  00000000 | 
|  | 2354 | 00000000  00000000  00000000  00000000 | 
|  | 2355 | 00000000  00000000  00000000  00000000 | 
|  | 2356 | Kernel BackChain  CallChain    BackChain  CallChain | 
|  | 2357 | 004b7ca8   8002bd0c     004b7d18   8002b92c | 
|  | 2358 | 004b7db8   8005cd50     004b7e38   8005d12a | 
|  | 2359 | 004b7f08   80019114 | 
|  | 2360 | Showing among other things memory usage & status of some signals & | 
|  | 2361 | the processes'es registers from the kernel task_structure | 
|  | 2362 | as well as a backchain which may be useful if a process crashes | 
|  | 2363 | in the kernel for some unknown reason. | 
|  | 2364 |  | 
|  | 2365 | Some driver debugging techniques | 
|  | 2366 | ================================ | 
|  | 2367 | debug feature | 
|  | 2368 | ------------- | 
|  | 2369 | Some of our drivers now support a "debug feature" in | 
|  | 2370 | /proc/s390dbf see s390dbf.txt in the linux/Documentation directory | 
|  | 2371 | for more info. | 
|  | 2372 | e.g. | 
|  | 2373 | to switch on the lcs "debug feature" | 
|  | 2374 | echo 5 > /proc/s390dbf/lcs/level | 
|  | 2375 | & then after the error occurred. | 
|  | 2376 | cat /proc/s390dbf/lcs/sprintf >/logfile | 
|  | 2377 | the logfile now contains some information which may help | 
|  | 2378 | tech support resolve a problem in the field. | 
|  | 2379 |  | 
|  | 2380 |  | 
|  | 2381 |  | 
|  | 2382 | high level debugging network drivers | 
|  | 2383 | ------------------------------------ | 
|  | 2384 | ifconfig is a quite useful command | 
|  | 2385 | it gives the current state of network drivers. | 
|  | 2386 |  | 
|  | 2387 | If you suspect your network device driver is dead | 
|  | 2388 | one way to check is type | 
|  | 2389 | ifconfig <network device> | 
|  | 2390 | e.g. tr0 | 
|  | 2391 | You should see something like | 
|  | 2392 | tr0       Link encap:16/4 Mbps Token Ring (New)  HWaddr 00:04:AC:20:8E:48 | 
|  | 2393 | inet addr:9.164.185.132  Bcast:9.164.191.255  Mask:255.255.224.0 | 
|  | 2394 | UP BROADCAST RUNNING MULTICAST  MTU:2000  Metric:1 | 
|  | 2395 | RX packets:246134 errors:0 dropped:0 overruns:0 frame:0 | 
|  | 2396 | TX packets:5 errors:0 dropped:0 overruns:0 carrier:0 | 
|  | 2397 | collisions:0 txqueuelen:100 | 
|  | 2398 |  | 
|  | 2399 | if the device doesn't say up | 
|  | 2400 | try | 
|  | 2401 | /etc/rc.d/init.d/network start | 
|  | 2402 | ( this starts the network stack & hopefully calls ifconfig tr0 up ). | 
|  | 2403 | ifconfig looks at the output of /proc/net/dev & presents it in a more presentable form | 
|  | 2404 | Now ping the device from a machine in the same subnet. | 
|  | 2405 | if the RX packets count & TX packets counts don't increment you probably | 
|  | 2406 | have problems. | 
|  | 2407 | next | 
|  | 2408 | cat /proc/net/arp | 
|  | 2409 | Do you see any hardware addresses in the cache if not you may have problems. | 
|  | 2410 | Next try | 
|  | 2411 | ping -c 5 <broadcast_addr> i.e. the Bcast field above in the output of | 
|  | 2412 | ifconfig. Do you see any replies from machines other than the local machine | 
|  | 2413 | if not you may have problems. also if the TX packets count in ifconfig | 
|  | 2414 | hasn't incremented either you have serious problems in your driver | 
|  | 2415 | (e.g. the txbusy field of the network device being stuck on ) | 
|  | 2416 | or you may have multiple network devices connected. | 
|  | 2417 |  | 
|  | 2418 |  | 
|  | 2419 | chandev | 
|  | 2420 | ------- | 
|  | 2421 | There is a new device layer for channel devices, some | 
|  | 2422 | drivers e.g. lcs are registered with this layer. | 
|  | 2423 | If the device uses the channel device layer you'll be | 
|  | 2424 | able to find what interrupts it uses & the current state | 
|  | 2425 | of the device. | 
|  | 2426 | See the manpage chandev.8 &type cat /proc/chandev for more info. | 
|  | 2427 |  | 
|  | 2428 |  | 
|  | 2429 |  | 
|  | 2430 | Starting points for debugging scripting languages etc. | 
|  | 2431 | ====================================================== | 
|  | 2432 |  | 
|  | 2433 | bash/sh | 
|  | 2434 |  | 
|  | 2435 | bash -x <scriptname> | 
|  | 2436 | e.g. bash -x /usr/bin/bashbug | 
|  | 2437 | displays the following lines as it executes them. | 
|  | 2438 | + MACHINE=i586 | 
|  | 2439 | + OS=linux-gnu | 
|  | 2440 | + CC=gcc | 
|  | 2441 | + CFLAGS= -DPROGRAM='bash' -DHOSTTYPE='i586' -DOSTYPE='linux-gnu' -DMACHTYPE='i586-pc-linux-gnu' -DSHELL -DHAVE_CONFIG_H   -I. -I. -I./lib -O2 -pipe | 
|  | 2442 | + RELEASE=2.01 | 
|  | 2443 | + PATCHLEVEL=1 | 
|  | 2444 | + RELSTATUS=release | 
|  | 2445 | + MACHTYPE=i586-pc-linux-gnu | 
|  | 2446 |  | 
| Matt LaPlante | 2fe0ae7 | 2006-10-03 22:50:39 +0200 | [diff] [blame] | 2447 | perl -d <scriptname> runs the perlscript in a fully interactive debugger | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2448 | <like gdb>. | 
|  | 2449 | Type 'h' in the debugger for help. | 
|  | 2450 |  | 
|  | 2451 | for debugging java type | 
|  | 2452 | jdb <filename> another fully interactive gdb style debugger. | 
|  | 2453 | & type ? in the debugger for help. | 
|  | 2454 |  | 
|  | 2455 |  | 
|  | 2456 |  | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2457 | SysRq | 
|  | 2458 | ===== | 
|  | 2459 | This is now supported by linux for s/390 & z/Architecture. | 
|  | 2460 | To enable it do compile the kernel with | 
|  | 2461 | Kernel Hacking -> Magic SysRq Key Enabled | 
|  | 2462 | echo "1" > /proc/sys/kernel/sysrq | 
|  | 2463 | also type | 
|  | 2464 | echo "8" >/proc/sys/kernel/printk | 
|  | 2465 | To make printk output go to console. | 
|  | 2466 | On 390 all commands are prefixed with | 
|  | 2467 | ^- | 
|  | 2468 | e.g. | 
|  | 2469 | ^-t will show tasks. | 
|  | 2470 | ^-? or some unknown command will display help. | 
|  | 2471 | The sysrq key reading is very picky ( I have to type the keys in an | 
|  | 2472 | xterm session & paste them  into the x3270 console ) | 
|  | 2473 | & it may be wise to predefine the keys as described in the VM hints above | 
|  | 2474 |  | 
|  | 2475 | This is particularly useful for syncing disks unmounting & rebooting | 
|  | 2476 | if the machine gets partially hung. | 
|  | 2477 |  | 
|  | 2478 | Read Documentation/sysrq.txt for more info | 
|  | 2479 |  | 
|  | 2480 | References: | 
|  | 2481 | =========== | 
|  | 2482 | Enterprise Systems Architecture Reference Summary | 
|  | 2483 | Enterprise Systems Architecture Principles of Operation | 
|  | 2484 | Hartmut Penners s390 stack frame sheet. | 
|  | 2485 | IBM Mainframe Channel Attachment a technology brief from a CISCO webpage | 
|  | 2486 | Various bits of man & info pages of Linux. | 
|  | 2487 | Linux & GDB source. | 
|  | 2488 | Various info & man pages. | 
|  | 2489 | CMS Help on tracing commands. | 
|  | 2490 | Linux for s/390 Elf Application Binary Interface | 
|  | 2491 | Linux for z/Series Elf Application Binary Interface ( Both Highly Recommended ) | 
|  | 2492 | z/Architecture Principles of Operation SA22-7832-00 | 
|  | 2493 | Enterprise Systems Architecture/390 Reference Summary SA22-7209-01 & the | 
|  | 2494 | Enterprise Systems Architecture/390 Principles of Operation SA22-7201-05 | 
|  | 2495 |  | 
|  | 2496 | Special Thanks | 
|  | 2497 | ============== | 
|  | 2498 | Special thanks to Neale Ferguson who maintains a much | 
|  | 2499 | prettier HTML version of this page at | 
| Justin P. Mattock | 0ea6e61 | 2010-07-23 20:51:24 -0700 | [diff] [blame] | 2500 | http://linuxvm.org/penguinvm/ | 
| Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 2501 | Bob Grainger Stefan Bader & others for reporting bugs |