bionic: fix pthread_{create, exit}/signal race condition

(1) in pthread_create:
    If the one signal is received before esp is subtracted by 16 and
    __thread_entry( ) is called, the stack will be cleared by kernel
    when it tries to contruct the signal stack frame. That will cause
    that __thread_entry will get a wrong tls pointer from the stack
    which leads to the segment fault when trying to access tls content.

(2) in pthread_exit
    After pthread_exit called system call unmap(), its stack will be
    freed.  If one signal is received at that time, there is no stack
    available for it.

Fixed by subtracting the child's esp by 16 before the clone system
call and by blocking signal handling before pthread_exit is started.

Author: Jack Ren <jack.ren@intel.com>
Signed-off-by: Bruce Beare <bruce.j.beare@intel.com>
diff --git a/libc/arch-x86/bionic/clone.S b/libc/arch-x86/bionic/clone.S
index b9b0957..8abb7c8 100644
--- a/libc/arch-x86/bionic/clone.S
+++ b/libc/arch-x86/bionic/clone.S
@@ -22,6 +22,7 @@
         movl    %eax, -8(%ecx)
         movl    %ecx, -4(%ecx)
 
+        subl    $16, %ecx
         movl    $__NR_clone, %eax
         int     $0x80
         test    %eax, %eax
@@ -39,7 +40,6 @@
         # we're in the child thread now, call __thread_entry
         # with the appropriate arguments on the child stack
         # we already placed most of them
-        subl    $16, %esp
         jmp     __thread_entry
         hlt