ext4: fix ext4_flush_completed_IO wait semantics

BUG #1) All places where we call ext4_flush_completed_IO are broken
    because buffered io and DIO/AIO goes through three stages
    1) submitted io,
    2) completed io (in i_completed_io_list) conversion pended
    3) finished  io (conversion done)
    And by calling ext4_flush_completed_IO we will flush only
    requests which were in (2) stage, which is wrong because:
     1) punch_hole and truncate _must_ wait for all outstanding unwritten io
      regardless to it's state.
     2) fsync and nolock_dio_read should also wait because there is
        a time window between end_page_writeback() and ext4_add_complete_io()
        As result integrity fsync is broken in case of buffered write
        to fallocated region:
        fsync                                      blkdev_completion
	 ->filemap_write_and_wait_range
                                                   ->ext4_end_bio
                                                     ->end_page_writeback
          <-- filemap_write_and_wait_range return
	 ->ext4_flush_completed_IO
   	 sees empty i_completed_io_list but pended
   	 conversion still exist
                                                     ->ext4_add_complete_io

BUG #2) Race window becomes wider due to the 'ext4: completed_io
locking cleanup V4' patch series

This patch make following changes:
1) ext4_flush_completed_io() now first try to flush completed io and when
   wait for any outstanding unwritten io via ext4_unwritten_wait()
2) Rename function to more appropriate name.
3) Assert that all callers of ext4_flush_unwritten_io should hold i_mutex to
   prevent endless wait

Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 5b24c40..68e896e 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -189,8 +189,6 @@
 
 		list_add_tail(&io->list, &complete);
 	}
-	/* It is important to update all flags for all end_io in one shot w/o
-	 * dropping the lock.*/
 	spin_lock_irqsave(&ei->i_completed_io_lock, flags);
 	while (!list_empty(&complete)) {
 		io = list_entry(complete.next, ext4_io_end_t, list);
@@ -228,9 +226,14 @@
 	ext4_do_flush_completed_IO(io->inode, io);
 }
 
-int ext4_flush_completed_IO(struct inode *inode)
+int ext4_flush_unwritten_io(struct inode *inode)
 {
-	return ext4_do_flush_completed_IO(inode, NULL);
+	int ret;
+	WARN_ON_ONCE(!mutex_is_locked(&inode->i_mutex) &&
+		     !(inode->i_state & I_FREEING));
+	ret = ext4_do_flush_completed_IO(inode, NULL);
+	ext4_unwritten_wait(inode);
+	return ret;
 }
 
 ext4_io_end_t *ext4_init_io_end(struct inode *inode, gfp_t flags)