f2fs: Pull in from upstream 3.13 kernel
Merge tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull new F2FS filesystem from Jaegeuk Kim:
"Introduce a new file system, Flash-Friendly File System (F2FS), to
Linux 3.8.
Highlights:
- Add initial f2fs source codes
- Fix an endian conversion bug
- Fix build failures on random configs
- Fix the power-off-recovery routine
- Minor cleanup, coding style, and typos patches"
From the Kconfig help text:
F2FS is based on Log-structured File System (LFS), which supports
versatile "flash-friendly" features. The design has been focused on
addressing the fundamental issues in LFS, which are snowball effect
of wandering tree and high cleaning overhead.
Since flash-based storages show different characteristics according to
the internal geometry or flash memory management schemes aka FTL, F2FS
and tools support various parameters not only for configuring on-disk
layout, but also for selecting allocation and cleaning algorithms.
and there's an article by Neil Brown about it on lwn.net:
http://lwn.net/Articles/518988/
* tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (36 commits)
f2fs: fix tracking parent inode number
f2fs: cleanup the f2fs_bio_alloc routine
f2fs: introduce accessor to retrieve number of dentry slots
f2fs: remove redundant call to f2fs_put_page in delete entry
f2fs: make use of GFP_F2FS_ZERO for setting gfp_mask
f2fs: rewrite f2fs_bio_alloc to make it simpler
f2fs: fix a typo in f2fs documentation
f2fs: remove unused variable
f2fs: move error condition for mkdir at proper place
f2fs: remove unneeded initialization
f2fs: check read only condition before beginning write out
f2fs: remove unneeded memset from init_once
f2fs: show error in case of invalid mount arguments
f2fs: fix the compiler warning for uninitialized use of variable
f2fs: resolve build failures
f2fs: adjust kernel coding style
f2fs: fix endian conversion bugs reported by sparse
f2fs: remove unneeded version.h header file from f2fs.h
f2fs: update the f2fs document
f2fs: update Kconfig and Makefile
...
Conflicts:
include/uapi/linux/magic.h
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs bug fixes from Jaegeuk Kim:
"This patch-set includes two major bug fixes:
- incorrect IUsed provided by *df -i*, and
- lookup failure of parent inodes in corner cases.
[Other Bug Fixes]
- Fix error handling routines
- Trigger recovery process correctly
- Resolve build failures due to missing header files
[Etc]
- Add a MAINTAINERS entry for f2fs
- Fix and clean up variables, functions, and equations
- Avoid warnings during compilation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
f2fs: unify string length declarations and usage
f2fs: clean up unused variables and return values
f2fs: clean up the start_bidx_of_node function
f2fs: remove unneeded variable from f2fs_sync_fs
f2fs: fix fsync_inode list addition logic and avoid invalid access to memory
f2fs: remove unneeded initialization of nr_dirty in dirty_seglist_info
f2fs: handle error from f2fs_iget_nowait
f2fs: fix equation of has_not_enough_free_secs()
f2fs: add MAINTAINERS entry
f2fs: return a default value for non-void function
f2fs: invalidate the node page if allocation is failed
f2fs: add missing #include <linux/prefetch.h>
f2fs: do f2fs_balance_fs in front of dir operations
f2fs: should recover orphan and fsync data
f2fs: fix handling errors got by f2fs_write_inode
f2fs: fix up f2fs_get_parent issue to retrieve correct parent inode number
f2fs: fix wrong calculation on f_files in statfs
f2fs: remove set_page_dirty for atomic f2fs_end_io_write
Merge tag 'f2fs-for-3.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs fixes from Jaegeuk Kim:
o Support swap file and link generic_file_remap_pages
o Enhance the bio streaming flow and free section control
o Major bug fix on recovery routine
o Minor bug/warning fixes and code cleanups
* tag 'f2fs-for-3.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (22 commits)
f2fs: use _safe() version of list_for_each
f2fs: add comments of start_bidx_of_node
f2fs: avoid issuing small bios due to several dirty node pages
f2fs: support swapfile
f2fs: add remap_pages as generic_file_remap_pages
f2fs: add __init to functions in init_f2fs_fs
f2fs: fix the debugfs entry creation path
f2fs: add global mutex_lock to protect f2fs_stat_list
f2fs: remove the blk_plug usage in f2fs_write_data_pages
f2fs: avoid redundant time update for parent directory in f2fs_delete_entry
f2fs: remove redundant call to set_blocksize in f2fs_fill_super
f2fs: move f2fs_balance_fs to punch_hole
f2fs: add f2fs_balance_fs in several interfaces
f2fs: revisit the f2fs_gc flow
f2fs: check return value during recovery
f2fs: avoid null dereference in f2fs_acl_from_disk
f2fs: initialize newly allocated dnode structure
f2fs: update f2fs partition info about SIT/NAT layout
f2fs: update f2fs document to reflect SIT/NAT layout correctly
f2fs: remove unneeded INIT_LIST_HEAD at few places
...
Merge tag 'f2fs-for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs update from Jaegeuk Kim:
"[Major bug fixes]
o Store device file information correctly
o Fix -EIO handling with respect to power-off-recovery
o Allocate blocks with global locks
o Fix wrong calculation of the SSR cost
[Cleanups]
o Get rid of fake on-stack dentries
[Enhancement]
o Support (un)freeze_fs
o Enhance the f2fs_gc flow
o Support 32-bit binary execution on 64-bit kernel"
* tag 'f2fs-for-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (29 commits)
f2fs: avoid build warning
f2fs: add compat_ioctl to provide backward compatability
f2fs: fix calculation of max. gc cost in the SSR case
f2fs: clarify and enhance the f2fs_gc flow
f2fs: optimize the return condition for has_not_enough_free_secs
f2fs: make an accessor to get sections for particular block type
f2fs: mark gc_thread as NULL when thread creation is failed
f2fs: name gc task as per the block device
f2fs: remove unnecessary gc option check and balance_fs
f2fs: remove repeated F2FS_SET_SB_DIRT call
f2fs: when check superblock failed, try to check another superblock
f2fs: use F2FS_BLKSIZE to judge bloksize and page_cache_size
f2fs: add device name in debugfs
f2fs: stop repeated checking if cp is needed
f2fs: avoid balanc_fs during evict_inode
f2fs: remove the use of page_cache_release
f2fs: fix typo mistake for data_version description
f2fs: reorganize code for ra_node_page
f2fs: avoid redundant call to has_not_enough_free_secs in f2fs_gc
f2fs: add un/freeze_fs into super_operations
...
Merge tag 'f2fs-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"This patch-set includes the following major enhancement patches.
- introduce a new gloabl lock scheme
- add tracepoints on several major functions
- fix the overall cleaning process focused on victim selection
- apply the block plugging to merge IOs as much as possible
- enhance management of free nids and its list
- enhance the readahead mode for node pages
- address several cretical deadlock conditions
- reduce lock_page calls
The other minor bug fixes and enhancements are as follows.
- calculation mistakes: overflow
- bio types: READ, READA, and READ_SYNC
- fix the recovery flow, data races, and null pointer errors"
* tag 'f2fs-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (68 commits)
f2fs: cover free_nid management with spin_lock
f2fs: optimize scan_nat_page()
f2fs: code cleanup for scan_nat_page() and build_free_nids()
f2fs: bugfix for alloc_nid_failed()
f2fs: recover when journal contains deleted files
f2fs: continue to mount after failing recovery
f2fs: avoid deadlock during evict after f2fs_gc
f2fs: modify the number of issued pages to merge IOs
f2fs: remove useless #include <linux/proc_fs.h> as we're now using sysfs as debug entry.
f2fs: fix inconsistent using of NM_WOUT_THRESHOLD
f2fs: check truncation of mapping after lock_page
f2fs: enhance alloc_nid and build_free_nids flows
f2fs: add a tracepoint on f2fs_new_inode
f2fs: check nid == 0 in add_free_nid
f2fs: add REQ_META about metadata requests for submit
f2fs: give a chance to merge IOs by IO scheduler
f2fs: avoid frequent background GC
f2fs: add tracepoints to debug checkpoint request
f2fs: add tracepoints for write page operations
f2fs: add tracepoints to debug the block allocation
...
Merge tag 'for-f2fs-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"This patch-set includes the following major enhancement patches:
- remount_fs callback function
- restore parent inode number to enhance the fsync performance
- xattr security labels
- reduce the number of redundant lock/unlock data pages
- avoid frequent write_inode calls
The other minor bug fixes are as follows.
- endian conversion bugs
- various bugs in the roll-forward recovery routine"
* tag 'for-f2fs-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (56 commits)
f2fs: fix to recover i_size from roll-forward
f2fs: remove the unused argument "sbi" of func destroy_fsync_dnodes()
f2fs: remove reusing any prefree segments
f2fs: code cleanup and simplify in func {find/add}_gc_inode
f2fs: optimize the init_dirty_segmap function
f2fs: fix an endian conversion bug detected by sparse
f2fs: fix crc endian conversion
f2fs: add remount_fs callback support
f2fs: recover wrong pino after checkpoint during fsync
f2fs: optimize do_write_data_page()
f2fs: make locate_dirty_segment() as static
f2fs: remove unnecessary parameter "offset" from __add_sum_entry()
f2fs: avoid freqeunt write_inode calls
f2fs: optimise the truncate_data_blocks_range() range
f2fs: use the F2FS specific flags in f2fs_ioctl()
f2fs: sync dir->i_size with its block allocation
f2fs: fix i_blocks translation on various types of files
f2fs: set sb->s_fs_info before calling parse_options()
f2fs: support xattr security labels
f2fs: fix iget/iput of dir during recovery
...
Merge tag 'for-f2fs-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"This patch-set includes the following major enhancement patches:
- support inline xattrs
- add sysfs support to control GCs explicitly
- add proc entry to show the current segment usage information
- improve the GC/SSR performance
The other bug fixes are as follows:
- avoid the overflow on status calculation
- fix some error handling routines
- fix inconsistent xattr states after power-off-recovery
- fix incorrect xattr node offset definition
- fix deadlock condition in fsync
- fix the fdatasync routine for power-off-recovery"
* tag 'for-f2fs-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (40 commits)
f2fs: optimize gc for better performance
f2fs: merge more bios of node block writes
f2fs: avoid an overflow during utilization calculation
f2fs: trigger GC when there are prefree segments
f2fs: use strncasecmp() simplify the string comparison
f2fs: fix omitting to update inode page
f2fs: support the inline xattrs
f2fs: add the truncate_xattr_node function
f2fs: introduce __find_xattr for readability
f2fs: reserve the xattr space dynamically
f2fs: add flags for inline xattrs
f2fs: fix error return code in init_f2fs_fs()
f2fs: fix wrong BUG_ON condition
f2fs: fix memory leak when init f2fs filesystem fail
f2fs: fix a compound statement label error
f2fs: avoid writing inode redundantly when creating a file
f2fs: alloc_page() doesn't return an ERR_PTR
f2fs: should cover i_xattr_nid with its xattr node page lock
f2fs: check the free space first in new_node_page
f2fs: clean up the needless end 'return' of void function
...
Merge tag 'for-f2fs-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"This patch-set includes the following major enhancement patches.
- add a sysfs to control reclaiming free segments
- enhance the f2fs global lock procedures
- enhance the victim selection flow
- wait for selected node blocks during fsync
- add some tracepoints
- add a config to remove abundant BUG_ONs
The other bug fixes are as follows.
- fix deadlock on acl operations
- fix some bugs with respect to orphan inodes
And, there are a bunch of cleanups"
* tag 'for-f2fs-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (42 commits)
f2fs: issue more large discard command
f2fs: fix memory leak after kobject init failed in fill_super
f2fs: cleanup waiting routine for writeback pages in cp
f2fs: avoid to use a NULL point in destroy_segment_manager
f2fs: remove unnecessary TestClearPageError when wait pages writeback
f2fs: update f2fs document
f2fs: avoid to wait all the node blocks during fsync
f2fs: check all ones or zeros bitmap with bitops for better mount performance
f2fs: change the method of calculating the number summary blocks
f2fs: fix calculating incorrect free size when update xattr in __f2fs_setxattr
f2fs: add an option to avoid unnecessary BUG_ONs
f2fs: introduce CONFIG_F2FS_CHECK_FS for BUG_ON control
f2fs: fix a deadlock during init_acl procedure
f2fs: clean up acl flow for better readability
f2fs: remove unnecessary segment bitmap updates
f2fs: add tracepoint for vm_page_mkwrite
f2fs: add tracepoint for set_page_dirty
f2fs: remove redundant set_page_dirty from write_compacted_summaries
f2fs: add reclaiming control by sysfs
f2fs: introduce f2fs_balance_fs_bg for some background jobs
...
Change-Id: Ied5488471d49d64ce6abb4be19237c4e90829ff6
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
new file mode 100644
index 0000000..d16645a
--- /dev/null
+++ b/fs/f2fs/data.c
@@ -0,0 +1,785 @@
+/*
+ * fs/f2fs/data.c
+ *
+ * Copyright (c) 2012 Samsung Electronics Co., Ltd.
+ * http://www.samsung.com/
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include <linux/fs.h>
+#include <linux/f2fs_fs.h>
+#include <linux/buffer_head.h>
+#include <linux/mpage.h>
+#include <linux/writeback.h>
+#include <linux/backing-dev.h>
+#include <linux/blkdev.h>
+#include <linux/bio.h>
+#include <linux/prefetch.h>
+
+#include "f2fs.h"
+#include "node.h"
+#include "segment.h"
+#include <trace/events/f2fs.h>
+
+/*
+ * Lock ordering for the change of data block address:
+ * ->data_page
+ * ->node_page
+ * update block addresses in the node page
+ */
+static void __set_data_blkaddr(struct dnode_of_data *dn, block_t new_addr)
+{
+ struct f2fs_node *rn;
+ __le32 *addr_array;
+ struct page *node_page = dn->node_page;
+ unsigned int ofs_in_node = dn->ofs_in_node;
+
+ f2fs_wait_on_page_writeback(node_page, NODE, false);
+
+ rn = F2FS_NODE(node_page);
+
+ /* Get physical address of data block */
+ addr_array = blkaddr_in_node(rn);
+ addr_array[ofs_in_node] = cpu_to_le32(new_addr);
+ set_page_dirty(node_page);
+}
+
+int reserve_new_block(struct dnode_of_data *dn)
+{
+ struct f2fs_sb_info *sbi = F2FS_SB(dn->inode->i_sb);
+
+ if (is_inode_flag_set(F2FS_I(dn->inode), FI_NO_ALLOC))
+ return -EPERM;
+ if (!inc_valid_block_count(sbi, dn->inode, 1))
+ return -ENOSPC;
+
+ trace_f2fs_reserve_new_block(dn->inode, dn->nid, dn->ofs_in_node);
+
+ __set_data_blkaddr(dn, NEW_ADDR);
+ dn->data_blkaddr = NEW_ADDR;
+ sync_inode_page(dn);
+ return 0;
+}
+
+static int check_extent_cache(struct inode *inode, pgoff_t pgofs,
+ struct buffer_head *bh_result)
+{
+ struct f2fs_inode_info *fi = F2FS_I(inode);
+ pgoff_t start_fofs, end_fofs;
+ block_t start_blkaddr;
+
+ read_lock(&fi->ext.ext_lock);
+ if (fi->ext.len == 0) {
+ read_unlock(&fi->ext.ext_lock);
+ return 0;
+ }
+
+ stat_inc_total_hit(inode->i_sb);
+
+ start_fofs = fi->ext.fofs;
+ end_fofs = fi->ext.fofs + fi->ext.len - 1;
+ start_blkaddr = fi->ext.blk_addr;
+
+ if (pgofs >= start_fofs && pgofs <= end_fofs) {
+ unsigned int blkbits = inode->i_sb->s_blocksize_bits;
+ size_t count;
+
+ clear_buffer_new(bh_result);
+ map_bh(bh_result, inode->i_sb,
+ start_blkaddr + pgofs - start_fofs);
+ count = end_fofs - pgofs + 1;
+ if (count < (UINT_MAX >> blkbits))
+ bh_result->b_size = (count << blkbits);
+ else
+ bh_result->b_size = UINT_MAX;
+
+ stat_inc_read_hit(inode->i_sb);
+ read_unlock(&fi->ext.ext_lock);
+ return 1;
+ }
+ read_unlock(&fi->ext.ext_lock);
+ return 0;
+}
+
+void update_extent_cache(block_t blk_addr, struct dnode_of_data *dn)
+{
+ struct f2fs_inode_info *fi = F2FS_I(dn->inode);
+ pgoff_t fofs, start_fofs, end_fofs;
+ block_t start_blkaddr, end_blkaddr;
+
+ f2fs_bug_on(blk_addr == NEW_ADDR);
+ fofs = start_bidx_of_node(ofs_of_node(dn->node_page), fi) +
+ dn->ofs_in_node;
+
+ /* Update the page address in the parent node */
+ __set_data_blkaddr(dn, blk_addr);
+
+ write_lock(&fi->ext.ext_lock);
+
+ start_fofs = fi->ext.fofs;
+ end_fofs = fi->ext.fofs + fi->ext.len - 1;
+ start_blkaddr = fi->ext.blk_addr;
+ end_blkaddr = fi->ext.blk_addr + fi->ext.len - 1;
+
+ /* Drop and initialize the matched extent */
+ if (fi->ext.len == 1 && fofs == start_fofs)
+ fi->ext.len = 0;
+
+ /* Initial extent */
+ if (fi->ext.len == 0) {
+ if (blk_addr != NULL_ADDR) {
+ fi->ext.fofs = fofs;
+ fi->ext.blk_addr = blk_addr;
+ fi->ext.len = 1;
+ }
+ goto end_update;
+ }
+
+ /* Front merge */
+ if (fofs == start_fofs - 1 && blk_addr == start_blkaddr - 1) {
+ fi->ext.fofs--;
+ fi->ext.blk_addr--;
+ fi->ext.len++;
+ goto end_update;
+ }
+
+ /* Back merge */
+ if (fofs == end_fofs + 1 && blk_addr == end_blkaddr + 1) {
+ fi->ext.len++;
+ goto end_update;
+ }
+
+ /* Split the existing extent */
+ if (fi->ext.len > 1 &&
+ fofs >= start_fofs && fofs <= end_fofs) {
+ if ((end_fofs - fofs) < (fi->ext.len >> 1)) {
+ fi->ext.len = fofs - start_fofs;
+ } else {
+ fi->ext.fofs = fofs + 1;
+ fi->ext.blk_addr = start_blkaddr +
+ fofs - start_fofs + 1;
+ fi->ext.len -= fofs - start_fofs + 1;
+ }
+ goto end_update;
+ }
+ write_unlock(&fi->ext.ext_lock);
+ return;
+
+end_update:
+ write_unlock(&fi->ext.ext_lock);
+ sync_inode_page(dn);
+}
+
+struct page *find_data_page(struct inode *inode, pgoff_t index, bool sync)
+{
+ struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
+ struct address_space *mapping = inode->i_mapping;
+ struct dnode_of_data dn;
+ struct page *page;
+ int err;
+
+ page = find_get_page(mapping, index);
+ if (page && PageUptodate(page))
+ return page;
+ f2fs_put_page(page, 0);
+
+ set_new_dnode(&dn, inode, NULL, NULL, 0);
+ err = get_dnode_of_data(&dn, index, LOOKUP_NODE);
+ if (err)
+ return ERR_PTR(err);
+ f2fs_put_dnode(&dn);
+
+ if (dn.data_blkaddr == NULL_ADDR)
+ return ERR_PTR(-ENOENT);
+
+ /* By fallocate(), there is no cached page, but with NEW_ADDR */
+ if (dn.data_blkaddr == NEW_ADDR)
+ return ERR_PTR(-EINVAL);
+
+ page = grab_cache_page_write_begin(mapping, index, AOP_FLAG_NOFS);
+ if (!page)
+ return ERR_PTR(-ENOMEM);
+
+ if (PageUptodate(page)) {
+ unlock_page(page);
+ return page;
+ }
+
+ err = f2fs_readpage(sbi, page, dn.data_blkaddr,
+ sync ? READ_SYNC : READA);
+ if (sync) {
+ wait_on_page_locked(page);
+ if (!PageUptodate(page)) {
+ f2fs_put_page(page, 0);
+ return ERR_PTR(-EIO);
+ }
+ }
+ return page;
+}
+
+/*
+ * If it tries to access a hole, return an error.
+ * Because, the callers, functions in dir.c and GC, should be able to know
+ * whether this page exists or not.
+ */
+struct page *get_lock_data_page(struct inode *inode, pgoff_t index)
+{
+ struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
+ struct address_space *mapping = inode->i_mapping;
+ struct dnode_of_data dn;
+ struct page *page;
+ int err;
+
+repeat:
+ page = grab_cache_page_write_begin(mapping, index, AOP_FLAG_NOFS);
+ if (!page)
+ return ERR_PTR(-ENOMEM);
+
+ set_new_dnode(&dn, inode, NULL, NULL, 0);
+ err = get_dnode_of_data(&dn, index, LOOKUP_NODE);
+ if (err) {
+ f2fs_put_page(page, 1);
+ return ERR_PTR(err);
+ }
+ f2fs_put_dnode(&dn);
+
+ if (dn.data_blkaddr == NULL_ADDR) {
+ f2fs_put_page(page, 1);
+ return ERR_PTR(-ENOENT);
+ }
+
+ if (PageUptodate(page))
+ return page;
+
+ /*
+ * A new dentry page is allocated but not able to be written, since its
+ * new inode page couldn't be allocated due to -ENOSPC.
+ * In such the case, its blkaddr can be remained as NEW_ADDR.
+ * see, f2fs_add_link -> get_new_data_page -> init_inode_metadata.
+ */
+ if (dn.data_blkaddr == NEW_ADDR) {
+ zero_user_segment(page, 0, PAGE_CACHE_SIZE);
+ SetPageUptodate(page);
+ return page;
+ }
+
+ err = f2fs_readpage(sbi, page, dn.data_blkaddr, READ_SYNC);
+ if (err)
+ return ERR_PTR(err);
+
+ lock_page(page);
+ if (!PageUptodate(page)) {
+ f2fs_put_page(page, 1);
+ return ERR_PTR(-EIO);
+ }
+ if (page->mapping != mapping) {
+ f2fs_put_page(page, 1);
+ goto repeat;
+ }
+ return page;
+}
+
+/*
+ * Caller ensures that this data page is never allocated.
+ * A new zero-filled data page is allocated in the page cache.
+ *
+ * Also, caller should grab and release a mutex by calling mutex_lock_op() and
+ * mutex_unlock_op().
+ * Note that, npage is set only by make_empty_dir.
+ */
+struct page *get_new_data_page(struct inode *inode,
+ struct page *npage, pgoff_t index, bool new_i_size)
+{
+ struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
+ struct address_space *mapping = inode->i_mapping;
+ struct page *page;
+ struct dnode_of_data dn;
+ int err;
+
+ set_new_dnode(&dn, inode, npage, npage, 0);
+ err = get_dnode_of_data(&dn, index, ALLOC_NODE);
+ if (err)
+ return ERR_PTR(err);
+
+ if (dn.data_blkaddr == NULL_ADDR) {
+ if (reserve_new_block(&dn)) {
+ if (!npage)
+ f2fs_put_dnode(&dn);
+ return ERR_PTR(-ENOSPC);
+ }
+ }
+ if (!npage)
+ f2fs_put_dnode(&dn);
+repeat:
+ page = grab_cache_page(mapping, index);
+ if (!page)
+ return ERR_PTR(-ENOMEM);
+
+ if (PageUptodate(page))
+ return page;
+
+ if (dn.data_blkaddr == NEW_ADDR) {
+ zero_user_segment(page, 0, PAGE_CACHE_SIZE);
+ SetPageUptodate(page);
+ } else {
+ err = f2fs_readpage(sbi, page, dn.data_blkaddr, READ_SYNC);
+ if (err)
+ return ERR_PTR(err);
+ lock_page(page);
+ if (!PageUptodate(page)) {
+ f2fs_put_page(page, 1);
+ return ERR_PTR(-EIO);
+ }
+ if (page->mapping != mapping) {
+ f2fs_put_page(page, 1);
+ goto repeat;
+ }
+ }
+
+ if (new_i_size &&
+ i_size_read(inode) < ((index + 1) << PAGE_CACHE_SHIFT)) {
+ i_size_write(inode, ((index + 1) << PAGE_CACHE_SHIFT));
+ /* Only the directory inode sets new_i_size */
+ set_inode_flag(F2FS_I(inode), FI_UPDATE_DIR);
+ mark_inode_dirty_sync(inode);
+ }
+ return page;
+}
+
+static void read_end_io(struct bio *bio, int err)
+{
+ const int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
+ struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1;
+
+ do {
+ struct page *page = bvec->bv_page;
+
+ if (--bvec >= bio->bi_io_vec)
+ prefetchw(&bvec->bv_page->flags);
+
+ if (uptodate) {
+ SetPageUptodate(page);
+ } else {
+ ClearPageUptodate(page);
+ SetPageError(page);
+ }
+ unlock_page(page);
+ } while (bvec >= bio->bi_io_vec);
+ bio_put(bio);
+}
+
+/*
+ * Fill the locked page with data located in the block address.
+ * Return unlocked page.
+ */
+int f2fs_readpage(struct f2fs_sb_info *sbi, struct page *page,
+ block_t blk_addr, int type)
+{
+ struct block_device *bdev = sbi->sb->s_bdev;
+ struct bio *bio;
+
+ trace_f2fs_readpage(page, blk_addr, type);
+
+ down_read(&sbi->bio_sem);
+
+ /* Allocate a new bio */
+ bio = f2fs_bio_alloc(bdev, 1);
+
+ /* Initialize the bio */
+ bio->bi_sector = SECTOR_FROM_BLOCK(sbi, blk_addr);
+ bio->bi_end_io = read_end_io;
+
+ if (bio_add_page(bio, page, PAGE_CACHE_SIZE, 0) < PAGE_CACHE_SIZE) {
+ bio_put(bio);
+ up_read(&sbi->bio_sem);
+ f2fs_put_page(page, 1);
+ return -EFAULT;
+ }
+
+ submit_bio(type, bio);
+ up_read(&sbi->bio_sem);
+ return 0;
+}
+
+/*
+ * This function should be used by the data read flow only where it
+ * does not check the "create" flag that indicates block allocation.
+ * The reason for this special functionality is to exploit VFS readahead
+ * mechanism.
+ */
+static int get_data_block_ro(struct inode *inode, sector_t iblock,
+ struct buffer_head *bh_result, int create)
+{
+ unsigned int blkbits = inode->i_sb->s_blocksize_bits;
+ unsigned maxblocks = bh_result->b_size >> blkbits;
+ struct dnode_of_data dn;
+ pgoff_t pgofs;
+ int err;
+
+ /* Get the page offset from the block offset(iblock) */
+ pgofs = (pgoff_t)(iblock >> (PAGE_CACHE_SHIFT - blkbits));
+
+ if (check_extent_cache(inode, pgofs, bh_result)) {
+ trace_f2fs_get_data_block(inode, iblock, bh_result, 0);
+ return 0;
+ }
+
+ /* When reading holes, we need its node page */
+ set_new_dnode(&dn, inode, NULL, NULL, 0);
+ err = get_dnode_of_data(&dn, pgofs, LOOKUP_NODE_RA);
+ if (err) {
+ trace_f2fs_get_data_block(inode, iblock, bh_result, err);
+ return (err == -ENOENT) ? 0 : err;
+ }
+
+ /* It does not support data allocation */
+ f2fs_bug_on(create);
+
+ if (dn.data_blkaddr != NEW_ADDR && dn.data_blkaddr != NULL_ADDR) {
+ int i;
+ unsigned int end_offset;
+
+ end_offset = IS_INODE(dn.node_page) ?
+ ADDRS_PER_INODE(F2FS_I(inode)) :
+ ADDRS_PER_BLOCK;
+
+ clear_buffer_new(bh_result);
+
+ /* Give more consecutive addresses for the read ahead */
+ for (i = 0; i < end_offset - dn.ofs_in_node; i++)
+ if (((datablock_addr(dn.node_page,
+ dn.ofs_in_node + i))
+ != (dn.data_blkaddr + i)) || maxblocks == i)
+ break;
+ map_bh(bh_result, inode->i_sb, dn.data_blkaddr);
+ bh_result->b_size = (i << blkbits);
+ }
+ f2fs_put_dnode(&dn);
+ trace_f2fs_get_data_block(inode, iblock, bh_result, 0);
+ return 0;
+}
+
+static int f2fs_read_data_page(struct file *file, struct page *page)
+{
+ return mpage_readpage(page, get_data_block_ro);
+}
+
+static int f2fs_read_data_pages(struct file *file,
+ struct address_space *mapping,
+ struct list_head *pages, unsigned nr_pages)
+{
+ return mpage_readpages(mapping, pages, nr_pages, get_data_block_ro);
+}
+
+int do_write_data_page(struct page *page)
+{
+ struct inode *inode = page->mapping->host;
+ block_t old_blk_addr, new_blk_addr;
+ struct dnode_of_data dn;
+ int err = 0;
+
+ set_new_dnode(&dn, inode, NULL, NULL, 0);
+ err = get_dnode_of_data(&dn, page->index, LOOKUP_NODE);
+ if (err)
+ return err;
+
+ old_blk_addr = dn.data_blkaddr;
+
+ /* This page is already truncated */
+ if (old_blk_addr == NULL_ADDR)
+ goto out_writepage;
+
+ set_page_writeback(page);
+
+ /*
+ * If current allocation needs SSR,
+ * it had better in-place writes for updated data.
+ */
+ if (unlikely(old_blk_addr != NEW_ADDR &&
+ !is_cold_data(page) &&
+ need_inplace_update(inode))) {
+ rewrite_data_page(F2FS_SB(inode->i_sb), page,
+ old_blk_addr);
+ } else {
+ write_data_page(inode, page, &dn,
+ old_blk_addr, &new_blk_addr);
+ update_extent_cache(new_blk_addr, &dn);
+ }
+out_writepage:
+ f2fs_put_dnode(&dn);
+ return err;
+}
+
+static int f2fs_write_data_page(struct page *page,
+ struct writeback_control *wbc)
+{
+ struct inode *inode = page->mapping->host;
+ struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
+ loff_t i_size = i_size_read(inode);
+ const pgoff_t end_index = ((unsigned long long) i_size)
+ >> PAGE_CACHE_SHIFT;
+ unsigned offset;
+ bool need_balance_fs = false;
+ int err = 0;
+
+ if (page->index < end_index)
+ goto write;
+
+ /*
+ * If the offset is out-of-range of file size,
+ * this page does not have to be written to disk.
+ */
+ offset = i_size & (PAGE_CACHE_SIZE - 1);
+ if ((page->index >= end_index + 1) || !offset) {
+ if (S_ISDIR(inode->i_mode)) {
+ dec_page_count(sbi, F2FS_DIRTY_DENTS);
+ inode_dec_dirty_dents(inode);
+ }
+ goto out;
+ }
+
+ zero_user_segment(page, offset, PAGE_CACHE_SIZE);
+write:
+ if (sbi->por_doing) {
+ err = AOP_WRITEPAGE_ACTIVATE;
+ goto redirty_out;
+ }
+
+ /* Dentry blocks are controlled by checkpoint */
+ if (S_ISDIR(inode->i_mode)) {
+ dec_page_count(sbi, F2FS_DIRTY_DENTS);
+ inode_dec_dirty_dents(inode);
+ err = do_write_data_page(page);
+ } else {
+ f2fs_lock_op(sbi);
+ err = do_write_data_page(page);
+ f2fs_unlock_op(sbi);
+ need_balance_fs = true;
+ }
+ if (err == -ENOENT)
+ goto out;
+ else if (err)
+ goto redirty_out;
+
+ if (wbc->for_reclaim)
+ f2fs_submit_bio(sbi, DATA, true);
+
+ clear_cold_data(page);
+out:
+ unlock_page(page);
+ if (need_balance_fs)
+ f2fs_balance_fs(sbi);
+ return 0;
+
+redirty_out:
+ wbc->pages_skipped++;
+ set_page_dirty(page);
+ return err;
+}
+
+#define MAX_DESIRED_PAGES_WP 4096
+
+static int __f2fs_writepage(struct page *page, struct writeback_control *wbc,
+ void *data)
+{
+ struct address_space *mapping = data;
+ int ret = mapping->a_ops->writepage(page, wbc);
+ mapping_set_error(mapping, ret);
+ return ret;
+}
+
+static int f2fs_write_data_pages(struct address_space *mapping,
+ struct writeback_control *wbc)
+{
+ struct inode *inode = mapping->host;
+ struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
+ bool locked = false;
+ int ret;
+ long excess_nrtw = 0, desired_nrtw;
+
+ /* deal with chardevs and other special file */
+ if (!mapping->a_ops->writepage)
+ return 0;
+
+ if (wbc->nr_to_write < MAX_DESIRED_PAGES_WP) {
+ desired_nrtw = MAX_DESIRED_PAGES_WP;
+ excess_nrtw = desired_nrtw - wbc->nr_to_write;
+ wbc->nr_to_write = desired_nrtw;
+ }
+
+ if (!S_ISDIR(inode->i_mode)) {
+ mutex_lock(&sbi->writepages);
+ locked = true;
+ }
+ ret = write_cache_pages(mapping, wbc, __f2fs_writepage, mapping);
+ if (locked)
+ mutex_unlock(&sbi->writepages);
+ f2fs_submit_bio(sbi, DATA, (wbc->sync_mode == WB_SYNC_ALL));
+
+ remove_dirty_dir_inode(inode);
+
+ wbc->nr_to_write -= excess_nrtw;
+ return ret;
+}
+
+static int f2fs_write_begin(struct file *file, struct address_space *mapping,
+ loff_t pos, unsigned len, unsigned flags,
+ struct page **pagep, void **fsdata)
+{
+ struct inode *inode = mapping->host;
+ struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
+ struct page *page;
+ pgoff_t index = ((unsigned long long) pos) >> PAGE_CACHE_SHIFT;
+ struct dnode_of_data dn;
+ int err = 0;
+
+ f2fs_balance_fs(sbi);
+repeat:
+ page = grab_cache_page_write_begin(mapping, index, flags);
+ if (!page)
+ return -ENOMEM;
+ *pagep = page;
+
+ f2fs_lock_op(sbi);
+
+ set_new_dnode(&dn, inode, NULL, NULL, 0);
+ err = get_dnode_of_data(&dn, index, ALLOC_NODE);
+ if (err)
+ goto err;
+
+ if (dn.data_blkaddr == NULL_ADDR)
+ err = reserve_new_block(&dn);
+
+ f2fs_put_dnode(&dn);
+ if (err)
+ goto err;
+
+ f2fs_unlock_op(sbi);
+
+ if ((len == PAGE_CACHE_SIZE) || PageUptodate(page))
+ return 0;
+
+ if ((pos & PAGE_CACHE_MASK) >= i_size_read(inode)) {
+ unsigned start = pos & (PAGE_CACHE_SIZE - 1);
+ unsigned end = start + len;
+
+ /* Reading beyond i_size is simple: memset to zero */
+ zero_user_segments(page, 0, start, end, PAGE_CACHE_SIZE);
+ goto out;
+ }
+
+ if (dn.data_blkaddr == NEW_ADDR) {
+ zero_user_segment(page, 0, PAGE_CACHE_SIZE);
+ } else {
+ err = f2fs_readpage(sbi, page, dn.data_blkaddr, READ_SYNC);
+ if (err)
+ return err;
+ lock_page(page);
+ if (!PageUptodate(page)) {
+ f2fs_put_page(page, 1);
+ return -EIO;
+ }
+ if (page->mapping != mapping) {
+ f2fs_put_page(page, 1);
+ goto repeat;
+ }
+ }
+out:
+ SetPageUptodate(page);
+ clear_cold_data(page);
+ return 0;
+
+err:
+ f2fs_unlock_op(sbi);
+ f2fs_put_page(page, 1);
+ return err;
+}
+
+static int f2fs_write_end(struct file *file,
+ struct address_space *mapping,
+ loff_t pos, unsigned len, unsigned copied,
+ struct page *page, void *fsdata)
+{
+ struct inode *inode = page->mapping->host;
+
+ SetPageUptodate(page);
+ set_page_dirty(page);
+
+ if (pos + copied > i_size_read(inode)) {
+ i_size_write(inode, pos + copied);
+ mark_inode_dirty(inode);
+ update_inode_page(inode);
+ }
+
+ unlock_page(page);
+ page_cache_release(page);
+ return copied;
+}
+
+static ssize_t f2fs_direct_IO(int rw, struct kiocb *iocb,
+ const struct iovec *iov, loff_t offset, unsigned long nr_segs)
+{
+ struct file *file = iocb->ki_filp;
+ struct inode *inode = file->f_mapping->host;
+
+ if (rw == WRITE)
+ return 0;
+
+ /* Needs synchronization with the cleaner */
+ return blockdev_direct_IO(rw, iocb, inode, iov, offset, nr_segs,
+ get_data_block_ro);
+}
+
+static void f2fs_invalidate_data_page(struct page *page, unsigned long offset)
+{
+ struct inode *inode = page->mapping->host;
+ struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
+ if (S_ISDIR(inode->i_mode) && PageDirty(page)) {
+ dec_page_count(sbi, F2FS_DIRTY_DENTS);
+ inode_dec_dirty_dents(inode);
+ }
+ ClearPagePrivate(page);
+}
+
+static int f2fs_release_data_page(struct page *page, gfp_t wait)
+{
+ ClearPagePrivate(page);
+ return 1;
+}
+
+static int f2fs_set_data_page_dirty(struct page *page)
+{
+ struct address_space *mapping = page->mapping;
+ struct inode *inode = mapping->host;
+
+ trace_f2fs_set_page_dirty(page, DATA);
+
+ SetPageUptodate(page);
+ if (!PageDirty(page)) {
+ __set_page_dirty_nobuffers(page);
+ set_dirty_dir_page(inode, page);
+ return 1;
+ }
+ return 0;
+}
+
+static sector_t f2fs_bmap(struct address_space *mapping, sector_t block)
+{
+ return generic_block_bmap(mapping, block, get_data_block_ro);
+}
+
+const struct address_space_operations f2fs_dblock_aops = {
+ .readpage = f2fs_read_data_page,
+ .readpages = f2fs_read_data_pages,
+ .writepage = f2fs_write_data_page,
+ .writepages = f2fs_write_data_pages,
+ .write_begin = f2fs_write_begin,
+ .write_end = f2fs_write_end,
+ .set_page_dirty = f2fs_set_data_page_dirty,
+ .invalidatepage = f2fs_invalidate_data_page,
+ .releasepage = f2fs_release_data_page,
+ .direct_IO = f2fs_direct_IO,
+ .bmap = f2fs_bmap,
+};