summaryrefslogtreecommitdiff
path: root/fs/f2fs
Commit message (Collapse)AuthorAgeFilesLines
* f2fs: avoid to ra unneeded blocks in recover flowChao Yu2014-12-083-18/+23
| | | | | | | | | | | To improve recovery speed, f2fs try to readahead many contiguous blocks in warm node segment, but for most time, abnormal power-off do not occur frequently, so when mount a normal power-off f2fs image, by contrary ra so many blocks and then invalid them will hurt the performance of mount. It's better to just ra the first next-block for normal condition. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: introduce is_valid_blkaddr to cleanup codes in ra_meta_pagesChao Yu2014-12-081-27/+26
| | | | | | | | | This patch does cleanup work, it introduces is_valid_blkaddr() to include verification code for blkaddr with upper and down boundary value which were in ra_meta_pages previous. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to enable readahead for SSA/CP blocksChao Yu2014-12-081-2/+15
| | | | | | | | | | | | 1.We use zero as upper boundary value for ra SSA/CP blocks, we will skip readahead as verification failure with max number, it causes low performance. 2.Low boundary value is not accurate for SSA/CP/POR region verification, so these values need to be redefined. This patch fixes above issues. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: use atomic for counting inode with inline_{dir,inode} flagChao Yu2014-12-082-8/+11
| | | | | | | | As inline_{dir,inode} stat is increased/decreased concurrently by multi threads, so the value is not so accurate, let's use atomic type for counting accurately. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: cleanup path to need cp at fsyncChangman Lee2014-12-081-36/+43
| | | | | | | | Added some commentaries for code readability and cleaned up if-statement clearly. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: check if inode state is dirty at fsyncChangman Lee2014-12-081-6/+19
| | | | | | | | If inode state is dirty, go straight to write. Suggested-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: count the number of inmemory pagesJaegeuk Kim2014-12-083-1/+8
| | | | | | This patch adds counting # of inmemory pages in the page cache. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: release inmemory pages when the file was closedJaegeuk Kim2014-12-081-0/+9
| | | | | | If file is closed, let's drop inmemory pages. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: set page private for inmemory pages for truncationJaegeuk Kim2014-12-081-0/+2
| | | | | | | The inmemory pages should be handled by invalidate_page since it needs to be released int the truncation path. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: count inline_xx in do_read_inodeJaegeuk Kim2014-12-081-2/+4
| | | | | | | | In do_read_inode, if we failed __recover_inline_status, the inode has inline flag without increasing its count. Later, f2fs_evict_inode will decrease the count, which causes -1. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: do retry operations with cond_reschedJaegeuk Kim2014-12-084-38/+20
| | | | | | | | | | This patch revists retrial paths in f2fs. The basic idea is to use cond_resched instead of retrying from the very early stage. Suggested-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: call radix_tree_preload before radix_tree_insertJaegeuk Kim2014-12-053-6/+19
| | | | | | | | | | | | | | | | | | | | This patch tries to fix: BUG: using smp_processor_id() in preemptible [00000000] code: f2fs_gc-254:0/384 (radix_tree_node_alloc+0x14/0x74) from [<c033d8a0>] (radix_tree_insert+0x110/0x200) (radix_tree_insert+0x110/0x200) from [<c02e8264>] (gc_data_segment+0x340/0x52c) (gc_data_segment+0x340/0x52c) from [<c02e8658>] (f2fs_gc+0x208/0x400) (f2fs_gc+0x208/0x400) from [<c02e8a98>] (gc_thread_func+0x248/0x28c) (gc_thread_func+0x248/0x28c) from [<c0139944>] (kthread+0xa0/0xac) (kthread+0xa0/0xac) from [<c0105ef8>] (ret_from_fork+0x14/0x3c) The reason is that f2fs calls radix_tree_insert under enabled preemption. So, before calling it, we need to call radix_tree_preload. Otherwise, we should use _GFP_WAIT for the radix tree, and use mutex or semaphore to cover the radix tree operations. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: use rw_semaphore for nat entry lockJaegeuk Kim2014-12-032-27/+27
| | | | | | | | | | | Previoulsy, we used rwlock for nat_entry lock. But, now we have a lot of complex operations in set_node_addr. (e.g., allocating kernel memories, handling radix_trees, and so on) So, this patches tries to change spinlock to rw_semaphore to give CPUs to other threads. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix missing kmem_cache_freeJaegeuk Kim2014-12-031-1/+1
| | | | | | This patch fixes missing kmem_cache_free when handling errors. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: more fast lookup for gc_inode listChangman Lee2014-12-022-19/+34
| | | | | | | | | If there are many inodes that have data blocks in victim segment, it takes long time to find a inode in gc_inode list. Let's use radix_tree to reduce lookup time. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: cleanup redundant macroChangman Lee2014-12-011-3/+3
| | | | | | | We've already made fi and sbi for inode. Let's avoid duplicated work. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to return correct error number in f2fs_write_beginChao Yu2014-12-011-1/+3
| | | | | | | Fix the wrong error number in error path of f2fs_write_begin. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: cleanup if-statement of phase in gc_data_segmentChangman Lee2014-11-271-16/+16
| | | | | | | | Little cleanup to distinguish each phase easily Signed-off-by: Changman Lee <cm224.lee@samsung.com> [Jaegeuk Kim: modify indentation for code readability] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to recover converted inline_dataJaegeuk Kim2014-11-251-0/+3
| | | | | | | | If an inode has converted inline_data which was written to the disk, we should set its inode flag for further fsync so that this inline_data can be recovered from sudden power off. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: make clean the page before writingJaegeuk Kim2014-11-251-1/+6
| | | | | | If a page is set to be written to the disk, we can make clean the page. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: no more dirty_nat_entires when flushingChangman Lee2014-11-251-4/+4
| | | | | | | | After flushing dirty nat entries, it has to be no more dirty nat entries. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: check dirty_nat_cnt before flushing nat entries in journalChangman Lee2014-11-251-4/+3
| | | | | | | | | It's meaningless to check dirty_nat_cnt after re-dirtying nat entries in journal. And although there are rooms for dirty nat entires if dirty_nat_cnt is zero, it's also meaningless to check __has_cursum_space. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix deadlock during inline_data conversionJaegeuk Kim2014-11-251-14/+14
| | | | | | | | | | | | | | | | | | | | A deadlock can be occurred: Thread 1] Thread 2] - f2fs_write_data_pages - f2fs_write_begin - lock_page(page #0) - grab_cache_page(page #X) - get_node_page(inode_page) - grab_cache_page(page #0) : to convert inline_data - f2fs_write_data_page - f2fs_write_inline_data - get_node_page(inode_page) In this case, trying to lock inode_page and page #0 causes deadlock. In order to avoid this, this patch adds a rule for this locking policy, which is that page #0 should be locked followed by inode_page lock. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix typos for the word "destroy" in jump labelsMarkus Elfring2014-11-251-4/+4
| | | | | | | | | | Two jump labels were adjusted in the implementation of the create_node_manager_caches() function because these identifiers contained typos. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix livelock calling f2fs_iget during f2fs_evict_inodeJaegeuk Kim2014-11-231-1/+10
| | | | | | | | | | | | | | | | | | | | | In f2fs_evict_inode, commit_inmemory_pages f2fs_gc f2fs_iget iget_locked -> wait for inode free Here, if the inode is same as the one to be evicted, f2fs should wait forever. Actually, we should not call f2fs_balance_fs during f2fs_evict_inode to avoid this. But, the commit_inmem_pages calls f2fs_balance_fs by default, even if f2fs_evict_inode wants to free inmemory pages only. Hence, this patch adds to trigger f2fs_balance_fs only when there is something to write. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: introduce f2fs_dentry_kunmap to clean upJaegeuk Kim2014-11-234-24/+18
| | | | | | This patch introduces f2fs_dentry_kunmap to clean up dirty codes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix wrong data structure when create slabChangman Lee2014-11-231-1/+1
| | | | | | | | It used nat_entry_set when create slab for sit_entry_set. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: call flush_dcache_page when the page was updatedJaegeuk Kim2014-11-231-0/+1
| | | | | | Whenever f2fs updates mapped pages, it needs to call flush_dcache_page. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: write SSA pages under memory pressureJaegeuk Kim2014-11-191-1/+4
| | | | | | Under memory pressure, we don't need to skip SSA page writes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: submit bio for node blocks in the reclaim pathJaegeuk Kim2014-11-191-0/+4
| | | | | | | If a node page is request to be written during the reclaiming path, we should submit the bio to avoid pending to recliam it. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: introduce struct inode_management to wrap inner fieldsChao Yu2014-11-194-49/+66
| | | | | | | | | | | | | | | | | | | | | | | | | Now in f2fs, we have three inode cache: ORPHAN_INO, APPEND_INO, UPDATE_INO, and we manage fields related to inode cache separately in struct f2fs_sb_info for each inode cache type. This makes codes a bit messy, so that this patch intorduce a new struct inode_management to wrap inner fields as following which make codes more neat. /* for inner inode cache management */ struct inode_management { struct radix_tree_root ino_root; /* ino entry array */ spinlock_t ino_lock; /* for ino entry lock */ struct list_head ino_list; /* inode list head */ unsigned long ino_num; /* number of entries */ }; struct f2fs_sb_info { ... struct inode_management im[MAX_INO_ENTRY]; /* manage inode cache */ ... } Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: remove unneeded check code with option in f2fs_remountChao Yu2014-11-191-2/+2
| | | | | | | | | Because we have checked the contrary condition in case of "if" judgment, we do not need to check the condition again in case of "else" judgment. Let's remove it. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid unable to restart gc thread in remountChao Yu2014-11-192-3/+1
| | | | | | | | | | | | In f2fs_remount, we will stop gc thread and set need_restart_gc as true when new option is set without BG_GC, then if any error occurred in the following procedure, we can restore to start the gc thread. But after that, We will fail to restore gc thread in start_gc_thread as BG_GC is not set in new option, so we'd better move this condition judgment out of start_gc_thread to fix this issue. Signed-off-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: put the inode page when error was occurredJaegeuk Kim2014-11-181-4/+6
| | | | | | We should put the inode page when error was occurred. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix to call put_page at the error handling routineJaegeuk Kim2014-11-181-3/+3
| | | | | | | The locked page should be released before returning the function. Reviewed-by: Chao Yu <chao2.yu@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: convert inline_data when i_size becomes largeJaegeuk Kim2014-11-112-0/+9
| | | | | | | | | | If i_size becomes large outside of MAX_INLINE_DATA, we shoud convert the inode. Otherwise, we can make some dirty pages during the truncation, and those pages will be written through f2fs_write_data_page. At that moment, the inode has still inline_data, so that it tries to write non- zero pages into inline_data area. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: fix deadlock to grab 0'th data pageJaegeuk Kim2014-11-111-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | The scenario is like this. One trhead triggers: f2fs_write_data_pages lock_page f2fs_write_data_page f2fs_lock_op <- wait The other thread triggers: f2fs_truncate truncate_blocks f2fs_lock_op truncate_partial_data_page lock_page <- wait for locking the page This patch resolves this bug by relocating truncate_partial_data_page. This function is just to truncate user data page and not related to FS consistency as well. And, we don't need to call truncate_inline_data. Rather than that, f2fs_write_data_page will finally update inline_data later. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: reduce the number of inline_data inode before clearing itJaegeuk Kim2014-11-101-1/+1
| | | | | | | The # of inline_data inode is decreased only when it has inline_data. After clearing the flag, we can't decreased the number. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: implement -o dirsyncJaegeuk Kim2014-11-101-0/+24
| | | | | | | If a mount option has dirsync, we should call checkpoint for all the directory operations. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: do not skip any writes under memory pressureJaegeuk Kim2014-11-101-0/+3
| | | | | | Under memory pressure, let's avoid skipping data writes. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: write node pages if checkpoint is not doingJaegeuk Kim2014-11-101-4/+6
| | | | | | | | It needs to write node pages if checkpoint is not doing in order to avoid memory pressure. Reviewed-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: control the memory footprint used by ino entriesJaegeuk Kim2014-11-063-8/+26
| | | | | | | This patch adds to control the memory footprint used by ino entries. This will conduct best effort, not strictly. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: introduce the number of inode entriesJaegeuk Kim2014-11-063-14/+19
| | | | | | This patch adds to monitor the number of ino entries. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: disable roll-forward when active_logs = 2Jaegeuk Kim2014-11-052-2/+4
| | | | | | | The roll-forward mechanism should be activated when the number of active logs is not 2. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: introduce -o fastboot for reducing booting time onlyJaegeuk Kim2014-11-044-6/+16
| | | | | | | | | If a system wants to reduce the booting time as a top priority, now we can use a mount option, -o fastboot. With this option, f2fs conducts a little bit slow write_checkpoint, but it can avoid the node page reads during the next mount time. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: avoid race condition in handling wait_ioJaegeuk Kim2014-11-043-29/+10
| | | | | | | | | | | | | | | | __submit_merged_bio f2fs_write_end_io f2fs_write_end_io wait_io = X wait_io = x complete(X) complete(X) wait_io = NULL wait_for_completion() free(X) spin_lock(X) kernel panic In order to avoid this, this patch removes the wait_io facility. Instead, we can use wait_on_all_pages_writeback(sbi) to wait for end_ios. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: send discard commands in larger extentJaegeuk Kim2014-11-041-17/+27
| | | | | | | | If there is a chance to make a huge sized discard command, we don't need to split it out, since each blkdev_issue_discard should wait one at a time. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: revisit inline_data to avoid data races and potential bugsJaegeuk Kim2014-11-046-212/+250
| | | | | | | | | | | This patch simplifies the inline_data usage with the following rule. 1. inline_data is set during the file creation. 2. If new data is requested to be written ranges out of inline_data, f2fs converts that inode permanently. 3. There is no cases which converts non-inline_data inode to inline_data. 4. The inline_data flag should be changed under inode page lock. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: remove pointless bit testing in f2fs_delete_entry()Jan Kara2014-11-031-1/+1
| | | | | | | | | There's no point in using test_and_clear_bit_le() when we don't use the return value of the function. Just use clear_bit_le() instead. Coverity-id: 1016434 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
* f2fs: do not discard data protected by the previous checkpointJaegeuk Kim2014-11-031-1/+1
| | | | | | | We should not discard any data protected by the previous checkpoint all the time. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>