summaryrefslogtreecommitdiff
path: root/fs
Commit message (Collapse)AuthorAgeFilesLines
* NFSv4.1: Clean ups and bugfixes for the pNFS read/writeback/commit codeTrond Myklebust2012-03-176-97/+172
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move more pnfs-isms out of the generic commit code. Bugfixes: - filelayout_scan_commit_lists doesn't need to get/put the lseg. In fact since it is run under the inode->i_lock, the lseg_put() can deadlock. - Ensure that we distinguish between what needs to be done for commit-to-data server and what needs to be done for commit-to-MDS using the new flag PG_COMMIT_TO_DS. Otherwise we may end up calling put_lseg() on a bucket for a struct nfs_page that got written through the MDS. - Fix a case where we were using list_del() on an nfs_page->wb_list instead of list_del_init(). - filelayout_initiate_commit needs to call filelayout_commit_release on error instead of the mds_ops->rpc_release(). Otherwise it won't clear the commit lock. Cleanups: - Let the files layout manage the commit lists for the pNFS case. Don't expose stuff like pnfs_choose_commit_list, and the fact that the commit buckets hold references to the layout segment in common code. - Cast out the put_lseg() calls for the struct nfs_read/write_data->lseg into the pNFS layer from whence they came. - Let the pNFS layer manage the NFS_INO_PNFS_COMMIT bit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Fred Isaman <iisaman@netapp.com>
* NFS: Fix a compile error when !defined NFS_DEBUGTrond Myklebust2012-03-141-1/+1
| | | | | | | | | We should use the 'ifdebug' wrapper rather than trying to inline tests of nfs_debug, so that the code compiles correctly when we don't define NFS_DEBUG. Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Rate limit the state manager for lock reclaim warning messagesWilliam Dauchy2012-03-141-1/+2
| | | | | | | Adding rate limit on `Lock reclaim failed` messages since it could fill up system logs Signed-off-by: William Dauchy <wdauchy@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* pnfs-obj: Uglify objio_segment allocation for the sake of the principle :-(Boaz Harrosh2012-03-131-14/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At some past instance Linus Trovalds wrote: > From: Linus Torvalds <torvalds@linux-foundation.org> > commit a84a79e4d369a73c0130b5858199e949432da4c6 upstream. > > The size is always valid, but variable-length arrays generate worse code > for no good reason (unless the function happens to be inlined and the > compiler sees the length for the simple constant it is). > > Also, there seems to be some code generation problem on POWER, where > Henrik Bakken reports that register r28 can get corrupted under some > subtle circumstances (interrupt happening at the wrong time?). That all > indicates some seriously broken compiler issues, but since variable > length arrays are bad regardless, there's little point in trying to > chase it down. > > "Just don't do that, then". Since then any use of "variable length arrays" has become blasphemous. Even in perfectly good, beautiful, perfectly safe code like the one below where the variable length arrays are only used as a sizeof() parameter, for type-safe dynamic structure allocations. GCC is not executing any stack allocation code. I have produced a small file which defines two functions main1(unsigned numdevs) and main2(unsigned numdevs). main1 uses code as before with call to malloc and main2 uses code as of after this patch. I compiled it as: gcc -O2 -S see_asm.c and here is what I get: <see_asm.s> main1: .LFB7: .cfi_startproc mov %edi, %edi leaq 4(%rdi,%rdi), %rdi salq $3, %rdi jmp malloc .cfi_endproc .LFE7: .size main1, .-main1 .p2align 4,,15 .globl main2 .type main2, @function main2: .LFB8: .cfi_startproc mov %edi, %edi addq $2, %rdi salq $4, %rdi jmp malloc .cfi_endproc .LFE8: .size main2, .-main2 .section .text.startup,"ax",@progbits .p2align 4,,15 </see_asm.s> *Exact* same code !!! So please seriously consider not accepting this patch and leave the perfectly good code intact. CC: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: null dereference in dev_remove()Dan Carpenter2012-03-131-1/+1
| | | | | | | | | | | In commit 5ffaf85541 "NFS: replace global bl_wq with per-net one" we made "msg" a pointer instead of a struct stored in stack memory. But we forgot to change the memset() here so we're still clearing stack memory instead clearing the struct like we intended. It will lead to a kernel crash. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Rate limit the state manager warning messagesTrond Myklebust2012-03-123-4/+6
| | | | | | | | Prevent the state manager from filling up system logs when recovery fails on the server. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
* NFS: Check return value from rpc_queue_upcall()Bryan Schumaker2012-03-121-2/+7
| | | | | | | | | This function could fail to queue the upcall if rpc.idmapd is not running, causing a warning message to be printed. Instead, I want to check the return value and revoke the key if the upcall can't be run. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Only define some function when v4.1 is enabledBryan Schumaker2012-03-121-0/+4
| | | | | | | | Now that the nfs4_cb_match_client() function is static, gcc notices that it is only used when CONFIG_NFS_V4_1 is enabled. Signed-off-by: Bryan Schumaker <bjschuma@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix a number of sparse warningsTrond Myklebust2012-03-1113-33/+38
| | | | | | | | | | | | | Fix a number of "warning: symbol 'foo' was not declared. Should it be static?" conditions. Fix 2 cases of "warning: Using plain integer as NULL pointer" fs/nfs/delegation.c:263:31: warning: restricted fmode_t degrades to integer - We want to allow upgrades to a WRITE delegation, but should otherwise consider servers that hand out duplicate delegations to be borken. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: replace global bl_wq with per-net oneStanislav Kinsbursky2012-03-115-31/+39
| | | | | | | | | | | | | This queue is used for sleeping in kernel and it have to be per-net since we don't want to wake any other waiters except in out network nemespace. BTW, move wq to per-net data is easy. But some way to handle upcall timeouts have to be provided. On message destroy in case of timeout, tasks, waiting for message to be delivered, should be awakened. Thus, some data required to located the right wait queue. Chosen solution replaces rpc_pipe_msg object with new introduced bl_pipe_msg object, containing rpc_pipe_msg and proper wq. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: replace global bl_mount_reply with per-net oneStanislav Kinsbursky2012-03-113-9/+11
| | | | | | | | This global variable is used for blocklayout downcall and thus can be corrupted if case of existence of multiple networks namespaces. Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: remove nfs_inode radix treeFred Isaman2012-03-107-177/+206
| | | | | | | | The radix tree is only being used to compile lists of reqs needing commit. It is simpler to just put the reqs directly into a list. Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: remove NFS_PAGE_TAG_LOCKEDFred Isaman2012-03-102-42/+10
| | | | | | | | The last real use of this tag was removed by commit 7f2f12d963 NFS: Simplify nfs_wb_page() Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.0: Re-establish the callback channel on NFS4ERR_CB_PATHDOWNTrond Myklebust2012-03-103-20/+16
| | | | | | | | | | | | | | When the NFSv4.0 server tells us that it can no-longer talk to us on the callback channel, we should attempt a new SETCLIENTID in order to re-transmit the callback channel information. Note that as long as we do not change the boot verifier, this is a safe procedure; the server is required to keep our state. Also move the function nfs_handle_cb_pathdown to fs/nfs/nfs4state.c, and change the name in order to mark it as being specific to NFSv4.0. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Clean up nfs4_select_rw_stateid()Trond Myklebust2012-03-084-20/+49
| | | | | | | Ensure that we select delegation stateids first, then lock stateids and then open stateids. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Don't copy read delegation stateids in setattrTrond Myklebust2012-03-083-8/+12
| | | | | | The server will just return an NFS4ERR_OPENMODE anyway. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1 cleanup DS stateid error handlingAndy Adamson2012-03-081-7/+2
| | | | | | | | The error handler nfs4_state parameter is never NULL in the pNFS case as the open_context must carry an nfs_state. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Return the delegation if the server returns NFS4ERR_OPENMODETrond Myklebust2012-03-072-1/+13
| | | | | | | | | | If a setattr() fails because of an NFS4ERR_OPENMODE error, it is probably due to us holding a read delegation. Ensure that the recovery routines return that delegation in this case. Reported-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
* NFSv4: Don't free the nfs4_lock_state until after the release_lockownerTrond Myklebust2012-03-073-15/+28
| | | | | | | Otherwise we can end up with sequence id problems if the client reuses the owner_id before the server has processed the release_lockowner Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1 handle DS stateid errorsAndy Adamson2012-03-073-1/+31
| | | | | | | | | | Handle DS READ and WRITE stateid errors by recovering the stateid on the MDS. NFS4ERR_OLD_STATEID is ignored as the client always sends a state sequenceid of zero for DS READ and WRITE stateids. Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: add fh_crc to debug outputWeston Andros Adamson2012-03-071-2/+4
| | | | | | | Print the filehandle crc in two debug messages Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: add filehandle crc for debug displayWeston Andros Adamson2012-03-071-3/+20
| | | | | | | Match wireshark's CRC-32 hash for easier debugging Signed-off-by: Weston Andros Adamson <dros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Add a helper encode_uint64Trond Myklebust2012-03-061-10/+11
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: More xdr cleanupsTrond Myklebust2012-03-061-114/+50
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Cleanup - convert more functions to use encode_op_hdrTrond Myklebust2012-03-061-9/+2
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix nfs4_verifier memory alignmentChuck Lever2012-03-062-33/+39
| | | | | | | | | | | | | | | | | | | | | | | Clean up due to code review. The nfs4_verifier's data field is not guaranteed to be u32-aligned. Casting an array of chars to a u32 * is considered generally hazardous. Fix this by using a __be32 array to generate a verifier's contents, and then byte-copy the contents into the verifier field. The contents of a verifier, for all intents and purposes, are opaque bytes. Only local code that generates a verifier need know the actual content and format. Everyone else compares the full byte array for exact equality. Also, sizeof(nfs4_verifer) is the size of the in-core verifier data structure, but NFS4_VERIFIER_SIZE is the number of octets in an XDR'd verifier. The two are not interchangeable, even if they happen to have the same value. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Add a encode op helperTrond Myklebust2012-03-061-129/+32
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Add a helper for encoding NFSv4 sequence idsTrond Myklebust2012-03-061-19/+28
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Minor clean ups for encode_string()Trond Myklebust2012-03-061-22/+18
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Simplify the struct nfs4_stateidTrond Myklebust2012-03-066-18/+17
| | | | | | | | | Replace the union with the common struct stateid4 as defined in both RFC3530 and RFC5661. This makes it easier to access the sequence id, which will again make implementing support for parallel OPEN calls easier. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Add helpers for basic copying of stateidsTrond Myklebust2012-03-065-36/+38
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Rename nfs4_copy_stateid()Trond Myklebust2012-03-064-4/+4
| | | | | | | It is really a function for selecting the correct stateid to use in a read or write situation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Add a helper for encoding stateidsTrond Myklebust2012-03-061-51/+62
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Add a helper for encoding opaque dataTrond Myklebust2012-03-061-5/+9
| | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Rename encode_stateid() to encode_open_stateid()Trond Myklebust2012-03-061-3/+3
| | | | | | | The current version of encode_stateid really only applies to open stateids. You can't use it for locks, delegations or layouts. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4: Further clean-ups of delegation stateid validationTrond Myklebust2012-03-064-28/+27
| | | | | | | | | Change the name to reflect what we're really doing: testing two stateids for whether or not they match according the the rules in RFC3530 and RFC5661. Move the code from callback_proc.c to nfs4proc.c Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFSv4.1: Fix matching of the stateids when returning a delegationTrond Myklebust2012-03-062-6/+6
| | | | | | | | | | | | nfs41_validate_delegation_stateid is broken if we supply a stateid with a non-zero sequence id. Instead of trying to match the sequence id, the function assumes that we always want to error. While this is true for a delegation callback, it is not true in general. Also fix a typo in nfs4_callback_recall. Reported-by: Andy Adamson <andros@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Properly handle the case where the delegation is revokedTrond Myklebust2012-03-065-4/+57
| | | | | | | | | | | | | | | | If we know that the delegation stateid is bad or revoked, we need to remove that delegation as soon as possible, and then mark all the stateids that relied on that delegation for recovery. We cannot use the delegation as part of the recovery process. Also note that NFSv4.1 uses a different error code (NFS4ERR_DELEG_REVOKED) to indicate that the delegation was revoked. Finally, ensure that setlk() and setattr() can both recover safely from a revoked delegation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
* NFS: Fix a typo in _nfs_display_fhandleTrond Myklebust2012-03-061-1/+1
| | | | | | | | The check for 'fh == NULL' needs to come _before_ we dereference fh. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* NFS: Fix a compile issue when !CONFIG_NFS_V4_1Trond Myklebust2012-03-051-9/+19
| | | | | | | | The attempt to display the implementation ID needs to be conditional on whether or not CONFIG_NFS_V4_1 is defined Reported-by: Bryan Schumaker <Bryan.Schumaker@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* Merge commit 'nfs-for-3.3-4' into nfs-for-nextTrond Myklebust2012-03-0310-166/+117
|\ | | | | | | | | | | | | | | Conflicts: fs/nfs/nfs4proc.c Back-merge of the upstream kernel in order to fix a conflict with the slotid type conversion and implementation id patches...
| * NFSv4: fix server_scope memory leakWeston Andros Adamson2012-02-171-6/+9
| | | | | | | | | | | | | | | | | | server_scope would never be freed if nfs4_check_cl_exchange_flags() returned non-zero Signed-off-by: Weston Andros Adamson <dros@netapp.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFSv4.1: Fix a NFSv4.1 session initialisation regressionTrond Myklebust2012-02-171-65/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit aacd553 (NFSv4.1: cleanup init and reset of session slot tables) introduces a regression in the session initialisation code. New tables now find their sequence ids initialised to 0, rather than the mandated value of 1 (see RFC5661). Fix the problem by merging nfs4_reset_slot_table() and nfs4_init_slot_table(). Since the tbl->max_slots is initialised to 0, the test in nfs4_reset_slot_table for max_reqs != tbl->max_slots will automatically pass for an empty table. Reported-by: Vitaliy Gusev <gusev.vitaliy@nexenta.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
| * NFSv4: Ensure we throw out bad delegation stateids on NFS4ERR_BAD_STATEIDTrond Myklebust2012-02-091-0/+2
| | | | | | | | | | | | | | | | To ensure that we don't just reuse the bad delegation when we attempt to recover the nfs4_state that received the bad stateid error. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@vger.kernel.org
| * NFSv4: Fix an Oops in the NFSv4 getacl codeTrond Myklebust2012-02-032-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit bf118a342f10dafe44b14451a1392c3254629a1f (NFSv4: include bitmap in nfsv4 get acl data) introduces the 'acl_scratch' page for the case where we may need to decode multi-page data. However it fails to take into account the fact that the variable may be NULL (for the case where we're not doing multi-page decode), and it also attaches it to the encoding xdr_stream rather than the decoding one. The immediate result is an Oops in nfs4_xdr_enc_getacl due to the call to page_address() with a NULL page pointer. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Andy Adamson <andros@netapp.com> Cc: stable@vger.kernel.org
| * Merge branch 'for-linus' of ↵Linus Torvalds2012-02-025-10/+19
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: fix safety of rbd_put_client() rbd: fix a memory leak in rbd_get_client() ceph: create a new session lock to avoid lock inversion ceph: fix length validation in parse_reply_info() ceph: initialize client debugfs outside of monc->mutex ceph: change "ceph.layout" xattr to be "ceph.file.layout"
| | * ceph: create a new session lock to avoid lock inversionAlex Elder2012-02-024-9/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lockdep was reporting a possible circular lock dependency in dentry_lease_is_valid(). That function needs to sample the session's s_cap_gen and and s_cap_ttl fields coherently, but needs to do so while holding a dentry lock. The s_cap_lock field was being used to protect the two fields, but that can't be taken while holding a lock on a dentry within the session. In most cases, the s_cap_gen and s_cap_ttl fields only get operated on separately. But in three cases they need to be updated together. Implement a new lock to protect the spots updating both fields atomically is required. Signed-off-by: Alex Elder <elder@dreamhost.com> Reviewed-by: Sage Weil <sage@newdream.net>
| | * ceph: fix length validation in parse_reply_info()Xi Wang2012-02-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | "len" is read from network and thus needs validation. Otherwise, given a bogus "len" value, p+len could be an out-of-bounds pointer, which is used in further parsing. Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: Sage Weil <sage@newdream.net>
| | * ceph: change "ceph.layout" xattr to be "ceph.file.layout"Alex Elder2012-02-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The virtual extended attribute named "ceph.layout" is meaningful only for regular files. Change its name to be "ceph.file.layout" to more directly reflect that in the ceph xattr namespace. Preserve the old "ceph.layout" name for the time being (until we decide it's safe to get rid of it entirely). Add a missing initializer for "readonly" in the terminating entry. Signed-off-by: Alex Elder <elder@dreamhost.com> Reviewed-by: Sage Weil <sage@newdream.net>
| * | Fix race in process_vm_rw_coreChristopher Yeoh2012-02-021-20/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes the race in process_vm_core found by Oleg (see http://article.gmane.org/gmane.linux.kernel/1235667/ for details). This has been updated since I last sent it as the creation of the new mm_access() function did almost exactly the same thing as parts of the previous version of this patch did. In order to use mm_access() even when /proc isn't enabled, we move it to kernel/fork.c where other related process mm access functions already are. Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>