diff options
author | Gulam Mohamed <gulam.mohamed@oracle.com> | 2020-11-02 13:31:59 +0000 |
---|---|---|
committer | Gulam Mohamed <gulam.mohamed@oracle.com> | 2020-11-17 05:57:43 +0000 |
commit | 9258c8eae046d98511d92912983778ca57ba201f (patch) | |
tree | bc547109e93beca280476883c003743851fb31ce | |
parent | 348f093311f289635f9e85129aad944ffe1eda91 (diff) | |
download | open-iscsi-9258c8eae046d98511d92912983778ca57ba201f.tar.gz |
iscsid: Poll timeout value to 1 minute for iscsid
Description
===========
This patch has the following two changes
------------------------------------------
Change 1: Specify the poll timeout value to 1 minute as third parameter
to the function iscsid_exec_req() when called from sync_session()
Reason: Currently the poll timeout value sent sent to iscsid_response()
from iscsid_exec_req() is "info->iscsid_req_tmo" which is -1 as set in
"iscsi_sysfs_for_each_session()". When iscsid_response() receives this
-1, it sets the timeout value to ISCSID_REQ_TIMEOUT (1000 ms) and also
sets a local variable "poll_wait" to 1. There is a while loop below this
which checks the value of "poll_wait". If "poll_wait" is set to 1, then
it calls "continue". For sessions which are giving continuous connection
errors (like target service stopped OR target node is shutdown OR any
other continuous connection errors etc ...), this results in an
indefinite while loop, as nothing will be written to the poll fd,
resulting in further sessions not getting synced (or recovered) when the
iscsid is restarted due to any reason (either manual restart of iscsid
or rpm install). Poll timeout of 1 minute seems to be a reasonable value
for the slow connections
Change 2: Change the return error code from the function
iscsid_response() to ISCSI_ERR_SESSION_NOT_CONNECTED when the poll times
out and returns error 0
Reason: Currently the iscsid_response() function returns
ISCSI_ERR_ISCSID_NOTCONN error code in case if the poll times out and
the poll_wait variable is 0 (i.e poll timeout sent to
iscsid_response() was -1). Returning this error code doesn't
seem to be correct for the following two reasons:
a. ISCSI_ERR_ISCSID_NOTCONN should be returned only when we
are not able to connect to iscsid but we are in iscsid_response()
function indicating that we were able to connect to iscsid
successfully
b. When ISCSI_ERR_ISCSID_NOTCONN is returned, the
sync_session() will retry the request till 30 retries are reached. This
causes an overlap of multiple "iscsi login task" in
kernel and the kernel will return an error to the
user-space indicating "Login/Text in progress. Cannot start new task.".
This is repeated continuously which results in session not getting
recovered even after it comes back up. Also at one point of time we
observed that the kernel panics in "iscsi_sw_tcp_conn_set_param()" while
trying to set the param ISCSI_PARAM_DATADGST_EN. We have a kernel fix
for this panic which is in review for upstream
-rw-r--r-- | usr/iscsi_sysfs.c | 2 | ||||
-rw-r--r-- | usr/iscsi_sysfs.h | 1 | ||||
-rw-r--r-- | usr/iscsid_req.c | 2 |
3 files changed, 3 insertions, 2 deletions
diff --git a/usr/iscsi_sysfs.c b/usr/iscsi_sysfs.c index 435c576..de15b8c 100644 --- a/usr/iscsi_sysfs.c +++ b/usr/iscsi_sysfs.c @@ -1426,7 +1426,7 @@ int iscsi_sysfs_for_each_session(void *data, int *nr_found, if (!info) return ISCSI_ERR_NOMEM; - info->iscsid_req_tmo = -1; + info->iscsid_req_tmo = ISCSID_RESP_POLL_TIMEOUT; n = scandir(ISCSI_SESSION_DIR, &namelist, trans_filter, alphasort); if (n <= 0) diff --git a/usr/iscsi_sysfs.h b/usr/iscsi_sysfs.h index 1d0377f..9575c65 100644 --- a/usr/iscsi_sysfs.h +++ b/usr/iscsi_sysfs.h @@ -34,6 +34,7 @@ struct iscsi_auth_config; struct flashnode_rec; #define SCSI_MAX_STATE_VALUE 32 +#define ISCSID_RESP_POLL_TIMEOUT 60000 extern void free_transports(void); extern char *iscsi_sysfs_get_iscsi_kernel_version(void); diff --git a/usr/iscsid_req.c b/usr/iscsid_req.c index 3bbf5b9..a3aba6d 100644 --- a/usr/iscsid_req.c +++ b/usr/iscsid_req.c @@ -156,7 +156,7 @@ int iscsid_response(int fd, iscsiadm_cmd_e cmd, iscsiadm_rsp_t *rsp, if (!err) { if (poll_wait) continue; - return ISCSI_ERR_ISCSID_NOTCONN; + return ISCSI_ERR_SESSION_NOT_CONNECTED; } else if (err < 0) { if (errno == EINTR) continue; |