================================================================= Linux* Open-iSCSI ================================================================= Jan 26, 2007 Contents ======== - 1. In This Release - 2. Introduction - 3. Installation - 4. Open-iSCSI daemon - 5. Open-iSCSI Configuration Utility - 6. Configuration - 7. Getting Started - 8. Advanced Configuration - 9. iSCSI System Info 1. In This Release ================== This file describes the Linux* Open-iSCSI Initiator. The software was tested on AMD Opteron (TM) and Intel Xeon (TM). The latest development release is available at: http://www.open-iscsi.org For questions, comments, contributions send e-mail to: open-iscsi@googlegroups.com 1.1. Features - highly optimized and very small-footprint data path; - persistent configuration database; - SendTargets discovery; - CHAP; - PDU header Digest; - multiple sessions; For the most recent list of features please refer to: http://www.open-iscsi.org/cgi-bin/wiki.pl/Roadmap 2. Introduction =============== Open-iSCSI project is a high-performance, transport independent, multi-platform implementation of RFC3720 iSCSI. Open-iSCSI is partitioned into user and kernel parts. The kernel portion of Open-iSCSI is a from-scratch code licensed under GPL. The kernel part implements iSCSI data path (that is, iSCSI Read and iSCSI Write), and consists of three loadable modules: scsi_transport_iscsi.ko, libiscsi.ko and iscsi_tcp.ko. User space contains the entire control plane: configuration manager, iSCSI Discovery, Login and Logout processing, connection-level error processing, Nop-In and Nop-Out handling, and (in the future:) Text processing, iSNS, SLP, Radius, etc. The user space Open-iSCSI consists of a daemon process called iscsid, and a management utility iscsiadm. 3. Installation =============== As of today, the Open-iSCSI Initiator requires a host running the Linux operating system with kernel version 2.6.16, or later. 2.6.14 and 2.6.15 are partially supported. Known issues with 2.6.14 - .15 support: - If the device is using a write back cache, during session logout the cache sync command will fail. - iscsiadm's -P 3 option will not print out scsi devices. - iscsid will not automatically online devices. You need to enable "Cryptographic API" under "Cryptographic options" in the kernel config. And you must enable "CRC32c CRC algorithm" even if you do not use header or data digests. They are the kernel options, CONFIG_CRYPTO and CONFIG_CRYPTO_CRC32C, respectively. By default the kernel source found at /lib/modules/`uname -a`/build will be used to compile the open-iscsi modules. To specify a different kernel to build against use: make KSRC= or cross-compilation: make KSRC= KARCH="ARCH=um" To compile on SUSE Linux you'll have to use make KSRC=/usr/src/linux \ KBUILD_OUTPUT=/usr/src/linux-obj// where is the kernel configuration to use (eg. 'smp'). If you choose to install the Debian packages instead of building from source, please read the file /usr/share/doc/linux-iscsi/README.debian for information on how to build kernel modules against your specific kernel. For Red Hat/Fedora and Debian distributions open-iscsi can be installed by typing "make install". This will copy iscsid and iscsiadm to /usr/sbin, the init script to /etc/init.d, and the kernel modules: iscsi_tcp.ko, libiscsi.ko and scsi_transport_iscsi to /lib/modules/`uname -r`/kernel/drivers/scsi/ overwriting existing iscsi modules. 4. Open-iSCSI daemon ==================== The daemon implements control path of iSCSI protocol, plus some management facilities. For example, the daemon could be configured to automatically re-start discovery at startup, based on the contents of persistent iSCSI database (see next section). For help, run: ./iscsid --help Usage: iscsid [OPTION] -c, --config=[path] Execute in the config file (/etc/iscsi/iscsid.conf). -f, --foreground run iscsid in the foreground -d, --debug debuglevel print debugging information -u, --uid=uid run as uid, default is current user -g, --gid=gid run as gid, default is current user group -h, --help display this help and exit -v, --version display version and exit 5. Open-iSCSI Configuration Utility =================================== Open-iSCSI persistent configuration is implemented as a DBM database available on all Linux installations. The database contains two tables: - Discovery table (/etc/iscsi/send_targets); - Node table (/etc/iscsi/nodes). The regular place for iSCSI database files: /etc/iscsi/nodes The iscsiadm utility is a command-line tool to manage (update, delete, insert, query) the persistent database. The utility presents set of operations that a user can perform on iSCSI nodes, sessions, connections, and discovery records. Open-iscsi does not use the term node as defined by the iSCSI RFC, where a node is a single iSCSI initiator or target. Open-iscsi uses the term node to refer to a portal on a target, so tools like iscsiadm require that --targetname and --portal argument be used when in node mode. For session mode, a session id (sid) is used. The sid of a session can be found by running iscsiadm -m session -i. The session id is not currently persistent and is partially determined by when the session is setup. Note that some of the iSCSI Node and iSCSI Discovery operations do not require iSCSI daemon (iscsid) loaded. For help, run: ./iscsiadm --help Usage: iscsiadm [OPTION] -m, --mode specify operational mode op = -m discovery --type=[type] --interface=iscsi_ifacename \ --portal=[ip:port] --login --print=[N] perform [type] discovery for target portal with ip-address [ip] and port [port]. See below for how to setup iscsi ifaces for software iscsi or override the system defaults. Multiple ifaces can be passed in during discovery. -m discovery --print=[N] display all discovery records from internal persistent discovery database. -m discovery --interface --portal=[ip:port] --print=[N] --login perform discovery based on portal in database. See above for info in the interface argument. For the above commands "print" is optional. If used, N can be 0 or 1. 0 = The old flat style of output is used. 1 = The tree style with the inteface info is used. If print is not used the old flay style is used. -m discovery --portal=[ip:port] --op=[op] [--name=[name] --value=[value]] perform specific DB operation [op] for specific discovery portal. It could be one of: [new], [delete], [update] or [show]. In case of [update], you have to provide [name] and [value] you wish to update -m node display all discovered nodes from internal persistent discovery database -m node --targetname=[name] --portal=[ip:port] \ --interface=iscsi_ifacename] \ [--login|--logout|--rescan|--stats] -m node --targetname=[name] --portal=[ip:port] --interface=[driver,HWaddress] \ --op=[op] [--name=[name] --value=[value]] -m node --targetname=[name] --portal=[ip:port] --interface=iscsi_ifacename] \ --print=[level] perform specific DB operation [op] for specific interface on host that will connect to portal on target. targetname, portal and interface are optional. See below for how to setup iscsi ifaces for software iscsi or override the system defaults. op could be one of: [new], [delete], [update] or [show]. In case of [update], you have to provide [name] and [value] you wish to update Print level can be 0 to 1. Rescan will perform a SCSI layer scan of the session to find new LUNs. Stats prints the iSCSI stats for the session. -m node --logoutall=[all|manual|automatic] Logout "all" the running sessions or just the ones with a node or conn startup value manual or automatic. Nodes marked as ONBOOT are skipped. -m node --loginall=[all|manual|automatic] Login "all" the running sessions or just the ones with a node or conn startup value manual or automatic. Nodes marked as ONBOOT are skipped. -m session display all active sessions and connections -m session --sid=[sid] [ --print=level | --rescan | --logout ] --op=[op] [--name=[name] --value=[value]] perform operation for specific session with session id sid. If no sid is given the operation will be performed on all running sessions if possible. --logout and --op work like they do in node mode, but in session mode targetname and portal info is is not passed in. Print level can be 0 to 2. 1 = Print basic session info like node we are connected to and whether we are connected. 2 = Print iscsi params used. 3 = Print SCSI info like LUNs, device state. If no sid and no operation is given print out the running sessions. -m iface --interface=iscsi_ifacename --op=[op] [--name=[name] --value=[value]] --print=level perform operation on fiven interface with name iscsi_ifacename. See below for examples. -d, --debug debuglevel print debugging information -V, --version display version and exit -h, --help display this help and exit 5.1 How to setup iSCSI interfaces (iface) for binding ===================================================== If you wish to allow the network susbsystem to figure out the best path/NIC to use then you can skip this section. For example if you have setup your portals and NICs on different subnets then this the following is not needed for software iscsi. Warning!!!!!! This feature is experimental. The interface may change. When reporting bugs, if you cannot do a "ping -I ethX target_portal", then check your network settings first. If you cannot ping the portal, then you will not be able to bind a session to a NIC. What is a scsi_host and iface for software, hardware and partial offload iscsi? Software iscsi, like iscsi_tcp and iser, allocate a scsi_host per session and does a single connection per session. As a result /sys/class_scsi_host and /proc/scsi will report a scsi_host for each connection/session you have logged into. Hardware iscsi, like Qlogic's qla4xxx, allocates a scsi_host for each PCI device (each port on a HBA will show up as a different PCI device so you get a scsi_host per HBA port). To manage both types of initiator stacks, iscsiadm uses the interface (iface) structure. For each HBA port or for software iscsi for each network device (ethX) or NIC, that you wish to bind sessions to you must create a iface config /etc/iscsi/ifaces. When you run iscsiadm the first time a hardware iscsi driver like qla4xxx is loaded, iscsiadm will create default iface configs for you. The config created by iscsiadm for qlogic should be ok for most uses and users should not have to modify them. Running: # iscsiadm -m iface iface0 qla4xxx,00:c0:dd:08:63:e8,default iface1 qla4xxx,00:c0:dd:08:63:ea,default Will report iface configurations that are setup in /etc/iscsi/ifaces. The format is: iface_name transport_name,hwaddress,net_ifacename For software iscsi, you can create the iface configs by hand, but it is reccomended that you use iscsiadm's iface mode. There is a iface.example in /etc/iscsi/ifaces which can be used as a template for the daring. For each network object you wish to bind a session to you must create a seperate iface config in /etc/iscsi/ifaces and each iface config file must have a unique name which is less than or equal to 64 characters. Example: If you have NIC1 with MAC address 00:0F:1F:92:6B:BF and NIC2 with MAC address 00:C0:DD:08:63:E7 and you wanted to do software iscsi over TCP/IP. Then in /etc/iscsi/ifaces/iface0 you would enter: iface.transport_name = tcp iface.hwaddress = 00:0F:1F:92:6B:BF and in /etc/iscsi/ifaces/iface1 you would enter: iface.transport_name = tcp iface.hwaddress = 00:C0:DD:08:63:E7 Warning: Do not name a iface config file "default". default is a special value/file that is used by the iscsi tools for backward compatibility. If you name a config default, then the behavior is not defined. To use iscsiadm to create iface0 above for you run: (This will create a new empty iface config. If there was already a iface with the name "iface0" this command will overwrite it.) # iscsiadm -m iface -I iface0 --op=new (This will set the hwaddress.) # iscsiadm -m iface -I iface0 --op=update -n iface.hwaddress -v 00:0F:1F:92:6B:BF If you had sessions logged in iscsiadm will not update, delete or overwrite a iface. You must log out first. If you have a iface bound to a node/portal but you have not logged in then, iscsiadm will update the config and all existing bindings. When you then run iscsiadm to do discovery, it will check for interfaces in /etc/iscsi/ifaces and bind the portals that are discovered so that they will be logged in through each iface. This behavior can also be overriden by passing in the interfaces you want to use. For example if you had defined two interface but only wanted to use one you can use the --interface/-I argument: iscsiadm -m discovery -t st -p ip:port -I iface1 -P 1 If you had defined interfaces but wanted the old behavior, where we do not bind a session to a iface, then you can use the special iface "default": iscsiadm -m discovery -t st -p ip:port -I default -P 1 And if you did not define any interfaces in /etc/iscsi/ifaces and do not pass anything into iscsiadm, running iscsiadm will do the default behavior, where we allow the network subsystem to decide which device to use. If you later want to remove the bindings for a specific target and iface then you can run: iscsiadm -m node -T my_target -I iface0 --op=delete To do this for a specific portal on a target run: iscsiadm -m node -T my_target -p ip:port -I iface0 --op=delete If you wanted to delete all bindinds for iface0 then you can run iscsiadm -m node -I iface0 --op=delete And for equalogic targets it is sometimes useful to remove by just portal iscsiadm -m node -p ip:port -I iface0 --op=delete 5.2 iscsiadm examples ===================== Usage examples using the one-letter options (see iscsiadm man page for long options): Discovery mode: - SendTargets iSCSI Discovery using the default driver and interface: ./iscsiadm -m discovery -t sendtargets -p 192.168.1.1:3260 This will first search /etc/iscsi/ifaces for interfaces using software iscsi. If any are found then nodes found during discovery will be setup so that they can logged in through those interfaces. - SendTargets iSCSI Discovery with a specific interface. If you wish to only use a subset of the interfaces in /etc/iscsi/ifaces then you can pass them in during discovery: ./iscsiadm -m discovery -t sendtargets -p 192.168.1.1:3260 \ --inerface=iface0 --interface=iface1 Note that for software iscsi, we let the network layer select which NIC to use for discovery, but for later logins iscsiadm will use the NIC defined in the iface config. qla4xxx support is very basic and experimental. It does not store the record info in the card's FLASH or the node DB, so you must rerun discovery every time the driver is reloaded. Node mode. In node mode you can specify which records you want to log into by specifying the targetname, ip address, port or interface (if specifying the interface it must already be setup in the node db). iscsiadm will search the node db, for records which match the values you pass in, so if you pass in the targetname and interface, iscsiadm will search for records with those values and operate on only them. Passing in none of them will result in all node records being operated on. - iSCSI Login to all portals on every node/starget through each interface set in the db: ./iscsiadm -m node -l - iSCSI login to all portals on a node/target through each interface set in the db: ./iscsiadm -m node -T iqn.2005-03.com.max -l - iSCSI login to a specific portal through each interface set in the db: ./iscsiadm -m node -T iqn.2005-03.com.max -p 192.168.0.4:3260 -l To specify a IPv6 address the following can be used: ./iscsiadm -m node -T iqn.2005-03.com.max \ -p 2001:c90::211:9ff:feb8:a9e9 -l The above command would use the default port, 3260. To specify a port use the following: ./iscsiadm -m node -T iqn.2005-03.com.max \ -p [2001:c90::211:9ff:feb8:a9e9]:3260 -l - iSCSI Login to a specific portal through the NIC setup as iface0: ./iscsiadm -m node -T iqn.2005-03.com.max -p 192.168.0.4:3260 \ -I iface0 -l - iSCSI Logout to all portals on every node/starget through each interface set in the db: ./iscsiadm -m node -u Warning: this does not check startup values like the logout/login all option. Do not use this if you are running iscsi on your root disk. - iSCSI logout to all portals on a node/target through each interface set in the db: ./iscsiadm -m node -T iqn.2005-03.com.max -u - iSCSI logout to a specific portal through each interface set in the db: ./iscsiadm -m node -T iqn.2005-03.com.max -p 192.168.0.4:3260 -u - iSCSI Logout to a specific portal through the NIC setup as iface0: ./iscsiadm -m node -T iqn.2005-03.com.max -p 192.168.0.4:3260 \ -I iface0 - Changing iSCSI parameter: ./iscsiadm -m node -T iqn.2005-03.com.max -p 192.168.0.4:3260 \ -o update -n node.cnx[0].iscsi.MaxRecvDataSegmentLength -v 65536 You can also change paramaters for multiple records at once, by specifying different combinations of the target, portal and interface like above. - Adding custom iSCSI portal: ./iscsiadm -m node -o new -T iqn.2005-03.com.max \ -p 192.168.0.1:3260,2 -I iface4 The -I/--interface is optional. If not passed in, "default" is used. For tcp or iser, this would allow the network layer to decide what is best. Note that for this command the target portal group tag (TPGT) should be passed in. If it is not passed in on the initial creation command then the user must run iscsiadm again to set the value. Also if the TPGT is not initially passed in, the old behavior of not tracking whether the record was statically or dynamically created is used. - Adding custom NIC config to multiple targets: ./iscsiadm -m node -o new -I iface4 This command will add a interface config using the iSCSI and SCSI settings from iscsid.conf to every target that is in the node db. - Removing iSCSI portal: ./iscsiadm -m node -o delete -T iqn.2005-03.com.max -p 192.168.0.4:3260 You can also delete multiple records at once, by specifying different combinations of the target, portal and interface like above. - Display iSCSI portal onfiguration: ./iscsiadm -m node -T iqn.2005-03.com.max -p 192.168.0.4:3260 or ./iscsiadm -m node -o show -T iqn.2005-03.com.max -p 192.168.0.4:3260 You can also display multiple records at once, by specifying different combinations of the target, portal and interface like above. Note: running "iscsiadm -m node" will only display the records. It will not display the configuration info. You must run, "iscsiadm -m node -o show". - Show all node records: ./iscsiadm -m node This will print the nodes using the old flat format where the interface and driver are not displayed. To display that info use the -P argument with the arguent "1": ./iscsiadm -m node -P 1 - Show all records in discovery database: ./iscsiadm -m discovery - Show all records in discovery database and show the targets that were discovered from each record: ./iscsiadm -m discovery -P 1 - Display discovery record setting: ./iscsiadm -m discovery -p 192.168.0.4:3260 - Display session statistics: ./iscsiadm -m session -r 1 --stats This function also works in node mode. instead of the "-r $sid" argument, you would pass in the node info like targetname and/or portal, and/or interface. - Perform a SCSI scan on a session ./iscsiadm -m session -r 1 --rescan This function also works in node mode. instead of the "-r $sid" argument, you would pass in the node info like targetname and/or portal, and/or interface. Note: Rescanning does not delete old LUNs. It will only pick up new ones. - Display running sessions: ./iscsiadm -m session -P 1 6. Configuration ================ The default configuration file is /etc/iscsi/iscsid.conf. This file contains only configuration that could be overwritten by iSCSI Discovery, or manualy updated via iscsiadm utility. Its OK if this file does not exist in which case compiled-in default configuration will take place for newer discovered Target nodes. See the man page and the example file for the current syntax. The manpages for iscsid, iscsiadm are in the doc subdirectory and can be installed in the appropriate man page directories and need to be manually copied into e.g. /usr/local/share/man8. 7. Getting Started ================== There are three steps needed to set up a system to use iSCSI storage: 7.1. iSCSI startup using the init script or manual startup. 7.2. Discover targets. 7.3. Automate target logins for future system reboots. The init scripts will start the iSCSI daemon and log into any connections or nodes that are set up for automatic login. If your distro does not have a init script, then you will have to start the daemon and log into the targets manually. 7.1.1 iSCSI startup using the init script ----------------------------------------------- Red Hat or Fedora: ----------------- To start open-iscsi in Red Hat/Fedora you can do: service open-iscsi start To get open-iscsi to automatically start at run time you may have to run: chkconfig --level open-iscsi on Where are the run levels. And, to automatically mount a file system during startup you must have the partition entry in /etc/fstab marked with the "_netdev" option. For example this would mount a iscsi disk sdb: /dev/sdb /mnt/iscsi ext3 _netdev 0 0 SUSE or Debian: --------------- Otherwise, if there is a initd script for your distro in etc/initd that gets installed with "make install" /etc/init.d/open-iscsi start will usually get you started. 7.1.2 Manual Startup: --------------------- 7.1.2.1 Starting up the iSCSI daemon (iscsid) and loading modules: ----------------------------------------------------------------- If there is no initd script, you must start the tools by hand. First load the iscsi modules with: modprobe -q iscsi_tcp after that start iSCSI daemon process: ./iscsid or alternatively, start it with debug enabled and with output redirected to the current console: ./iscsid -d 8 -f & 7.1.2.2 Logging into Targets: --------------------------- Use the configuration utility, iscsiadm, to add/remove/update Discovery records, iSCSI Node records or monitor active iSCSI sessions (see above or the iscsiadm man files and see section 7.2 below for how to discover targets). ./iscsiadm -m node will print out the nodes that have been discovered as: 10.15.85.19:3260,3 iqn.1992-08.com.netapp:sn.33615311 10.15.84.19:3260,2 iqn.1992-08.com.netapp:sn.33615311 The format is: ip:port,target_portal_group_tag targetname If you are using the iface argument or want to see the driver info use the following: ./iscsiadm -m node -P 1 Target: iqn.1992-08.com.netapp:sn.33615311 Portal: 10.15.84.19:3260,2 Iface Name: iface2 Portal: 10.15.85.19:3260,3 Iface Name: iface2 The format is: Target: targetname Portal ip_address:port,tpgt Iface: iface where targetname is the name of the target and ip_address:port is the address and port of the portal. tpgt, is the portal group tag of the portal, and is not used in iscsiadm commands except for static record creation. And iface name is the name of the iscsi interface defined in /etc/iscsi/ifaces. If no interface was defined in /etc/iscsi/ifaces or passed in, the default behavior is used. Default here is iscsi_tcp/tcp to be used over which ever NIC the network layer decides is best. To login, take the ip, port and targetname from above and run: ./iscsiadm -m node -T targetname -p ip:port -l In this example we would run ./iscsiadm -m node -T iqn.1992-08.com.netapp:sn.33615311 -p 10.15.84.19:3260 -l Note: drop the portal group tag from the "iscsiadm -m node" output. 7.2. Discover Targets --------------------- Once the iSCSI service is running, you can perform discovery using SendTarget with: iscsiadm -m discovery -t sendtargets -p ip:port where "ip" is the address of the portal and port is the port. Or you can you perform discovery using iSNS by setting the address of the iSNS server in iscsid.conf with the "isns.address" value and running: iscsiadm -m discovery -t isns *** Warning *** iSNS support is experimental in this release. The command line options for it will change in future releases. Both commands will print out the list of all discovered targets and their portals: # iscsiadm -m discovery -t st -p 10.15.85.19:3260 10.15.84.19:3260,2 iqn.1992-08.com.netapp:sn.33615311 10.15.85.19:3260,3 iqn.1992-08.com.netapp:sn.33615311 The format for the output is: ip:port,tpgt targetname In this example, for the first target the ip address is 10.15.85.19. The port is 3260. The target portal group is 3. The target name is iqn.1992-08.com.netapp:sn.33615311. If you would also like to see the iscsi inteface which will be used for each session then use the --print[N] option. iscsiadm -m discovery -t sendtargets -p ip:port -P 1 will print: Target: iqn.1992-08.com.netapp:sn.33615311 Portal: 10.15.84.19:3260,2 Iface Name: iface2 Portal: 10.15.85.19:3260,3 Iface Name: iface2 In this example, The IP address of the first portal is 10.15.84.19. The port is 3260. The target portal group is 3. The target name is iqn.1992-08.com.netapp:sn.33615311. The iface being used is iface2. While discovery targets are kept in the discovery db, they are usefull only for re-discovery. The discovered targets (a.k.a. nodes) are stored as records in the node db. The discovered targets are not logged into yet. Rather than logging into the discovered nodes (making LUs from those nodes available as storage), it is better to automate the login to the nodes we need. If you wish to log into a target manually now, see section "7.1.2.2 Logging in targets" above. 7.3. Automate Target Logins for Future System Statups ----------------------------------------------------- Note: this may only work for distros with init scripts. To automate login to a node, use the following with the record ID (record ID is the targetname and portal) of the node discovered in the discovery above: iscsiadm -m node -T targetname -p ip:port --op update -n node.conn[0].startup -v automatic To set the automatic setting to all portals on a target through every interface setup for each protal, the following can be run: iscsiadm -m node -T targetname --op update -n node.conn[0].startup -v automatic Or to set the "node.conn[0].statup" attribute to "startup" as default for all sessions add the following to the /etc/iscsi/iscsid.conf: node.conn[0].startup = automatic Setting this in iscsid.conf, will not affect existing nodes. It will only affect nodes that are discovered after setting the value. To login to all the automated nodes, simply restart the iscsi service: e.g /etc/init.d/open-iscsi restart. On your next startup the nodes will be logged into autmotically. 8. Advanced Configuration ========================= 8.1 iSCSI settings for dm-multipath ----------------------------------- When using dm-multipath, the iSCSI timers should be set so that commands are quickly failed to the dm-multipath layer. For dm-multipath you should then set values like queue if no path, so that IO errors are retried and queued if all paths are failed in the multipath layer. 8.1.1 iSCSI ping/Nop-Out settings --------------------------------- To quickly detect problems in the network, the iSCSI layer will send iSCSI pings (iSCSI NOP-Out requests) to the target. If a NOP-Out times out the iSCSI layer will respond by failing running commands and asking the SCSI layer to requeue them if possible (SCSI disk commands get 5 retries if not using multipath). If dm-multipath is being used the SCSI layer will fail the command to the multipath layer instead of retrying. The multipath layer will then retry the command on another path. To control how often a NOP-Out is sent the following value can be set: node.conn[0].timeo.noop_out_interval = X Where X is in seconds and the default is 10 seconds. To control the timeout for the NOP-Out the noop_out_timeout value can be used: node.conn[0].timeo.noop_out_timeout = X Again X is in seconds and the default is 15 seconds. Normally for these values you can use: node.conn[0].timeo.noop_out_interval = 5 node.conn[0].timeo.noop_out_timeout = 10 If there are a lot of IO error messages, then the above values may be too aggresive and you may need to increase the values for your network conditions and workload, or you may need to check your network for possible problems. 8.1.2 replacement_timeout ------------------------- The next iSCSI timer that will need to be tweaked is: node.session.timeo.replacement_timeout = X Here X is in seconds. replacement_timeout will control how long to wait for session re-establishment before failing pending SCSI commands and commands that are being operated on by the SCSI layer's error handler up to a higher level like multipath or to an application if multipath is not being used. 8.1.2.1 Running Commands, the SCSI Error Handler, and replacement_timeout ------------------------------------------------------------------------- Remember, from the Nop-out discussion that if a network problem is detected, the running commands are failed immediately. There is one exception to this and that is when the SCSI layer's error handler is running. To check if the SCSI error handler is running iscsiadm can be run as: iscsiadm -m session -P 3 You will then see: Host Number: X State: Recovery When the SCSI EH is running, commands will not be failed until node.session.timeo.replacement_timeout seconds. 8.1.2.2 Pending Commands and replacement_timeout ------------------------------------------------ Commonly, the SCSI/BLOCK layer will queue 256 commands, but the path can only take 32. When a network problem is detected, the 32 commands in flight will be sent back to the SCSI layer immediately and because multipath is being used this will cause the commands to be sent to the multipath layer for execution on another path. However the other 96 commands that were still in the SCSI/BLOCK queue, will remain here until the session is re-established or until node.session.timeo.replacement_timeout seconds has gone by. After replacement_timeout seconds, the pending commands will be failed to the multipath layer, and all new incoming commands will be immediately failed back to the multipath layer. If a session is later re-established, then new commands will be queued and executed. Normally, multipathd's path tester mechanism will detect that the session has been re-established and the path is accessable again, and it will inform dm-multipath. 8.1.3 Optimal replacement_timeout Value --------------------------------------- The default value for replacement_timeout is 120 seconds, but because multipath's queue if no path setting can prevent IO errors from being propogated to the application, replacement_timeout can be set to a shorter value like 15 to 30 seconds. By setting it lower pending IO is quickly sent to a new path and executed while the iSCSI layer attempts to re-establishment the session. If all paths end up being failed, then the multipath and device mapper layer will internally queue IO based on the multipath.conf settings, instead of the iSCSI layer. 8.2 iSCSI settings for iSCSI root --------------------------------- When accessing the root parition directly through a iSCSI disk, the iSCSI timers should be set so that iSCSI layer has several chances to try to re-establish a session and so that commands are not quickly requeued to the SCSI layer. Basically you want the opposite of when using dm-multipath. For this setup, you can turn off iSCSI pings by setting: node.conn[0].timeo.noop_out_interval = 0 node.conn[0].timeo.noop_out_timeout = 0 And you can turn the replacement_timer to a very long value: node.session.timeo.replacement_timeout = 86400 9. iSCSI System Info ==================== To get information about the running sessions: including the session and device state, session ids (sid) for session mode, and some of the negioated parameters, run: iscsiadm -m session -P 2 If you are looking for something shorter like just the sid to node mapping run: iscsiadm -m session -P 0 or iscsiadm -m session This will print the list of running sessions with the format: driver [sid] ip:port,target_portal_group_tag targetname # iscsiadm -m session tcp [2] 10.15.84.19:3260,2 iqn.1992-08.com.netapp:sn.33615311 tcp [3] 10.15.85.19:3260,3 iqn.1992-08.com.netapp:sn.33615311 To print the hw address info use the -P option with "1": iscsiadm -m session -P 1 This will print the sessions with the following format: Target: targetname Current Portal: portal currently logged into Persistent Portal: portal we would fall back to if we had got redirected during login Iface Transport: driver/transport_name Iface IPaddress: IP address of iface being used Iface HWaddress: HW address used to bind session Iface Netdev: netdev value used to bind session SID: iscsi sysfs session id iSCSI Connection State: iscsi state Note: if a older kernel is being used or if the session is not bound then the keyword "default" is print to indicate that the default network behavior is being used. Example: #iscsiadm -m session -P 1 Target: iqn.1992-08.com.netapp:sn.33615311 Current Portal: 10.15.85.19:3260,3 Persistent Portal: 10.15.85.19:3260,3 Iface Transport: tcp Iface IPaddress: 10.11.14.37 Iface HWaddress: default Iface Netdev: default SID: 7 iSCSI Connection State: LOGGED IN Internal iscsid Session State: NO CHANGE The connection state is currently not available for qla4xxx.