================================================================= Linux* Open-iSCSI ================================================================= March 22, 2010 Contents ======== - 1. In This Release - 2. Introduction - 3. Installation - 4. Open-iSCSI daemon - 5. Open-iSCSI Configuration Utility - 6. Configuration - 7. Getting Started - 8. Advanced Configuration - 9. iSCSI System Info 1. In This Release ================== This file describes the Linux* Open-iSCSI Initiator. The software was tested on AMD Opteron (TM) and Intel Xeon (TM). The latest development release is available at: http://www.open-iscsi.org For questions, comments, contributions send e-mail to: open-iscsi@googlegroups.com 1.1. Features - highly optimized and very small-footprint data path; - persistent configuration database; - SendTargets discovery; - CHAP; - PDU header Digest; - multiple sessions; 2. Introduction =============== Open-iSCSI project is a high-performance, transport independent, multi-platform implementation of RFC3720 iSCSI. Open-iSCSI is partitioned into user and kernel parts. The kernel portion of Open-iSCSI is a from-scratch code licensed under GPL. The kernel part implements iSCSI data path (that is, iSCSI Read and iSCSI Write), and consists of three loadable modules: scsi_transport_iscsi.ko, libiscsi.ko and iscsi_tcp.ko. User space contains the entire control plane: configuration manager, iSCSI Discovery, Login and Logout processing, connection-level error processing, Nop-In and Nop-Out handling, and (in the future:) Text processing, iSNS, SLP, Radius, etc. The user space Open-iSCSI consists of a daemon process called iscsid, and a management utility iscsiadm. 3. Installation =============== As of today, the Open-iSCSI Initiator requires a host running the Linux operating system with kernel version 2.6.16, or later. 2.6.14 and 2.6.15 are partially supported. Known issues with 2.6.14 - .15 support: - If the device is using a write back cache, during session logout the cache sync command will fail. - iscsiadm's -P 3 option will not print out scsi devices. - iscsid will not automatically online devices. You need to enable "Cryptographic API" under "Cryptographic options" in the kernel config. And you must enable "CRC32c CRC algorithm" even if you do not use header or data digests. They are the kernel options, CONFIG_CRYPTO and CONFIG_CRYPTO_CRC32C, respectively. By default the kernel source found at /lib/modules/`uname -a`/build will be used to compile the open-iscsi modules. To specify a different kernel to build against use: make KSRC= or cross-compilation: make KSRC= KARCH="ARCH=um" To compile on SUSE Linux you'll have to use make KSRC=/usr/src/linux \ KBUILD_OUTPUT=/usr/src/linux-obj// where is the kernel configuration to use (eg. 'smp'). For Red Hat/Fedora and Debian distributions open-iscsi can be installed by typing "make install". This will copy iscsid and iscsiadm to /usr/sbin, the init script to /etc/init.d, and the kernel modules: iscsi_tcp.ko, libiscsi_tcp.ko, libiscsi.ko and scsi_transport_iscsi to /lib/modules/`uname -r`/kernel/drivers/scsi/ overwriting existing iscsi modules. For Debian, be sure to install the linux-headers package that corresponds to your kernel in order to compile the kernel modules ('aptitude install linux-headers-`uname -r`'). You may also wish to run 'make -C kernel/ dpkg_divert' before installing kernel modules if you run a Debian-provided kernel. This will use dpkg-divert(8) to move the packaged kernel modules out of the way, and ensure that future kernel upgrades will not overwrite them. Also, please be aware that the compatibility patches that enable these iscsi modules to run on kernels older than 2.6.25 will not update the ib_iser module; you may get warnings related to mismatched symbols on this driver, in which case you'll be unable to load ib_iser and open-iscsi simultaneously. 4. Open-iSCSI daemon ==================== The daemon implements control path of iSCSI protocol, plus some management facilities. For example, the daemon could be configured to automatically re-start discovery at startup, based on the contents of persistent iSCSI database (see next section). For help, run: ./iscsid --help Usage: iscsid [OPTION] -c, --config=[path] Execute in the config file (/etc/iscsi/iscsid.conf). -f, --foreground run iscsid in the foreground -d, --debug debuglevel print debugging information -u, --uid=uid run as uid, default is current user -g, --gid=gid run as gid, default is current user group -h, --help display this help and exit -v, --version display version and exit 5. Open-iSCSI Configuration Utility =================================== Open-iSCSI persistent configuration is implemented as a DBM database available on all Linux installations. The database contains two tables: - Discovery table (/etc/iscsi/send_targets); - Node table (/etc/iscsi/nodes). The regular place for iSCSI database files: /etc/iscsi/nodes The iscsiadm utility is a command-line tool to manage (update, delete, insert, query) the persistent database. The utility presents set of operations that a user can perform on iSCSI nodes, sessions, connections, and discovery records. Open-iscsi does not use the term node as defined by the iSCSI RFC, where a node is a single iSCSI initiator or target. Open-iscsi uses the term node to refer to a portal on a target, so tools like iscsiadm require that --targetname and --portal argument be used when in node mode. For session mode, a session id (sid) is used. The sid of a session can be found by running iscsiadm -m session -i. The session id is not currently persistent and is partially determined by when the session is setup. Note that some of the iSCSI Node and iSCSI Discovery operations do not require iSCSI daemon (iscsid) loaded. For help, run: ./iscsiadm --help Usage: iscsiadm [OPTION] -m, --mode specify operational mode op = -m discovery --type=[type] --interface=iscsi_ifacename \ --portal=[ip:port] --login --print=[N] \ --op=[op]=[NEW | UPDATE | DELETE | NONPERSISTENT] perform [type] discovery for target portal with ip-address [ip] and port [port]. This command will not use the discovery record settings. It will use the iscsid.conf discovery settings and it will overwrite the discovery record with iscsid.conf discovery settings if it exists. By default, it will then remove records for portals no longer returned. And, if a portal is returned by the target, then the discovery command will create a new record or modify an existing one with values from iscsi.conf and the command line. [op] can be passed in multiple times to this command, and it will alter the DB manipulation. If [op] is passed in and the value is "new", iscsiadm will add records for portals that do not yet have records in the db. If [op] is passed in and the value is "update", iscsiadm will update node records using info from iscsi.conf and the command line for portals that are returned during discovery and have a record in the db. If [op] is passed in and the value is "delete", iscsiadm will delete records for portals that were not returned during discovery. If [op] is passed in and the value is "nonpersistent" iscsiadm will not store the portals found in the node DB. See the example section for more info. See below for how to setup iscsi ifaces for software iscsi or override the system defaults. Multiple ifaces can be passed in during discovery. -m discovery --print=[N] display all discovery records from internal persistent discovery database. -m discovery --interface --portal=[ip:port] --print=[N] --discover perform discovery based on portal in database. See above for info in the interface argument. For the above commands "print" is optional. If used, N can be 0 or 1. 0 = The old flat style of output is used. 1 = The tree style with the inteface info is used. If print is not used the old flay style is used. -m discovery --interface --portal=[ip:port] --print=[N] --login perform discovery based on portal in database, and log into portals found during discovery. See above for info in the interface argument. For the above commands "print" is optional. If used, N can be 0 or 1. 0 = The old flat style of output is used. 1 = The tree style with the inteface info is used. If print is not used the old flay style is used. -m discovery --portal=[ip:port] --op=[op] [--name=[name] --value=[value]] perform specific DB operation [op] for specific discovery portal. It could be one of: [new], [delete], [update] or [show]. In case of [update], you have to provide [name] and [value] you wish to update -m node display all discovered nodes from internal persistent discovery database -m node --targetname=[name] --portal=[ip:port] \ --interface=iscsi_ifacename] \ [--login|--logout|--rescan|--stats] -m node --targetname=[name] --portal=[ip:port] --interface=[driver,HWaddress] \ --op=[op] [--name=[name] --value=[value]] -m node --targetname=[name] --portal=[ip:port] --interface=iscsi_ifacename] \ --print=[level] perform specific DB operation [op] for specific interface on host that will connect to portal on target. targetname, portal and interface are optional. See below for how to setup iscsi ifaces for software iscsi or override the system defaults. op could be one of: [new], [delete], [update] or [show]. In case of [update], you have to provide [name] and [value] you wish to update. [delete] - Note that if a session is using the node record, the session will be logged out then the record will be deleted. Print level can be 0 to 1. Rescan will perform a SCSI layer scan of the session to find new LUNs. Stats prints the iSCSI stats for the session. -m node --logoutall=[all|manual|automatic] Logout "all" the running sessions or just the ones with a node startup value manual or automatic. Nodes marked as ONBOOT are skipped. -m node --loginall=[all|manual|automatic] Login "all" the running sessions or just the ones with a node startup value manual or automatic. Nodes marked as ONBOOT are skipped. -m session display all active sessions and connections -m session --sid=[sid] [ --print=level | --rescan | --logout ] --op=[op] [--name=[name] --value=[value]] perform operation for specific session with session id sid. If no sid is given the operation will be performed on all running sessions if possible. --logout and --op work like they do in node mode, but in session mode targetname and portal info is is not passed in. Print level can be 0 to 2. 1 = Print basic session info like node we are connected to and whether we are connected. 2 = Print iscsi params used. 3 = Print SCSI info like LUNs, device state. If no sid and no operation is given print out the running sessions. -m iface --interface=iscsi_ifacename --op=[op] [--name=[name] --value=[value]] --print=level perform operation on fiven interface with name iscsi_ifacename. See below for examples. -m host --host=hostno --print=level Display information for a specific host if hostno is passed in. If no hostno is passed in then info for all hosts is printed. Print level can be 0 to 4. 1 = Print info for how like its state, MAC, and netinfo if possible. 2 = Print basic session info for nodes the host is connected to. 3 = Print iscsi params used. 4 = Print SCSI info like LUNs, device state. -d, --debug debuglevel print debugging information -V, --version display version and exit -h, --help display this help and exit 5.1 iSCSI iface setup ===================== The next sections describe how to setup iSCSI ifaces so you can bind a session to a NIC port when using software iscsi (section 5.1.1), and it describes how to setup ifaces for use with offload cards from Chelsio and Broadcm (section 5.1.2). 5.1.1 How to setup iSCSI interfaces (iface) for binding ======================================================= If you wish to allow the network susbsystem to figure out the best path/NIC to use then you can skip this section. For example if you have setup your portals and NICs on different subnets then this the following is not needed for software iscsi. Warning!!!!!! This feature is experimental. The interface may change. When reporting bugs, if you cannot do a "ping -I ethX target_portal", then check your network settings first. If you cannot ping the portal, then you will not be able to bind a session to a NIC. What is a scsi_host and iface for software, hardware and partial offload iscsi? Software iscsi, like iscsi_tcp and iser, allocate a scsi_host per session and does a single connection per session. As a result /sys/class_scsi_host and /proc/scsi will report a scsi_host for each connection/session you have logged into. Offload iscsi, like Chelsio cxgb3i, allocates a scsi_host for each PCI device (each port on a HBA will show up as a different PCI device so you get a scsi_host per HBA port). To manage both types of initiator stacks, iscsiadm uses the interface (iface) structure. For each HBA port or for software iscsi for each network device (ethX) or NIC, that you wish to bind sessions to you must create a iface config /etc/iscsi/ifaces. Running: # iscsiadm -m iface iface0 qla4xxx,00:c0:dd:08:63:e8,20.15.0.7,default,iqn.2005-06.com.redhat:madmax iface1 qla4xxx,00:c0:dd:08:63:ea,20.15.0.9,default,iqn.2005-06.com.redhat:madmax Will report iface configurations that are setup in /etc/iscsi/ifaces. The format is: iface_name transport_name,hwaddress,ipaddress,net_ifacename,initiatorname For software iscsi, you can create the iface configs by hand, but it is reccomended that you use iscsiadm's iface mode. There is a iface.example in /etc/iscsi/ifaces which can be used as a template for the daring. For each network object you wish to bind a session to you must create a seperate iface config in /etc/iscsi/ifaces and each iface config file must have a unique name which is less than or equal to 64 characters. Example: If you have NIC1 with MAC address 00:0F:1F:92:6B:BF and NIC2 with MAC address 00:C0:DD:08:63:E7 and you wanted to do software iscsi over TCP/IP. Then in /etc/iscsi/ifaces/iface0 you would enter: iface.transport_name = tcp iface.hwaddress = 00:0F:1F:92:6B:BF and in /etc/iscsi/ifaces/iface1 you would enter: iface.transport_name = tcp iface.hwaddress = 00:C0:DD:08:63:E7 Warning: Do not name a iface config file "default" or "iser". They are special value/file that is used by the iscsi tools for backward compatibility. If you name a iface default or iser, then the behavior is not defined. To use iscsiadm to create iface0 above for you run: (This will create a new empty iface config. If there was already a iface with the name "iface0" this command will overwrite it.) # iscsiadm -m iface -I iface0 --op=new (This will set the hwaddress.) # iscsiadm -m iface -I iface0 --op=update -n iface.hwaddress -v 00:0F:1F:92:6B:BF If you had sessions logged in iscsiadm will not update, overwrite a iface. You must log out first. If you have a iface bound to a node/portal but you have not logged in then, iscsiadm will update the config and all existing bindings. You should now skip to 5.1.3 to see how to log in using the iface and for some helpful management commands. 5.1.2 Setting up a iface for a iSCSI offload card ================================================= This section describes how to setup ifaces for use with Chelsio and Broadcom cards. By default, iscsiadm will create a iface for each Broadcom and Chelsio port. The iface name will be of the form: $transport/driver_name.$MAC_ADDRESS Running: # iscsiadm -m iface default tcp,,,, iser iser,,,, cxgb3i.00:07:43:05:97:07 cxgb3i,00:07:43:05:97:07,,, Will report iface configurations that are setup in /etc/iscsi/ifaces. The format is: iface_name transport_name,hwaddress,ipaddress,net_ifacename,initiatorname iface_name: name of iface transport_name: name of driver hwaddress: MAC address ipaddress: IP address to use for this port net_iface_name: Net_ifacename will be because change between reboots. It is used for software iSCSI's vlan or alias binding. initiatorname: Initiatorname to be used if you want to override the default one in /etc/iscsi/initiatorname.iscsi. To display these values in a more friendly run: iscsiadm -m iface -I cxgb3i.00:07:43:05:97:07 # BEGIN RECORD 2.0-871 iface.iscsi_ifacename = cxgb3i.00:07:43:05:97:07 iface.net_ifacename = iface.ipaddress = iface.hwaddress = 00:07:43:05:97:07 iface.transport_name = cxgb3i iface.initiatorname = # END RECORD Before you can use the iface, you must set the IP address for the port with the following command: iscsiadm -m iface -I cxgb3i.00:07:43:05:97:07 -o update -n iface.ipaddress -v 20.15.0.66 Note1. For the name of the value we want to update we use the name from the "iscsiadm -m iface -I cxgb3i.00:07:43:05:97:07" command which is "iface.ipaddress". Now, we can use this iface to login into targets, which is described in the next section. 5.1.3 Discoverying iSCSI targets/portals ======================================== Be aware that iscsiadm will use the default route to do discovery. It will not use the iface specified. So if you are using a offload card, you will need a seperate network connection to the target for discovery purposes. *This will be fixed in the next version of open-iscsi* For compatibility reasons, when you run iscsiadm to do discovery, it will check for interfaces in /etc/iscsi/iscsi/ifaces that are using tcp for the iface.transport and it will bind the portals that are discovered so that they will be logged in through those ifaces. This behavior can also be overriden by passing in the interfaces you want to use. For the case of offload like with cxgb3i and bnx2i this is required because the transport will not be tcp. For example if you had defined two interface but only wanted to use one you can use the --interface/-I argument: iscsiadm -m discovery -t st -p ip:port -I iface1 -P 1 If you had defined interfaces but wanted the old behavior, where we do not bind a session to a iface, then you can use the special iface "default": iscsiadm -m discovery -t st -p ip:port -I default -P 1 And if you did not define any interfaces in /etc/iscsi/ifaces and do not pass anything into iscsiadm, running iscsiadm will do the default behavior, where we allow the network subsystem to decide which device to use. If you later want to remove the bindings for a specific target and iface then you can run: iscsiadm -m node -T my_target -I iface0 --op=delete To do this for a specific portal on a target run: iscsiadm -m node -T my_target -p ip:port -I iface0 --op=delete If you wanted to delete all bindinds for iface0 then you can run iscsiadm -m node -I iface0 --op=delete And for equalogic targets it is sometimes useful to remove by just portal iscsiadm -m node -p ip:port -I iface0 --op=delete To now log into targets it is the same as with sofware iscsi. See section 7 for how to get started. 5.2 iscsiadm examples ===================== Usage examples using the one-letter options (see iscsiadm man page for long options): Discovery mode: - SendTargets iSCSI Discovery using the default driver and interface: ./iscsiadm -m discovery -t sendtargets -p 192.168.1.1:3260 This will first search /etc/iscsi/ifaces for interfaces using software iscsi. If any are found then nodes found during discovery will be setup so that they can logged in through those interfaces. - SendTargets iSCSI Discovery updating existing records: ./iscsiadm -m discovery -t sendtargets -p 192.168.1.1:3260 \ -o update If there a record for targetX and portalY exists in the DB, and is returned during discovery, it will be updated with the info from the iscsi.conf. No new portals will be added and stale portals will not be removed. - SendTargets iSCSI Discovery deleting existing records: ./iscsiadm -m discovery -t sendtargets -p 192.168.1.1:3260 \ -o delete If there a record for targetX and portalY exists in the DB, but is not returned during discovery it will be removed from the DB. No new portals will be added and existing portal records will not be changed. Note: If a session is logged into portal we are going to delete a record for, it will be logged out then the record will be deleted. - SendTargets iSCSI Discovery adding new records: ./iscsiadm -m discovery -t sendtargets -p 192.168.1.1:3260 \ -o new If there targetX and portalY is returned during discovery and does not have a record, it will be added. Existing records are not modified. - SendTargets iSCSI Discovery using multiple ops: ./iscsiadm -m discovery -t sendtargets -p 192.168.1.1:3260 \ -o new -o delete This command will add new portals and delete records for portals no longer returned. It will not change the record information for existing portals. - SendTargets iSCSI Discovery in nonpersistent mode: ./iscsiadm -m discovery -t sendtargets -p 192.168.1.1:3260 \ -o nonpersistent This command will perform discovery, but not manipulate the node DB. - SendTargets iSCSI Discovery with a specific interface. If you wish to only use a subset of the interfaces in /etc/iscsi/ifaces then you can pass them in during discovery: ./iscsiadm -m discovery -t sendtargets -p 192.168.1.1:3260 \ --interface=iface0 --interface=iface1 Note that for software iscsi, we let the network layer select which NIC to use for discovery, but for later logins iscsiadm will use the NIC defined in the iface config. qla4xxx support is very basic and experimental. It does not store the record info in the card's FLASH or the node DB, so you must rerun discovery every time the driver is reloaded. - SendTargets iSCSI Discovery using the default driver and interface and using the discovery settings for the discovery record with the ID [192.168.1.1:3260]. ./iscsiadm -m discovery -p 192.168.1.1:3260 --discover This will search /etc/iscsi/send_targets for a record with the ID [192.168.1.1:3260]. If found it will perform discovery using the settings stored in the record. For the ifaces, This will first search /etc/iscsi/ifaces for interfaces using software iscsi. If any are found then nodes found during discovery will be setup so that they can logged in through those interfaces. This command also accepts the -o new, delete and update settings like above. Node mode. In node mode you can specify which records you want to log into by specifying the targetname, ip address, port or interface (if specifying the interface it must already be setup in the node db). iscsiadm will search the node db, for records which match the values you pass in, so if you pass in the targetname and interface, iscsiadm will search for records with those values and operate on only them. Passing in none of them will result in all node records being operated on. - iSCSI Login to all portals on every node/starget through each interface set in the db: ./iscsiadm -m node -l - iSCSI login to all portals on a node/target through each interface set in the db: ./iscsiadm -m node -T iqn.2005-03.com.max -l - iSCSI login to a specific portal through each interface set in the db: ./iscsiadm -m node -T iqn.2005-03.com.max -p 192.168.0.4:3260 -l To specify a IPv6 address the following can be used: ./iscsiadm -m node -T iqn.2005-03.com.max \ -p 2001:c90::211:9ff:feb8:a9e9 -l The above command would use the default port, 3260. To specify a port use the following: ./iscsiadm -m node -T iqn.2005-03.com.max \ -p [2001:c90::211:9ff:feb8:a9e9]:3260 -l - iSCSI Login to a specific portal through the NIC setup as iface0: ./iscsiadm -m node -T iqn.2005-03.com.max -p 192.168.0.4:3260 \ -I iface0 -l - iSCSI Logout to all portals on every node/starget through each interface set in the db: ./iscsiadm -m node -u Warning: this does not check startup values like the logout/login all option. Do not use this if you are running iscsi on your root disk. - iSCSI logout to all portals on a node/target through each interface set in the db: ./iscsiadm -m node -T iqn.2005-03.com.max -u - iSCSI logout to a specific portal through each interface set in the db: ./iscsiadm -m node -T iqn.2005-03.com.max -p 192.168.0.4:3260 -u - iSCSI Logout to a specific portal through the NIC setup as iface0: ./iscsiadm -m node -T iqn.2005-03.com.max -p 192.168.0.4:3260 \ -I iface0 - Changing iSCSI parameter: ./iscsiadm -m node -T iqn.2005-03.com.max -p 192.168.0.4:3260 \ -o update -n node.cnx[0].iscsi.MaxRecvDataSegmentLength -v 65536 You can also change paramaters for multiple records at once, by specifying different combinations of the target, portal and interface like above. - Adding custom iSCSI portal: ./iscsiadm -m node -o new -T iqn.2005-03.com.max \ -p 192.168.0.1:3260,2 -I iface4 The -I/--interface is optional. If not passed in, "default" is used. For tcp or iser, this would allow the network layer to decide what is best. Note that for this command the target portal group tag (TPGT) should be passed in. If it is not passed in on the initial creation command then the user must run iscsiadm again to set the value. Also if the TPGT is not initially passed in, the old behavior of not tracking whether the record was statically or dynamically created is used. - Adding custom NIC config to multiple targets: ./iscsiadm -m node -o new -I iface4 This command will add a interface config using the iSCSI and SCSI settings from iscsid.conf to every target that is in the node db. - Removing iSCSI portal: ./iscsiadm -m node -o delete -T iqn.2005-03.com.max -p 192.168.0.4:3260 You can also delete multiple records at once, by specifying different combinations of the target, portal and interface like above. - Display iSCSI portal onfiguration: ./iscsiadm -m node -T iqn.2005-03.com.max -p 192.168.0.4:3260 or ./iscsiadm -m node -o show -T iqn.2005-03.com.max -p 192.168.0.4:3260 You can also display multiple records at once, by specifying different combinations of the target, portal and interface like above. Note: running "iscsiadm -m node" will only display the records. It will not display the configuration info. You must run, "iscsiadm -m node -o show". - Show all node records: ./iscsiadm -m node This will print the nodes using the old flat format where the interface and driver are not displayed. To display that info use the -P argument with the arguent "1": ./iscsiadm -m node -P 1 - Show all records in discovery database: ./iscsiadm -m discovery - Show all records in discovery database and show the targets that were discovered from each record: ./iscsiadm -m discovery -P 1 - Display discovery record setting: ./iscsiadm -m discovery -p 192.168.0.4:3260 - Display session statistics: ./iscsiadm -m session -r 1 --stats This function also works in node mode. Instead of the "-r $sid" argument, you would pass in the node info like targetname and/or portal, and/or interface. - Perform a SCSI scan on a session ./iscsiadm -m session -r 1 --rescan This function also works in node mode. Instead of the "-r $sid" argument, you would pass in the node info like targetname and/or portal, and/or interface. Note: Rescanning does not delete old LUNs. It will only pick up new ones. - Display running sessions: ./iscsiadm -m session -P 1 6. Configuration ================ The default configuration file is /etc/iscsi/iscsid.conf. This file contains only configuration that could be overwritten by iSCSI Discovery, or manualy updated via iscsiadm utility. Its OK if this file does not exist in which case compiled-in default configuration will take place for newer discovered Target nodes. See the man page and the example file for the current syntax. The manpages for iscsid, iscsiadm are in the doc subdirectory and can be installed in the appropriate man page directories and need to be manually copied into e.g. /usr/local/share/man8. 7. Getting Started ================== There are three steps needed to set up a system to use iSCSI storage: 7.1. iSCSI startup using the init script or manual startup. 7.2. Discover targets. 7.3. Automate target logins for future system reboots. The init scripts will start the iSCSI daemon and log into any portals that are set up for automatic login (discussed in 7.2) or discovered through the discover daemon iscsid.conf params (discussed in 7.1.2). If your distro does not have a init script, then you will have to start the daemon and log into the targets manually. 7.1.1 iSCSI startup using the init script ----------------------------------------------- Red Hat or Fedora: ----------------- To start open-iscsi in Red Hat/Fedora you can do: service open-iscsi start To get open-iscsi to automatically start at run time you may have to run: chkconfig --level open-iscsi on Where are the run levels. And, to automatically mount a file system during startup you must have the partition entry in /etc/fstab marked with the "_netdev" option. For example this would mount a iscsi disk sdb: /dev/sdb /mnt/iscsi ext3 _netdev 0 0 SUSE or Debian: --------------- Otherwise, if there is a initd script for your distro in etc/initd that gets installed with "make install" /etc/init.d/open-iscsi start will usually get you started. 7.1.2 Automatic Discovery and Login ----------------------------------- When iscsid starts it will check iscsid.conf for: discovery.daemon.sendtargets.addresses = discovery.daemon.sendtargets.poll_interval = discovery.daemon.isns.addresses = discovery.daemon.isns.poll_interval = being set. If an address or addresses are set, iscsid will perform discovery to the address every poll_interval seconds, and it will log into any portals found from the discovery source using the ifaces in /etc/iscsi/ifaces. Note that for iSNS the poll_interval does not have to be set. If not set, iscsid will only perform rediscovery when it gets a SCN from the server. SCNs are not supported when using the Microsoft or SLES iSNS server. If using one of them you should set the poll_interval. See the iscsid.conf for more examples. 7.1.2 Manual Startup: --------------------- 7.1.2.1 Starting up the iSCSI daemon (iscsid) and loading modules: ----------------------------------------------------------------- If there is no initd script, you must start the tools by hand. First load the iscsi modules with: modprobe -q iscsi_tcp after that start iSCSI daemon process: ./iscsid or alternatively, start it with debug enabled and with output redirected to the current console: ./iscsid -d 8 -f & 7.1.2.2 Logging into Targets: --------------------------- Use the configuration utility, iscsiadm, to add/remove/update Discovery records, iSCSI Node records or monitor active iSCSI sessions (see above or the iscsiadm man files and see section 7.2 below for how to discover targets). ./iscsiadm -m node will print out the nodes that have been discovered as: 10.15.85.19:3260,3 iqn.1992-08.com.netapp:sn.33615311 10.15.84.19:3260,2 iqn.1992-08.com.netapp:sn.33615311 The format is: ip:port,target_portal_group_tag targetname If you are using the iface argument or want to see the driver info use the following: ./iscsiadm -m node -P 1 Target: iqn.1992-08.com.netapp:sn.33615311 Portal: 10.15.84.19:3260,2 Iface Name: iface2 Portal: 10.15.85.19:3260,3 Iface Name: iface2 The format is: Target: targetname Portal ip_address:port,tpgt Iface: iface where targetname is the name of the target and ip_address:port is the address and port of the portal. tpgt, is the portal group tag of the portal, and is not used in iscsiadm commands except for static record creation. And iface name is the name of the iscsi interface defined in /etc/iscsi/ifaces. If no interface was defined in /etc/iscsi/ifaces or passed in, the default behavior is used. Default here is iscsi_tcp/tcp to be used over which ever NIC the network layer decides is best. To login, take the ip, port and targetname from above and run: ./iscsiadm -m node -T targetname -p ip:port -l In this example we would run ./iscsiadm -m node -T iqn.1992-08.com.netapp:sn.33615311 -p 10.15.84.19:3260 -l Note: drop the portal group tag from the "iscsiadm -m node" output. 7.2. Discover Targets --------------------- Once the iSCSI service is running, you can perform discovery using SendTarget with: iscsiadm -m discovery -t sendtargets -p ip:port where "ip" is the address of the portal and port is the port. To use iSNS you can run the discovery command with the type as "isns" and pass in the ip:port: iscsiadm -m discovery -t isns -p ip:port Both commands will print out the list of all discovered targets and their portals: # iscsiadm -m discovery -t st -p 10.15.85.19:3260 10.15.84.19:3260,2 iqn.1992-08.com.netapp:sn.33615311 10.15.85.19:3260,3 iqn.1992-08.com.netapp:sn.33615311 The format for the output is: ip:port,tpgt targetname In this example, for the first target the ip address is 10.15.85.19. The port is 3260. The target portal group is 3. The target name is iqn.1992-08.com.netapp:sn.33615311. If you would also like to see the iscsi inteface which will be used for each session then use the --print[N] option. iscsiadm -m discovery -t sendtargets -p ip:port -P 1 will print: Target: iqn.1992-08.com.netapp:sn.33615311 Portal: 10.15.84.19:3260,2 Iface Name: iface2 Portal: 10.15.85.19:3260,3 Iface Name: iface2 In this example, The IP address of the first portal is 10.15.84.19. The port is 3260. The target portal group is 3. The target name is iqn.1992-08.com.netapp:sn.33615311. The iface being used is iface2. While discovery targets are kept in the discovery db, they are useful only for re-discovery. The discovered targets (a.k.a. nodes) are stored as records in the node db. The discovered targets are not logged into yet. Rather than logging into the discovered nodes (making LUs from those nodes available as storage), it is better to automate the login to the nodes we need. If you wish to log into a target manually now, see section "7.1.2.2 Logging in targets" above. 7.3. Automate Target Logins for Future System Statups ----------------------------------------------------- Note: this may only work for distros with init scripts. To automate login to a node, use the following with the record ID (record ID is the targetname and portal) of the node discovered in the discovery above: iscsiadm -m node -T targetname -p ip:port --op update -n node.startup -v automatic To set the automatic setting to all portals on a target through every interface setup for each protal, the following can be run: iscsiadm -m node -T targetname --op update -n node.startup -v automatic Or to set the "node.startup" attribute to "startup" as default for all sessions add the following to the /etc/iscsi/iscsid.conf: node.startup = automatic Setting this in iscsid.conf, will not affect existing nodes. It will only affect nodes that are discovered after setting the value. To login to all the automated nodes, simply restart the iscsi service: e.g /etc/init.d/open-iscsi restart. On your next startup the nodes will be logged into autmotically. 8. Advanced Configuration ========================= 8.1 iSCSI settings for dm-multipath ----------------------------------- When using dm-multipath, the iSCSI timers should be set so that commands are quickly failed to the dm-multipath layer. For dm-multipath you should then set values like queue if no path, so that IO errors are retried and queued if all paths are failed in the multipath layer. 8.1.1 iSCSI ping/Nop-Out settings --------------------------------- To quickly detect problems in the network, the iSCSI layer will send iSCSI pings (iSCSI NOP-Out requests) to the target. If a NOP-Out times out the iSCSI layer will respond by failing running commands and asking the SCSI layer to requeue them if possible (SCSI disk commands get 5 retries if not using multipath). If dm-multipath is being used the SCSI layer will fail the command to the multipath layer instead of retrying. The multipath layer will then retry the command on another path. To control how often a NOP-Out is sent the following value can be set: node.conn[0].timeo.noop_out_interval = X Where X is in seconds and the default is 10 seconds. To control the timeout for the NOP-Out the noop_out_timeout value can be used: node.conn[0].timeo.noop_out_timeout = X Again X is in seconds and the default is 15 seconds. Normally for these values you can use: node.conn[0].timeo.noop_out_interval = 5 node.conn[0].timeo.noop_out_timeout = 10 If there are a lot of IO error messages, then the above values may be too aggressive and you may need to increase the values for your network conditions and workload, or you may need to check your network for possible problems. 8.1.2 replacement_timeout ------------------------- The next iSCSI timer that will need to be tweaked is: node.session.timeo.replacement_timeout = X Here X is in seconds. replacement_timeout will control how long to wait for session re-establishment before failing pending SCSI commands and commands that are being operated on by the SCSI layer's error handler up to a higher level like multipath or to an application if multipath is not being used. 8.1.2.1 Running Commands, the SCSI Error Handler, and replacement_timeout ------------------------------------------------------------------------- Remember, from the Nop-out discussion that if a network problem is detected, the running commands are failed immediately. There is one exception to this and that is when the SCSI layer's error handler is running. To check if the SCSI error handler is running iscsiadm can be run as: iscsiadm -m session -P 3 You will then see: Host Number: X State: Recovery When the SCSI EH is running, commands will not be failed until node.session.timeo.replacement_timeout seconds. To modify the timer that starts the SCSI EH, you can either write directly to the device's sysfs file: echo X > /sys/block/sdX/device/timeout where X is in seconds or on most distros you can modify the udev rule. To modify the udev rule open /etc/udev/rules.d/50-udev.rules, and find the following lines: ACTION=="add", SUBSYSTEM=="scsi" , SYSFS{type}=="0|7|14", \ RUN+="/bin/sh -c 'echo 60 > /sys$$DEVPATH/timeout'" And change the echo 60 part of the line to the value that you want. The default timeout for normal File System commands is 30 seconds when udev is not being used. If udev is used the default is the above value which is normally 60 seconds. 8.1.2.2 Pending Commands and replacement_timeout ------------------------------------------------ Commonly, the SCSI/BLOCK layer will queue 256 commands, but the path can only take 32. When a network problem is detected, the 32 commands in flight will be sent back to the SCSI layer immediately and because multipath is being used this will cause the commands to be sent to the multipath layer for execution on another path. However the other 96 commands that were still in the SCSI/BLOCK queue, will remain here until the session is re-established or until node.session.timeo.replacement_timeout seconds has gone by. After replacement_timeout seconds, the pending commands will be failed to the multipath layer, and all new incoming commands will be immediately failed back to the multipath layer. If a session is later re-established, then new commands will be queued and executed. Normally, multipathd's path tester mechanism will detect that the session has been re-established and the path is accessible again, and it will inform dm-multipath. 8.1.3 Optimal replacement_timeout Value --------------------------------------- The default value for replacement_timeout is 120 seconds, but because multipath's queue_if_no_path and no_path_retry setting can prevent IO errors from being propagated to the application, replacement_timeout can be set to a shorter value like 5 to 15 seconds. By setting it lower pending IO is quickly sent to a new path and executed while the iSCSI layer attempts re-establishment of the session. If all paths end up being failed, then the multipath and device mapper layer will internally queue IO based on the multipath.conf settings, instead of the iSCSI layer. 8.2 iSCSI settings for iSCSI root --------------------------------- When accessing the root partition directly through a iSCSI disk, the iSCSI timers should be set so that iSCSI layer has several chances to try to re-establish a session and so that commands are not quickly requeued to the SCSI layer. Basically you want the opposite of when using dm-multipath. For this setup, you can turn off iSCSI pings by setting: node.conn[0].timeo.noop_out_interval = 0 node.conn[0].timeo.noop_out_timeout = 0 And you can turn the replacement_timer to a very long value: node.session.timeo.replacement_timeout = 86400 9. iSCSI System Info ==================== To get information about the running sessions: including the session and device state, session ids (sid) for session mode, and some of the negotiated parameters, run: iscsiadm -m session -P 2 If you are looking for something shorter like just the sid to node mapping run: iscsiadm -m session -P 0 or iscsiadm -m session This will print the list of running sessions with the format: driver [sid] ip:port,target_portal_group_tag targetname # iscsiadm -m session tcp [2] 10.15.84.19:3260,2 iqn.1992-08.com.netapp:sn.33615311 tcp [3] 10.15.85.19:3260,3 iqn.1992-08.com.netapp:sn.33615311 To print the hw address info use the -P option with "1": iscsiadm -m session -P 1 This will print the sessions with the following format: Target: targetname Current Portal: portal currently logged into Persistent Portal: portal we would fall back to if we had got redirected during login Iface Transport: driver/transport_name Iface IPaddress: IP address of iface being used Iface HWaddress: HW address used to bind session Iface Netdev: netdev value used to bind session SID: iscsi sysfs session id iSCSI Connection State: iscsi state Note: if a older kernel is being used or if the session is not bound then the keyword "default" is print to indicate that the default network behavior is being used. Example: #iscsiadm -m session -P 1 Target: iqn.1992-08.com.netapp:sn.33615311 Current Portal: 10.15.85.19:3260,3 Persistent Portal: 10.15.85.19:3260,3 Iface Transport: tcp Iface IPaddress: 10.11.14.37 Iface HWaddress: default Iface Netdev: default SID: 7 iSCSI Connection State: LOGGED IN Internal iscsid Session State: NO CHANGE The connection state is currently not available for qla4xxx. To get a HBA/Host view of the session there is the host mode. Example: iscsiadm -m host cxgb3i: [7] 10.10.15.51,[00:07:43:05:97:07],eth3 This prints the list of iSCSI hosts in the system with the format: driver [hostno] ipaddress,[hwaddress],net_ifacename,initiatorname To print this info in a more user friendly way the -P argument can be used: iscsiadm -m host -P 1 Host Number: 7 State: running Transport: cxgb3i Initiatorname: IPaddress: 10.10.15.51 HWaddress: 00:07:43:05:97:07 Netdev: eth3 Here, you can also see the sate of the host. You can also pass in any value from 1 - 4 to print more info like the sessions running through the host, what ifaces are being used and what devices are accessed through it. To print the info for a specific host then you can pass in the -H argument with the host number: iscsiadm -m host -P 1 -H 7