We moved this page to our Documentation Portal. You can find the latest updates here. |
Question
How do I configure a highly available ISCSI+ Multipath DataStore using Nexenta SAN in OnApp?
Assumptions
Client uses Centos6x or Centos5x (your HVs).
All your HVs and your Backup server will be able to access all the targets on the SAN (network is configured).
HVs and SAN have 2 Dedicated NICs that are in the same subnet e.g. 10.99.140.41, 10.99.139.41.
Objectives
This How-to-Article is based on https://github.com/szaydel/Nexenta-Docs-Pub/blob/master/NexentaStorHigh-AvailabilityiSCSIwithLinuxClients.mdown, as it has been adapted to our environment and it also explains how to configure the resulting path and create the DataStores within OnApp. It is provided as it is, and it's not supported by our support staff. If you have any doubts about HA, etc. ask your hardware vendor to review the config.
Settings Used
ISCSI Target Names
iqn.2010-08.org.example:onappiscsi
iqn.1986-03.com.nexenta:ecd714e43149
SCSI (NEXENTASTOR) Target Portal Group name(s) and associated IP(s) TGP / ip:port
tpg_lab9_e1000g1 / 10.99.140.41:3260
tpg_lab9_e1000g2 / 10.99.139.41:3260
SCSI (NEXENTASTOR) Target Group name(s)
tg_onapp
SCSI (NEXENTASTOR) Initiator Group(s)
hg_onapp
Initiator(s) from the CentOS client
iqn.2011-10.test:1:server-testonapp-hv1
Answer
Steps on Nexenta SAN:
Perform the following steps via Management GUI (NMV):
1. We are going to create the target portal groups first, to which we are going to bind our iscsi targets.
Navigate to Data Management > SCSI Target/Target Plus > iSCSI > Target Portal Groups. Click on Create under Manage iSCSI TPGs.
2. Enter the name (in our case tpg_lab9_e1000g1) in the Name field and IP Address:Port (in our case 10.99.140.4:3260) in the Addresses field, then click Create. We choose to use a descriptive name for Target Portal Groups, including the name of network interface to which the group is bound.
3. Repeat the above step for second Target Portal Group. Each group is bound to a single IP Address. If more than two interfaces are being used, repeat the actions for each interface/IP.
Next, we are adding iSCSI Targets, Target Groups and Initiator Groups.
Navigate to Data Management > SCSI Target/Target Plus > iSCSI > Targets. Click Create under Manage iSCSI Targets.
Enter the name (in our case iqn.2011-10.test:1:server-testonapp-hv1) in the Name field. You many choose to leave this field empty to auto-generate a name.
We are not using aliases, as they are optional, but you may choose to setup an alias for each target.
4. Under iSCSI Target Portal Groups select one of the Target Portal groups created earlier. For simplicity of management, we used :01: and :02: in the name of the targets to designate first and second target and we relate first target to tpg_lab9_e1000g1 and second target to tpg_lab9_e1000g2, then click Create.
5. Navigate to Data Management > SCSI Target/Target Plus > SCSI Target > Target Groups. Click on Create under Manage Target Groups.
6. Enter the name (in our case tg_onapp) in the Name field and select all targets (in our case, we select two targets we created earlier), then click Create.
Navigate to Data Management > SCSI Target/Target Plus > SCSI Target > Target Groups.
7. Enter the name (in our case hg_onapp) in the Name field, and if Initiator names are already known you may enter them manually in the Additional Initiators field, otherwise, we will update this group later after logging in from our client(s), then click Create.
8. At this point, we can create mappings as necessary. We have to have ZVOLs created first, prior to setting up mappings. Please refer to the User's Guide for further details on creation of ZVOLs and mappings. For the purposes of this document, we are using four ZVOLs, with LUN ID's 10 through 13 and hg_onapp as the Host Group and tg_onapp as the Target Group.
Steps Client-Side:
1. Quick validation of the iscsid service is necessary to make sure that indeed it is setup correctly. Command chkconfig --list iscsid should return state of the iscsid service. We expect to have it enabled in runlevels 3, 4, and 5. If not enabled, run chkconfig iscsid on, which will assume the defaults and enable service in runlevels 3, 4, and 5. Keep in mind that different distributions of Linux may have different names for services and tools/methods to enable/disable automatic startup of services.
onapp-test# chkconfig --list iscsid
iscsid 0:off 1:off 2:off 3:on 4:on 5:on 6:off
2. Next, we validate that multipathd service is working correctly. The mpathconf utility will return information about the state of the multipath configuration:
onapp-test# mpathconf
multipath is enabled
find_multipaths is disabled
user_friendly_names is enabled
dm_multipath module is loaded
multipathd is chkconfiged on
3. Now we will configure our iSCSI initiator settings. Depending on your client, the configuration files may/may not be in the same location(s). We will modify the /etc/iscsi/initiatorname.iscsi with a custom initiator name. By default, the file will already have an InitiatorName entry. We replace it with a custom entry, but this is strictly optional, and should be part of your naming convention decisions.
iqn.2011-10.test:1:server-testonapp-hv1
4. We are going to create a virtual iSCSI interface for each physical network interface and bind the physical interfaces to the virtual iSCSI interfaces. The end-result will be two new iscsi interfaces: sw-iscsi-0 and sw-iscsi-1, bound to the physical network interfaces eth4 and eth5. Avoid naming logical interfaces with the same names as the physical NICs.
First, we create the logical interfaces for which a corresponding configuration file named sw-iscsi-0 and sw-iscsi-0 will be generated under /var/lib/iscsi/ifaces:
onapp-test# iscsiadm --mode iface --op=new --interface sw-iscsi-0
onapp-test# iscsiadm --mode iface --op=new --interface sw-iscsi-1
5. Following the two iscsiadm commands, we are going to modify two newly created config files with additional parameters. Here's an example of one of the configuration files modified with details about the physical interface. Note that we are defining parameters here specific to each interface and your configuration will certainly vary.
To quickly collect information about each interface, you could simply use the ip command: ip addr show <interface-name>|egrep 'inet|link’.
This configuration explicitly binds our virtual interfaces to physical interfaces. Each interface is on its own network, in our case 10.99.140 and 10.99.139.
We can always validate our configuration with this: for i in 0 1; do iscsiadm -m iface -I sw-iscsi-$i; done, replacing 0 and 1 with whatever number of the iSCSI interface(s) we used.
onapp-test# cat /var/lib/iscsi/ifaces/sw-iscsi-0
# BEGIN RECORD 6.2.0-873.2.el6
iface.iscsi_ifacename = eth4
iface.hwaddress = 6C:AE:8B:61:54:BC
iface.transport_name = tcp
iface.vlan_id = 0
iface.vlan_priority = 0
iface.iface_num = 0
iface.mtu = 0
iface.port = 0
# END RECORD
onapp-test# cat /var/lib/iscsi/ifaces/sw-iscsi-1
# BEGIN RECORD 6.2.0-873.2.el6
iface.iscsi_ifacename = eth5
iface.hwaddress = 6C:AE:8B:61:54:BD
iface.transport_name = tcp
iface.vlan_id = 0
iface.vlan_priority = 0
iface.iface_num = 0
iface.mtu = 0
iface.port = 0
# END RECORD
6. Depending on your client, TCP/IP kernel parameter rp_filter may need to be tuned, in order to allow to correct multipathing with iSCSI. For the purposes of this guide, we set 0. For each physical interface we add an entry to /etc/sysctl.conf. In our example, we are modifying this tunable for eth1 and eth2. If the kernel tuning is being applied, either set the parameters via the sysctl command, or reboot system prior to the next steps.
onapp-test# grep eth[0-9].rp_filter /etc/sysctl.conf
net.ipv4.conf.eth1.rp_filter = 0
net.ipv4.conf.eth2.rp_filter = 0
7. Next, we will perform a discovery of the targets via the portals that we exposed previously. We can do a discovery against one of the both portals, and the result should be identical. We discover targets for each configured logical iSCSI interface. If you choose to have more than two logical interfaces, and therefore more than two paths to the SAN, perform the following step for each logical interface:
onapp-test# iscsiadm -m discovery -t sendtargets --portal=10.99.140.41 -I sw-iscsi-0 --discover
onapp-test# iscsiadm -m discovery -t sendtargets --portal=10.99.139.41 -I sw-iscsi-1 --discover
8. We should validate nodes created as a result of the discovery. We expect to see two nodes for each portal on the SAN.
onapp-test# iscsiadm -m node
10.99.140.41:3260,2 iqn.2010-08.org.example:onappiscsi
10.99.140.41:3260,2 iqn.1986-03.com.nexenta:ecd714e43149
10.99.139.41:3260,2 iqn.2010-08.org.example:onappiscsi
10.99.139.41:3260,2 iqn.1986-03.com.nexenta:ecd714e43149
9. At this point, we have each logical iSCSI interface configured to login into all known portals on the SAN and into all known targets. Thus, we have a choice to make. We can allow this configuration to remain or we can choose to isolate each logical interface to a single portal on the SAN.
iscsiad -m node -l
10. Each node will have a directory under /var/lib/iscsi/nodes with name identical to target name on the SAN and a sub-directory for each portal. Here, we can see that the leaf objects of this tree structure are files named identical to our logical iSCSI interfaces. In fact, these are config files generated upon successful target discovery for each interface.
onapp-test# ls -l /var/lib/iscsi/nodes/iqn.2010-08.org.example:onappiscsi/10.99.139.120\,3260\,3/
total 8
-rw------- 1 root root 1878 Jun 28 2013 sw-iscsi-0
-rw------- 1 root root 1878 Jun 28 2013 sw-iscsi-1
onapp-test# ls -l /var/lib/iscsi/nodes/iqn.1986-03.com.nexenta:ecd714e43149/10.99.140.220\,3260\,3/
total 8
-rw------- 1 root root 1862 May 21 08:13 sw-iscsi-0
-rw------- 1 root root 1862 May 21 08:47 sw-iscsi-1
Deletion of one of the interface configuration files under each node will in effect restrict that interface from logging into the target. At any time, you can choose which interfaces will login into which portals. For the purposes of this document, we are going to assume that it is acceptable for each logical iSCSI interface to login into both portals.
11. Because we have one additional layer between OS and iSCSI (DM-MP layer), we want to tune iSCSI parameters adjusting the time it takes to hand-off failed commands to the DM-MP layer. Typically, it takes 120 seconds for the iSCSI service to give up on a command when there are issues with completing the command. We want this period to be a lot shorter and allow DM-MP to try and switch to another path instead of potentially trying down the same path for 2 minutes. We need to modify /etc/iscsi/iscsid.conf and comment out the default entry with the value 120 (seconds), and add new entry with the value 10.
onapp-test# grep node.session.timeo.replacement_timeout /etc/iscsi/iscsid.conf
#node.session.timeo.replacement_timeout = 120
node.session.timeo.replacement_timeout = 10
12. There are a number of parameters that we have to configure in order for the device mapper to correctly multipath with these LUNs. First, we are going to configure our /etc/multipath.conf configuration file, with the basic settings necessary to properly manage multipath behavior and path failure. This is not an end-all-be-all configuration, rather a very good starting point for most NexentaStor deployments. Please be aware that this configuration of multipath may fail on systems with older version of multipath. If you are running Debian and older RedHat-based distributions, parameters that will most likely need to be adjusted are: checker_timer,getuid_callout, path_selector. Be certain to review your distribution's multipath documentation or a commented sample multipath.conf file.
defaults {
checker_timer 120
getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
no_path_retry 12
path_checker directio
path_grouping_policy group_by_serial
prio const
polling_interval 10
queue_without_daemon yes
rr_min_io 1000
rr_weight uniform
selector "round-robin 0"
udev_dir /dev
user_friendly_names yes
}
blacklist {
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z]"
devnode "^sda"
devnode "^sda[0-9]"
device {
vendor DELL
product "PERC|Universal|Virtual"
}
}
devices {
device {
## This section Applicable to ALL NEXENTA/COMSTAR provisioned LUNs
## And will set sane defaults, necessary to multipath correctly
## Specific parameters and deviations from the defaults should be
## configured in the multipaths section on a per LUN basis
##
vendor NEXENTA
product NEXENTASTOR
path_checker directio
path_grouping_policy group_by_node_name
failback immediate
getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
rr_min_io 2000
##
##
## Adjust `node.session.timeo.replacement_timeout` in /etc/iscsi/iscsid.conf
## in order to rapidly fail commands down to the multipath layer
## and allow DM-MP to manage path selection after failure
## set node.session.timeo.replacement_timeout = 10
##
## The features parameter works with replacement_timeout adjustment
features "1 queue_if_no_path"
}
}
13. Next, we describe each LUN that we are going to access via multiple paths. Any parameters that we already set in device and defaults sections can be skipped, assuming we are accepting the global settings in the device and defaults sections. Note that we explicitly define WWID for each LUN. We also supply an alias, which will make it much easier to specifically identify LUNs; this is strictly optional. Parameters vendor and product will always be NEXENTA and COMSTAR respectively, unless explicitly changed on the SAN, which is out of scope of this configuration.
multipaths {
## Define specifics about each LUN in this section, including any
## parameters that will be different from defaults and device
##
multipath {
alias mpathc
wwid 3600144f05415410000005077d06b0002
vendor NEXENTA
product NEXENTASTOR
path_selector "service-time 0"
failback immediate
}
multipath {
alias mpathb
wwid 3600144f05415410000005077d05b0001
vendor NEXENTA
product NEXENTASTOR
path_selector "service-time 0"
failback immediate
}
}
14. After we save the file as /etc/multipath.conf, we are going to flush and reload device-mapper maps. At this point, we are assuming that the multipathd service is running on the system. Running multipath -v2 gives verbose enough details to make sure that maps are being created correctly:
onapp-test# multipath -F
onapp-test multipath -v2
A typical multipath configuration for any single LUN will look very similar to the following:
[onpp-test~]# multipath -ll
mapthc (3600144f05415410000005077d06b0002) dm-4 NEXENTA,NEXENTASTOR
size=4.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
|- 26:0:0:2 sdg 8:96 active ready running
`- 25:0:0:2 sdd 8:48 active ready running
15. The resulting multipath devices can be used to create the DataStore:
mpathc (3600144f05415410000005077d06b0002) dm-4 NEXENTA,NEXENTASTOR
size=4.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
|- 26:0:0:2 sdg 8:96 active ready running
`- 25:0:0:2 sdd 8:48 active ready running
mpathb (3600144f05415410000005077d05b0001) dm-3 NEXENTA,NEXENTASTOR
size=4.0T features='1 queue_if_no_path' hwhandler='0' wp=rw
`-+- policy='round-robin 0' prio=1 status=active
|- 26:0:0:1 sdf 8:80 active ready running
`- 25:0:0:1 sdb 8:16 active ready running
Steps in OnApp (Configure the DataStore within OnApp and the HV)
** Create a DataStore Zone or use an existing DataStore Zone prior to following these instructions.
1. Go to your Control Panel and create two new DataStores one for each resulting math device (mpathc, mpathb). This will create an entry on the DB that we will use with the resulting uuid to create the DataStore Manually on the HV.
2. Once the DataStores have been created on the UI, login to the HV. We will next create the DataStores on the HV using the uuids. ISCSI/FC data stores use LVM. Create the 2 physical volumes one for each DS:
pvcreate —metadatasize 50M /dev/mapper/mpathc
pvcreate —metadatesize 50M /dev/mapper/mpathb
3. Create the Volume groups using the uuids from the UI:
vgcreate onapp-usxck5bmj0vqrg /dev/mapper/mpathc
vgcreate onapp-m8fhwar2eenf04 /dev/mapper/mpathb
4. Check whether both data stores are available within the HVs:
onapp-test # pvscan
PV /dev/mapper/mpathc VG onapp-usxck5bmj0vqrg lvm2 [4.00 TiB / 650.90 GiB free]
PV /dev/mapper/mpathb VG onapp-m8fhwar2eenf04 lvm2 [4.00 TiB / 679.90 GiB free]
PV /dev/sda2 VG vg_onaappkvm1 lvm2 [277.97 GiB / 27.97 GiB free]
Total: 3 [8.66 TiB] / in use: 4 [8.66 TiB] / in no VG: 0 [0 ]
5. Now in the OnApp UI, go to Settings, Hypervisor_Zones and click Manage DataStores. You should add the DataStores to the HV_Zone.
Now when you create a VM, a logical volume will be created on the DataStore chosen. That will be your VMdisk.