What you will learn in this tip: iSCSI storage area networks (SANs) are generally easier to set up and keep running than their Fibre Channel counterparts. However,
that doesn't mean they're completely trouble-free. If you're having iSCSI SAN performance problems, learn about the first places to look in this tip.
There are several things you can do to boost the performance of your iSCSI storage system, ranging from changing a few settings on your network to adjusting some settings. If you're not getting good iSCSI SAN performance, the first place to start checking is with the log files. Check both the logs for the iSCSI network and for the storage devices. Usually the answer to your performance problem can be found in one or the other, and not both. The messages in the logs will tell you which side of the system is giving you trouble.
Problems in the storage array are usually easy to spot. The reason is that storage array vendors generally do a good job of monitoring their systems and reporting problems comprehensively in the logs. If the problem is in the array, the problem (and its solution) will probably pop out at you as soon as you look at the storage logs. Common problems with storage involve failed or failing drives and failed host bus adapters (HBAs).
With an iSCSI SAN, things can be trickier. The error messages are much less specific and many of the errors can have multiple causes.
Most commonly, if you are having performance problems with an iSCSI SAN, the problems are usually in the network rather than the storage arrays. Specifically, the first thing to suspect is the configuration of the iSCSI network, but be sure to proceed systematically and eliminate other likely causes.
iSCSI SAN performance issues and network connections
Assuming your storage system logs don't show a problem, the first thing to check is to see if you have connectivity to the SAN.
Check the physical connections on the network to make sure they are clean and tight. Check to see that your cables aren't kinked or bent too sharply. If you're suspicious of a connection, try swapping in a new cable. Cabling usually isn't the cause of iSCSI SAN problems (unless you've got a cable running across the floor where someone can trip on it), but cable problems can be difficult to track down, especially if the problem is intermittent.
Ping the storage system to make sure that the arrays and the LUNs are visible to the server. If you're using VMware ESX, you'll need to use the vmkping command to check the virtual connections as well.
If you have connectivity, but iSCSI still isn't working, check your firewall and make sure it isn't blocking TCP port 3260. iSCSI needs that port available to work.
Enable jumbo frames at every hop
Most iSCSI SANs use jumbo frames. If yours does, make sure jumbo frames are enabled at every hop across the network. Jumbo frames have to be specifically enabled at every device in the iSCSI chain. Keep in mind that the default setting for most Ethernet equipment is standard frames. Not just the target and the initiator have to be set to accept jumbo frames, but so does every intermediate stage. If all of them are not set to accept jumbo frames, frames will be dropped.
Check your configuration settings
Make sure you have your entire system properly configured. Check your IP addresses, LUNs and subnet configurations.
Has iSCSI failed?
If your iSCSI initiator has completely failed, it will commonly display a constant "reconnecting" message as the initiator attempts to establish a connection with the target. Check your configuration and try replacing the initiator.
If performance is poor, the connection may be overwhelmed by other traffic. Again, check the logs. To test, try putting the server and the storage device on their own subnet.
Also check to see that the server has two network interface cards (NICs). One card should be dedicated to iSCSI and the other used for other network traffic. Trying to do both jobs on a single NIC is likely to overload the system and reduce performance.
Instead of using a regular NIC in your server, you should consider a NIC that is TCP/IP Offload Engine (TOE) enabled. TOE considerably improves performance by taking a load off the server CPU. This can make a considerable difference in iSCSI SAN performance.
The key to successful iSCSI SAN troubleshooting is to systematically prune the problem tree to eliminate possible causes. Work methodically and carefully through the possible causes. Refer to the iSCSI SAN documentation for settings and device-related issues as needed.
This was first published in November 2010