Slow file copying on ESXi with HP SmartArray p410i


Recommended Posts

Hi,

So I've got a HP DL160 G6 running with an HP SmartArray p410i 1GB super-capacitor RAM-based cache, and am using 3*15K RPM 73GB SAS drives in a RAID5 configuration and I have a single 500GB WD enterprise SATA drive connected too...

Now I know in comparison to the Dell PERC6 that the HP SA is a steaming pile of trash, it's much slower for reasons I've never been able to work out despite having a much better spec.

But this always gets me, I back up my VMs using the SSH server in ESXi and I have downtime of 50 minutes to transfer a 22GB VM, which is absolutely bloody rediculous (it's still not finished copying) and I'm just at a loss as to why. The 500GB drive isn't anything amazing, just a standard 7200 RPM drive but to transfer a VM at less than 7MBps is just... I'm speechless. I can transfer files faster on a 32 bit SCSI card faster than this.

 

Does anyone have a similar setup or any tips for this? (It's ESXi 5.1 and both file-systems are VMFS-5) I've got 4 more VMs to backup after this then apply some system updates and it's looking like it's going to take the whole day. I don't think it's a hardware speed problem, I'm thinking it's down to ###### poor drivers from HP for the p410i (hpaucli is [in comparison to dell's PERC utilities] a complete joke) or something up with ESXi but don't really know how I can go about test either or speeding it up.

Link to comment
Share on other sites

Is the vmkern network on the same or a different physical port from the interface on your primary vSwitch?

 

I read that performance is awful unless this is separated (although I have never tried).

Link to comment
Share on other sites

Is the vmkern network on the same or a different physical port from the interface on your primary vSwitch?

 

I read that performance is awful unless this is separated (although I have never tried).

Ah, I don't mean I'm coping the data over SSH, I just mean I'm connected via SSH to do the file copying. It's going direct disk-to-disk on the same host.

At 11:58 I started copying a 10GB VM, 25 minutes later it's still not finished, so this copy speed is definitely slower than 7MBps.

Link to comment
Share on other sites

How exactly are you doing this copy.. Are you going to the datastore and downloading the vm disk?

 

How is your vmkern - is it shared with another nic.. I noticed a huge increase in performance when broke out vmkern to its own port group on its own nic..

 

post-14624-0-74552400-1422101386.png

 

So here I started a download of vm - clicked go at 6:07:30..  Its downloading now, This is off a HP N40L with cheap nics added, the vmkern is using the built in nic I do believe.. If I look at my network performance for my pc I downloading the file too - getting pretty decent network util

 

post-14624-0-94754700-1422101722.png

 

Ok done...  looks like 20min

 

post-14624-0-40134400-1422102622.png

 

See the time created, and then last modified time..  So lets call it 34GB / 20 min = 1.7GB a min = 28MBps, which clearly is not full speed of my network..  I normally see double or triple that from the nas on the same esxi..  But it is inline with with how the vmkern works, etc..

 

edit:  So your dong disk to disk copy on your esxi host?  via cli command..  Let me test that with this same 34GB file..  I have a SSD datastore and the 250GB disk it came with as datastore as well..  BRB

 

Ok so started at 6:38:30 and so far its copied 3.8GB in 3 minutes.. 

so at 10 min mark bit over 12GB

 

/vmfs/volumes/54c39196-0f3ec6fc-3df2-001f29541714/test # ls -la
total 12049416
drwxr-xr-x    1 root     root           420 Jan 24 12:38 .
drwxr-xr-t    1 root     root          1400 Jan 24 12:37 ..
-rw-------    1 root     root     12343582720 Jan 24 12:48 w7x64-clean-flat.vmdk
/vmfs/volumes/54c39196-0f3ec6fc-3df2-001f29541714/test #

 

So I would have to say, seems like a bit slower than the download copy..  But that 250GB disk is pretty old crappy disk ;)

 

So at 20min, 24GB roughly looks like about 20MBps which yeah is like 3x what your seeing and this is just the controller that that comes with the N40L..

Link to comment
Share on other sites

Not an answer to your problem but you could use ghettoVCB which is a free backup script for ESXi and works quite well. Bit fiddly to set up but once done it's fine. 

 

I have it scripted so I just log into the host and run a script to do the backup then once done I can copy the backup files without having to take down any of the servers.

Link to comment
Share on other sites

I've now shut down all VMs except one (my W7 remote management VM) and I've taken a screenshot of it... Something really is not right here.

aNS6w1g.png

EDIT: Changed around the graph output (bit hard on a small VNC screen) and it's apparently reading at 6MBps from the main drive and writing to the backup drive at 15MBps... I can't understand how it's writing twice the data it's reading!

 

 

Not an answer to your problem but you could use ghettoVCB which is a free backup script for ESXi and works quite well. Bit fiddly to set up but once done it's fine. 

 

I have it scripted so I just log into the host and run a script to do the backup then once done I can copy the backup files without having to take down any of the servers.

I can't do that as I've got snapshots disabled and have all changes written to the disk as they're performed.

Edited by n_K
Link to comment
Share on other sites

Ok so it finished

 

/vmfs/volumes/54c39196-0f3ec6fc-3df2-001f29541714/test # stat w7x64-clean-flat.vmdk
  File: w7x64-clean-flat.vmdk
  Size: 34359738368     Blocks: 67108864   IO Block: 131072 regular file
Device: 831f0b1fc2871364h/9448282774282113892d  Inode: 4225796     Links: 1
Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2015-01-24 12:38:34.000000000
Modify: 2015-01-24 13:06:49.000000000
Change: 2015-01-24 13:06:49.000000000
 

 

So we got 28 min for 34GB, roughly 20MBps, which yeah is blowing you away on crappier hardware.... Hmmmmmm??

Link to comment
Share on other sites

you could try dd command vs cp, doing a test now looks like 4.9GB in 3 minutes vs the 3.8 with the cp command, let me try uping the bs from 1M

 

edit:  Well using dd seems to get me the speeds I saw with download..  About 28MBps vs the 20 was seeing with cp.

 

edit2:  ok -- seems cp has really be depreciated for a while on esxi.. your suppose to use vmfsktools command..

 

http://www.vmware.com/pdf/esx_3p_scvcons.pdf

For performance and data placement reasons, do not use scp or cp; instead, use vmkfstools, the Virtual Machine Importer tool from VMware, or the SDK APIs to manipulate your virtual disks. You should see very significant performance improvements if you use the recommended tools.

 

So doing a copy of that same vm using

/vmfs/volumes/535605bc-d0c25a0d-7cf0-001f29541714/w7 # vmkfstools -i /vmfs/volumes/datastore0/w7/w7x64-clean.vmdk /vmfs/volumes/datastore1/test/test.vmdk
Destination disk format: VMFS zeroedthick
Cloning disk '/vmfs/volumes/datastore0/w7/w7x64-clean.vmdk'...
Clone: 100% done.
 

was done in 4.25 min or 133MBps

 

/vmfs/volumes/54c39196-0f3ec6fc-3df2-001f29541714/test # stat test.vmdk
  File: test.vmdk
  Size: 514             Blocks: 0          IO Block: 131072 regular file
Device: 831f0b1fc2871364h/9448282774282113892d  Inode: 8420100     Links: 1
Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2015-01-24 13:38:46.000000000
Modify: 2015-01-24 13:42:57.000000000
Change: 2015-01-24 13:42:57.000000000
 

 

post-14624-0-99362500-1422107393.png

 

Link to comment
Share on other sites

Argh the dd is an annoying busybox version, just started running it with all the ibs and out set to 16MB and will see what happens with an 8GB file!

Did not know that about cp on ESXi, thanks budman! I'll see if DD increases the speed and if not will retry with that command, I'm assuming it doesn't bother copying the blank space and speeds up transfer that way, the VM I'm copying now is 90% utilised so it probably wouldn't save much time.

 

EDIT: OK no there is definitely something not right, 3Gbps link speeds between the SATA/SAS disks and the controller, 0.2GB copied in 30 seconds.

Edited by n_K
Link to comment
Share on other sites

Dude see my edit.. Use vmkfstools -i src dst

 

My test shows a SCREAMING difference..   What you got to loose..  It sure can not be any slower than your dd or cp commands ;)

Link to comment
Share on other sites

I can't do that as I've got snapshots disabled and have all changes written to the disk as they're performed.

 

Curios as to why you disabled snapshots.

Link to comment
Share on other sites

Just tried on a 23GB disk after the 8GB DD finished budman;

/vmfs/volumes/508aa94d-fbcf15ba-0faf-68b599b49d30/Jan 24 2015/W7 # vmkfstools -i /vmfs/volumes/Main/Windows\ 7\ Pro
fessional-N\ x64/Windows\ 7\ Professional-N\ x64.vmdk /vmfs/volumes/Backup\ Disk/Jan\ 24\ 2015/W7/
Destination disk format: VMFS zeroedthick
Cloning disk '/vmfs/volumes/Main/Windows 7 Professional-N x64/Windows 7 Professional-N x64.vmdk'...
Failed to clone disk: The file already exists (39).
Ignore, I'm being dense and not putting in the filename!

 

Curios as to why you disabled snapshots.

Uses space which I don't have that much of.

Link to comment
Share on other sites

Yeah that is much better ;)  Should be a helpful thread for other people I think.. I don't normally move files between datastores

 

Now not sure on what your original was..  Was it thick, or thin?  Notice it defaults to thickzero'd -- so if was thin before, your backup isn't.  if you want to maintain thin you can do -d thin on the end.  But that took about double the time to copy..  But I would think even 10 minutes for you would be much better than what you were seeing.

Link to comment
Share on other sites

Not so sure its really a performance booster doing that any more.. If on SSD datastore makes no difference for sure.. So your storage is local, is it VAAI ??  Do you see hardware acceleration when you look at your datastores?

 

There are lots of variables at play when it comes to performance - a we see in this example using a deprecated common way that many people would do can have huge performance implications.. Comes down to your requirements.  I for sure have my storage over provisioned for sure.. Small datastore..  And play with lots of vms, I don't see any reason to suck up all the space with zeros ;)

Link to comment
Share on other sites

This topic is now closed to further replies.