A few days ago I posted a comparison between FreeBSD’s bhyve and VMWare ESXi 5.5. I received a lot of feedbacks from the result of our test, so we decided to investigate further with a new round of tests, in a more scientific approach.
As in previous test, we used a standard “empty” FreeBSD 10 machine + latest portsnap that we used as our main “template”. The VM was using “ahci-hd” as the storage backend and the tests were run in SSH, not local console. We always started from this template for every test and run the same test in different scenarios. The hardware was the same one as the previous tests.
Note: I didn’t write it in the past post, but our first round of test was run on a ZFS filesystem with both compression and deduplication enabled.
In our first run of tests we compiled bash. For this second run of tests we went for a compile again, but we changed the package – in fact we switched from bash to MySQL Server 5.6. The choice was taken because MySQL’s compilation time is longer than bash’s and also because MySQL compiles almost without dependencies. We pre-fetched the packages and put them in /usr/ports/distfiles, so the timing you will see are strictly compile time. Another difference is that this time we went for a simple “make -DBATCH” instead of “make -DBATCH install clean” as in our previous test. So the compile command was:
cd /usr/ports/databases/mysql56-server time make -DBATCH
We also decided to expand our testing a bit further and create a wider scenario – we also received some suggestion from Peter Grehan about using the “-P” command line switch for bhyve invocation – so our testing matrix became the following:
- Test VMWare ESXi
- Test bhyve on ZFS with and without the -P switch
- Test bhyve on UFS with and without the “noatime” switch
Once again: the ZFS filesystem was both compression-enabled and deduplication-enabled.
As in previous test we run the following scenarios:
- 1 Virtual Machine with 1 CPU
- 1 Virtual Machine with 2 CPUs
- 20 Virtual Machines with 1 CPU each
- 20 Virtual Machines with 2 CPUs each
So, let’s jump to the stuff.
First test: a single machine
Single CPU VM
In this test we run a single VM with a single virtual CPU on a totally idle host, it represents the best scenario available. These are the results:
- ESXi took 13 minutes and 54 seconds (12 minutes and 41 seconds of user time, 1 minute and 2 seconds of system time)
- bhyve on ZFS took 16 minutes and 17 seconds (14 minutes and 30 seconds of user time, 1 minute and 27 seconds ot system time)
- bhyve on ZFS with the -P switch took 16 minutes and 12 seconds (14 minutes and 33 seconds of user time, 1 minute and 30 seconds of system time)
- bhyve on UFS took 15 minutes ans 56 seconds (14 minutes and 19 seconds of user time, 1 minute ans 26 seconds of system time)
- bhyve on UFS mounted with noatime and withe the -P switch took 15 minutes and 37 seconds (14 minues and 6 seconds of user time, 1 minute and 22 seconds of system time)
The best combination was ESXi vs bhyve -P on UFS noatime. bhyve was 12% slower than ESXi in this scenario. Here’s the magical graph for this scenario:
Dual CPU VM
In this test we run a single VM configured with two virtual CPUs on a totally idle host. Last time this test gave result slower than a single CPU machine, but this time things got different:
- ESXi took 10 minutes and 57 seconds (14 minutes and 55 seconds of user time, 1 minute and 39 seconds of system time)
- bhyve on ZFS took 13 minutes and 16 seconds (17 minutes and 35 seconds of user time, 2 minutes and 15 seconds of system time)
- bhyve with -P on ZFS took 13 minutes and 26 seconds (17 minutes and 45 seconds of user time, 2 minutes and 18 seconds of system time)
- bhyve on UFS took 13 minutes and 5 seconds (17 minutes and 16 seconds of user time, 2 minutes and 9 seconds of system time)
- bhyve with -P on UFS noatime took 13 minutes and 13 seconds (17 minutes and 22 seconds of user time, 2 minutes and 15 seconds of system time)
The best combination was ESXi vs bhyve on UFS. bhyve was 19% slower than ESXi in this scenario. I think that number would have been better if the filesystem was mounted with noatime but we didn’t try that combination. The usual graph:
Second test: multiple machines
Single CPU VM
In this test we run 20 virtual machines compiling at the same time, configured with one single virtual CPU. The compilation was started at the very same time on each machine using cluster-ssh. The following time is the average of the times of the 20 VMs.
- ESXi took 42 minutes and 50 seconds (28 minutes and 34 seconds of user time, 2 minutes and 6 seconds of system time)
- bhyve -P on ZFS took 37 minutes and 7 seconds (33 minutes and 2 seconds of user time, 2 minutes and 59 seconds of system time)
- bhyve -P on UFS noatime took 43 minutes and 45 seconds (30 minutes and 45 seconds of user time, 2 minutes and 44 seconds of system time)
The best combination was ESXi vs bhyve -P on ZFS. bhyve was 14% faster than ESXi in this scenario. Even the slower bhyve -P on UFS noatime was just 2% slower than ESXi. Here’s the graph:
Dual CPU VM
In this test we run 20 parallel VMs all configured with two virtual CPUs. The following is the average time of all the machine. Note: we “lost” one machine (ssh disconnected) in the “bhyve on UFS” scenario, so that average is against 19 machines, not 20. The compilation was almost over when we lost the machine so the timings are still reasonably correct.
Note: this is the scenario we could not complete in our previous test because it run “forever”. The -P flag seems to have solved the issue.
- ESXi took 44 minutes and 16 seconds (49 minutes and 49 seconds of user time, 5 minutes and 26 seconds of system time) – note: this is slower than a 1CPU VM
- bhyve -P on ZFS took 1 hour, 20 minutes and 58 seconds (54 minutes and 22 seconds of user time, 53 minutes and 57 seconds of system time (!)) – note: this is much slower than a 1CPU VM
- bhyve -P on UFS noatime took 1 hour, 6 minutes and 18 seconds (41 minutes and 7 seconds of user time, 36 minutes and 3 seconds of system time) – note: this is slower than a 1CPU VM
The best combination was ESXi vs bhyve -P on UFS noatime. bhyve was 49% slower than ESXi in this scenario. Here’s the latest graph:
I think there’s not a clear pattern in our results, altough you can easily spot the reason for every result we got.
ZFS or UFS?
First of all the HDs of our server are pretty slow (7200 RPMs), so when the host has enough spare CPU power to use for ZFS compression and deduplication the reduced amount of datas written to disk become an advantage. Yet again UFS is clearly faster than ZFS especially when it comes to a single machine working. The big boost in the third scenario (20 VMs, 1 CPU) is caused – I think – by the combination of having 20 task reading and writing the same data (thus using deduplication to reduce the disk activity) but yet having a few CPU clock to compress the data. On the countrary ZFS behaved very badly under extreme load conditions like the 20 VMs 2 CPUs scenario. I think there’s not a clear magical solution here.
BASH or MySQL?
As you can see the results compiling MySQL are pretty different from the ones we got compiling Bash. I think it has something to do with the compiler parameters for the two packages. Probably MySQL’s compilation takes advantage of the two virtual CPUs, while we have seen that compiling BASH the second CPUs wasn’t an improvement at all!
ESXi or bhyve?
As we said in part 1, bhyve is a totally new product, so ESXi is clearly in a predominant position right now. Yet again, it’s nice to see that the gap is definitely not big!
So our final choice would be to adopt bhyve and to put the VMs on a ZFS. Altough slower than UFS, my perception is that the advantage of deduplication and compression overcome the performance hit – after all usually a server spend a lot of time idle, but his HD is still there – saving some space in my opinion it’s a plus in such a scenario, especially because we’ll be using an external iSCSI SAN as the final storage for the VMs, and the idea of having datas compressed before reaching the iSCSI network card, thus reducing the traffic on the iSCSI network, is a must.
That’s it, it was fun for us, I hope this was useful for you.
The Linux thing 🙂
Someone asked us to do a comparison between ESXi and bhyve using a Linux machine. We did that as well, also if I have to admit we didn’t put too much work in that. Under linux we compiled perl. It seemed a reasonably big thing to use as a test.
- ESXi took 2 minutes and 7 seconds to compile perl
- bhyve on ZFS took 2 minutes and 23 seconds to compile perl
We weren’t very strict on tracking these datas, but I think that the point we can say is that even when virtualizing linux bhyve and ESXi can compete.
For those who are interested in numbers, here’s the excel file with all the numbers.