Recommended Posts

Hey,

 

I noticed a thread over on the pfSense forums while looking for any input on a problem I just noticed on my ESXi host. I did chime in on the forums there, but figured I would open a question here to see what others are experiencing as well as thoughts on the issue...

 

Doing ~80Mbps of transfer on my pfSense VM has been netting me extremely high CPU usage on the host. CPU usage as registered inside the pfSense VM is very small (~20% or so), but VMWare is reporting it using over 135% of the host CPU. Very strange...

 

The ESXi host is a Dell R620 with Intel NICs. I have read that some people were having this problem with low quality non-Intel NICs, but that isn't the case here. The box is also on the VMWare HCL and installed using the Dell official ESXi 5.5 installer.

 

I haven't yet tried upgrading to 5.5 U1, that is the weekend, so I'm not sure if it is fixed in the update (though nothing in the release notes points to this area).

 

For anyone else running pfSense in a VM, are you having a similar issue with CPU usage during active load?

Try a different adapter type on the VM? Maybe one that has (better) driver support in ESXi.

 

I haven't yet tried a different adapter, but from what I read on pfSense forums the Virtualized adapter seems to not change things. I will give this a shot though.

Are you using VMXNET3 in the PFSENSE? Did you install from ISO or use the appliance?

 

E1000 as setup by the appliance.

 

pfSense is version 2.1

I would have wire something to one of my segments, and put something on the wan side to check that kind of throughput.. I only have 28mbps down internet.  So hard to really load up my pfsense vm.  But could test across segments fairly easy - but if anything is causing extra cpu I would guess nat vs just routing.  So would have to put something on my wan side to get high enough speeds to see 80mbps.

 

Sounds like a fun weekend project ;)  I am running esxi 5.5, will get around to u1 maybe this weekend as well.

I would have wire something to one of my segments, and put something on the wan side to check that kind of throughput.. I only have 28mbps down internet.  So hard to really load up my pfsense vm.  But could test across segments fairly easy - but if anything is causing extra cpu I would guess nat vs just routing.  So would have to put something on my wan side to get high enough speeds to see 80mbps.

 

Sounds like a fun weekend project ;)  I am running esxi 5.5, will get around to u1 maybe this weekend as well.

My feeling was something along the NAT lines as well, but then I would expect the high CPU usage to be more internal to the pfSense VM as that would be doing the actual NAT processing. I've attached the screenshot below of what I'm seeing from ESXi on this...

 

In may case, the additional load isn't going to affect the overall server, but it just seems very strange to me that this level of CPU use is occurring in the way it is.

 

8OVaBXLl.png

Question - why are you running 64bit version??  You only have given it 1GB or ram, there is no reason for 64bit version that I can see..  I run 32bit.  And maybe its that 3rd party software - I look at the performance graph in esxi and don't see it.  Let me grab that 3rd party tool

 

edit:  Ok I went a different route here..  What if pfsense shows 100% cpu what does esxi show?  So I grabbed cpuburn for pfsense

pkg_add -r http://ftp-archive.freebsd.org/pub/FreeBSD/ports/i386/packages-8.3-release/All/cpuburn-1.4.tbz
 

I than ran burnMMX since my host is amd, etc.

 

post-14624-0-76596100-1395579857.png

 

Looks to be a decent matchup to me -- so to me other than yours showing more than 100% usage -- which is kind of impossible ;)  and you gui widget showing low.. They seem to be matching up to me. 

 

Can you repeat my test and see what you get?

 

Also do you have any pools setup, any reservations or limitations on memory/cpu/disk/etc..  What are you shares set to?  low, normal, high?  What are other VMs doing at this time?  You can have problem with reported cpu depending on what other machines are using, etc.  But I would think that would show other direction ie esxi showing lower for that VM, etc.

 

What is needed is a way to create a specific load that runs at that load for a time so you can let the reporting tools stable out..  Which is why I went with maxing out my VM..  I don't know how often that little widget updates, not even sure where it gets its data?  What does your graphs show for cpu usage at the time, those are averages over a period, etc.  Didn't notice how many cores do you have given to the vm and how many in the host, etc.

 

Also notice your vm hardware is at 7?  Mine is 9, backed it off from 10 the current version with 5.5 so could still edit with the vclient.  Curious why you haven't updated?

I upgraded to VMWare ESXi 5.5 U1 and, as suspected, that didn't have any impact on the problem.

 

In regards to running 64bit, I typically run the 64bit version of my server software unless I have a solid reason not to. As most of my server software doesn't offer a 32bit version or if it is offered give it the same level of support. Although, the pressure to move to 64bit primarily probably doesn't exist for pfSense due to the hardware it is expected to be running on. I can migrate to 32bit to see there is some issue with the 64bit flavor...

 

The usage above 100% isn't impossible. It is showing you how much of the host CPU is being used along with ESXi overhead usage (or it is taking into account Intel Turbo Boost, but I'm inclined to believe it is usage + ESXi overhead).

 

KRH0yZXl.png

 

Overall, the ESXi host isn't being taxed in any way. It has 16 cores available so the excessive usage by pfSense isn't causing any adverse performance elsewhere.

 

The pfSense VM has been given 1 vCPU. The host has dual CPUs with 8 cores each along with HyperThreading enabled.

 

I have found a way to reproduce the issue easily. Loading up a Usenet client and hitting the pfSense box with 20 connections at a time at full line speed seem to trigger it easily, but even "light" work (such as Netflix) can cause it to spike pretty well. The lower usage point you see between the high peak and the latter high peak is just Netflix streaming.

 

I will upgrade the vHardware. I haven't pulled that up from that was configured by the VMWare appliance.

So what is the concern here - that the cpu widget is not showing correct?  Or the exsi is not?  Did you run cpuburn to see what happens when you load up.. What does top or vmstat or other tools on pfsense say is your load while your doing this highspeed download?

So what is the concern here - that the cpu widget is not showing correct?  Or the exsi is not?  Did you run cpuburn to see what happens when you load up.. What does top or vmstat or other tools on pfsense say is your load while your doing this highspeed download?

The concern is really centered around what is causing this. Is it a configuration problem or driver problem dealing with the NIC, etc...

 

The CPU widget isn't showing incorrectly, it matches what ESXi itself is showing...

 

The usage is correctly reported when the pfSense VM is under load internally using CPUBurn.

hmmm -- ok got my sons laptop moving large amount of files wireless from my wired network so there would be some network io..  So pfsense shows like 30 and esxi meter you gave shows like 30

 

post-14624-0-72171800-1395619437.png

 

Looking at the pfsense esxi graph..  This look right as well

 

post-14624-0-40453500-1395619520.png

 

Now what doesn't look right.. Is so I load up a big download to max out my download pipe (internet)..  I would have to plug something wired into my wlan segment at gig to really load up pfsense routing traffic and moving data.

 

And its way higher than what it should be - this is what your talking about right.

 

post-14624-0-27830100-1395620230.png

 

So pfsense shows 40, while esxi is showing 60+  But I think it comes down to this

 

http://blog.logicmonitor.com/2013/02/25/a-tale-of-two-metrics-windows-cpu-or-vcenter-vm-cpu/

 

There are times when the Guest OS (windows perfmon, etc) will show lower CPU usage than VMware reports.  The guest doesn?t know anything about the CPU used to virtualize the hardware resources it is requesting. ESXi does, and accurately attributes that load. Comparing the top two graphs, you can note that outside the period of load test, Windows reports a slightly lower CPU resource usage than does ESXi.

 

So I don't think there is really anything really that off here?  So when I download from internet I am crossing physical nics.  When I move data from lan to wlan the physical is a dual port.  So that could have effect on the actual esxi host cpu usage, etc.

if the CPU widget in pfsense and what vcenter is reporting are coherent, then the problem most likely lies within the VM, specifically the NIC driver/model.

The ESXi E1000 driver is notoriously bad, it's a generic-fit-all emulation driver meant for cases where the VMXNET3 can't be loaded.

 

Apparently VMXNET3 can work on Pfsense, so you should give that a try.

 

(60% CPU use for a 3.5MB/s load is bonkers btw)

"(60% CPU use for a 3.5MB/s load is bonkers btw)"

 

Agree -- it seems odd..  Which is why I posted the lan to wlan without nat, and its 30 and shows correct.  I am currently running E1000 because I had some issues with vmxnet3 before and my vpn client..  But could give it a try and again and see what it reports for cpu, etc..

 

I don't have any issues with what it reports as use, I don't have any problems moving files to and from my vms, or internet speed since I only have 25Mbps plan -- kind of doesn't matter to me if it reports 1% or 100% ;)  I can max out my internet download while also getting 70+ MBps from a VM to my machine, while watching a movie to my media player off same vm, etc..

 

But it is curious to why such a difference in reported, and if a driver drops it lower -- can sure test that out.

This topic is now closed to further replies.
  • Posts

    • Microsoft confirms Windows 11 26H2, urges IT admins to prepare for release by Usama Jawad Windows 11 typically follows an annual update cycle, but Microsoft recently broke that tradition a bit by releasing a "26H1" version in the first half of this year as a "scoped" build for select new silicon PCs only. This version was not available for customers using 24H2 and 25H2 builds, as Microsoft is busy preparing version 26H2 for them, confirmed officially for the first time. In a Windows IT Pro blog, Microsoft has urged IT admins to prepare for the upcoming release of Windows 11 version 26H2. The company has confirmed that this will be a small enablement package (eKB) that will simply light up certain disabled features that are already present in the operating system's code base. This means that the "refined" Windows update and deployment experience will be simpler and quicker, with minimal disruptions, as the feature update will simply toggle a few flags rather than performing a complete replacement. Microsoft has explained that this is all possible because the standard Windows 11 releases share the same servicing branch and hence, the same source code. However, this also means that Windows 11 26H1 users won't be able to upgrade to 26H2 as that is a different branch, but this is something we have known for a while now. Similar to previous annual feature updates, Windows 11 26H2 will offer the following support cycles: 24 months of support for Home, Pro, Pro EDU, and Pro for Workstations editions 36 months of support for Enterprise, Education, IoT Enterprise, and Enterprise Multi-session editions Microsoft has not confirmed a concrete release date for Windows 11 26H2, but noted that it is "coming soon". If we go by the ongoing release cadence, we can expect it to begin rolling out in early October 2026. As such, IT admins have been encouraged to begin validating Windows Insider releases in the Experimental Channel, plan rollout rings, and strategize the utilization of their existing deployment tools.
    • Windows 11 gets new audio improvements in the latest builds by Taras Buria Today's Experimental builds (26H1 and Future Platforms, formerly Canary) pack several audio-related improvements. If your device is enrolled in the Experimental Channel (26H1), you can download build 28120.2315, while those in the Future Platforms version have build 29613.1000 to try. Here is what is new in build 29613.1000: [Audio] Following up on our previous improvements, we’re making some more adjustments to Settings > System > Sounds based on your feedback. Namely, we’ve updated the “All sound devices” page so: You now have the ability to change default devices from this page. Each of the devices displayed on this page now has a little volume meter next to it to show if there is audio actively playing. We’ve adjusted the page design slightly so now you can filter whether you’re viewing input or output devices. We’ve added toggles so you can choose if you want to hide or show disabled, disconnected, and unplugged devices on this page. We’ve also updated the input and output audio properties page for devices in Settings to now include jack information for those that need it. And here is the changelog for build 28120.2315: This update includes a small number of minor bug fixes and improvements. [Accessibility] This update improves caption style responsiveness by redrawing captions immediately for caption style changes. If no current caption is visible, a sample caption string is displayed. [Audio] This update improves the reliability of the inbox HD Audio driver. You can find the official release notes for build 28120.2315 here and for build 29613.1000 here.
    • I agree with what I think you are saying, just not in the way you are saying it. Like any tool, the amount it represents your work is perorational to the effort you put into it. It is similar to why 2nd grade math students learning to add and subtract are not allowed to use calculators, but a high-school calculous student is. For the 2nd grader, that tool would completely replace the work they are doing, for the calculous student the same tool allows them to work far more effectively while in no way replacing their effort or knowable. If you spend 30 seconds writing a prompt, then the image that comes out is no more "yours" than if you found the same image with a Google Image search. However, many of these generative tools also support highly iterative processes that allow back and forth, and merging generated images with photos or human created images. I am sure you would agree that a human spending hours of time working on a project, even if AI was involved in the process, still reflects that human's work.
    • Windows 11 version 26H2 is now available for testing in the latest preview build by Taras Buria Friday Windows 11 preview builds are here. Insiders in the Experimental (formerly Dev) and Beta Channel can download builds 26300.8697 and 26220.8690. There are no new features, but Microsoft is officially moving the Experimental Channel to version 26H2. In addition, Microsoft is improving the copy dialog in File Explorer, the Start menu reliability, and fixing virtualization issues. Here is the changelog: [General] With today’s build, Windows Insiders in the Experimental channel will see the versioning updated under Settings > System > About (and winver) to version 26H2. For more information, see the Windows Insiders blog. [File Explorer] We’ve improved the visual consistency and reliability of the Copy dialog in Dark mode, including its launch experience and the expanded progress view. [Start menu] - Also available in Beta Improved reliability of Start menu reflecting newly installed or removed apps without requiring sign-out or restart. [Taskbar] Fixed an issue for Insiders using the new smaller taskbar option, where the system tray might get cut off or pushed off screen. [Settings] - Also available in Beta Improved reliability of Settings > Apps > Startup. [Virtualization] - Also available in Beta This update addresses an issue that could result in bugchecks citing HYPERVISOR_ERROR (0x20001) and KMODE_EXCEPTION_NOT_HANDLED (0x1E) errors after installing the latest flights on some devices during system restarts, virtual machine operations, or while running some gaming applications. You can find the official changelog for the Experimental build here and for the Beta build here.
    • I've always preferred this possibility. There is something that feels good about the idea that all matter in the universe will eventually come back together and maybe even result in another big bang. The idea that the universe would fizzle out over the eons and forever drift apart is a little depressing. I realize it is not logical to let a basic human desire for life to have a grand everlasting meaning change the way I feel about a scientific theory, but I am human, so that is how I feel :-).
  • Recent Achievements

    • Collaborator
      ryansurfer98 went up a rank
      Collaborator
    • Week One Done
      Eurosoft10 earned a badge
      Week One Done
    • One Month Later
      Eurosoft10 earned a badge
      One Month Later
    • One Year In
      Skeet Campbell earned a badge
      One Year In
    • One Month Later
      Sharbel earned a badge
      One Month Later
  • Popular Contributors

    1. 1
      +primortal
      577
    2. 2
      +Edouard
      190
    3. 3
      Michael Scrip
      77
    4. 4
      PsYcHoKiLLa
      76
    5. 5
      Steven P.
      73
  • Tell a friend

    Love Neowin? Tell a friend!