Microsoft's Xbox One engineers reveal more about console's CPU

Earlier this week, Eurogamer posted an excerpt from a long interview with two of Microsoft Xbox One hardware engineers, where they revealed that game developers will be able to access the 10 percent of the console's GPU that's being used for other features sometime in the future. Now, the website has posted the full interview with the engineers, Andrew Goossen and Nick Baker, where the two men go deep into hardware details for the Xbox One.

The highly technical interview has lots of information for hardware junkies, including more on the Xbox One's custom APU, which Microsoft co-designed with AMD. The CPU has eight cores based on AMD's Jaguar design. When asked why Microsoft picked this configuration instead of four Piledriver-based cores, Baker stated:

The extra power and area associated with getting that additional IPC boost going from Jaguar to Piledriver... It's not the right decision to make for a console. Being able to hit the sweet spot of power/performance per area and make it a more parallel problem. That's what it's all about. How we're partitioning cores between the title and the operating system works out as well in that respect.

The Xbox One has 15 processors inside the system on a chip, according to Baker, "eight inside the audio block, four move engines, one video encode, one video decode and one video compositor/resizer." The audio block was designed completely by Microsoft and made to handle 512 simultaneous voices for audio in games. It can also handle the speech pre-processing for the Kinect add-on.

Source: Eurogamer | Image via Microsoft

Report a problem with article
Previous Story

Surface Pro 2 128GB goes "out of stock"

Next Story

Bandicam 1.9.1.419

38 Comments

Commenting is disabled on this article.

512 audio channels..

To think Microsoft killed hardware audio on PC with Vista+ and brought the limit down natively to 32.

I'm thankful I've owned a creative card with ALchemy support, but that's still only 128 channels.. and if ALchemy works properly in the first place.

512 natively would be amazing. Guess we have to rely on FMOD or new audio engines.. OpenAL is dead because Creative is awful; so sad.

Shaun said,
I CAN'T WAIT!!!!!...for the NSA to be in my living room 100% of the time...

Room? I think its there in your pocket and even sleeps next to you at times *Mobile phone*

StandingInAlley said,

Room? I think its there in your pocket and even sleeps next to you at times *Mobile phone*

I can't understand this argument that people have about the Kinect. If you have a smartphone and are scared of the NSA -- game over. The Kinect 2.0 represents such little data compared to a smartphone.

Just off the top of my head:
Read every message I send email, text, web browser, etc...
Listen to my calls
Read my address book.
Know when I get up, go to bed
Where I work
When I leave work
Where I shop
Listen in on my surroundings
etc...

A Kinect 2.0 (which is optional to use) might see me 3 hours a day.

My smartphone is with me 24/7. If the NSA wants to use the Kinect 2 -- it is like getting sprinkles on ice cream. They have all the good stuff with your cell phone, the Kinect would just be adding a little bit to it.

Shaun said,
I CAN'T WAIT!!!!!...for the NSA to be in my living room 100% of the time...

You should be banned, seriously. Huge troll

adam7288 said,
512? I think thats more than enough voices.

It's 512 channels, not voises, so 512 different concurrent noises at once, 512 is pretty standard.

Nick Baker said, and I quote: "The goal was to run 512 simultaneous voices for game audio as well as being able to do speech pre-processing for Kinect. "

So yeah, not standard at all.

n_K said,

It's 512 channels, not voises, so 512 different concurrent noises at once, 512 is pretty standard.

The audio chip on the X1 out-powers any PC soundcard you can buy to date. 8 core DSP processors which can handle all audio effects with 512 voices. A HUGE benefit gained to the system by offloading this from the CPU/GPU.

JonnyLH said,

The audio chip on the X1 out-powers any PC soundcard you can buy to date. 8 core DSP processors which can handle all audio effects with 512 voices. A HUGE benefit gained to the system by offloading this from the CPU/GPU.

I hope this isn't flame bait but I'm honestly curious how this compares to the PS4. As I understand it both the Xbone and the PS4 offload the audio from the main CPU. The big difference is that on the Xbone the APU is built into the SoC while on the PS4 it is handled by a separate chip. The PS4's separate chip serves as an Audio DSP with support for "many" audio streams (not sure how that compares to the XBones 512) as well as audio/video compression, background upload/download, and a zlib decoder. I seriously doubt the differences would be enough to sway anyone one way or the other on their console choice but I'd just find it interesting to see a detailed comparison of the two approaches.

JonnyLH said,

The audio chip on the X1 out-powers any PC soundcard you can buy to date. 8 core DSP processors which can handle all audio effects with 512 voices. A HUGE benefit gained to the system by offloading this from the CPU/GPU.

And an old module tracker (modplug) supports that on PCs and has done for years.
As said, NOTHING NEW.

Asmodai said,

I hope this isn't flame bait but I'm honestly curious how this compares to the PS4. As I understand it both the Xbone and the PS4 offload the audio from the main CPU. The big difference is that on the Xbone the APU is built into the SoC while on the PS4 it is handled by a separate chip. The PS4's separate chip serves as an Audio DSP with support for "many" audio streams (not sure how that compares to the XBones 512) as well as audio/video compression, background upload/download, and a zlib decoder. I seriously doubt the differences would be enough to sway anyone one way or the other on their console choice but I'd just find it interesting to see a detailed comparison of the two approaches.

The common belief, especially around GAF and N4G, is that the PS4 has a similar chip. Which is quite incorrect. The PS4's chip is just a encoder/decoder which can take all the voices processed via the GPU (Hence Sony's extra CU's and heavy support for GPGPU), and encode all the different formats into a common stream which is outputted to your system. It doesn't do any processing of the audio so to speak. Hope that helps.

n_K said,

And an old module tracker (modplug) supports that on PCs and has done for years.
As said, NOTHING NEW.

A software interface can make up for 8 CPU cores which handle DSP processing? Come on..

JonnyLH said,

The common belief, especially around GAF and N4G, is that the PS4 has a similar chip. Which is quite incorrect. The PS4's chip is just a encoder/decoder which can take all the voices processed via the GPU (Hence Sony's extra CU's and heavy support for GPGPU), and encode all the different formats into a common stream which is outputted to your system. It doesn't do any processing of the audio so to speak. Hope that helps.

Do you have a source article that does a direct comparison? I've read a lot of things too and I don't know what to believe and what not to. Like I said I'd be VERY interested in a reputable tech site doing a full comparison instead of reading a little about one in this article and a little about the other in another article and then trying to piece together a comparison on my own.

It is my understanding that encoding/decoding is by far the most processor intensive part of audio but I too have heard that is most (if not all) of the dedicated hardware capability of the PS4 APU. I've also heard the PS4 is 200+ streams so about half the 512 of the XBone. Complex Audio processing on the PS4 can be done on either the CPU or the GPU as I understand it and will likely be CPU based on launch titles and move to the GPU during the consoles life as developers become more familiar with the ins and outs of the architecture.

I've also heard much of the Xbones dedicated audio hardware is reserved for Kinect 2 and not directly accessible by game developers (of course if they are making a Kinect enabled game they will be using the hardware indirectly through the Kinect API).

Anyway like I said I'm just curious from a tech geek point of view I already have my pre-order in and audio specs aren't going to change my console choice one way or the other. Would be a fascinating read though.

JonnyLH said,

A software interface can make up for 8 CPU cores which handle DSP processing? Come on..

You do know what a sound card is right? Yes, correct, it's a DSP (or many)! and a DAC. We've finally came to the conclusion this is nothing special or new!

n_K said,

You do know what a sound card is right? Yes, correct, it's a DSP (or many)! and a DAC. We've finally came to the conclusion this is nothing special or new!

That it can do 512 and process all the audio in hardware is special and new. From everything thats been said, the PS4 can decode a couple hundred but has no special 3d audio hardware so will have to make up for that using GPGPU, cutting into its supposed advantage on the GPU.

We shall see how it all plays out, but I just laugh at people who think the GPU advantage is the only thing that matters.

Randomevent said,

That it can do 512 and process all the audio in hardware is special and new. From everything thats been said, the PS4 can decode a couple hundred but has no special 3d audio hardware so will have to make up for that using GPGPU, cutting into its supposed advantage on the GPU.

We shall see how it all plays out, but I just laugh at people who think the GPU advantage is the only thing that matters.


Aha. There's no doubt that the PS4 has the stronger GPU by a long shot. Just the different approach from MS with the larger focus on offloading and the extra benefit of keeping some general purpose RAM to not bottleneck the CPU because of the shared bus. Jaguars are known to be more prone to latency, which makes it worse.

Even though every one is shouting to the roof tops around the "similar" architecture, they're miles apart and the differences to the engines between the platforms will be as immense as the previous generation.

JonnyLH said,

Even though every one is shouting to the roof tops around the "similar" architecture, they're miles apart and the differences to the engines between the platforms will be as immense as the previous generation.

I disagree to some degree. They are very similar in end result, but the X1s more complicated design may end up a short term weakness and a long term strength.

I'm not really sure at the moment though.

Asmodai said,

Do you have a source article that does a direct comparison? I've read a lot of things too and I don't know what to believe and what not to. Like I said I'd be VERY interested in a reputable tech site doing a full comparison instead of reading a little about one in this article and a little about the other in another article and then trying to piece together a comparison on my own.

It is my understanding that encoding/decoding is by far the most processor intensive part of audio but I too have heard that is most (if not all) of the dedicated hardware capability of the PS4 APU. I've also heard the PS4 is 200+ streams so about half the 512 of the XBone. Complex Audio processing on the PS4 can be done on either the CPU or the GPU as I understand it and will likely be CPU based on launch titles and move to the GPU during the consoles life as developers become more familiar with the ins and outs of the architecture.

I've also heard much of the Xbones dedicated audio hardware is reserved for Kinect 2 and not directly accessible by game developers (of course if they are making a Kinect enabled game they will be using the hardware indirectly through the Kinect API).

Anyway like I said I'm just curious from a tech geek point of view I already have my pre-order in and audio specs aren't going to change my console choice one way or the other. Would be a fascinating read though.


There isn't unfortunately, but there is detailed explanation of the audio chip in the X1 from the Hot Chips information and the DF interview. The only things we've got from Sony is very fluffy quotes from Mark Cerny which confirm the suspicions of a more focused GPGPU approach with just offloading audio decoding/encoding.

Regarding you're 2nd paragraph, the audio processing will be GPU based. There's just too much to think about regarding Jaguar cores heavy audio processing and the architecture of sharing the bus with the CPU. For example, you could easily take up numerous cores to circumvent the return latency. You don't have the same worries in the GPU because of a GPU's stream nature and higher memory bus, it isn't effected by memory latency. Although, this is all speculation due to Sony's nature of being tight lipped around specs and architecture.

There's some reserved function on the X1's audio chip for Kinect but the 4 DSP cores can be fully utilized for heavy audio processing and it is a huge gain to remove this from GPU/CPU, even though 10% of the GPU is reserved for OS GPGPU.

Randomevent said,

I disagree to some degree. They are very similar in end result, but the X1s more complicated design may end up a short term weakness and a long term strength.

I'm not really sure at the moment though.


Well yes, it's not going to be as different as the previous generation but rather still big differences between both.

Due to the nature of how low-level these consoles actually are, the architectures of the software will have to cater for a lot of GPGPU on the PS4, the X1 GPU purely worrying about rendering (with the exception of the 10% OS reserve) and the big shift towards offloading (Move engines, SHAPE). Without the common interface, unfortunately these developers will have to work hard to suit both of them to their optimum. Although, high level API's do exist on both and will probably be used more for release games. For example, the "mono" driver was only released post-E3 and was purely due to developer request of wanting to go lower. With that, there will be a huge development cost to re-write a game engine to support that drivers new functionality. Probably something Crytek hasn't done for Ryse. Think the Mantle update for BF4.

JonnyLH said,

There isn't unfortunately, but there is detailed explanation of the audio chip in the X1 from the Hot Chips information and the DF interview. The only things we've got from Sony is very fluffy quotes from Mark Cerny which confirm the suspicions of a more focused GPGPU approach with just offloading audio decoding/encoding.

It is my understanding that audio decoding/encoding is by far the most processor intensive part of audio so saying they "just" offload that seems to be a bit misleading. It seems the XBone does handle more streams in hardware though and they may even be of higher quality since Sony has been very tight lipped as you said.
JonnyLH said,

Regarding you're 2nd paragraph, the audio processing will be GPU based. There's just too much to think about regarding Jaguar cores heavy audio processing and the architecture of sharing the bus with the CPU. For example, you could easily take up numerous cores to circumvent the return latency. You don't have the same worries in the GPU because of a GPU's stream nature and higher memory bus, it isn't effected by memory latency. Although, this is all speculation due to Sony's nature of being tight lipped around specs and architecture.

The Sony produced slide about the ACP clearly states "Presumably CPU, but GPGPU is also a possibility" so it doesn't sound so firmly GPU based to me. In any event that's why I'd like to see a reputable 3rd party do a direct comparison.
JonnyLH said,

There's some reserved function on the X1's audio chip for Kinect but the 4 DSP cores can be fully utilized for heavy audio processing and it is a huge gain to remove this from GPU/CPU, even though 10% of the GPU is reserved for OS GPGPU.

Keep in mind also that this isn't a straight apples to apples comparison. I does look to me like the Xbone is better able to offload audio from the CPU/GPU however it isn't as simple as that. As I understand it the ACP on the PS4 is part of the "companion core" secondary chip on the PS4 but it's not ALL this secondary chip does. It also handles video encoding/decoding for the PS4 which is handled by the GPU on the XBone. So it's entirely possible that everything MS gains by offloading sound processing from the CPU/GPU could be lost by not offloading the video encoding/decoding, especially since both systems are apparently always recording the last few minutes of play for you to be able to share on the fly. Additionally the PS4 "companion core" does zlib decoding of compressed data and background uploads/downloads so the main SoC doesn't have to. I honestly don't know if these offset or what, again that's why I'd be VERY interested in seeing a more comprehensive direct comparison by a reputable tech site.

Asmodai said,

Keep in mind also that this isn't a straight apples to apples comparison. I does look to me like the Xbone is better able to offload audio from the CPU/GPU however it isn't as simple as that. As I understand it the ACP on the PS4 is part of the "companion core" secondary chip on the PS4 but it's not ALL this secondary chip does. It also handles video encoding/decoding for the PS4 which is handled by the GPU on the XBone. So it's entirely possible that everything MS gains by offloading sound processing from the CPU/GPU could be lost by not offloading the video encoding/decoding, especially since both systems are apparently always recording the last few minutes of play for you to be able to share on the fly. Additionally the PS4 "companion core" does zlib decoding of compressed data and background uploads/downloads so the main SoC doesn't have to. I honestly don't know if these offset or what, again that's why I'd be VERY interested in seeing a more comprehensive direct comparison by a reputable tech site.

The X1 also has a coprocessor for video encoding and decoding so that's not really an advantage for either console. I'm sure it has compression assists as well...especially considering every console ever made has had that, afaik.

Randomevent said,

The X1 also has a coprocessor for video encoding and decoding so that's not really an advantage for either console. I'm sure it has compression assists as well...especially considering every console ever made has had that, afaik.

I guess it depends on what you call a "coprocessor". I didn't mean to imply the Xbone doesn't have hardware to do video encoding and decoding. On the Xbone just about everything is inside a single SoC chip. Within that chip what I was calling the CPU, the GPU, and the ACP is based off this diagram:

http://images.eurogamer.net/20...9/33.png/EG11/resize/600x-1

where everything green I was calling GPU, everything blue is CPU, everything purple is ACP, and gray is Memory. As you can see the Video Encode/Decode is clearly in the Green GPU section.

On the PS4 all this is NOT within the same SoC. The PS4 has a second chip but the breakdown isn't as simple as the purple section from that XBone diagram being pulled out and being put on a separate chip... yet that's exactly how most comparisons thus far seem to be going. Instead the PS4's second chip does some functions (audio encoding/decoding) from the purple, some from the green (video encoding/decoding), etc. which offloads all of that from the primary SoC.

Asmodai said,
On the PS4 all this is NOT within the same SoC. The PS4 has a second chip but the breakdown isn't as simple as the purple section from that XBone diagram being pulled out and being put on a separate chip... yet that's exactly how most comparisons thus far seem to be going. Instead the PS4's second chip does some functions (audio encoding/decoding) from the purple, some from the green (video encoding/decoding), etc. which offloads all of that from the primary SoC.

I'm not sure what your point is. They both have the functionality, which is more effective at it isn't up to which is on a seperate chip.

Randomevent said,

I'm not sure what your point is. They both have the functionality, which is more effective at it isn't up to which is on a seperate chip.

The point was just that it's not an apples to apples comparison as the breakdown of the hardware is different. I don't KNOW how big of a difference that makes that's what I'm saying I'll like to see a tech site to cover. You sound like you're asking me to explain how hit makes a big difference when I'm saying I don't know if it does or not. It COULD, for example the fact that all the video encode/decode is in the "GPU" section of the Xbone SoC (the stuff in green in that image) could mean it competes for resources with the graphics core whereas having it on a separate chip (especially one that functions even when the main SoC is completely powered down as with the PS4) could mean it doesn't compete for resources with the graphics core thus freeing up the graphics core to do more (such as audio). Again though I'm not saying that's the case, I don't know. I'm just saying I'd like to read a technical comparison of the two architectures by a reputable tech site.

psionicinversion said,
you could imagine it would come to fisty cuffs over which architecture is better as neither would concede failings in there own stuff hah

They probably are more interested in discussing why certain avenues were taken and understand the mindset behind the routes each other took. These guys are probably well past fan-boy fighting over consoles.

uxo22 said,

They probably are more interested in discussing why certain avenues were taken and understand the mindset behind the routes each other took. These guys are probably well past fan-boy fighting over consoles.

Oh yeah. They might even go over, "jeez, if budget was no object, or you had double the budget, what would you have gone for?" type discussions. I mean, wow... when you're in a niche industry, you don't have many people you can actually bounce ideas off of, you know?

I'd say Netflix was a given... they were there at the launch of Windows 8, so I'm betting they will be for the X1 as well.

It is getting so close to release, I am excited. I wonder which 3rd party apps will be ready at launch, hoping for hbogo and Netflix!