Jump to content



Photo

The complete XBOX One architects interview - DF

xbox one

  • Please log in to reply
35 replies to this topic

#1 vcfan

vcfan

    POP POP RET

  • Tech Issues Solved: 3
  • Joined: 12-June 11

Posted 05 October 2013 - 20:15

There's a lot of misinformation out there and a lot of people who don't get it.

 

Real world ram performance over 200GB/S. ~50% more bandwidth than the competition
 

It will be true that you can go directly, simultaneously to DRAM and ESRAM.

That equivalent on ESRAM would be 218GB/s. However, just like main memory, it's rare to be able to achieve that over long periods of time so typically an external memory interface you run at 70-80 per cent efficiency.

we've measured about 140-150GB/s for ESRAM. That's real code running. That's not some diagnostic or some simulation case or something like that. That is real code that is running at that bandwidth. You can add that to the external memory and say that that probably achieves in similar conditions 50-55GB/s and add those two together you're getting in the order of 200GB/s across the main memory and internally.

Digital Foundry: So 140-150GB/s is a realistic target and you can integrate DDR3 bandwidth simultaneously?

Nick Baker: Yes. That's been measured.

The biggest thing in terms of the number of compute units, that's been something that's been very easy to focus on. It's like, hey, let's count up the number of CUs, count up the gigaflops and declare the winner based on that. My take on it is that when you buy a graphics card, do you go by the specs or do you actually run some benchmarks? Firstly though, we don't have any games out. You can't see the games. When you see the games you'll be saying, "What is the performance difference between them?" The games are the benchmarks.

 

 

Explaining what balanced means in a system.

The goal of a 'balanced' system is by definition not to be consistently bottlenecked on any one area. In general with a balanced system there should rarely be a single bottleneck over the course of any given frame - parts of the frame can be fill-rate bound, other can be ALU bound, others can be fetch bound, others can be memory bound, others can be wave occupancy bound, others can be draw-setup bound, others can be state change bound, etc. To complicate matters further, the GPU bottlenecks can change within the course of a single draw call!

 

How important is the CPU to framerates and why cpu offloading was a big part of the design

Another very important thing for us in terms of design on the system was to ensure that our game had smooth frame-rates. Interestingly, the biggest source of your frame-rate drops actually comes from the CPU, not the GPU. Adding the margin on the CPU... we actually had titles that were losing frames largely because they were CPU-bound in terms of their core threads. In providing what looks like a very little boost, it's actually a very significant win for us in making sure that we get the steady frame-rates on our console. And so that was a key design goal of ours - and we've got a lot of CPU offload going on.

 

 

 

The scalar sounds cool. It can dynamically change per frame to reduce frame drops.

We've done things on the GPU side as well with our hardware overlays to ensure more consistent frame-rates. We have two independent layers we can give to the titles where one can be 3D content, one can be the HUD. We have a higher quality scaler than we had on Xbox 360. What this does is that we actually allow you to change the scaler parameters on a frame-by-frame basis. I talked about CPU glitches causing frame glitches... GPU workloads tend to be more coherent frame to frame. There doesn't tend to be big spikes like you get on the CPU and so you can adapt to that.

 

About the function of the eMMC memory.

 

Digital Foundry: Another thing that came up from the Hot Chips presentation that was new information was the eMMC NAND which I hadn't seen any mention of. I'm told it's not available for titles. So what does it do?

Andrew Goossen: Sure. We use it as a cache system-side to improve system response and again not disturb system performance on the titles running underneath. So what it does is that it makes our boot times faster when you're not coming out of the sleep mode - if you're doing the cold boot. It caches the operating system on there. It also caches system data on there while you're actually running the titles and when you have the snap applications running concurrently. It's so that we're not going and hitting the hard disk at the same time that the title is. All the game data is on the HDD. We wanted to be moving that head around and not worrying about the system coming in and monkeying with the head at an inopportune time.

 

Clock increases

Digital Foundry: Can you talk us through how you arrived at the CPU and GPU increases that you did and did it have any effect on production yield?

Nick Baker: We knew we had headroom. We didn't know what we wanted to do with it until we had real titles to test on. How much do you increase the GPU by? How much do you increase the CPU by?



lots more interesting details about the architecture

 

 

http://www.eurogamer...x-one-interview




#2 GotBored

GotBored

    Brain Trust

  • Tech Issues Solved: 3
  • Joined: 24-June 13
  • OS: Windows 8.1
  • Phone: iPhone 5

Posted 06 October 2013 - 05:57

So for the first few milliseconds of usage of the 32MB of ESRAM you can transfer at a rate of 150GB/s but what happens after that when the data in the ESRAM is read by the Processor and the ESRAM needs to put more data inside?

It gets data off the DDR3 which has a transfer rate of 50-55GB/s? or worse cause they mention that the DDR3 and ESRAM will function simultaneously from the HDD? 

 

All Microsoft did here was put in an additional step in processing which gives the impression that it has one thing over the PS4 in terms of system specs, realistically that 32MB of ESRAM will be pointless outside of maybe UI usage as todays games go through 32MB of data almost instantaneously then after that the next step in the process becomes the bottleneck, Xbox Ones DDR3, then after that the HDD.

 

In regards to eMMC NAND its flash memory same as SSD, basically they put a solid state chip in there with the OS for faster boot times. This is the only system specs advantage over the PS4 but its a short lived one, because the PS4 HDD is user-replaceable and anyone who is wanting the faster boot times can just replace it with a SSD, doing so will give more benefits that a small eMMC NAND flash memory chip because it will also improve game load times and future proof itself for when OS data becomes too big for a size limited eMMC NAND chip.

 

I know people get upset when someone compares system stats of the Xbox One with the PS4, but as it is its direct competitor the PS4 is the benchmark for comparisons. 



#3 JonnyLH

JonnyLH

    I say things.

  • Joined: 15-February 13
  • Location: UK
  • OS: W8, W7, WP8, iOS, Ubuntu
  • Phone: Nokia Lumia 920

Posted 06 October 2013 - 12:13

I saw this article when it first got published and its a very good read. Its fascinating how and why they've gone with the design changes they have, especially around the software stack and building the box around virtualisation with no overheads. 

 

So for the first few milliseconds of usage of the 32MB of ESRAM you can transfer at a rate of 150GB/s but what happens after that when the data in the ESRAM is read by the Processor and the ESRAM needs to put more data inside?

It gets data off the DDR3 which has a transfer rate of 50-55GB/s? or worse cause they mention that the DDR3 and ESRAM will function simultaneously from the HDD? 

 

All Microsoft did here was put in an additional step in processing which gives the impression that it has one thing over the PS4 in terms of system specs, realistically that 32MB of ESRAM will be pointless outside of maybe UI usage as todays games go through 32MB of data almost instantaneously then after that the next step in the process becomes the bottleneck, Xbox Ones DDR3, then after that the HDD.

 

In regards to eMMC NAND its flash memory same as SSD, basically they put a solid state chip in there with the OS for faster boot times. This is the only system specs advantage over the PS4 but its a short lived one, because the PS4 HDD is user-replaceable and anyone who is wanting the faster boot times can just replace it with a SSD, doing so will give more benefits that a small eMMC NAND flash memory chip because it will also improve game load times and future proof itself for when OS data becomes too big for a size limited eMMC NAND chip.

 

I know people get upset when someone compares system stats of the Xbox One with the PS4, but as it is its direct competitor the PS4 is the benchmark for comparisons. 

Why do people continue to question the best system architects in the world and their decisions? Its beyond me honestly.

 

Have you read anything around the article? They aren't throwing numbers around, they got the 204Gb/s mark with real code running on the box. Questioning those rates are invalid since MS have actually had those transfer rates on retail hardware. 

 

The eSRAM is useless? Its the key to the RAM infrastructure on the X1. Free AA, lightning fast post processing with little overhead. Its a tool which has been used successfully throughout the 360's life cycle. Like said in the article, if you have a artefact which has little overdraw, then that'll spill over into DDR3 because it simply doesn't need to lie is eSRAM because it's not going to need that extra BW to post process.

 

The X1 turns on instantly when you say "Xbox On" because of its reserved state in the flash. If the PS4 always boots cold, which I'm quite sure it does, it'll never just boot instantly. 

 

This attitude towards this just sums up the threads on N4G and NeoGAF around this, its ridiculous how false the information people throwing around is. People claiming balance is a PR term obviously have no slight knowledge about system architecture at all. In a pipeline if you have power ratios to 1:1:0.5:1 your power is 0.5. Its a simply analogy.



#4 Nilus

Nilus

    Neowinian

  • Joined: 09-March 13
  • Location: England

Posted 06 October 2013 - 15:35

Excuse my ignorance here i'm just trying to get my head around how these types of memory work.

 

So whilst they are giving raw numbers, saying DDR3+ESRAM is faster than the GDDR5 in the PS4, how does the amounts of that ram affect things?

 

So they have 32MB of memory running at the faster speed meaning, in my mind, that they can shift 32MB of data at that rate. Where as the PS4 has GDDR5 at a slower speed than ESRAM, but it has 8GB of it. Meaning it can shift 8GB of data at the slower speed but due to the volume it would be shifting far more data still?

 

Does any of that make sense or is it complete rubbish?  I am no hardware engineer, i'm not even an armchair hardware engineer, as I said just trying to think it through logically and get my head around it.



#5 OP vcfan

vcfan

    POP POP RET

  • Tech Issues Solved: 3
  • Joined: 12-June 11

Posted 06 October 2013 - 18:29

So for the first few milliseconds of usage of the 32MB of ESRAM you can transfer at a rate of 150GB/s but what happens after that when the data in the ESRAM is read by the Processor and the ESRAM needs to put more data inside?
It gets data off the DDR3 which has a transfer rate of 50-55GB/s? or worse cause they mention that the DDR3 and ESRAM will function simultaneously from the HDD?

 
Don't be silly. A 32bit 1080p render target is only 8-12MB. Read the article,its explained by the engineer, and he says the move engines can shift relevant data in or out of the ESRAM. So lets do say you have a 1080p render target in the ESRAM that you are working on,and soon you need to swap to another one, what would happen is that the move engine would have already shifted this render target back in ESRAM from DDR during unused ESRAM memory cycles,so you can switch to it instantaneously with no memory operations when you are ready,and the previous render target will once again be shifted back out the same way.
 
You're also forgetting one huge point. The compression/decompression engines part of the move engines. The move engines can shift this data even quicker because its compressed,so less cycles for the move.
 
from the article
 

Digital Foundry: Obviously though, you are limited to just 32MB of ESRAM. Potentially you could be looking at say, four 1080p render targets, 32 bits per pixel, 32 bits of depth - that's 48MB straight away. So are you saying that you can effectively separate render targets so that some live in DDR3 and the crucial high-bandwidth ones reside in ESRAM?

Andrew Goossen: Oh, absolutely. And you can even make it so that portions of your render target that have very little overdraw... For example, if you're doing a racing game and your sky has very little overdraw, you could stick those subsets of your resources into DDR to improve ESRAM utilisation. On the GPU we added some compressed render target formats like our 6e4 [six bit mantissa and four bits exponent per component] and 7e3 HDR float formats [where the 6e4 formats] that were very, very popular on Xbox 360, which instead of doing a 16-bit float per component 64pp render target, you can do the equivalent with us using 32 bits - so we did a lot of focus on really maximizing efficiency and utilisation of that ESRAM.

You can use the Move Engines to move these things asynchronously in concert with the GPU so the GPU isn't spending any time on the move. You've got the DMA engine doing it. Now the GPU can go on and immediately work on the next render target rather than simply move bits around.



#6 Enron

Enron

    Windows for Workgroups

  • Tech Issues Solved: 1
  • Joined: 30-May 11
  • OS: Windows 8.1 U1
  • Phone: Nokia Lumia 900

Posted 06 October 2013 - 18:36

Xbox One is going to be great.



#7 SnoopZ

SnoopZ

    Resistance is Futile

  • Joined: 02-December 04
  • Location: The Twilight Zone

Posted 06 October 2013 - 19:00

Xbox One is going to be great.

From what i saw at the Eurogamer show it seems to be great and i am pleased i chose it.



#8 Showan

Showan

    Neowinian Senior

  • Joined: 28-November 12
  • Location: Amurrika
  • OS: W7, W8
  • Phone: Lumia 521

Posted 06 October 2013 - 20:25

There's also quite a number of other design aspects and requirements that we put in around things like latency, steady frame-rates and that the titles aren't interrupted by the system and other things like that. You'll see this very much as a pervasive ongoing theme in our system design.



-Andrew Goossen

#9 Blackhearted

Blackhearted

    .....

  • Joined: 26-February 04
  • Location: Ohio
  • Phone: Samsung Galaxy S2 (VM)

Posted 06 October 2013 - 20:47

I saw this article when it first got published and its a very good read. Its fascinating how and why they've gone with the design changes they have, especially around the software stack and building the box around virtualisation with no overheads. 

 

Why do people continue to question the best system architects in the world and their decisions? Its beyond me honestly.

 

Have you read anything around the article? They aren't throwing numbers around, they got the 204Gb/s mark with real code running on the box. Questioning those rates are invalid since MS have actually had those transfer rates on retail hardware. 

 

The eSRAM is useless? Its the key to the RAM infrastructure on the X1. Free AA, lightning fast post processing with little overhead. Its a tool which has been used successfully throughout the 360's life cycle. Like said in the article, if you have a artefact which has little overdraw, then that'll spill over into DDR3 because it simply doesn't need to lie is eSRAM because it's not going to need that extra BW to post process.

 

The X1 turns on instantly when you say "Xbox On" because of its reserved state in the flash. If the PS4 always boots cold, which I'm quite sure it does, it'll never just boot instantly. 

 

This attitude towards this just sums up the threads on N4G and NeoGAF around this, its ridiculous how false the information people throwing around is. People claiming balance is a PR term obviously have no slight knowledge about system architecture at all. In a pipeline if you have power ratios to 1:1:0.5:1 your power is 0.5. Its a simply analogy.

 

I don't think you read the article either. They got 140-150GB/sec out of the esram running real code. A far cry from that 204GB/sec you are throwing around.

 

Also, there's no such thing as completely "Free AA". It didn't exist on the 360, and it likely wont here. Now if you were to say "low cost" then you might be right.

 



#10 OP vcfan

vcfan

    POP POP RET

  • Tech Issues Solved: 3
  • Joined: 12-June 11

Posted 06 October 2013 - 21:28

Also, there's no such thing as completely "Free AA". It didn't exist on the 360, and it likely wont here. Now if you were to say "low cost" then you might be right.

Actually, there is such a thing as free AA on the 360. The EDRAM die had 192 component processors that could be used to do 4xMSAA without affecting the performance of the GPU. Developers didnt have to use it for AA though,so they could choose to use some low cost shader based AA,and use the component processors for something else. GTA4 on the 360 had 2xMSAA at 720p ,unlike the ps3 version,and it likely used this method.



#11 JonnyLH

JonnyLH

    I say things.

  • Joined: 15-February 13
  • Location: UK
  • OS: W8, W7, WP8, iOS, Ubuntu
  • Phone: Nokia Lumia 920

Posted 06 October 2013 - 22:06

I don't think you read the article either. They got 140-150GB/sec out of the esram running real code. A far cry from that 204GB/sec you are throwing around.

Also, there's no such thing as completely "Free AA". It didn't exist on the 360, and it likely wont here. Now if you were to say "low cost" then you might be right.

154+50=204, is there something i missed? He says this in the article, with running code.

There is such thing as free aa. Well in terms of free from the GPU.

#12 trooper11

trooper11

    Neowinian Senior

  • Tech Issues Solved: 5
  • Joined: 21-November 12

Posted 07 October 2013 - 01:15

 

In regards to eMMC NAND its flash memory same as SSD, basically they put a solid state chip in there with the OS for faster boot times. This is the only system specs advantage over the PS4 but its a short lived one, because the PS4 HDD is user-replaceable and anyone who is wanting the faster boot times can just replace it with a SSD, doing so will give more benefits that a small eMMC NAND flash memory chip because it will also improve game load times and future proof itself for when OS data becomes too big for a size limited eMMC NAND chip.

 

I know people get upset when someone compares system stats of the Xbox One with the PS4, but as it is its direct competitor the PS4 is the benchmark for comparisons. 

 

 

You are right about the ps4 having the bonus option of using an ssd, but you forgot to list any of the negatives.

 

First of all, if you switch to an ssd, your going to be extremely limited storage space wise vs a standard drive. Secondly, your going to pay a lot vs a standard hdd.  I would hazard a guess that your average gamer will not be making that switch.

 

I think this is a wash. MS offers a console that can already offer the advantages of an ssd without requiring the user to buy an ssd. I'm not sure about the OS data getting bigger overtime though. I mean usually Sony and MS shrink the OS footprint overtime, so I don't know if that would growing over time. Maybe your talking about user data, so I don't know.



#13 Showan

Showan

    Neowinian Senior

  • Joined: 28-November 12
  • Location: Amurrika
  • OS: W7, W8
  • Phone: Lumia 521

Posted 07 October 2013 - 03:23

This is a brave move on Microsoft's part.

After reading that (only grasping bits and pieces), it seems Microsoft has built one Box (no pun intended) to take on all it's challengers in one shot.

 

The really meant it when they said you only need this one device for ALL your entertainment needs.

 

And by design it's left very very open...  I honestly believe app makers and Indies are going to be in their glory come CES and E3 '14.

 

As I stated in another post... I was thinking so small for Xbox One. 



#14 GotBored

GotBored

    Brain Trust

  • Tech Issues Solved: 3
  • Joined: 24-June 13
  • OS: Windows 8.1
  • Phone: iPhone 5

Posted 07 October 2013 - 03:36

You are right about the ps4 having the bonus option of using an ssd, but you forgot to list any of the negatives.

 

First of all, if you switch to an ssd, your going to be extremely limited storage space wise vs a standard drive. Secondly, your going to pay a lot vs a standard hdd.  I would hazard a guess that your average gamer will not be making that switch.

 

I think this is a wash. MS offers a console that can already offer the advantages of an ssd without requiring the user to buy an ssd. I'm not sure about the OS data getting bigger overtime though. I mean usually Sony and MS shrink the OS footprint overtime, so I don't know if that would growing over time. Maybe your talking about user data, so I don't know.

 

I will agree that in order for PS4 users to get the extra benefit they will have to pay a premium. (As SSD's are fairly expensive in comparison to HDD's) But size wise you are not limited by SSD as the standard harddrive on both the PS4 and Xbox One is only 500GB and 1TB SSD which are double that size and have been on sale for a few months now to average consumers (While 2TB SSD's are available to the not-so-average consumer at a ridiculous price).

- SSD's are also increasing in size every few months a bigger one so the gap will only be getting bigger over time.

 

I was under the impression that the OS footprint expands over time which is the reason why both Sony and Microsoft reserved more RAM for OS operations than they needed to for future updates and OS additions. And the thing that changed over time was the reserved RAM as they get closer to finalizing the console OS they give more RAM to games as the OS size is more definitive. 



#15 spacer

spacer

    I'm awesome

  • Joined: 09-November 06
  • Location: Connecticut, USA
  • OS: Windows 7
  • Phone: Nexus 4

Posted 07 October 2013 - 04:07

The article was a good read, but theoretical numbers are a long way from practical ones. So far the performance of the games, from what we can tell, has been less than stellar. Time will tell if the One's architecture is really as good as it needs to be.