Jump to content

37 posts in this topic

Posted

This guy right here gives a plain explanation for why eSRAM is more complicated

 

http://youtu.be/JJW5OKbh0WA?t=38m57s

Share this post


Link to post
Share on other sites

Posted

Correct me if I'm wrong, but isn't MS using eSRam not eDRAM?

Share this post


Link to post
Share on other sites

Posted

Correct me if I'm wrong, but isn't MS using eSRam not eDRAM?

The argument remains the same

 

The CPU cannot access that eSRAM at all, it is only there as a GPU cache.
Compute workloads can only run across the slower DDR3. If the GPU needs to return a result to the CPU it has to be written to the common DDR3 as that is the only coherent memory in the system.

Truth be told the Xbone's memory arrangement isn't *terrible* there are a number of systems out there that use something similar. The PS4 with its hUMA configuration is just substantially better.

Share this post


Link to post
Share on other sites

Posted

this sony guy just admitted the edram method does indeed give you more bandwidth, but its more complicated. Just because sony doesn't know how to make it easier for developers to use doesn't mean Microsoft doesn't. And Microsoft is isn't even using edram, they are using eSRAM,which is way faster.

Share this post


Link to post
Share on other sites

Posted

The argument remains the same

 

The CPU cannot access that eSRAM at all, it is only there as a GPU cache.
Compute workloads can only run across the slower DDR3. If the GPU needs to return a result to the CPU it has to be written to the common DDR3 as that is the only coherent memory in the system.

Truth be told the Xbone's memory arrangement isn't *terrible* there are a number of systems out there that use something similar. The PS4 with its hUMA configuration is just substantially better.

 

 

Do you know what's more important than memory architecture ? a good API and SDK that allows you to make use of the system, or even makes use of the system for you.

 

MS has been making great SDK's that lets you do this since the original Xbox, and if you don't purposely make use of it, it makes use of it for you. Sony's history of making good SDK's is... well non existant. 

 

Meanwhile developers who pretty much lived in the Sony bubble and lived and breathed Sony, (ie the Metal Gear Solid dude) is saying the difference is minimal and won't have any real effect. 

Share this post


Link to post
Share on other sites

Posted

this sony guy just admitted the edram method does indeed give you more bandwidth, but its more complicated. Just because sony doesn't know how to make it easier for developers to use doesn't mean Microsoft doesn't. And Microsoft is isn't even using edram, they are using eSRAM,which is way faster.

 

 

Do you know what's more important than memory architecture ? a good API and SDK that allows you to make use of the system, or even makes use of the system for you.

 

MS has been making great SDK's that lets you do this since the original Xbox, and if you don't purposely make use of it, it makes use of it for you. Sony's history of making good SDK's is... well non existant. 

 

Meanwhile developers who pretty much lived in the Sony bubble and lived and breathed Sony, (ie the Metal Gear Solid dude) is saying the difference is minimal and won't have any real effect. 

 

Throwing ungodly amounts of bandwidth at a GPU does nothing for it unless the GPU actually has the execution resources to make use of it.
Its like installing a 8 lane highway in a town with only 12 people. You have plenty of wide open lanes, but you can never fill them.

The PS4 has more bandwidth and 50% more ALUs than the Xbone. It has that high bandwidth because it actually has the GPU to use it.
A reference Radeon 7850 with 16 GCN engines has 153.6 GB/s memory bandwidth.
The PS4's GPU with 18 GCN engines has 176 GB/s memory bandwidth
The Xbone only has 12 GCN engines. Giving 12 GCN engines 100,000,000 GB/s memory bandwidth will literally not improve their performance at all over even 150 GB/s.

Share this post


Link to post
Share on other sites

Posted

Throwing ungodly amounts of bandwidth at a GPU does nothing for it unless the GPU actually has the execution resources to make use of it.
Its like installing a 8 lane highway in a town with only 12 people. You have plenty of wide open lanes, but you can never fill them.

The PS4 has more bandwidth and 50% more ALUs than the Xbone. It has that high bandwidth because it actually has the GPU to use it.
A reference Radeon 7850 with 16 GCN engines has 153.6 GB/s memory bandwidth.
The PS4's GPU with 18 GCN engines has 176 GB/s memory bandwidth
The Xbone only has 12 GCN engines. Giving 12 GCN engines 100,000,000 GB/s memory bandwidth will literally not improve their performance at all over even 150 GB/s.

 

 

LOL so now anything over 150GB/s is useless...Right. Now that the tables have turned,and it turns out xbox one has more bandwidth, all of a sudden it doesn't matter.

 

And lack of bandwidth caps your card,it doesn't matter if it has 100,000,000 GCN engines. Unless you have benchmarks,then you cannot speak of performance. We know Microsoft had the edram configuration last gen,and developers didn't have to jump through crazy hoops to develop games. Microsoft has the right tools to take advantage of their configuration. Microsoft also had an ASIC last gen that took a lot of computation intensive elements away from the GPU such as MSAA,alpha blending and z buffering. You don't have the die shots down to the gate and transistor levels.

 

You are also forgetting the fact that xbox one will have a custom HD audio engine chip that will keep a whole bunch of free CPU cycles. This matters a whole lot when it comes to actual performance. 

 

In that video, Sony basically just admitted they used the easier method because they don't want to over complicate things for developers,since they've proven last gen they don't have the software know how to take advantage of such complex systems. They just did the exact same thing with their PS4 chip,taking the easy way out by using 18 GCNs. Microsoft took the complicated way last time around by using the edram configuration,and using some other custom logic on their chip. Looks like they are doing the same today, and by the sounds of it,with stuff like this latest news coming out,i wouldn't be surprised if xbox one ends up being the one with the much superior and better performing hardware.

Share this post


Link to post
Share on other sites

Posted

LOL so now anything over 150GB/s is useless...Right. Now that the tables have turned,and it turns out xbox one has more bandwidth, all of a sudden it doesn't matter.

 

And lack of bandwidth caps your card,it doesn't matter if it has 100,000,000 GCN engines. Unless you have benchmarks,then you cannot speak of performance. We know Microsoft had the edram configuration last gen,and developers didn't have to jump through crazy hoops to develop games. Microsoft has the right tools to take advantage of their configuration. Microsoft also had an ASIC last gen that took a lot of computation intensive elements away from the GPU such as MSAA,alpha blending and z buffering. You don't have the die shots down to the gate and transistor levels.

 

You are also forgetting the fact that xbox one will have a custom HD audio engine chip that will keep a whole bunch of free CPU cycles. This matters a whole lot when it comes to actual performance. 

 

In that video, Sony basically just admitted they used the easier method because they don't want to over complicate things for developers,since they've proven last gen they don't have the software know how to take advantage of such complex systems. They just did the exact same thing with their chip,taking the easy way out by using 18 GCNs. Microsoft took the complicated way last time around by using the edram configuration,and using some other custom logic on their chip.

LOL you see how fanboys can turn ish in there favour...

Share this post


Link to post
Share on other sites

Posted

Throwing ungodly amounts of bandwidth at a GPU does nothing for it unless the GPU actually has the execution resources to make use of it.
Its like installing a 8 lane highway in a town with only 12 people. You have plenty of wide open lanes, but you can never fill them.

The PS4 has more bandwidth and 50% more ALUs than the Xbone. It has that high bandwidth because it actually has the GPU to use it.
A reference Radeon 7850 with 16 GCN engines has 153.6 GB/s memory bandwidth.
The PS4's GPU with 18 GCN engines has 176 GB/s memory bandwidth
The Xbone only has 12 GCN engines. Giving 12 GCN engines 100,000,000 GB/s memory bandwidth will literally not improve their performance at all over even 150 GB/s.

 

Don't bother trying to explain anything technical to those on team xbox. As we've seen in the big thread about the ps4/one specs, it's largely a waste of effort as they don't care much about anything other than their own theories.

 

 

LOL so now anything over 150GB/s is useless...Right. Now that the tables have turned,and it turns out xbox one has more bandwidth, all of a sudden it doesn't matter.

 

And lack of bandwidth caps your card,it doesn't matter if it has 100,000,000 GCN engines. Unless you have benchmarks,then you cannot speak of performance. We know Microsoft had the edram configuration last gen,and developers didn't have to jump through crazy hoops to develop games. Microsoft has the right tools to take advantage of their configuration. Microsoft also had an ASIC last gen that took a lot of computation intensive elements away from the GPU such as MSAA,alpha blending and z buffering. You don't have the die shots down to the gate and transistor levels.

 

You are also forgetting the fact that xbox one will have a custom HD audio engine chip that will keep a whole bunch of free CPU cycles. This matters a whole lot when it comes to actual performance. 

 

On a GPU of the one's power, 150GB/sec is a bit higher than necessary, and anything more than that is definitely more than it could take full advantage of. There's no being a fan of one side or another about it, it's just a simple fact. Another fact is that even if the increased bandwidth gives it a little boost in performance, it still wont make it suddenly be able to match the peak performance of a box with a noticeably more powerful gpu.

 

Also, this isn't the 90's anymore. The difference a dedicated audio chip will make to performance on a modern computer or game console is minuscule.

Share this post


Link to post
Share on other sites

Posted

Don't bother trying to explain anything technical to those on team xbox. As we've seen in the big thread about the ps4/one specs, it's largely a waste of effort as they don't care much about anything other than their own theories.

That's fine; at least those who want to read up and remain objective will have the info readily available for them to see.

Share this post


Link to post
Share on other sites

Posted

On a GPU of the one's power, 150GB/sec is a bit higher than necessary, and anything more than that is definitely more than it could take full advantage of.

 

really? care to give some technical examples to show us why 150GB/s is more than necessary?

 

There's no being a fan of one side or another about it, it's just a simple fact. Another fact is that even if the increased bandwidth gives it a little boost in performance, it still wont make it suddenly be able to match the peak performance of a box with a noticeably more powerful gpu.

 

again,GCNs are just one aspect of a GPU,just like bus width,ram configuration,and other things that we don't know. theres examples set from last gen that shows there are more customized elements part of GPUs that massively affect performance.

 

Also, this isn't the 90's anymore. The difference a dedicated audio chip will make to performance on a modern computer or game console is minuscule.

 

are you kidding? we're not talking about a dumb buffer fill like a cpu assisted sound card. try for example using some audio processing plugins in your digital audio workstation applications,and wait for your powerful CPU to process this audio signal. Yeah, not that simple.

Share this post


Link to post
Share on other sites

Posted

LOL so now anything over 150GB/s is useless...Right. Now that the tables have turned,and it turns out xbox one has more bandwidth, all of a sudden it doesn't matter.

 

And lack of bandwidth caps your card,it doesn't matter if it has 100,000,000 GCN engines. Unless you have benchmarks,then you cannot speak of performance. We know Microsoft had the edram configuration last gen,and developers didn't have to jump through crazy hoops to develop games. Microsoft has the right tools to take advantage of their configuration. Microsoft also had an ASIC last gen that took a lot of computation intensive elements away from the GPU such as MSAA,alpha blending and z buffering. You don't have the die shots down to the gate and transistor levels.

 

You are also forgetting the fact that xbox one will have a custom HD audio engine chip that will keep a whole bunch of free CPU cycles. This matters a whole lot when it comes to actual performance. 

 

In that video, Sony basically just admitted they used the easier method because they don't want to over complicate things for developers,since they've proven last gen they don't have the software know how to take advantage of such complex systems. They just did the exact same thing with their PS4 chip,taking the easy way out by using 18 GCNs. Microsoft took the complicated way last time around by using the edram configuration,and using some other custom logic on their chip. Looks like they are doing the same today, and by the sounds of it,with stuff like this latest news coming out,i wouldn't be surprised if xbox one ends up being the one with the much superior and better performing hardware.

Just to answer

 

The XBone is 60% less powerful than the PS4 in processing, adding 30 times more bandwidth won't help for jack because of that. Also, the restricted 3GB of RAM to the OS at all times and 2 for Kinect. What developers have asked for the most is more RAM


If your embedded memory bandwidth is 1000 TB/sec, it's still 32 MB and it's isolated from the main RAM pool, meaning you're gonna have to go through those "move engine" co-processors to get there. The more and more I learn about the XBone's design, the more and more I think the move engines are really gonna be its Achilles' Heel. If you're shuffling data in and out of 32MB of which is essentially glorified cache, those move engines are going to have to be cranking a mile a minute - developers will likely have to reprogram them to suit their needs depending on the game, and they WILL cause bottleneck no matter how you slice it.

Share this post


Link to post
Share on other sites

Posted

Serious question: Do games these days (or the near future) on a high-end PC or console actually take that kind of bandwidth and make use of it?

Share this post


Link to post
Share on other sites

Posted

It's simply not true. One of the Xbox 360's greatest advantages was its 10MB EDRAM buffer. It was not difficult to use, especially at 720p. Even at 1080p with some extra legwork it was very effective. The thing to keep in mind is that it's tailored to be used for the most memory bandwidth intensive tasks the system performs, which are those high-throughput low-size requirements. Namely the frame buffer. This frees the main memory to have ample bandwidth for everything else, which tend to be tasks requiring a lot of low-latency memory but not necessarily all that much bandwidth. These are generalizations of course, but the success of this model in the 360, and the fact that the XB1 has a larger, faster chunk of it (enough for a 1080p frame buffer without any fancy tricks) is encouraging.

 

Also, take note of the DX 11.2 work discussed at Build this week. One of the big focuses was on making drastically more efficient use of graphics memory via an impressive new tiled resource architecture. Don't think for second that they didn't design the XB1 with this in mind.

3 people like this

Share this post


Link to post
Share on other sites

Posted

Throwing ungodly amounts of bandwidth at a GPU does nothing for it unless the GPU actually has the execution resources to make use of it.
 

 And what makes you think that it doesn't...  :s

 

In the same video posted, Mark Cerny states that the GPU is doing far more than just graphics, GPUs can do so much more nowadays. 

Share this post


Link to post
Share on other sites

Posted

 And what makes you think that it doesn't...  :s

 

In the same video posted, Mark Cerny states that the GPU is doing far more than just graphics, GPUs can do so much more nowadays. 

On a GPU with 12 GCN engines clocked at just 800mhz, yes. That is a fact. Increasing memory bandwidth does literally nothing if the GPU itself can't process workloads fast enough to consistently fill the memory.

Share this post


Link to post
Share on other sites

Posted

the point its not only about the power, bandwidth or even faster memory speed.

 

Unified memory its a type of architecture that most developers wanted, and means faster use of resources no segmentation or complicated way to manage memory between cpu and gpu in the end its more optimized approach.

 

Too bad all of this its wasted on multiplatform games since every single developer will just develop to the lowest common spec in mind. But first party and exclusive games will shine....

Share this post


Link to post
Share on other sites

Posted

Just to answer

 

The XBone is 60% less powerful than the PS4 in processing, adding 30 times more bandwidth won't help for jack because of that. Also, the restricted 3GB of RAM to the OS at all times and 2 for Kinect. What developers have asked for the most is more RAM

 

no, that is incorrect. the xbox has 33% less compute unit power than ps4,not 60%.

 

And just for you to understand how compute units doesn't tell you actual performance, im going to show you some examples. Last gen, xbox 360 had the edram die that also enabled xbox360 to have penalty free 4XMSAA,alpha blending and zbuffering.

 

Check out these benchmarks at around 1080p resolutions, where turning on 4xMSAA nearly halves the framerate on GPUs. yeah,xbox360 had this asic that did this stuff without affecting performance. instead of your compute units doing all the work,all this is offloaded to some special logic who does this for you,freeing up the compute units for other functions. This shows how performance here was not told by how many computer units the card had.

 

if you compare a card that had this asic,and 8 compute units, vs a card that has 16 compute units,based on these benchmarks,you could make an assumption that these 2 card could run games at the exact same frames per second.

 

http://www.hardocp.com/article/2011/07/18/nvidias_new_fxaa_antialiasing_technology/2#.Uc5Ab8DD9dg

 

 


If your embedded memory bandwidth is 1000 TB/sec, it's still 32 MB and it's isolated from the main RAM pool, meaning you're gonna have to go through those "move engine" co-processors to get there. The more and more I learn about the XBone's design, the more and more I think the move engines are really gonna be its Achilles' Heel. If you're shuffling data in and out of 32MB of which is essentially glorified cache, those move engines are going to have to be cranking a mile a minute - developers will likely have to reprogram them to suit their needs depending on the game, and they WILL cause bottleneck no matter how you slice it.

 

cause bottlenecks? what are you talking about? and you have no idea what developers have to or not do because you don't have the development tools,and you don't know how it works. xbox360 had 10mb edram,and ive never heard complaints about how hard it was to develop for it. I am going to assume it was simple because of the tools provided by Microsoft because of their decades in experience with software.

1 person likes this

Share this post


Link to post
Share on other sites

Posted

Just to answer

 

The XBone is 60% less powerful than the PS4 in processing, adding 30 times more bandwidth won't help for jack because of that. Also, the restricted 3GB of RAM to the OS at all times and 2 for Kinect. What developers have asked for the most is more RAM


If your embedded memory bandwidth is 1000 TB/sec, it's still 32 MB and it's isolated from the main RAM pool, meaning you're gonna have to go through those "move engine" co-processors to get there. The more and more I learn about the XBone's design, the more and more I think the move engines are really gonna be its Achilles' Heel. If you're shuffling data in and out of 32MB of which is essentially glorified cache, those move engines are going to have to be cranking a mile a minute - developers will likely have to reprogram them to suit their needs depending on the game, and they WILL cause bottleneck no matter how you slice it.

 

Where do people get these percentages?

 

I've heard 15, 30, 40, 50 and now 60%. Seems like no one really knows anything and they're just throwing numbers around.

Share this post


Link to post
Share on other sites

Posted

On a GPU with 12 GCN engines clocked at just 800mhz, yes. That is a fact. Increasing memory bandwidth does literally nothing if the GPU itself can't process workloads fast enough to consistently fill the memory.

 

 GPUs have a large number of cores capable of performing huge amount of calculations simultaneously. My GTX 560 can run a simple ray-tracer in real-time using CUDA, these things are anything but lacking in processing power. The bottleneck has always been getting data to the cores faster. 

Share this post


Link to post
Share on other sites

Posted

really? care to give some technical examples to show us why 150GB/s is more than necessary?

 

 

 

 

again,GCNs are just one aspect of a GPU,just like bus width,ram configuration,and other things that we don't know. theres examples set from last gen that shows there are more customized elements part of GPUs that massively affect performance.

 

 

 

 

are you kidding? we're not talking about a dumb buffer fill like a cpu assisted sound card. try for example using some audio processing plugins in your digital audio workstation applications,and wait for your powerful CPU to process this audio signal. Yeah, not that simple.

 

Just look at various graphics cards. You don't see much, if any, with the around same power of the one's gpu sporting 150GB/sec or higher of bandwidth. There's a reason for that and it should be pretty obvious.

 

Yes, the compute is one aspect, that's true, but when you cut down that part of a gpu you usually cut down other parts in the process. Again, look at any modern pc gpu for examples if you're unsure.

 

Too bad this is a game console and not a digital audio workstation. If it was the latter you may actually have some kind of point.

Share this post


Link to post
Share on other sites

Posted

That's not quite true. Memory bandwidth for the frame buffer lets you better AA, for example, without all that substantial an effect on compute resources for these kind of chips. And unless you're targeting 4K rendering, the max resolution you have is 1080p, which is far less demanding on memory bandwidth than the higher resolutions some hardcore gamers might run their PC games at.

 

Apparently quoting is broken. Was replying to Motoko saying that bandwidth does nothing with fewer compute engines.

Share this post


Link to post
Share on other sites

Posted

no, that is incorrect. the xbox has 33% less compute unit power than ps4,not 60%.

 

And just for you to understand how compute units doesn't tell you actual performance, im going to show you some examples. Last gen, xbox 360 had the edram die that also enabled xbox360 to have penalty free 4XMSAA,alpha blending and zbuffering.

 

Check out these benchmarks at around 1080p resolutions, where turning on 4xMSAA nearly halves the framerate on GPUs. yeah,xbox360 had this asic that did this stuff without affecting performance. instead of your compute units doing all the work,all this is offloaded to some special logic who does this for you,freeing up the compute units for other functions. This shows how performance here was not told by how many computer units the card had.

 

if you compare a card that had this asic,and 8 compute units, vs a card that has 16 compute units,based on these benchmarks,you could make an assumption that these 2 card could run games at the exact same frames per second.

 

http://www.hardocp.com/article/2011/07/18/nvidias_new_fxaa_antialiasing_technology/2#.Uc5Ab8DD9dg

 

 

 

cause bottlenecks? what are you talking about? and you have no idea what developers have to or not do because you don't have the development tools,and you don't know how it works. xbox360 had 10mb edram,and ive never heard complaints about how hard it was to develop for it. I am going to assume it was simple because of the tools provided by Microsoft because of their decades in experience with software.

The most benefit that eSRAM will be able to provide is the framebuffer advantage that the eDRAM provided on the 360, but that tech is going to essentially be a null point now because the PS4's unified RAM is more than up to the task of a 1080p framebuffer. If you wanted to do anything ELSE with it, the amount of transferring in and out of main RAM you'd have to do (particularly for CPU tasks exceeding the cache, because the CPU has no direct access to the eSRAM) would essentially nullify any sort of benefit.

 

The EDRAM in the 360 is only there for use as a frame buffer. Its can essentially be free AA. It is not even remotely similar to the eSRAM employed in the Xbone.

Just to clarify further, The EDRAM of Xbox 360 is not just memory, but half the drawing work of the GPU. Besides the 10MB memory, there are 192 ROPs embeded on the memory itself, thus why the "free MSAA".The memory draws itself and asks the GPU for the pixel shaded data.

Now the Xbone memory is just memory, so no free MSAA for it.

Share this post


Link to post
Share on other sites

Posted

Motoko, you don't seem to understand how the embedded memory is used. It is never copied to main memory. It is written by the GPU (and read when it's modifying it) and then pushed out to the screen directly. At least this is the model used in the 360. It is very easy and natural to work with.

Share this post


Link to post
Share on other sites

Posted

Motoko, you don't seem to understand how the embedded memory is used. It is never copied to main memory. It is written by the GPU (and read when it's modifying it) and then pushed out to the screen directly. At least this is the model used in the 360. It is very easy and natural to work with.

Please relate to my post above

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Recently Browsing   0 members

    No registered users viewing this page.