DirectX 11: Sooner than You Think


Recommended Posts

While most of the attention at Nvision 2008 seemed to revolve around the LAN party, professional gaming tournaments, booth babes, and interesting games and applications, there was also a professional development conference going on.

Most of the action revolved around CUDA (compute unified device architecture), Nvidia's technology for using GPUs in massively parallel applications. However, Kev Gee of Microsoft showed up and gave a quick one-hour overview on Microsoft's upcoming DirectX 11?in particular, Direct3D.

Given the brief time, Gee had to compress a lot of information into an hour, but we came away with a better understanding of what DirectX 11 will bring to the table. In some ways, it's not as radical a change as DirectX 10 was. But it does introduce some important features. What follows is based on Kev Gee's talk at Nvision 2008. Since DirectX 11 is a work in progress, some of these bits of information could change before release.Building on DirectX 10b>

DirectX 10 only works with Windows Vista, and offered no backward compatibility with Windows XP. Microsoft cites the fundamental change of display driver model in Vista as a requirement for DX10. DirectX 11 will also work with Vista, but is also targeted to work with Windows 7, the working title for Microsoft's next operating system release. (Windows 7 is expected to use a fundamentally similar display driver model to Vista, if not exactly the same.)

Let's take a look at the DirectX 11 pipeline.

0,1425,i=219042,00.jpg

There are three new stages in the standard graphics pipeline: the Hull Shader, Tessellator, and Domain Shader. In addition, changes have been made to the pixel shader to enable compute shaders (for general purpose applications). We'll touch on those shortly.

In addition to the new pipeline stages, DirectX is being tweaked to fully support multithreading. So DirectX 11 DLLs will spawn threads as appropriate on multicore and SMT-enabled CPUs.

Another key new feature are several new texture compression formats, which enable better image quality, and will support high dynamic range. Again, we'll touch on this in more detail a bit later.

A host of lesser features are also being implemented; most don't require new hardware. They include upping the resource limit to 2GB, increasing texture limits to 16K and support for double-precision floating point (this last one is optional, and is aimed at compute shaders).

Now let's drill down on some of the features.Hardware Tessellationb>

One of the key goals for DirectX 11 is to enable more robust character authoring, while reducing the time to create complex and realistic characters. The trend has been to build characters with dense triangle meshes, then reduce the complexity depending on the target platform.

This creates a problem: The end result doesn't really jibe with the artist's conception.

Artists and game designers have been pushing for characters with denser triangle meshes, which enable more detailed characters. Animation complexity is also increasing. The net result is fewer pointy heads and moonwalking characters.

More detailed characters with increasingly complex animation eats into memory and storage requirements. This results in bandwidth issues?load times increase, and memory demands on graphics cards go up.

The answer is to use the power of the GPU to generate this additional complexity?hardware tessellation. Industry watchers were a little disappointed that hardware tessellation didn't make it into DX10, but it will be fully implemented in DX11. Note that this is the one feature that absolutely requires DirectX 11 hardware. When Gee was asked if the hardware tessellator currently built into AMD Radeon HD series GPUs would support DX11 tessellation, the answer was "No."

Gee went on to explain that DX11 tessellation was more robust and general than the solution built into current AMD GPUs. The AMD hardware uses essentially the same as the tessellation unit in the Xbox 360; DX11 tessellation is a superset of the AMD approach.

The hull shader takes control points for a patch as an input. Note that this is the first appearance of patch-based data used in DirectX. The output of the hull shader essentially tells the tessellator stage how much to tessellate. The tessellator itself is a fixed function unit, taking the outputs from the hull shader and generating the added geometry. The domain shader calculates the vertex positions from the tessellation data, which is passed to the geometry shader.

It's important to recognize that the key primitive used in the tessellator is no longer a triangle: It's a patch. A patch represents a curve or region, and can be represented by a triangle, but the more common representation is a quad, used in many 3D authoring applications.

What all this means is that fully compliant DirectX 11 hardware can procedurally generate complex geometry out of relatively sparse data sets, improving bandwidth and storage requirements. This also affects animation, as changes in the control points of the patch can affect the final output in each frame.

The cool thing about hardware tessellation is that it's scalable. It's possible that low end hardware would simply generate less complex models than high-end hardware, while the actual data fed into the GPUs remains the sCompute Shaderer

Nvidia and AMD have been pushing GP-GPU for several years now; the 8 series GPUs actually had hardware in place to better enable Nvidia GPUs to act as general purpose compute engines; the latest 200 series GPUs expands on that.

Companies are sitting up and taking notice of the performance gains possible in certain classes of applications when using the highly parallel computer engines that are part of a modern GPU. Apple is working with the Khronos Group on OpenCL, a standards-based method for general purpose GPU computing, modeled on OpenGL. AMD's Stream SDK enables GP-GPU support for Radeon HD series hardware, across multiple operating systems. Nvidia is probably the furthest along, with its CUDA technology; a host of applications using CUDA is starting to emerge.

DirectX 11 weighs in with compute shaders. The compute shader uses the resources of the GPU to perform post-processing chores, such as blur effects. This required adding syntax and constructs to the DirectX HLSL (high level shading language). The graphics pipeline can now generate data structures that are better suited to general purpose applications, which then can be operated on by the compute shader.

Note that the diagram doesn't imply that the compute shader is somehow part of the pixel shader. Rather, it's a shader that can take output from the graphics pipeline, after that data has passed through the pixel shader.

It's great that Microsoft is implementing compute shaders in DirectX. Once DX11 ships, GPU programmers will have a full array of tools to support general purpose applications on the GPU:

  • CUDA on Windows, MacOS, and Linux on Nvidia GPUs and Intel (and presumably AMD) CPUs
  • Stream SDK for AMD GPUs and CPUs on Windows and Linux
  • OpenCL on MacOS (and possibly other OSes) on both Nvidia and AMD
  • DirectX 11 compute shaders on both Nvidia and ATI GPUs and, presumably, Intel and AMD CPUs in Windows.

Windows and Mac OS programmers in particular won't have to choose which hardware they'll run on; the respective GP-GPU API will generalize support. It's likely that OpenCL will also show up on open source platforms (BSD and Linux) as well. At that point, the future for CUDA and Stream SDK may be limited to vertical applications requiring "closer to the metal" performaMultithreading and Dynamic Shader Linkagege

Multithreading is a hot topic. Today, dual core CPUs are mainstream, and if Intel's announcement of the Q8300 quad core CPU, a future where four cores become mainstream isn't that far off.

Both AMD and Nvidia have built better multithreading support into their respective graphics drivers, but the majority of DirectX is still single threaded. Microsoft will rectify this shortfall in DX11, and those benefits will even accrue to applications running on DirectX 10 hardware.

Multithreading support will include asynchronous resource loading, which can actually happen while rendering threads are executing. Draw and state submission will also be threaded, which will allow rendering work to be spread out across multiple threads.

To facilitate all this, DirectX 11 devices are split into device, immediate context, and deferred context interfaces. The immediate context is the current device for state and drawing, while the deferred context is the per-thread device contexts for future renders. Each device interface can spawn thread resources as needed. The deferred context has support for a type of display list per object. Note that the rendering is actually deferred?this is not the same as drawing to a back buffer and flipping. Rather, each deferred context holds the display list (draw calls) ready for rendering when appropDynamic Shader Linkagekage

Shader linkage is just another step along the way to make DirectX a more flexible and general purpose compute environment. Today, if multiple shaders need to be invoked, a large "uber shader" is created. This contains all the conditional statements needed to invoke whichever individual shader may be needed for a particular situation.

The problem is that this can create huge, unwieldy shaders that are difficult to debug. They also make less efficient usage of available hardware resources.

Microsoft's solution is to introduce object oriented features to the HLSL?interfaces and classes. This lets graphics programmers create shaders that behave like subroutines that are only loaded whenImproved Texture Compression and Hardware Supportupport

Today's DirectX texture compression is showing its age. When multiple textures are decompressed and displayed, the results are often blocky looking textures, even when the textures themselves are high resolution. On top of that, there's no support for compression of high dynamic range textures.

DirectX 11 introduces two new texture formats, BC6 (sometimes called BC6H) and BC7. BC6 supports HDR textures with 6:1 lossy compression (16 bits per channel.) This allows for high visual quality, but it's not lossless.

BC7 works with LDR (low dynamic range) formats, and can include alpha. It offers 3:1 compression for RGB or 4:1 for RGB + alpha. Visual quality should be very high with this format.

Microsoft will now require that DX11 hardware decompress textures in such a way to be completely accurate with the DX11 spec. Currently, there's some room for "interpretation" in the way that DX10 and below hardware handles texture decompression.

The block types are designed to offer smoother gradients and much less blocky Support for DirectX 10 Hardwarerdware

Quite a few features?with the exception of hardware tessellation?will be supported on DX10 hardware. Of course, DX10 hardware will continue to run games and apps in DX10 mode. But unlike DX10, which only runs on DX10-compliant hardware, elements of DX11-specific features will also run on DX10 hardware.

Multithreading will work, although deferred contexts will have to be implemented at the API (software) level rather than in the hardware. The object oriented features added should also work, though how efficiently is anyone's guess. The new texture compression formats could be implemented at the driver level, though that would be slower than dedicatSooner Than You Think You Think

While the final bits for DirectX 11 are targeted to ship with the first version of Windows 7, Microsoft will be delivering a preview version of the API as early as November, 2008. At that time, we'll have a better gauge of which features will actually make the cut. But we won't get a full picture until the Windows 7 timeframe, which is likely to be sometime in 2010 (though betas may appear in 2009).

Source: ExtremeTech

Link to comment
https://www.neowin.net/forum/topic/665084-directx-11-sooner-than-you-think/
Share on other sites

Sooner than I think? "While the final bits for DirectX 11 are targeted to ship with the first version of Windows 7" Funny, I thought it was coming with Windows 7.

That goes with the last thing I had heard as well. I really want them to take their time on this one and make sure its features are utilized.

Sooner than I think? "While the final bits for DirectX 11 are targeted to ship with the first version of Windows 7" Funny, I thought it was coming with Windows 7.

"Microsoft will be delivering a preview version of the API as early as November, 2008. At that time, we'll have a better gauge of which features will actually make the cut."

Sooner than I thought.

There aren't any DX10 games out there that perform as they should. If I remember correctly, DX10 was supposed to bring improved image quality and performance at the same time. I'm pretty sure DX11 will conform to the same trend.

All of the new features sound very very interesting (at least to me). The new compression methods, tessellation, compute shaders, etc. Sounds like DX11 will be what DX10 was "supposed" to be. The multi-threadedness of DX11 should give a nice boost in performance as well. Maybe DX11 will finally show better performance than DX9 with better looking graphics.

Will we need new graphic cards? :pinch:

Most DX11 features will be backwards compatible with DX10 hardware, but it'll run slower. It won't be as good as dedicated DX11 hardware.

and be prepared to buy W7 license muhahahaha to get DX11

Directx 11 will be for vista as well because they share the same driver model, the reason dx 10 is vista only is because it is designed for the new vista model and it would take a huge overhaul of dx 10 or xp's driver model to port it to xp. That problem won't exist with windows 7 and microsoft would not be dumb enough to restrict it to 7.

Directx 11 will be for vista as well because they share the same driver model, the reason dx 10 is vista only is because it is designed for the new vista model and it would take a huge overhaul of dx 10 or xp's driver model to port it to xp. That problem won't exist with windows 7 and microsoft would not be dumb enough to restrict it to 7.

That's not really true, there was a programmer who came out with a wrapper for DX10 in which it could be ported to XP, of course MS wanted nothing to do with it...so that is why we still see DX10 for Vista.

I hope it doesn't fail at life like dx10.

So far after what over a year and a half we got a single game that performs slightly better in dx10, Assassins Creed. We were supposed to get better gfx and better performance and so far we have very slightly better gfx at the cost of generally huge performance hits.

Directx 11 will be for vista as well because they share the same driver model, the reason dx 10 is vista only is because it is designed for the new vista model and it would take a huge overhaul of dx 10 or xp's driver model to port it to xp. That problem won't exist with windows 7 and microsoft would not be dumb enough to restrict it to 7.

i know

any way if we went DX11 >>> splash more cash at Graphic card

as DX11 wont work with older card or will be slower to begin with

i would lol if ms locked out DX11 for W7 to begin with

that mean they will have the same problem as DX10

Then they would have slow uptake by devs

it is the chicken and the egg !

That's not really true, there was a programmer who came out with a wrapper for DX10 in which it could be ported to XP, of course MS wanted nothing to do with it...so that is why we still see DX10 for Vista.

The "ported" directx 10 project is crap all it does is convert directx 9 calls to direct x 10 creating much more software overhead (Which is what directx 10 is supposed to eliminate) and some things directx 10 does just can't work with this method. There is no dx 10 for xp. Period.

The only way you're going to have DirectX 10 (or 11) on an old OS like XP, is to emulate the calls that can't be done on hardware, and translate the ones that can be done.

As you'd expect, this isn't fast.

yea

emulation = slower in the most of times

I hope it doesn't fail at life like dx10.

So far after what over a year and a half we got a single game that performs slightly better in dx10, Assassins Creed. We were supposed to get better gfx and better performance and so far we have very slightly better gfx at the cost of generally huge performance hits.

Mabye when games use DX10 natively you can actually have something to comment over. Games now does not use DX10 for what it's worth at all. It's just marketing nothing more.

Mabye when games use DX10 natively you can actually have something to comment over. Games now does not use DX10 for what it's worth at all. It's just marketing nothing more.

yeah that was kind of my whole point

when dx10 isn't worthless let me know ;)

Well, you'd have to tell the games devs to make native dx10 games first. It's not the API that's worthless in a sense. It's just that devs are always behind the API. This was the case for DX9 as well, they've had DX8.1 games for a good time till they slowly moved over to DX9.

But I think we'll see more DX10 games in 2009 once the market for them is big enough. Right now they just don't wanna have to code two different games to support both XP and Vista.

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
  • Posts

    • Who is paying for this 30x scale-up? Its sounds expensive.
    • Millions of users to benefit from Windows 11's new performance boost on Adobe Photoshop by Sayan Sen Despite the advent of AI-generated imagery, Adobe's Photoshop remains one of the most popular tools on this planet. Adobe does not have a publicly reported total user count but it's probably not wrong to assume there are millions. As of 2025, Adobe Creative Cloud has had approximately 41 million paid subscribers, many of whom likely use Photoshop. In addition, more than 166,000 companies worldwide are apparently also using the app. These figures are according to a very recent report by SQ Magazine. Out of them, it is fair to assume that many are probably running Windows. As such, there is good news for these users as Microsoft has announced Photoshop is getting a big 20% performance boost on x86-64 (AMD64) systems and a 13% bump-up on Arm devices. This is definitely great news for them as many have complained about the slow performance and general sluggishness of Photoshop on Windows 11 ever since the advent of the latter back in 2021. If you are wondering how Microsoft managed to do this, the answer lies in a combination of compiler-level optimizations and a technology called Sample Profile Guided Optimization (SPGO). According to Microsoft, Adobe worked closely with the company’s Visual C++ team and adopted the latest MSVC toolchain enhancements together with SPGO to squeeze more performance out of Photoshop’s CPU-bound workloads. Unlike traditional Profile Guided Optimization (PGO), which requires developers to create special instrumented builds and run lengthy training workloads, SPGO gathers performance data directly from optimized release binaries. This means Adobe could collect real-world usage information which gives a major advantage to this technique, as companies could leverage data collected from actual customer workloads rather than only relying on synthetic benchmark runs. In theory, this should allow optimizations to better reflect how users interact with software in the real world. Thanks to this, there are improvements to code layout, function inlining, hot-and-cold code separation, and other low-level tweaks that help processors execute instructions more efficiently. Essentially the compiler is better able to identify “hot” code paths, those which are most frequently executed, and optimize them accordingly.
    • "The 2TB Samsung 990 PRO NVMe SSD hits lowest price in over three months¨ I'd prefer to see the lowest price in over a year
    • Glad these prices are starting to come down, but that is still crazy. I bought the 2TB 9100 Pro (slightly more expensive version with PCIe 5.0) last year for $240.
    • The 2TB Samsung 990 PRO NVMe SSD hits lowest price in over three months by Sayan Sen Yesterday, we covered a really good deal wherein you can get a 4TB TeamGroup T-FORCE G50 NVMe PCIe Gen4 SSD for a low price of just $400 with a special discount coupon. That's just $100 per TB, making it a very good offer during these hard times. The deal is still live, so you can check it out in its dedicated article here if you do not want to miss out. Meanwhile, if you don't have that kind of budget but still wish to buy an SSD for a good price, the 2TB variant of the TeamGroup SSD at $280 its lowest price in over three months. Meanwhile, those seeking 2TB but faster performance can check out Samsung's 990 PRO, which has hit the lowest price also in the last quarter or so, as it's on sale for $370 (purchase links under the specs table down below). Thus, you want a faster drive, get the 990 Pro, or you want more capacity, grab the TeamGroup 4TB linked in the first para. The 990 PRO is a PCIe Gen4 NVMe SSD and still one of the fastest drives available today for under $500. Speaking of fast, sequential reads and writes are rated at 7450 MB/s and 6900 MB/s, respectively. The random throughputs for reads and writes are 1400K IOPS and 1550K IOPS, respectively. The 990 PRO is based on Samsung's 7th Gen V-NAND flash, and it too is TLC. It packs 2 gigs of LPDDR4 DRAM cache, which helps the random performance. The endurance rating for this is 1200 TBW (terabytes written), which should be sufficient for most users. The Samsung 990 PRO is compatible with the PlayStation 5, but if you are going to use the 990 PRO on a PC, check out the Samsung Magician app that lets you track your drive's health, update its firmware, customize various settings, and more. The tech specs are given below: Specification TeamGroup T-FORCE G50 2TB Samsung 990 PRO 2TB Interface PCIe 4.0 x4, NVMe 1.4 PCIe Gen 4.0 x4, NVMe 2.0 Form Factor M.2 2280 M.2 2280 Controller InnoGrit Controller Samsung In-house Controller NAND Flash 3D TLC 3D TLC DRAM Cache None (HMB supported) 2GB LPDDR4 Sequential Read (Max) 5,000 MB/s 7,450 MB/s Sequential Write (Max) 4,500 MB/s 6,900 MB/s Random Read (4K) Up to 600,000 IOPS Up to 1,400,000 IOPS Random Write (4K) Up to 700,000 IOPS Up to 1,550,000 IOPS TBW (Endurance) 1,300 TBW 1,200 TBW MTBF 3,000,000 hours 1,500,000 hours Operating Temperature 0°C to 70°C 0°C to 70°C Storage Temperature -40°C to 85°C -40°C to 85°C Shock Resistance 1,500G / 0.5ms 1,500G / 0.5ms Heatsink Patented Graphene Heat Spreader No Get them at the links below: Samsung 990 PRO SSD 2TB (MZ-V9P2T0B/AM): $369.99 (Sold and Shipped by Amazon US) TEAMGROUP T-Force G50 2TB SSD (TM8FFE002T0C129): $279.99 (Sold by TeamGroup, Shipped by Amazon US) Good to know This Amazon deal is U.S. specific, and not available in other regions unless specified. We only use first-party seller links (at the time of article publishing); ensure that you purchase from a first-party seller link only. Check out Today's Deals on Amazon | or our recent tech deals. Become a Prime member (for Students or SNAP) via Neowin Get Prime Access - Prime for half price (for qualifying Medicaid, EBT, SNAP) Subscribe to Prime Video, Audible Plus, Music Unlimited or Kindle Unlimited via Neowin As an Amazon Associate, we earn from qualifying purchases.
  • Recent Achievements

    • First Post
      Jocimo earned a badge
      First Post
    • Week One Done
      suprememobiles48 earned a badge
      Week One Done
    • One Month Later
      Windows Guy earned a badge
      One Month Later
    • One Month Later
      Prasann earned a badge
      One Month Later
    • Week One Done
      Prasann earned a badge
      Week One Done
  • Popular Contributors

    1. 1
      +primortal
      522
    2. 2
      +Edouard
      174
    3. 3
      PsYcHoKiLLa
      90
    4. 4
      Steven P.
      81
    5. 5
      ATLien_0
      70
  • Tell a friend

    Love Neowin? Tell a friend!