• 0

Unsafe void DoSomething<T>(T[] array) where T : struct


Question

Not allowed,

fixed(T* ptr = &array[0])

Also not allowed,

unsafe void DoSomething<T>(void* ptr) where T : struct
{
T* s = (T*)ptr;
}

even though T is a struct making it an artificial limitation AFAIK

 

I have a generic function used by N classes and I want to make it unsafe to get more speed... followed by a few more functions just like it.

I can write one function to take care of 3 scenarios as three structs are 24 bytes and thus interchangeable.

Any better solution though?

 

One more,

If I am going to use unsafe everywhere, any reason I shouldn't just AllocHGlobal instead of using new T?

Link to comment
Share on other sites

14 answers to this question

Recommended Posts

  • 0

From this it looks like the restriction is because a struct may contain a reference to a managed type.

That makes sense. But alas, a rock and a hard place...
Link to comment
Share on other sites

  • 0

This is one place where knowing F# would help you :)

let doSomething<'T when 'T:unmanaged> (ptr:nativeint) =
    let s = NativePtr.ofNativeInt<'T>(ptr)
    // Do whatever you want!
    ()

Although, if you could describe in more detail what is it that you're trying to optimize and why unsafe code seems like a good idea to speed things up (it rarely is), we could give you better advice.

Link to comment
Share on other sites

  • 0

This is one place where knowing F# would help you :)

let doSomething<'T when 'T:unmanaged> (ptr:nativeint) =
    let s = NativePtr.ofNativeInt<'T>(ptr)
    // Do whatever you want!
    ()
Although, if you could describe in more detail what is it that you're trying to optimize and why unsafe code seems like a good idea to speed things up (it rarely is), we could give you better advice.
I want to remove bounds checking for faster array manipulation.

T[] being a 1D representation of a matrix

Link to comment
Share on other sites

  • 0

I don't think you have much to gain simply by removing bounds checking. Pinning is not for free so the speedup has at least to exceed the cost of doing that. Bounds check are almost insignificant as an operation. In this sort of code your memory layout and access patterns (what fits in a line of cache etc) matter a lot more than saving one CPU instruction.

Also consider the new Vector types that compile to SIMD instructions in the next version of .NET.

If you need highly optimized numerical code one option is also to write those particular methods in native C++.

Link to comment
Share on other sites

  • 0

I disagree, you can get significant gains from using pointers to avoid bounds checking. The overwhelming majority of situations the JIT compiler will not remove it. Also, the 64bit JIT provides fewer optimizations compared the 32bit version.

Link to comment
Share on other sites

  • 0

I disagree, you can get significant gains from using pointers to avoid bounds checking. The overwhelming majority of situations the JIT compiler will not remove it. Also, the 64bit JIT provides fewer optimizations compared the 32bit version.

I would be curious to see a benchmark showing when this is the case. Simple test:

 

using System;
using System.Diagnostics;
using System.Linq;

namespace ConsoleApplication2 {

    class Program {

        unsafe static double SumArrayUnsafe(double* arr, int length) {
            double sum = 0;
            for (int i = 0; i < length; ++i) {
                sum += arr[i];
            }
            return sum;
        }

        static double SumArrayWithoutBoundsCheck(double[] arr) {
            double sum = 0;
            for (int i = 0; i < arr.Length; ++i) {
                    sum += arr[i];
            }
            return sum;
        }


        static double SumArrayWithBoundsCheck(double[] arr) {
            double sum = 0;
            for (int i = 0; i < arr.Length; ++i) {
                if (i < arr.Length) {
                    sum += arr[i];
                }
            }
            return sum;
        }

        unsafe static void Main() {
            var rand = new Random();
            var arr = Enumerable.Range(0, 1000000).Select(_ => rand.NextDouble()).ToArray();
            SumArrayWithBoundsCheck(arr);
            SumArrayWithoutBoundsCheck(arr);
            fixed (double* pinnedArr = arr) {
                SumArrayUnsafe(pinnedArr, arr.Length);
            }

            var sw = Stopwatch.StartNew();
            for (int i = 0; i < 1000; ++i) {
                SumArrayWithBoundsCheck(arr);
            }
            var elapsed = sw.Elapsed;
            Console.WriteLine("With bounds check : {0:0.000} seconds", elapsed.TotalSeconds);

            sw = Stopwatch.StartNew();
            for (int i = 0; i < 1000; ++i) {
                SumArrayWithoutBoundsCheck(arr);
            }
            elapsed = sw.Elapsed;
            Console.WriteLine("Without bounds check : {0:0.000} seconds", elapsed.TotalSeconds);


            fixed (double* pinnedArr = arr) {
                sw = Stopwatch.StartNew();
                for (int i = 0; i < 1000; ++i) {
                    SumArrayUnsafe(pinnedArr, arr.Length);
                }
                elapsed = sw.Elapsed;
                Console.WriteLine("Unsafe code : {0:0.000} seconds", elapsed.TotalSeconds);
            }
        }
    }
}

Results on my machine on .NET 4.5 32-bit Release:

 

With bounds check : 0,784 seconds

Without bounds check : 0,777 seconds

Unsafe code : 0,776 seconds

It's barely a 1% difference. And yes the bounds check in "Without bounds check" is optimized away (the code relies on that), and the redundant check in "With bounds check" isn't optimized away, I looked at the assembly.

Link to comment
Share on other sites

  • 0

I am actually compiling x64 because mostly working with double precision and harder to OOM.

 

I don't think you have much to gain simply by removing bounds checking. Pinning is not for free so the speedup has at least to exceed the cost of doing that. Bounds check are almost insignificant as an operation. In this sort of code your memory layout and access patterns (what fits in a line of cache etc) matter a lot more than saving one CPU instruction.

Also consider the new Vector types that compile to SIMD instructions in the next version of .NET.

If you need highly optimized numerical code one option is also to write those particular methods in native C++.

Yes, pinning is not free, which brings me to a question - should I just allocate a chunk of memory via Marshal AllocHglobal ? I am already doing this with a binary array. 

 

Is there any free optimized .NET / C++ library capable of working with say a double[] array as a matrix (flip, transpose, etc...)?

Link to comment
Share on other sites

  • 0

Yes, pinning is not free, which brings me to a question - should I just allocate a chunk of memory via Marshal AllocHglobal ? I am already doing this with a binary array. 

 

Is there any free optimized .NET / C++ library capable of working with say a double[] array as a matrix (flip, transpose, etc...)?

Math.NET Numerics is quite comprehensive and well optimized. It has a highly generic Matrix type that supports different storage strategies optimized for different scenarios (sparse, dense, diagonal, etc.).

 

For C++, Blaze looks pretty good. If you want the fastest, you have Intel MKL but that's not free. Note that Math.NET Numerics can leverage Intel MKL but the license is only for personal use, if you redistribute you'll have to buy your own license.

 

And, yes by allocating directly on the unmanaged heap you spare the GC having to work around a large pinned array, so that could help as well. But again, given that bounds checking isn't a significant cost in general, I'm not sure that it's really worth the trouble. You're giving up pretty much all type safety there.

Link to comment
Share on other sites

  • 0

I would be curious to see a benchmark showing when this is the case. Simple test:

 

using System;
using System.Diagnostics;
using System.Linq;

namespace ConsoleApplication2 {

    class Program {

        unsafe static double SumArrayUnsafe(double* arr, int length) {
            double sum = 0;
            for (int i = 0; i < length; ++i) {
                sum += arr[i];
            }
            return sum;
        }

        static double SumArrayWithoutBoundsCheck(double[] arr) {
            double sum = 0;
            for (int i = 0; i < arr.Length; ++i) {
                    sum += arr[i];
            }
            return sum;
        }


        static double SumArrayWithBoundsCheck(double[] arr) {
            double sum = 0;
            for (int i = 0; i < arr.Length; ++i) {
                if (i < arr.Length) {
                    sum += arr[i];
                }
            }
            return sum;
        }

        unsafe static void Main() {
            var rand = new Random();
            var arr = Enumerable.Range(0, 1000000).Select(_ => rand.NextDouble()).ToArray();
            SumArrayWithBoundsCheck(arr);
            SumArrayWithoutBoundsCheck(arr);
            fixed (double* pinnedArr = arr) {
                SumArrayUnsafe(pinnedArr, arr.Length);
            }

            var sw = Stopwatch.StartNew();
            for (int i = 0; i < 1000; ++i) {
                SumArrayWithBoundsCheck(arr);
            }
            var elapsed = sw.Elapsed;
            Console.WriteLine("With bounds check : {0:0.000} seconds", elapsed.TotalSeconds);

            sw = Stopwatch.StartNew();
            for (int i = 0; i < 1000; ++i) {
                SumArrayWithoutBoundsCheck(arr);
            }
            elapsed = sw.Elapsed;
            Console.WriteLine("Without bounds check : {0:0.000} seconds", elapsed.TotalSeconds);


            fixed (double* pinnedArr = arr) {
                sw = Stopwatch.StartNew();
                for (int i = 0; i < 1000; ++i) {
                    SumArrayUnsafe(pinnedArr, arr.Length);
                }
                elapsed = sw.Elapsed;
                Console.WriteLine("Unsafe code : {0:0.000} seconds", elapsed.TotalSeconds);
            }
        }
    }
}

Results on my machine on .NET 4.5 32-bit Release:

 

It's barely a 1% difference. And yes the bounds check in "Without bounds check" is optimized away (the code relies on that), and the redundant check in "With bounds check" isn't optimized away, I looked at the assembly.

 

Those scenarios are far too simple. Yes it will remove the bounds check because it is easy to tell that its impossible to go beyond it with that code. I can't post sample code because its proprietary but when you have large arrays (> 50 million) and you have to process the data in chunks you get code that the bounds cannot be inferred directly.

 

Another source of improvement using unsafe code is large bitmap manipulation.

Link to comment
Share on other sites

  • 0

Those scenarios are far too simple. Yes it will remove the bounds check because it is easy to tell that its impossible to go beyond it with that code. I can't post sample code because its proprietary but when you have large arrays (> 50 million) and you have to process the data in chunks you get code that the bounds cannot be inferred directly.

I'm not arguing that bounds check get removed, I'm arguing that bounds check are an insignificant cost and that's what the test I did shows. I'd like to see a scenario in which array bounds checking is a significant cost, one that's worth pinning and giving up type safety for.

Link to comment
Share on other sites

  • 0

If the code is long running and makes many reads/writes you will see a benefit. The importance of that extra performance depends on what you are doing and what kind of program you are running, but having put in significant time benchmarking all the options (including allocating on the native heap) we determined that for us it was worth it.

 

Its a fairly unique situation, and that is the only part of that code base that uses unsafe code but the point of my original post was simply that there are scenarios where you can get a measurable performance improvement.

Link to comment
Share on other sites

  • 0

Is there any free optimized .NET / C++ library capable of working with say a double[] array as a matrix (flip, transpose, etc...)?

By the way, there will be a hardware-accelerated matrix type coming as standard in the next version of .NET: http://blogs.msdn.com/b/dotnet/archive/2014/11/05/using-system-numerics-vector-for-graphics-programming.aspx

 

:)

Link to comment
Share on other sites

This topic is now closed to further replies.