+Red King Subscriber² Posted October 30, 2014 Subscriber² Share Posted October 30, 2014 Not allowed, fixed(T* ptr = &array[0]) Also not allowed, unsafe void DoSomething<T>(void* ptr) where T : struct { T* s = (T*)ptr; } even though T is a struct making it an artificial limitation AFAIK I have a generic function used by N classes and I want to make it unsafe to get more speed... followed by a few more functions just like it. I can write one function to take care of 3 scenarios as three structs are 24 bytes and thus interchangeable. Any better solution though? One more, If I am going to use unsafe everywhere, any reason I shouldn't just AllocHGlobal instead of using new T? Link to comment Share on other sites More sharing options...
0 Lant Posted October 30, 2014 Share Posted October 30, 2014 From this it looks like the restriction is because a struct may contain a reference to a managed type. Link to comment Share on other sites More sharing options...
0 +Red King Subscriber² Posted October 30, 2014 Author Subscriber² Share Posted October 30, 2014 From this it looks like the restriction is because a struct may contain a reference to a managed type.That makes sense. But alas, a rock and a hard place... Link to comment Share on other sites More sharing options...
0 Andre S. Veteran Posted October 31, 2014 Veteran Share Posted October 31, 2014 This is one place where knowing F# would help you :) let doSomething<'T when 'T:unmanaged> (ptr:nativeint) = let s = NativePtr.ofNativeInt<'T>(ptr) // Do whatever you want! () Although, if you could describe in more detail what is it that you're trying to optimize and why unsafe code seems like a good idea to speed things up (it rarely is), we could give you better advice. Link to comment Share on other sites More sharing options...
0 +Red King Subscriber² Posted October 31, 2014 Author Subscriber² Share Posted October 31, 2014 This is one place where knowing F# would help you :) let doSomething<'T when 'T:unmanaged> (ptr:nativeint) = let s = NativePtr.ofNativeInt<'T>(ptr) // Do whatever you want! ()Although, if you could describe in more detail what is it that you're trying to optimize and why unsafe code seems like a good idea to speed things up (it rarely is), we could give you better advice.I want to remove bounds checking for faster array manipulation.T[] being a 1D representation of a matrix Link to comment Share on other sites More sharing options...
0 Andre S. Veteran Posted October 31, 2014 Veteran Share Posted October 31, 2014 I don't think you have much to gain simply by removing bounds checking. Pinning is not for free so the speedup has at least to exceed the cost of doing that. Bounds check are almost insignificant as an operation. In this sort of code your memory layout and access patterns (what fits in a line of cache etc) matter a lot more than saving one CPU instruction. Also consider the new Vector types that compile to SIMD instructions in the next version of .NET. If you need highly optimized numerical code one option is also to write those particular methods in native C++. Link to comment Share on other sites More sharing options...
0 Lant Posted October 31, 2014 Share Posted October 31, 2014 Are you sure that the compiler isn't optimising away the bounds checking? For some cases the bounds checking is eliminated (and this was from 2009, so hopefully the situation has improved since then). Link to comment Share on other sites More sharing options...
0 notchinese Posted October 31, 2014 Share Posted October 31, 2014 I disagree, you can get significant gains from using pointers to avoid bounds checking. The overwhelming majority of situations the JIT compiler will not remove it. Also, the 64bit JIT provides fewer optimizations compared the 32bit version. Link to comment Share on other sites More sharing options...
0 Andre S. Veteran Posted October 31, 2014 Veteran Share Posted October 31, 2014 I disagree, you can get significant gains from using pointers to avoid bounds checking. The overwhelming majority of situations the JIT compiler will not remove it. Also, the 64bit JIT provides fewer optimizations compared the 32bit version. I would be curious to see a benchmark showing when this is the case. Simple test: using System; using System.Diagnostics; using System.Linq; namespace ConsoleApplication2 { class Program { unsafe static double SumArrayUnsafe(double* arr, int length) { double sum = 0; for (int i = 0; i < length; ++i) { sum += arr[i]; } return sum; } static double SumArrayWithoutBoundsCheck(double[] arr) { double sum = 0; for (int i = 0; i < arr.Length; ++i) { sum += arr[i]; } return sum; } static double SumArrayWithBoundsCheck(double[] arr) { double sum = 0; for (int i = 0; i < arr.Length; ++i) { if (i < arr.Length) { sum += arr[i]; } } return sum; } unsafe static void Main() { var rand = new Random(); var arr = Enumerable.Range(0, 1000000).Select(_ => rand.NextDouble()).ToArray(); SumArrayWithBoundsCheck(arr); SumArrayWithoutBoundsCheck(arr); fixed (double* pinnedArr = arr) { SumArrayUnsafe(pinnedArr, arr.Length); } var sw = Stopwatch.StartNew(); for (int i = 0; i < 1000; ++i) { SumArrayWithBoundsCheck(arr); } var elapsed = sw.Elapsed; Console.WriteLine("With bounds check : {0:0.000} seconds", elapsed.TotalSeconds); sw = Stopwatch.StartNew(); for (int i = 0; i < 1000; ++i) { SumArrayWithoutBoundsCheck(arr); } elapsed = sw.Elapsed; Console.WriteLine("Without bounds check : {0:0.000} seconds", elapsed.TotalSeconds); fixed (double* pinnedArr = arr) { sw = Stopwatch.StartNew(); for (int i = 0; i < 1000; ++i) { SumArrayUnsafe(pinnedArr, arr.Length); } elapsed = sw.Elapsed; Console.WriteLine("Unsafe code : {0:0.000} seconds", elapsed.TotalSeconds); } } } } Results on my machine on .NET 4.5 32-bit Release: With bounds check : 0,784 seconds Without bounds check : 0,777 seconds Unsafe code : 0,776 seconds It's barely a 1% difference. And yes the bounds check in "Without bounds check" is optimized away (the code relies on that), and the redundant check in "With bounds check" isn't optimized away, I looked at the assembly. Link to comment Share on other sites More sharing options...
0 +Red King Subscriber² Posted October 31, 2014 Author Subscriber² Share Posted October 31, 2014 I am actually compiling x64 because mostly working with double precision and harder to OOM. I don't think you have much to gain simply by removing bounds checking. Pinning is not for free so the speedup has at least to exceed the cost of doing that. Bounds check are almost insignificant as an operation. In this sort of code your memory layout and access patterns (what fits in a line of cache etc) matter a lot more than saving one CPU instruction.Also consider the new Vector types that compile to SIMD instructions in the next version of .NET.If you need highly optimized numerical code one option is also to write those particular methods in native C++. Yes, pinning is not free, which brings me to a question - should I just allocate a chunk of memory via Marshal AllocHglobal ? I am already doing this with a binary array. Is there any free optimized .NET / C++ library capable of working with say a double[] array as a matrix (flip, transpose, etc...)? Link to comment Share on other sites More sharing options...
0 Andre S. Veteran Posted October 31, 2014 Veteran Share Posted October 31, 2014 Yes, pinning is not free, which brings me to a question - should I just allocate a chunk of memory via Marshal AllocHglobal ? I am already doing this with a binary array. Is there any free optimized .NET / C++ library capable of working with say a double[] array as a matrix (flip, transpose, etc...)? Math.NET Numerics is quite comprehensive and well optimized. It has a highly generic Matrix type that supports different storage strategies optimized for different scenarios (sparse, dense, diagonal, etc.). For C++, Blaze looks pretty good. If you want the fastest, you have Intel MKL but that's not free. Note that Math.NET Numerics can leverage Intel MKL but the license is only for personal use, if you redistribute you'll have to buy your own license. And, yes by allocating directly on the unmanaged heap you spare the GC having to work around a large pinned array, so that could help as well. But again, given that bounds checking isn't a significant cost in general, I'm not sure that it's really worth the trouble. You're giving up pretty much all type safety there. Link to comment Share on other sites More sharing options...
0 notchinese Posted November 2, 2014 Share Posted November 2, 2014 I would be curious to see a benchmark showing when this is the case. Simple test: using System; using System.Diagnostics; using System.Linq; namespace ConsoleApplication2 { class Program { unsafe static double SumArrayUnsafe(double* arr, int length) { double sum = 0; for (int i = 0; i < length; ++i) { sum += arr[i]; } return sum; } static double SumArrayWithoutBoundsCheck(double[] arr) { double sum = 0; for (int i = 0; i < arr.Length; ++i) { sum += arr[i]; } return sum; } static double SumArrayWithBoundsCheck(double[] arr) { double sum = 0; for (int i = 0; i < arr.Length; ++i) { if (i < arr.Length) { sum += arr[i]; } } return sum; } unsafe static void Main() { var rand = new Random(); var arr = Enumerable.Range(0, 1000000).Select(_ => rand.NextDouble()).ToArray(); SumArrayWithBoundsCheck(arr); SumArrayWithoutBoundsCheck(arr); fixed (double* pinnedArr = arr) { SumArrayUnsafe(pinnedArr, arr.Length); } var sw = Stopwatch.StartNew(); for (int i = 0; i < 1000; ++i) { SumArrayWithBoundsCheck(arr); } var elapsed = sw.Elapsed; Console.WriteLine("With bounds check : {0:0.000} seconds", elapsed.TotalSeconds); sw = Stopwatch.StartNew(); for (int i = 0; i < 1000; ++i) { SumArrayWithoutBoundsCheck(arr); } elapsed = sw.Elapsed; Console.WriteLine("Without bounds check : {0:0.000} seconds", elapsed.TotalSeconds); fixed (double* pinnedArr = arr) { sw = Stopwatch.StartNew(); for (int i = 0; i < 1000; ++i) { SumArrayUnsafe(pinnedArr, arr.Length); } elapsed = sw.Elapsed; Console.WriteLine("Unsafe code : {0:0.000} seconds", elapsed.TotalSeconds); } } } } Results on my machine on .NET 4.5 32-bit Release: It's barely a 1% difference. And yes the bounds check in "Without bounds check" is optimized away (the code relies on that), and the redundant check in "With bounds check" isn't optimized away, I looked at the assembly. Those scenarios are far too simple. Yes it will remove the bounds check because it is easy to tell that its impossible to go beyond it with that code. I can't post sample code because its proprietary but when you have large arrays (> 50 million) and you have to process the data in chunks you get code that the bounds cannot be inferred directly. Another source of improvement using unsafe code is large bitmap manipulation. Link to comment Share on other sites More sharing options...
0 Andre S. Veteran Posted November 2, 2014 Veteran Share Posted November 2, 2014 Those scenarios are far too simple. Yes it will remove the bounds check because it is easy to tell that its impossible to go beyond it with that code. I can't post sample code because its proprietary but when you have large arrays (> 50 million) and you have to process the data in chunks you get code that the bounds cannot be inferred directly. I'm not arguing that bounds check get removed, I'm arguing that bounds check are an insignificant cost and that's what the test I did shows. I'd like to see a scenario in which array bounds checking is a significant cost, one that's worth pinning and giving up type safety for. Link to comment Share on other sites More sharing options...
0 notchinese Posted November 2, 2014 Share Posted November 2, 2014 If the code is long running and makes many reads/writes you will see a benefit. The importance of that extra performance depends on what you are doing and what kind of program you are running, but having put in significant time benchmarking all the options (including allocating on the native heap) we determined that for us it was worth it. Its a fairly unique situation, and that is the only part of that code base that uses unsafe code but the point of my original post was simply that there are scenarios where you can get a measurable performance improvement. Link to comment Share on other sites More sharing options...
0 Andre S. Veteran Posted November 5, 2014 Veteran Share Posted November 5, 2014 Is there any free optimized .NET / C++ library capable of working with say a double[] array as a matrix (flip, transpose, etc...)? By the way, there will be a hardware-accelerated matrix type coming as standard in the next version of .NET: http://blogs.msdn.com/b/dotnet/archive/2014/11/05/using-system-numerics-vector-for-graphics-programming.aspx :) Link to comment Share on other sites More sharing options...
Question
+Red King Subscriber²
Not allowed,
Also not allowed,
even though T is a struct making it an artificial limitation AFAIK
I have a generic function used by N classes and I want to make it unsafe to get more speed... followed by a few more functions just like it.
I can write one function to take care of 3 scenarios as three structs are 24 bytes and thus interchangeable.
Any better solution though?
One more,
If I am going to use unsafe everywhere, any reason I shouldn't just AllocHGlobal instead of using new T?
Link to comment
Share on other sites
14 answers to this question
Recommended Posts