• 0

Step Into Referenced C++ Project from C# Project


Question

EDIT 2:

Latest problems and questions are most likely on the last page. This thread is now used for C++ / C# questions I have (so not to create a thread per question).

 

I have a C# (GUI) project which in the most dumb way possible generates a fractal and well, I thought it would be fun to code the performance critical section C++ - a language that I don't know

 

So I added a C++ CLR Library Project - blank and managed to set it 64-bit after a few failed attempts.

I referenced the C++ project from the C# portion.

 

Issue is I cannot step into it. I pause where I call the C++ class and tell it to Step Into and it doesn't.

What am I doing wrong here?

 

EDIT,

 

Some other questions,

 

C# byte is just an unsigned char in C++? (I am going to be dealing w/ BGR values)

If I want to run a function in a thread in C++, is it true that I need to pass a struct pointer containing all the values?

Edited by _Alexander

13 answers to this question

Recommended Posts

  • 0

Make sure you check "Unmanaged code debugging" in your C# project options.

 

C# byte is an unsigned byte, exactly like the C++ "unsigned char", indeed.

Keep in mind "int" is a 32-bit integer in C# (it's an alias for System.Int32) while "int" is platform-dependent in C++. Stick to cstdint types to avoid possible confusion there.

 

I'm not too familiar with standard threads in C++ seeing as it's a novelty, but from a quick glance at the documentation it looks like you pass a function and each of its arguments separately. Actually since you're doing C++/CLI you probably just want to stick with System::Threading::Thread.

 

Also keep in mind there won't be significant performance advantages to doing numeric code in C++ unless you compile that specific code with /clr disabled and all optimisations on. Actually C++ tends to be slower than C# in debug builds.

  • 0

Actually, turns out, because I didn't have a body defined for the function, it just glossed over it.

I defined body and got it to go in it.

 

Anyway, I do not like C++/CLI anymore. Just look at how nasty it is (this is the only way I managed to start a thread)

System::Collections::Generic::List<Thread^>^ list = gcnew System::Collections::Generic::List<Thread^>(LogicalProcessors());
for (int i = 0; i < HEIGHT; i += amount)
{
int to = System::Math::Min(i + amount, HEIGHT);
ParameterizedThreadStart^ st = gcnew ParameterizedThreadStart(this, &Fractal::Parallel);
Thread^ th = gcnew System::Threading::Thread(st);
array<Int32>^ value2 = {i, to, bitmapData->Stride, cCount};
th->Start(value2);
list->Add(th);
}

and it is slightly slower.

 

And finally managed to get normal C++ to work,

extern "C"
{
__declspec(dllexport) int Test(int a, int b)
{
return a + b;
}
}

will try making it give back an BGR byte array, and after that C++ extensions from Intel and NVIDIA look interesting...

  • 0

Well if you understand the memory models of .NET and native C++ perfectly, then C++/CLI is no more difficult than combining both all the time :P That is to say it's a total cluster###### (4 different concepts of reference! *, &, ^, %). But I still like it for any kind of involved interop code; P/Invoke gets a lot messier a lot faster IMO.

  • 0

So I hacked together the normal C++ version, here is the performance difference (when debugging),

 

Safe = Managed C#

Unsafe = C++ library

Safe V. Unsafe
6255.649 V. 296.4006
6318.052 V. 405.6007
6318.0506 V. 280.7993
6520.8494 V. 343.2054
6458.4503 V. 405.6058
6255.6451 V. 265.2043
6427.2456 V. 327.6034
6318.0526 V. 421.202
6333.6444 V. 421.2078
7300.8531 V. 483.6044
UNSAFE WINS = 10
SAFE WINS = 0
SAFE COUNT = 64506.4921
UNSAFE COUNT = 3650.4337

Anyone know how do you Marshal a byte* ptr from an extern function into another byte *ptr in C#?

  • 0

That an unusually large difference, I suspect your C# version could be improved. Are you doing any byte-per-byte copies rather than using Array.Copy or Marshal.Copy by any chance?

 

  Quote
Anyone know how do you Marshal a byte* ptr from an extern function into another byte *ptr in C#?

 

Can't you just use IntPtr?

  • 0

My suggestion would be to work in either managed(including C++/CLI) or unmanaged but not both, just to avoid all the hassle. But understandably, sometimes that can't be helped, especially since going for both the feature set of .Net and the speed of optimized machine code is always tempting. Something like this http://stackoverflow.com/questions/7679522/return-array-of-integers-from-cross-platform-dll might be what you're looking for?

  • 0
  On 09/08/2013 at 01:47, Salutary7 said:

My suggestion would be to work in either managed(including C++/CLI) or unmanaged but not both, just to avoid all the hassle. But understandably, sometimes that can't be helped, especially since going for both the feature set of .Net and the speed of optimized machine code is always tempting. Something like this http://stackoverflow.com/questions/7679522/return-array-of-integers-from-cross-platform-dll might be what you're looking for?

Actually I am just trying everything out.

 

Right now I am trying to figure the best way to generate an byte array in C++ and pass it to C#. Efficiently with no memory leaks.

 

In C++ I have this:

extern "C"
{
__declspec(dllexport) unsigned char* Generate(
int width, int height, int iterations, double cReal, double cImaginary, double minX, double maxX, double minY, double maxY,
int cCount, int stride
)
{
Fractal* f = new Fractal(width, height, iterations, cReal, cImaginary, minX, maxX, minY, maxY);
unsigned char* ptr = f->Run(cCount, stride);
f->~Fractal();
return ptr;
}
}

I have no idea if explicitly calling the destructor is needed in the case of extern.

The destructor for the class only delete[] the int* array (the iteration count for a specific coordinate) and not the unsigned char* memory.

 

This gets passed to C# code

[DllImport(@"E:\SHARED\FractalViewer\x64\Debug\PureCPP.dll")]
private static extern IntPtr Generate( //byte* Generate(
int width, int height, int iterations, double cReal, double cImaginary, double minX, double maxX, double minY, double maxY,
int cCount, int stride
);

But I am not sure how to, do the below code properly (proper max speed copy without intermediate array and proper dealloc),

 
BitmapData bmData = image.LockBits(
new System.Drawing.Rectangle(0, 0, image.Width, image.Height),
ImageLockMode.WriteOnly,
image.PixelFormat);
IntPtr ptr = Generate(WIDTH, HEIGHT, iterations, cReal, cImaginary, minX, maxX, minY, maxY, 3, bmData.Stride);
// The code below feels bad, is there a way to use Marshal.SomeFunction(..) to do this?
UInt64* rgb = (UInt64 *) ptr.ToPointer();
UInt64* img = (UInt64*)bmData.Scan0.ToPointer();
int imgSizeIn64 = (bmData.Stride * HEIGHT) >> 3;
for (int index = 0; index < imgSizeIn64; index++)
{
img[index] = rgb[index];
}
// Marshal.FreeHGlobal(ptr); // Invalid Access to Memory Region
// Marshal.FreeBSTR(ptr); // Stops Working
// Marshal.FreeCoTaskMem(ptr); // Access Violation
// What do Here?
image.UnlockBits(bmData);
return image;

?

?

?

 

I just ran this for a 9000 x 9000 surface, 26 times, sitting at 4GB =(

 

EDIT: This solution probably get me punched in the nads,

?

private static extern void DeAlloc(IntPtr ptr);
  • 0
  On 08/08/2013 at 23:27, _Alexander said:

So I hacked together the normal C++ version, here is the performance difference (when debugging),

 

Safe = Managed C#

Unsafe = C++ library

Safe V. Unsafe
6255.649 V. 296.4006
6318.052 V. 405.6007
6318.0506 V. 280.7993
6520.8494 V. 343.2054
6458.4503 V. 405.6058
6255.6451 V. 265.2043
6427.2456 V. 327.6034
6318.0526 V. 421.202
6333.6444 V. 421.2078
7300.8531 V. 483.6044
UNSAFE WINS = 10
SAFE WINS = 0
SAFE COUNT = 64506.4921
UNSAFE COUNT = 3650.4337

Anyone know how do you Marshal a byte* ptr from an extern function into another byte *ptr in C#?

 

To accurately benchmark .NET code you must compile it in release mode, run it WITHOUT the debugger attached, and run the code before benchmarking it, so it will get optimized by the jitter.

  • 0
  On 09/08/2013 at 04:20, notchinese said:

To accurately benchmark .NET code you must compile it in release mode, run it WITHOUT the debugger attached, and run the code before benchmarking it, so it will get optimized by the jitter.

 

The results are not pretty,

9000 x 9000, 50 iterations, i5 3570k @ 4Ghz @ 4 Threads

Safe V. Unsafe
3223.0689 V. 3421.2571
3216.0616 V. 3420.2569
3222.0691 V. 3411.2463
3464.3002 V. 3419.2537
3246.0926 V. 3427.2612
3242.0882 V. 3433.2674
3447.2835 V. 3424.2587
3477.3106 V. 3426.2625
3221.0686 V. 3421.2555
3456.2926 V. 3422.2565
UNSAFE WINS = 4
SAFE WINS = 6
SAFE COUNT = 33215.6359
UNSAFE COUNT = 34226.5758

 

I guess I need to work on this hellish pile of what the hell,

 inline int Fractal::GetIterationCount(double a, double b)
 {
  int iterationCount = 0;
        double _a = a;
        double _b = b;
  
        double _aSq = a * a;
        double _bSq = b * b;
        for (int i = 0; i < iterations; i++)
        {
            _a = _aSq - _bSq;
            _b = 2 * a * b;
            _a += cReal;
            _b += cImaginary;
            _aSq = _a * _a;
            _bSq = _b * _b;
            if ((_aSq + _bSq) < thesholdSq)
            {
                iterationCount = i;
            }
   
            a = _a;
            b = _b;
        }
  
        return iterationCount;
 }
  • 0
extern "C"
{
    __declspec(dllexport) unsigned char* Generate(int width, int height, int iterations, double cReal, double cImaginary, double minX, double maxX, double minY, double maxY, int cCount, int stride)
    {
        Fractal* f = new Fractal(width, height, iterations, cReal, cImaginary, minX, maxX, minY, maxY);
        unsigned char* ptr = f->Run(cCount, stride);
        f->~Fractal();
        return ptr;
    }
}

In C++, use the stack as much as possible rather than new/delete. So your function becomes:

Fractal f(width, height, iterations, cReal, cImaginary, minX, maxX, minY, maxY);
return f.Run(cCount, stride);

No memory leaks.

Anything you allocate with new must be de-allocated with delete, so if you wanted to do it with new:

Fractal* f = new Fractal
unsigned char* result = f->Run(cCount, stride);
delete f;
return result;

delete calls the destructor. In C++/CLI, you also have to deal with reference types implementing IDisposable, in which case the same applies except they're allocate with gcnew rather than new. In this case delete calls Dispose().

 

Your code seems weird because result points to an array allocated by the Fractal object, but you destroy this object before returning, therefore Fractal cannot free the array; someone else has to do it which makes the code brittle. In general when you're dealing with unmanaged resources, make sure the object responsible for allocating them is also responsible for de-allocating them. For instance here the caller could be responsible for pre-allocating the array and freeing it when it's done with the data.

 

Or just use managed types instead.

 

  Quote
Right now I am trying to figure the best way to generate an byte array in C++ and pass it to C#. Efficiently with no memory leaks.

 

 

For an unmanaged array:

unsigned char* arr = new unsigned char[dim];
// pass to C# as IntPtr
delete[] arr;

For a managed array:

array<unsigned char>^ arr = gcnew array<unsigned char>(dim);
// pass to C# as array<unsigned char>^

And no need to delete as it's GCed. Try to rely on managed types as much as possible to avoid leaks. Managed arrays can be temporarily pinned to access them using pointers.

 

// The code below feels bad, is there a way to use Marshal.SomeFunction(..) to do this?
UInt64* rgb = (UInt64 *) ptr.ToPointer();
UInt64* img = (UInt64*)bmData.Scan0.ToPointer();
int imgSizeIn64 = (bmData.Stride * HEIGHT) >> 3;
for (int index = 0; index < imgSizeIn64; index++)
{
    img[index] = rgb[index];
}

For unmanaged to unmanaged, use memcpy - you'll have to P/Invoke it from C# (hey, the example on that page is exactly what you're trying to do!). For managed to unmanaged or vice-versa, use Marshal.Copy. For managed to managed, use Array.Copy. This is way faster than byte-per-byte copy. You can probably speed up your code by 2-4 times just by using a copy function rather than a loop like this.

 

// Marshal.FreeHGlobal(ptr); // Invalid Access to Memory Region
// Marshal.FreeBSTR(ptr); // Stops Working
// Marshal.FreeCoTaskMem(ptr); // Access Violation
// What do Here?

Assuming ptr was allocated using new, just use delete. Anything you allocate with new, call delete on it when you're done. Anything you allocate with new[], call delete[] on it when you're done.

 

Marshal.FreeHGlobal is for Marshal.AllocHGlobal, etc.

 

That said, your life would be a lot simpler if you used a managed array and Marshal.Copy'dinto the bitmap. You don't really gain performance by using an unmanaged array (it's just a blob of memory in any case), the possible gain here is having the C++ optimiser go over your inner loops and do its magic.

  • Like 2
  • 0

Ok memcpy works. Awesome. 

 

I use new on unsigned int* color and I delete[] it in the destructor.

And the "// pass to C# as IntPtr" after memcpy() should call back to the unmanaged C++ code to delete[] the ptr?

 

I do not understand well here.I also do not understand why the bitmap internal pointer to the rgb array can't just be modified to avoid memcpy altogether.

 

I am still struggling in three areas - C++ is as fast as managed C# and not faster, looking for a good coloring algorithm, and for some reason the image is flipped on the X axis.

Today, in its entirety was dedicated to researching _m128d and _m256d and made a homebrew _m128d implementation. Homebrew futile performance wise - will seek google and stackoverflow...

I also noticed other fractal implementations use single and not double precision. Will start working on float version...

One thing that I was struggling with was the fact that Stride for the bitmap was not width*3, I think was because width * height & 0xF was not zero.

  • 0
  On 11/08/2013 at 05:12, _Alexander said:

Ok memcpy works. Awesome. 

 

I use new on unsigned int* color and I delete[] it in the destructor.

And the "// pass to C# as IntPtr" after memcpy() should call back to the unmanaged C++ code to delete[] the ptr?

 

I do not understand well here.I also do not understand why the bitmap internal pointer to the rgb array can't just be modified to avoid memcpy altogether.

If you're creating an unmanaged array in C++ then you need to deallocate it in C++ when you're done with the data. This is certainly a messy approach, it'd be much better to create the buffer from C#, pass it to C++, have the C++ code fill it up and free the buffer in C# later. You could indeed just create a bitmap, lock the bits, pass that pointer to C++, when C++ is done unlock the bits and avoid having to perform any copy or memory management, just Dispose() the bitmap where you're done with it. That's how I'd do it in any case.
  Quote
I am still struggling in three areas - C++ is as fast as managed C# and not faster, looking for a good coloring algorithm, and for some reason the image is flipped on the X axis.
Today, in its entirety was dedicated to researching _m128d and _m256d and made a homebrew _m128d implementation. Homebrew futile performance wise - will seek google and stackoverflow...
I also noticed other fractal implementations use single and not double precision. Will start working on float version...
One thing that I was struggling with was the fact that Stride for the bitmap was not width*3, I think was because width * height & 0xF was not zero.

 

SSE intrinsics is definitely worth investigating for your use case, but it's a steep learning curve. Make sure you're compiling with all optimisations on and /clr disabled for that specific function (put it in a separate .cpp so you can disable /clr just for that code). If you're not compiling as native code with all optimisations on then it's useless to write in C++. Keep in mind anything above SSE2 has limited compatibility on modern CPUs - AVX was only supported by AMD on Bulldozer  and Intel on Sandy Bridge (both 2011) so your code will just crash on anything earlier. I'd generally stick with SSE2 for compatibility.

This topic is now closed to further replies.
  • Posts

    • NWinfo 1.4.2 by Razvan Serea NWinfo is a lightweight tool designed to give a quick look at your computer's key details, from hardware to software specs, without any fuss. You don't need to install it; just download, run, and see everything you need on one screen. It displays essential info about your CPU, memory, disk drives, network, and even the system's operating details. Since it’s portable, you can carry NWinfo on a USB stick and use it on any Windows machine, making it a handy tool for both tech enthusiasts and troubleshooting. NWinfo key features: Lightweight and portable—no installation required Simple, user-friendly interface for easy navigation Displays detailed CPU information, including model and speed Shows memory (RAM) specifications and usage Provides disk information, including storage capacity and usage Lists network adapters and IP addresses Displays motherboard details, including model and manufacturer Shows system uptime and operating system version Detects graphics card information and driver details Includes battery status for laptops Provides monitor specifications, including resolution and refresh rate Displays BIOS version and other firmware details Offers a summary of active processes and services Generates detailed logs for sharing or troubleshooting Open-source and free, allowing for customization and community support NWinfo 1.4.2 changelog: Add Polish language support Add support for EFI key options Refactor EFI boot menu enumeration Add FACS table parsing Add SMBIOS support for types 37-39 and 42 Note: NWinfo might trigger a few antivirus alerts or show up with warnings on VirusTotal due to its low download frequency. If you have any concerns, you're welcome to review the full source code available on the developer’s repository. Download: NWinfo 1.4.2 | 2.2 MB (Open Source) View: NWinfo Website | NWinfo@GitHub | Screenshot Get alerted to all of our Software updates on Twitter at @NeowinSoftware
    • TSMC's trade secret meltdown exposed by internal monitoring by Paul Hill Taiwan Semiconductor Manufacturing Co. (TSMC) has detected unauthorized activities using “comprehensive and robust monitoring mechanisms”. The chip maker said that it believes trade secrets have been leaked as a result and has taken disciplinary action against the personnel involved and initiated legal proceedings explaining that it has a zero-tolerance policy for compromising trade secrets or harming company interests. As the case is under judicial review, it is unable to provide further details about the case. While TSMC is not speaking on the matter, the leak allegedly involves critical proprietary information on 2-nanometer chip development and production. Production of the 2-nanometer chip is among the leading-edge manufacturing processes in the semiconductor industry right now, which explains why an actor would want to steal related information. TSMC is one of the world’s leading chip makers, with companies like Apple and Nvidia being among its clients. It is also the world’s biggest chip maker and among the most advanced. After this leak, the company has reaffirmed its commitment to safeguarding its core business competitiveness and the shared interests of all its employees. The jury is still out on the motivations behind the leaking of the trade secrets, but those involved have been fired, according to Nikkei. TSMC, similarly to Nvidia, has found itself in the geopolitical struggle between China and the US in recent years. As you likely know, the last several years have seen America really go hard after Huawei over national security concerns, and in more recent years, the US has sought to limit China’s access to AI hardware. Earlier this year, Neowin reported that TSMC faces a $1 billion fine for breaching export controls against Huawei. TSMC had made a chip for the Chinese firm Sophgo, but that same chip was later discovered to be used in Huawei's high-end Ascend 910B AI processor. This discovery was made by TechInsights, a Canadian company, and as a result, TSMC stopped shipments to Sophgo, and the US added Sophgo to its blacklist to prevent further circumvention along that route. It will certainly be fascinating to learn more about the motivation for the theft of trade secrets as more information comes to light. Via: CNBC | Image via Depositphotos.com
    • Apple will probably reciprocate by advertising their home products and mocking Google Home's continued debacles.
    • Files still rely on classic windows indexing rather than methods that 'everything' uses. So search is suboptimal. The compression algorithm list is really lacking not comparable to winrar, 7zip and peazip... The gui looks nice, but windhawk is a thing. I don't understand what files brings in 2025. Just a less efficient, less customizable file explorer. https://i.imgur.com/afVKXBj.png
  • Recent Achievements

    • Week One Done
      Zojaji earned a badge
      Week One Done
    • First Post
      Soeaker4thedead earned a badge
      First Post
    • First Post
      kryptickid earned a badge
      First Post
    • First Post
      Nemesis-IV earned a badge
      First Post
    • First Post
      Aidan Helfrich earned a badge
      First Post
  • Popular Contributors

    1. 1
      +primortal
      759
    2. 2
      ATLien_0
      187
    3. 3
      +FloatingFatMan
      151
    4. 4
      Xenon
      117
    5. 5
      wakjak
      113
  • Tell a friend

    Love Neowin? Tell a friend!