Jump to content



Photo

[C,C++] Subtle obscure differences


  • Please log in to reply
41 replies to this topic

#1 +snaphat (Myles Landwehr)

snaphat (Myles Landwehr)

    Electrical & Computer Engineer

  • Tech Issues Solved: 29
  • Joined: 23-August 05
  • OS: Win/Lin/Bsd/Osx
  • Phone: dumb phone

Posted 22 March 2014 - 01:29

Sometimes when I am doing low-level development I run into interesting incompatibilities between C and C++. The type of thing you may have known at one time, but that you eventually forget unless you know the specification of both languages inside-out.

 

One such case happened to me today with code similar to the following:

// foo.h:
int RAM[10000];

// foo.c:
#include "foo.h"
int main() {
    return (unsigned long long)RAM;
}

// bar.c
#include "foo.h"
int bar() {
    return (unsigned long long)RAM;
}

This particular code is valid in C, but not valid in C++ if you link both foo.c and bar.c into the same binary. Why? Because there is a subtle difference with how C and C++ treat uninitialized global symbols (RAM[] in this case). In C, the symbols are merged into one and become a single symbol (instance of a variable) (emitted using common linkage: see here). However, in C++ there is no such thing as common linkage. If you declare the same variable twice, regardless of the circumstances, it is seen as a two separate variables that conflict. So in the latter case you will see the following error:

/tmp/ccK4Aa4B.o:(.bss+0x0): multiple definition of `RAM'
/tmp/cc8u2dAO.o:(.bss+0x0): first defined here
collect2: error: ld returned 1 exit status

Of course, you can get around the limitation in C++ by doing the following instead:

// foo.h:
extern int RAM[10000]; //modified

// foo.c:
#include "foo.h"
int RAM[10000]; //added
int main() {
    return (unsigned long long)RAM;
}

// bar.c
#include "foo.h"
int bar() {
    return (unsigned long long)RAM;
}

The interesting part is some of the implications for C programmers. Suppose for example, you borrowed a database implementation that employed common linkage & you just happened to accidentally clobber over the name in your own code. In this case, you would have a subtle silent bug on your hand. The variable would be shared between your code and the database and you might never know! Here's another example shows some interesting things that occur in these cases:

// baz.c:
#include <stdio.h>
int test; //4 byte declaration.
unsigned long long qux();

int main()
{
    printf("wrong size returned in baz: %d\n", sizeof(test)); //Oops!
    qux();
    printf("set value in baz: %d\n", test);
    return test;
}


// qux.c:
#include <stdio.h>
unsigned long long test; //8 byte declaration.
unsigned long long test;//you can redeclare without error.


unsigned long long qux() {
    test = 0xFF;
    printf("correct size in qux: %d\n", sizeof(test));
    printf("set value in qux: %d\n", test);
}

Output:

wrong size returned in baz: 4
correct size in qux: 8
set value in qux: 255
set value in baz: 255

There are a few interesting things to note: (1) the variable is declared twice in qux.c without error, (2) is declared once with a different sized type in baz.c, (3) has actually been merged and is eight bytes large. Yet the wrong size will be printed in baz.c and the correct size only in qux.c. So it is silent even with incompatible types. The final thing to note is that even if you tried to enable verbose warnings in the compiler, you still won't see this.




#2 Aheer.R.S.

Aheer.R.S.

    I cannot Teach Him, the Boy has no Patience!

  • Tech Issues Solved: 9
  • Joined: 15-October 10
  • Location: Wolverhampton, West Midlands

Posted 22 March 2014 - 01:31

^ I have no idea what you are talking about, and I'm mostly bored

 

BUT

 

You are worthy of my respect and praise



#3 Praetor

Praetor

    ASCii / ANSi Designer

  • Tech Issues Solved: 7
  • Joined: 05-June 02
  • Location: Lisbon
  • OS: Windows Eight dot One dot One 1!one

Posted 22 March 2014 - 01:39

^ I have no idea what you are talking about, and I'm mostly bored

 

BUT

 

You are worthy of my respect and praise

:laugh:

that was Aheer.R.S. ways to say:

cool-story-bro.jpg


Edited by Praetor, 22 March 2014 - 01:39.


#4 OP +snaphat (Myles Landwehr)

snaphat (Myles Landwehr)

    Electrical & Computer Engineer

  • Tech Issues Solved: 29
  • Joined: 23-August 05
  • OS: Win/Lin/Bsd/Osx
  • Phone: dumb phone

Posted 22 March 2014 - 01:39

^ I could have titled the thread obscure quirks between C/C++. Basically, I'm just highlighting differences in how C and C++ handles global variables in certain cases and some potentially screwy behaviors that result from C's way of doing it.



#5 Andre S.

Andre S.

    Asik

  • Tech Issues Solved: 14
  • Joined: 26-October 05

Posted 22 March 2014 - 01:40

Thanks for sharing, you reinforce my resolve never to program in C++ ever again. :)

I decided a while ago that all the time I spent learning the intricacies and incoherences of this underspecified amateurish patchwork of ideas was better spent writing actually useful code in a language that generally makes sense (which turns out to be most languages out there except for C++ and a few others). I care about performance but my own sanity comes first.

 

That said if you happen to work with a platform where only C++ makes sense (and I know there are many), then you have all my sympathy.



#6 Aheer.R.S.

Aheer.R.S.

    I cannot Teach Him, the Boy has no Patience!

  • Tech Issues Solved: 9
  • Joined: 15-October 10
  • Location: Wolverhampton, West Midlands

Posted 22 March 2014 - 01:41

I'm sorry dude, the language of programming is beyond my knowledge field, I just wished to display my awe in a non offensive manner :)



#7 OP +snaphat (Myles Landwehr)

snaphat (Myles Landwehr)

    Electrical & Computer Engineer

  • Tech Issues Solved: 29
  • Joined: 23-August 05
  • OS: Win/Lin/Bsd/Osx
  • Phone: dumb phone

Posted 22 March 2014 - 01:46

Thanks for sharing, you reinforce my resolve never to program in C++ ever again. :)

I decided a while ago that all the time I spent learning the intricacies and incoherences of this underspecified amateurish patchwork of ideas was better spent writing actually useful code in a language that generally makes sense (which turns out to be most languages out there except for C++ and a few others). I care about performance but my own sanity comes first.

 

That said if you happen to work with a platform where only C++ makes sense (and I know there are many), then you have all my sympathy.

Well to be fair, I'm working in a platform where only C works normally... so even worse. But, for the moment I'm jumping to C++ for the STL (and only the STL, no actual OOP). Hence, me (re-)finding the above nonsensical differences  :)

 

Sadly, I wanted the nonsensical C behavior in this case because I wanted to save a few lines of code (i.e. not having to declare externs and such)  :laugh:



#8 Praetor

Praetor

    ASCii / ANSi Designer

  • Tech Issues Solved: 7
  • Joined: 05-June 02
  • Location: Lisbon
  • OS: Windows Eight dot One dot One 1!one

Posted 22 March 2014 - 01:55

^ I could have titled the thread obscure quirks between C/C++. Basically, I'm just highlighting differences in how C and C++ handles global variables in certain cases and some potentially screwy behaviors that result from C's way of doing it.

A6jSmoN.jpg



#9 Aheer.R.S.

Aheer.R.S.

    I cannot Teach Him, the Boy has no Patience!

  • Tech Issues Solved: 9
  • Joined: 15-October 10
  • Location: Wolverhampton, West Midlands

Posted 22 March 2014 - 01:57

A6jSmoN.jpg

lol that's mean

 

Big bully, I'm telling... :p



#10 Praetor

Praetor

    ASCii / ANSi Designer

  • Tech Issues Solved: 7
  • Joined: 05-June 02
  • Location: Lisbon
  • OS: Windows Eight dot One dot One 1!one

Posted 22 March 2014 - 02:04

lol that's mean

 

Big bully, I'm telling... :p

 

lol i'm just being funny, not trying to be offensive and i'm pretty sure that Myles would understand that; in fact most of the times that I'm talking to my wife about technology or some very interesting topics like some DNS zones that were really messed up or why some exquisite update made that particular server crawl to his knees she's the one in the picture saying that to me, so yeah i understand that.



#11 OP +snaphat (Myles Landwehr)

snaphat (Myles Landwehr)

    Electrical & Computer Engineer

  • Tech Issues Solved: 29
  • Joined: 23-August 05
  • OS: Win/Lin/Bsd/Osx
  • Phone: dumb phone

Posted 22 March 2014 - 03:51

^ I'll let you guys into a little off topic secret, my middle name is Myles. I couldn't fit my full name with the nick so I chopped off my first name, Aaron.  ;)



#12 firey

firey

    F͎̗͉͎͈͑͡ȉ͎̣̐́ṙ͖̺͕͙̓̌è̤̞͉̟̲͇̍̍̾̓ͥͅy͓̍̎̌̏̒

  • Tech Issues Solved: 8
  • Joined: 30-October 05
  • Location: Alberta, Canada
  • OS: Windows 7
  • Phone: Android (4.4.2)

Posted 22 March 2014 - 15:28

^ I'll let you guys into a little off topic secret, my middle name is Myles. I couldn't fit my full name with the nick so I chopped off my first name, Aaron.  ;)

 

okay Myles.



#13 simplezz

simplezz

    Neowinian Senior

  • Tech Issues Solved: 8
  • Joined: 01-February 12

Posted 22 March 2014 - 15:57

Sometimes when I am doing low-level development I run into interesting incompatibilities between C and C++.

Not to be pedantic, but C and C++ aren't low level. That distinction is reserved for assembler. At best they could be labelled as medium/high. After all they both have the same constructs as almost every other high level language. The only thing setting them apart is memory management, which I don't see as low level because the details are hidden behind malloc and new.
 

The interesting part is some of the implications for C programmers. Suppose for example, you borrowed a database implementation that employed common linkage & you just happened to accidentally clobber over the name in your own code. In this case, you would have a subtle silent bug on your hand. The variable would be shared between your code and the database and you might never know!

That's what static is for:
#define PRIVATE static;
PRIVATE int module_level_global;  
To restrict global variables to module level.

Globals are generally a bad idea to begin with except in specific cases. It's almost always better to have a module encapsulate it and provide functions which manipulate it.

#14 Andre S.

Andre S.

    Asik

  • Tech Issues Solved: 14
  • Joined: 26-October 05

Posted 22 March 2014 - 19:02

Not to be pedantic, but C and C++ aren't low level. That distinction is reserved for assembler. 

The creator of the C language disagrees with you:
 

C is a relatively "low-level'' language. This characterization is not pejorative; it simply means that
deals with the same sort of objects that most computers do, namely characters, numbers, and addresses
These may be combined and moved about with the arithmetic and logical operators implemented by real 
machines. 
 
C provides no operations to deal directly with composite objects such as character strings, sets, lists or 
arrays. There are no operations that manipulate an entire array or string, although structures may be 
copied as a unit. The language does not define any storage allocation facility other than static definition 
and the stack discipline provided by the local variables of functions; there is no heap or garbage 
collection. Finally, C itself provides no input/output facilities; there are no READ or WRITE statements, 
and no built-in file access methods. All of these higher-level mechanisms must be provided by explicitly 
called functions. Most C implementations have included a reasonably standard collection of such 
functions. 
 
Similarly, C offers only straightforward, single-thread control flow: tests, loops, grouping, and 
subprograms, but not multiprogramming, parallel operations, synchronization, or coroutines
 

http://net.pku.edu.c...ng_Language.pdf

 



#15 simplezz

simplezz

    Neowinian Senior

  • Tech Issues Solved: 8
  • Joined: 01-February 12

Posted 22 March 2014 - 20:00

The creator of the C language disagrees with you:

He didn't explicitly state it was low level. When he says 'relatively', he means compared to very high level languages:

C is a general-purpose programming language with features economy of expression, modern flow control
and data structures, and a rich set of operators. C is not a ``very high level'' language, nor a ``big'' one,
and is not specialized to any particular area of application.


Very high level programming languages are often domain specific, something C certainly isn't.