15166

Why is a C++ Hello World binary larger than the equivalent C binary?

In his FAQ, Bjarne Stroustrup says that when compiled with gcc -O2, the file size of a hello world using C and C++ are identical.

Reference: http://www.stroustrup.com/bs_faq.html#Hello-world

I decided to try this, here is the C version:

#include <stdio.h> int main(int argc, char* argv[]) { printf("Hello world!\n"); return 0; }

And here is the C++ version

#include <iostream> int main(int argc, char* argv[]) { std::cout << "Hello world!\n"; return 0; }

Here I compile, and the sizes are different:

r00t@wutdo:~/hello$ ls hello.c hello.cpp r00t@wutdo:~/hello$ gcc -O2 hello.c -o c.out r00t@wutdo:~/hello$ g++ -O2 hello.cpp -o cpp.out r00t@wutdo:~/hello$ ls -l total 32 -rwxr-xr-x 1 r00t r00t 8559 Sep 1 18:00 c.out -rwxr-xr-x 1 r00t r00t 8938 Sep 1 18:01 cpp.out -rw-r--r-- 1 r00t r00t 95 Sep 1 17:59 hello.c -rw-r--r-- 1 r00t r00t 117 Sep 1 17:59 hello.cpp r00t@wutdo:~/hello$ size c.out cpp.out text data bss dec hex filename 1191 560 8 1759 6df c.out 1865 608 280 2753 ac1 cpp.out

I replaced std::endl with \n and it made the binary smaller. I figured something this simple would be inlined, and am dissapointed it's not.

Also wow, the optimized assemblies have hundreds of lines of assembly output? I can write hello world with like 5 assembly instructions using sys_write, what's up with all the extra stuff? Why does C put some much extra on the stack to setup? I mean, like 50 bytes of assembly vs 8kb of C, why?

Answer1:

You're looking at a mix of information that's easily misinterpreted. The 8559 and 8938 byte file sizes are largely meaningless since they're mostly headers with symbol names and other misc information for at least minimal debugging purposes. The somewhat meaningful numbers are the size(1) output you added later:

r00t@wutdo:~/hello$ size c.out cpp.out text data bss dec hex filename 1191 560 8 1759 6df c.out 1865 608 280 2753 ac1 cpp.out

You could get a more detailed breakdown by using the -A option to size, but in short, the differences here are fairly trivial.

What's more interesting is that Bjarne Stroustrup never mentioned whether he was talking about static or dynamic linking. In your case, both programs are dynamic-linked, so the size differences have nothing to do with the actual size cost of stdio or iostream; you're just measuring the cost of the calling code, or (more likely, based on the other comments/answer) the base overhead of exception-handling support for C++. Now, there is a common claim that a static-linked C++ iostream-based hello world can be even smaller than a printf-based one, since the compiler can see exactly which overloaded versions of operator<< are used and optimize out unneeded code (such as expensive floating point printing), whereas printf's use of format strings makes this difficult in the common case and impossible in general. However, I've never seen a C++ implementation where a static-linked iostream-based hello program could come anywhere near close to being as small as, much less smaller than, a printf-based one in C.

Answer2:

I think he's treating the half kilobyte as a rounding error. Both are "9 kilobytes" and that's what you'll see in a typical file browser. They aren't exactly the same because, under the hood, the C and C++ libraries are quite different. If you're already familiar with your disassembler, you can see the details of the difference for yourself.

The "extra stuff" is for the sake of importing symbols from the standard library shlib, and handling C++ exceptions. Strangely enough, much of the GCC-compiled C executable is taken up by C++ exception handling tables. I've not figured out how to strip them using GCC.

endl is inlined, but it contains calls to print the \n character and flush the stream, which are not inlined. The difference in size is due to importing those from the standard library.

In truth, individual kilobytes seldom matter on any system with dynamically-loaded libraries. Self-contained code such as on an embedded system would need to include the standard library functionality it uses, and the C++ standard library tends to be heavier than its C counterpart — <iostream> vs. <stdio.h> in particular.

Recommend

  • Inline speed and compiler optimization
  • Why SQL functions are faster than UDF
  • How to implement an IFilter for indexing heavyweight formats?
  • Zoom Effect Only Zooms Original Image Not New Image From Thumbnail
  • F#/C# - fsx Script Files and Project References
  • Is there any way to prepare a struct for future additions?
  • C# automatic property
  • Is there a way to group nth-child CSS rules for same parent
  • Rails + Amazon RDS : latency issues
  • Why integer division and modulo isn't optimized out in NVRTC
  • Making a vectorized numpy function behave like a ufunc
  • Understanding how to construct GHC.Generics Rep's and convert back to values
  • react native create element with string
  • Greek letters in legend in R
  • Why can't I use non-integral types with switch [duplicate]
  • Rodeo UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 0: ordinal
  • Translating C# to PowerShell in InterIMAP
  • rewrite uppercase url to lowercase url htaccess
  • In-place sed command not working
  • how to set variables in a php include file?
  • How to disable all widgets inside Panel or inside Composite?
  • Reduction and collapse clauses in OMP have some confusing points
  • Unable to install Git-core+svn by MacPorts
  • SharedPreferences or SQLite Database?
  • Django simple Captcha “No module named fields” error
  • Debug.DrawLine not showing in the GameView
  • x64 applications using gdi+: what are the consequences on performance?
  • Could not find rake using whenever rails
  • R - Combining Columns to String Based on Logical Match
  • Read text file and split every line in MSBuild
  • swift auto completion not working in Xcode6-Beta
  • Deserializing XML into class C#
  • Jquery - Jquery Wysiwyg return html as a string
  • Function pointer “assignment from incompatible pointer type” only when using vararg ellipsis
  • Return words with double consecutive letters
  • how to add data labels for bar graph in matlab
  • C# - Getting references of reference
  • Can Visual Studio XAML designer handle font family names with spaces as a resource?
  • python draw pie shapes with colour filled
  • How to Embed XSL into XML