We're always interested in getting feedback. E-mail us if you like this guide, if you think that important material is omitted, if you encounter errors in the code examples or in the documentation, if you find any typos, or generally just if you feel like e-mailing. Mail to Frank Brokken or use an e-mail form. Please state the concerned document version, found in the title. If you're interested in a printable PostScript copy, pick up your own copy inzip
-format byftp
from ftp.icce.rug.nl/pub/http.
This document presents an introduction to programming in C++. It is a
guide for C/C++ programming courses, that Frank gives yearly at the
University of Groningen. As such, this document is not a complete
C/C++ handbook, but rather serves as an addition to other
documentation sources (e.g., the Dutch book De programmeertaal C,
Brokken and Kubat, University of Groningen 1996,
or the Microsoft C/C++ tutorial).
The reader should realize that extensive knowledge of the C programming
language is assumed and required. This document continues where topics of the
C programming language end, such as pointers, memory allocation and
compound types.
The version number of this document (currently 4.3.1a) is updated when the
contents of the document change. The first number is the major number,
and will probably not be changed for some time: it indicates a major
rewriting. The middle number is increased when new information is added to the
document. The last number only indicates small changes; it is increased when,
e.g., typos are corrected.
This document is published by the ICCE, University of Groningen, the
Netherlands. This document was typeset using the yodl formatting system.
All rights reserved. No part of this document may be published or
changed without prior consent of the authors. Direct all correspondence
concerning suggestions, additions, improvements or changes in this
document to the first author:
In this chapter a first impression of C++ is presented. A few extensions
to C are reviewed and a tip of the mysterious veil surrounding object
oriented programming (OOP) is lifted.
The original version of the guide was originally written by Frank and Karel in
Dutch and in LaTeX format. After some time, Karel Kubat rewrote the text and
converted the guide to a more suitable format and (of course) to English in
september 1994.
The first version of the guide appeared on the net in october 1994. By then it
was converted to SGML
.
In time several chapters were added, and the contents were modified
thanks to countless readers who sent us their comment, due to which we were
able to correct some typos and improve unclear parts.
The transition from major version three to major version four was realized by
Frank: again new chapters were added, and the source-document was converted
from SGML
to
Yodl.
The C++ Annotations are not freely distributable. Be sure to read the legal notes.
Reading the annotations beyond this point implies that you are aware of the restrictions that we pose and that you agree with them.
If you like this document, tell your friends about it. Even better, let us
know by sending email to Frank: frank@icce.rug.nl
.
Major version 4 represents a major rewrite of the previous
version 3.4.14: The document was rewritten from SGML to
Yodl, and many
new sections were added. All sections got a tune-up. The distribution basis,
however, hasn't changed: see the introduction.
The upgrade from version 4.1.* to 4.2.* was the result of the inclusion of
section 3.1.3 about the bool data type in chapter
3. The distinction between differences between C and
C++ and extensions of the C programming languages is (albeit a bit
fuzzy) reflected in the introdution chapter and the chapter on first
impressions of C++: The introduction chapter
covers some differences between C and C++, whereas the chapter about
first impressions of C++ covers some extensions of
the C programming language as found in C++.
The decision to upgrade from version 4.2.* to 4.3.* was made after realizing
that the lexical scanner function yylex()
can be defined in the
scanner class that is derived from yyFlexLexer
. Under this approach
the yylex()
function can access the members of the class derived from
yyFlexLexer
as well as the public and protected members of
yyFlexLexer
. The result of all this is a clean implementation of the rules
defined in the flex++
specification file. See section 15.4.1 for
details.
The version 4.3.1a
is a precursor of 4.3.2
. In 4.3.1a
most of the
typos I've received since the last update have been processed. In version
4.3.2.
the following modifications will be incorporated as well:
&
-operator
(this->*pointer)(...)
construction inside memberfunctions of the
class in which the pointer to memberfunctions is defined.
C++ was originally a `pre-compiler', similar to the preprocessor of
C, which converted special constructions in its source code to plain
C. This code was then compiled by a normal C compiler. The
`pre-code', which was read by the C++ pre-compiler, was usually located
in a file with the extension .cc
, .C
or .cpp
. This file
would then be converted to a C source file with the extension .c
, which
was compiled and linked.
The nomenclature of C++ source files remains: the extensions .cc
and
.cpp
are usually still used. However, the preliminary work of a C++
pre-compiler is in modern compilers usually included in the actual compilation
process. Often compilers will determine the type of a source file by the
extension. This holds true for Borland's and Microsoft's C++ compilers,
which assume a C++ source for an extension .cpp
. The GNU compiler
gcc
, which is available on many Unix platforms, assumes for C++ the
extension .cc
.
The fact that C++ used to be compiled into C code is also visible
from the fact that C++ is a superset of C: C++ offers all
possibilities of C, and more. This makes the transition from C to
C++ quite easy. Programmers who are familiar with C may start
`programming in C++' by using source files with an extension .cc
or
.cpp
instead of .c
, and can then just comfortably slide into all the
possibilities that C++ offers. No abrupt change of habits is required.
.cc
and
run it through a C++ compiler:
sizeof('c')
equals sizeof(int)
,
'c'
being any ASCII character. The underlying philosophy is
probably that char
's, when passed as arguments to functions, are
passed as integers anyway. Furthermore, the C compiler handles a
character constant like 'c'
as an integer constant. Hence, in
C, the function calls
putchar(10);
and
putchar('\n');
are synonyms.
In contrast, in C++, sizeof('c')
is always 1 (but see also section
3.1.4), while
an int
is still an int
. As we shall see later (see
section 2.5.9), two function calls
somefunc(10);
and
somefunc('\n');
are quite separate functions: C++ discriminates functions by
their arguments, which are different in these two calls: one function
requires an int
while the other one requires a char
.
extern void func();
means in C that a function func()
exists, which returns
no value. However, in C, the declaration doesn't specify which
arguments (if any) the function takes.
In contrast, such a declaration in C++ means that the
function func()
takes no arguments at all.
Which of these allegations are true? In our opinion, C++ is a little
overrated; in general this holds true for the entire object-oriented
programming (OOP). The enthusiasm around C++ resembles somewhat the
former allegations about Artificial-Intelligence (AI) languages like Lisp and
Prolog: these languages were supposed to solve the most difficult AI-problems
`almost without effort'. Obviously, too promising stories about any
programming language must be overdone; in the end, each problem can be coded
in any programming language (even BASIC or assembly language).
The advantages or
disadvantages of a given programming language aren't in `what you can do with
them', but rather in `which tools the language offers to make the job easier'.
Concerning the above allegations of C++, we think that the following can
be concluded. The development of new programs while existing code is reused
can also be realized in C by, e.g., using function libraries: thus, handy
functions can be collected in a library and need not be re-invented with each
new program. Still, C++ offers its specific syntax possibilities for
code reuse, apart from function libraries (see chapter 11).
Creating and using new data types is also very well possible in C; e.g.,
by using struct
s, typedef
s etc.. From these types other types can be
derived, thus leading to struct
s containing struct
s and so on.
Memory management is in principle in C++ as easy or as difficult as in
C. Especially when dedicated C functions such as xmalloc()
and
xrealloc()
are used (these functions are often present in our
C-programs, they allocate or abort the program when the memory pool is
exhausted). In short, memory management in C or in
C++ can be coded `elegantly', `ugly' or anything in between --
this depends on the developer rather than on the language.
Concerning `bug proneness' we can say that C++ indeed uses stricter type
checking than C. However, most modern C compilers implement
`warning levels'; it is then the programmer's choice to disregard or heed a
generated warning. In C++ many of such warnings become fatal errors (the
compilation stops).
As far as `data hiding' is concerned, C does offer some tools. E.g.,
where possible, local or static
variables can be used and special data
types such as struct
s can be manipulated by dedicated functions. Using
such techniques, data hiding can be realized even in C; though it needs
to be said that C++ offers special syntactical constructions. In
contrast, programmers who prefer to use a global variable int
i
for
each counter variable will quite likely not benefit from the concept of data
hiding, be it in C or C++.
Concluding, C++ in particular and OOP in general are not solutions to all
programming problems. C++, however, does offer some elegant syntactical
possibilities which are worthwhile investigating. At the same time, the level
of grammatical complexity of C++ has increased significantly compared to
C. In time we got used to this increased level of complexity, but the
transition didn't take place fast or painless. With the annotations we hope to
help the reader to make the transition from C to C++ by providing,
indeed, our annotations to what is found in some textbooks on C++. We
hope you like this document and may benefit from it: Good luck!
static
).
In contrast, or maybe better: in addition to this,
an object-oriented approach identifies the keywords
in the problem. These keywords are then depicted in a diagram and arrows are
drawn between these keywords to define an internal hierarchy. The keywords
will be the objects in the implementation and the hierarchy defines the
relationship between these objects. The term object is used here to describe a
limited, well-defined structure, containing all information about some
entity: data types and functions to manipulate the data.
As an example of an object-oriented approach, an illustration follows:
The employees and owner of a car dealer and auto garage company are paid
as follows. First, mechanics who work in the garage are paid a certain sum
each month. Second, the owner of the company receives a fixed amount each
month. Third, there are car salesmen who work in the showroom and receive
their salary each month plus a bonus per sold car. Finally, the company
employs second-hand car purchasers who travel around; these employees
receive their monthly salary, a bonus per bought car, and a restitution of
their travel expenses.
When representing the above salary administration, the keywords could be
mechanics, owner, salesmen and purchasers. The properties of such units are: a
monthly salary, sometimes a bonus per purchase or sale, and sometimes
restitution of travel expenses. When analyzing the problem in this manner we
arrive at the following representation:
In the hierarchy of objects we would define the dependency between the
first two objects by letting the car salesmen be `derived' from
the owner and mechanics.
The hierarchy of the thus identified objects further illustrated
in figure 1.
The overall process in the definition of a hierarchy such as the above starts
with the description of the most simple type. Subsequently more complex types
are derived, while each derivation adds a little functionality. From these
derived types, more complex types can be derived ad infinitum, until a
representation of the entire problem can be made.
In C++ each of the objects can be represented in a
class, containing the necessary functionality to do useful
things with the variables (called objects) of these classes. Not all of
the functionality and not all of the properties of a class is usually
available to objects of other classes. As we will see, classes tend to
encapsulate their properties in such a way that they are not immediately
accessible from the outside world. Instead, dedicated functions are normally
used to reach or modify the properties of objects.
//
and ends with the
end-of-line marker. The standard C comment, delimited by /*
and
*/
can still be used in C++:
int main() { // this is end-of-line comment // one comment per line /* this is standard-C comment, over more than one line */ return (0); }
The end-of-line comment was already implemented as an extension to C
in some C compilers, such as the Microsoft C Compiler V5.
0
. In C, where pointers are
concerned, NULL
is often used. This difference is purely stylistic, though
one that is widely adopted. In C++ there's no need anymore to use
NULL
. Indeed, according to the descriptions of the pointer-returning
operator new
0 rather than NULL
is returned when memory allocation
fails.
The program
int main() { printf("Hello World\n"); return (0); }
does often compile under C, though with a warning that printf()
is
not a known function. Many C++ compilers will fail to produce code in
such a situation (When GNU's g++ compiler encounters an unknown
function, it assumes that an `ordinary' C function is meant. It does complain
however.). The error is of course the missing #include<stdio.h>
directive.
extern void func();
means in C that the argument list of the declared function is not
prototyped: the compiler will not be able to warn against improper argument
usage. When declaring a function in C which has no arguments, the keyword
void
is used, as in:
extern void func(void);
Because C++ maintains strict type checking, an empty argument list is
interpreted as the absence of any parameter. The keyword void
can then be
left out. In C++ the above two declarations are equivalent.
__cplusplus
: it is as if each source file were prefixed with the
preprocessor directive #define __cplusplus
.
We shall see examples of the usage of this symbol in the following sections.
As an example, the following code fragment declares a function xmalloc()
which is a C function:
extern "C" void *xmalloc(unsigned size);
This declaration is analogous to a declaration in C, except that the
prototype is prefixed with extern "C"
.
A slightly different way to declare C functions is the following:
extern "C" { . . (declarations) . }
It is also possible to place preprocessor directives at the location of the
declarations. E.g., a C header file myheader.h
which declares
C functions can be included in a C++ source file as follows:
extern "C" { # include <myheader.h> }
The above presented methods can be used without problem, but are not very
current. A more frequently used method to declare external C functions is
presented below.
__cplusplus
and of the
possibility to define extern "C"
functions offers the ability to
create header files for both C and C++. Such a header file might,
e.g., declare a group of functions which are to be used in both C and
C++ programs.
The setup of such a header file is as follows:
#ifdef __cplusplus extern "C" { #endif . . (the declaration of C-functions occurs . here, e.g.:) extern void *xmalloc(unsigned size); . #ifdef __cplusplus } #endif
Using this setup, a normal C header file is enclosed by extern
"C" {
which occurs at the start of the file and by }
, which
occurs at the end of the file. The #ifdef
directives test for the type of
the compilation: C or C++. The `standard' header files, such as
stdio.h
, are built in this manner and therefore usable for both C
and C++.
An extra addition which is often seen is the following. Usually it is
desirable to avoid multiple inclusions of the same header file. This can
easily be achieved by including an #ifndef
directive in the header file.
An example of a file myheader.h
would then be:
#ifndef _MYHEADER_H_ #define _MYHEADER_H_ . . (the declarations of the header file follow here, . with #ifdef _cplusplus etc. directives) . #endif
When this file is scanned for the first time by the preprocessor, the
symbol _MYHEADER_H_
is not yet defined. The #ifndef
condition
succeeds and all declarations are scanned. In addition, the symbol
_MYHEADER_H_
is defined.
When this file is scanned for a second time during the same compilation,
the symbol _MYHEADER_H_
is defined. All information between the
#ifndef
and #endif
directives is skipped.
The symbol name _MYHEADER_H_
serves in this context only for recognition
purposes. E.g., the name of the header file can be used for this purpose, in
capitals, with an underscore character instead of a dot.
There is more to be said about header files. In section 4.7 the
preferred organization of header files when C++ classes are used is
discussed.
Furthermore local variables can be defined in some statements, just prior to
their usage. A typical example is the for
statement:
#include <stdio.h> int main() { for (register int i = 0; i < 20; i++) printf("%d\n", i); return (0); }
In this code fragment the variable i
is created inside the for
statement. According to the ANSI-standard, the variable does not exist
prior to the for
-statement and not beyond the for
-statement.
With some compilers, the variable continues to exist after the execution of
the for
-statement, but a warning like
warning: name lookup of `i' changed for new ANSI `for' scoping using obsolete binding at `i'will be issued when the variable is used outside of the
for
-loop. The
implication seems clear: define a variable just before the for
-statement
if it's to be used beyond that statement, otherwise the variable can be
defined at the for
-statement itself.
Defining local variables when they're needed requires a little getting used
to. However, eventually it tends to produce more readable code than defining
variables at the beginning of compound statements. We suggest the following
rules of thumb for defining local variables:
{
,
for
-statement, but
also all situations where a variable is only needed, say, half-way through
the function.
#include <stdio.h> void show(int val) { printf("Integer: %d\n", val); } void show(double val) { printf("Double: %lf\n", val); } void show(char *val) { printf("String: %s\n", val); } int main() { show(12); show(3.1415); show("Hello World\n!"); return (0); }
In the above fragment three functions show()
are defined, which only
differ in their argument lists: int
, double
and char *
. The
functions have the same name. The definition of several functions with the
same name is called `function overloading'.
It is interesting that the way in which the C++ compiler implements
function overloading is quite simple. Although the functions share the same
name in the source text (in this example show()
), the compiler --and
hence the linker-- use quite different names. The conversion of a name in the
source file to an internally used name is called `name mangling'. E.g., the
C++ compiler might convert the name void
show
(int)
to the
internal name VshowI
, while an analogous function with a char*
argument might be called VshowCP
. The actual names which are internally
used depend on the compiler and are not relevant for the programmer, except
where these names show up in e.g., a listing of the contents of a library.
A few remarks concerning function overloading are:
show()
are still somewhat related (they print information to the
screen).
However, it is also quite possible to define two functions
lookup()
, one of which would find a name in a list while the other
would determine the video mode. In this case the two functions have
nothing in common except for their name. It would therefore be more
practical to use names which suggest the action; say, findname()
and
getvidmode()
.
printf("Hello World!\n");holds no information concerning the return value of the function
printf()
(The return value is, by the way, an integer which
states the number of printed characters. This return value is practically
never inspected.). Two functions printf()
which would only
differ in their return type could therefore not be distinguished by the
compiler.
show(0);
given the three functions show()
above. The zero could be
interpreted here as a NULL
pointer to a char
, i.e., a
(char *)0
, or as an integer with the value zero. C++ will
choose to call the function expecting an integer argument, which might not
be what one expects.
An example is shown below:
#include <stdio.h> void showstring(char *str = "Hello World!\n") { printf(str); } int main() { showstring("Here's an explicit argument.\n"); showstring(); // in fact this says: // showstring("Hello World!\n"); return (0); }
The possibility to omit arguments in situations where default arguments are
defined is just a nice touch: the compiler will supply the missing argument
when not specified. The code of the program becomes by no means shorter or
more efficient.
Functions may be defined with more than one default argument:
void two_ints(int a = 1, int b = 4) { . . . } int main() { two_ints(); // arguments: 1, 4 two_ints(20); // arguments: 20, 4 two_ints(20, 5); // arguments: 20, 5 return (0); }
When the function two_ints()
is called, the compiler supplies one or two
arguments when necessary. A statement as two_ints(,6)
is however
not allowed: when arguments are omitted they must be on the right-hand side.
Default arguments must be known to the compiler when the code is generated
where the arguments may have to be supplied. Often this means that the default
arguments are present in a header file:
// sample header file extern void two_ints(int a = 1, int b = 4); // code of function in, say, two.cc void two_ints(int a, int b) { . . }
Note that supplying the default arguments in the function definition instead
of in the header file would not be the correct approach.
typedef
is in C++ allowed, but no longer necessary when
it is used as a prefix in union
, struct
or enum
definitions.
This is illustrated in the following example:
struct somestruct { int a; double d; char string[80]; };
When a struct
, union
or other compound type is defined, the tag of
this type can be used as type name (this is somestruct
in the above
example):
somestruct what; what.d = 3.1415;
struct
. This
is the first concrete example of the definition of an object: as was described
previously (see section 2.4), an object is a structure containing
all involved code and data.
A definition of a struct point
is given in the code fragment below.
In this structure, two int
data fields and one function draw()
are
declared.
struct point // definition of a screen { // dot: int x, // coordinates y; // x/y void draw(void); // drawing function };
A similar structure could be part of a painting program and could, e.g.,
represent a pixel in the drawing. Concerning this struct
it should be
noted that:
draw()
which occurs in the struct
definition
is only a declaration. The actual code of the function, or in other
words the actions which the function should perform, are located
elsewhere: in the code section of the program, where all code is
collected. We will describe the actual definitions of functions inside
struct
s later (see section 3.2).
struct
point
is just two int
s. Even
though a function is declared in the structure, its size is not affected
by this. The compiler implements this behavior by allowing the function
draw()
to be known only in the context of a point
.
The point
structure could be used as follows:
point // two points on a, // screen b; a.x = 0; // define first dot a.y = 10; // and draw it a.draw(); b = a; // copy a to b b.y = 20; // redefine y-coord b.draw(); // and draw it
The function which is part of the structure is selected in a similar manner in
which data fields are selected; i.e., using the field selector operator
(.
). When pointers to struct
s are used, ->
can be used.
The idea of this syntactical construction is that several types may contain
functions with the same name. E.g., a structure representing a circle might
contain three int
values: two values for the coordinates of the center of
the circle and one value for the radius. Analogously to the point
structure, a function draw()
could be declared which would draw the
circle.