Rebuild the World!

Acceptable Subsets of Plain C and C++

Contents:

Preamble
Restrictions for plain C
Restrictions for C++
For both languages
- #include"" vs. #include<>
- No GNU extensions

Preamble

Technical standards are used to be committee-made. On the other hand, it is well-known that committees, by their very nature, are unable to produce anything useful. In the very best case, committee-made things are useless, but far more often they are seriously harmful.

Speaking particularly about programming languages, the committees are used to issue a command to the whole world — from now on, the language that people know becomes totally different and everyone must agree.

It looks obvious that no one in the world can have powers to command to the whole world. No one, never ever. If governments let various “standard bodies”, such as ISO, do what they do, this only means the governments went far beyond the limits of acceptable.

C99 and later “standards” explain languages which are totally different, and they have nothing to do with the C language; only C90 was more or less close to what the C language is. The same is true for the C++ language, but starting with the very first “standard”, issued in 1998. None of the so-called “C++ standards” ever had anything in common with what C++ really is.

Honest behavior would be to give these specifications different names. There are examples of such well behavior in history, such as the language named Scheme, despite everyone understands it is just another Lisp dialect. Even C++ itself is an example of a more or less honest naming: although the name obviously suggests that it is the same C but better, it is still a different name, and Stroustrup never tried to convince the world that plain C is from now on obsoleted and C++ should be used instead.

Having said this, it becomes clear that what standard committees do is, plainly speaking, fraud. Industry is conservative, and it is very hard to convince it to adopt a completely new language; so the committees use the names of well-known languages to endorse what they create.

BTW, even if we take these standards simply as specifications for new languages, they are horrible. It's because the committee members hold no responsibility for what they do. They even don't have to implement their own ideas, they only need to vote for it, and a lot of other people will, as the consequence, have no other choice but to implement what the committee voted for.

We can't stop the committees, at least right now. We've got no power to scatter them. But there's one thing we can do right now: namely, we can boycott everyhting the damn committees do. So, let's do at least what we can.

Restrictions for plain C

Before we start, let's recall one trivial thing we must always remember: C and C++ are two completely different programming languages.

No features from C99 and later "standards"

First of all, it is prohibited to use VLAs.

Once again, it is prohibited to use VLAs.

For those who didn't understand: it is prohibited to use VLAs.

But, well, VLAs are not the only thing to be prohibited. As a rule, C99 is not C, and all the later crap such as C14 or C23 has nothing to do with the C language at all; if we write in C, then let's write in C.

There are no "line" comments in C, those that start with “//”. In plain C, only /* ... */ comments are allowed.

A variable or a type in plain C can only be defined (or even just declared) either in global space or at the very start of a block; no declarations and definitions may come after a statement. So, in particular, this is illegal:

  int f(int n)
  {
      int *a;
      a = malloc(n);
      int x;
      /* ... */
  }

This is because the variable x is defined after the statement (the one which contains malloc). No more declarations are allowed in the block once at least one statement is encountered. Well, remember we're discussing plain C now. In C++, declarations may be placed wherever you want, but plain C is not C++.

Certainly, all these ugly things like complex numbers, wide chars, L-strings, are not allowed. Complex numbers are too uncommon to be supported on the language level (in contrast to library level); as of those “wide” chars and strings, please recall that all program source code must be kept ASCII only.

However, there are not so obvious limitations. First of all, there is no bool type in plain C. Arithmetic zero stands for false, any arithmetic non-zero is true, and in case you need to explicitly specify boolean truth, use 1.

Please don't even think about using various “optimization hints” such as the restrict keyword and also these likely and unlikely macros. Simply forget about them.

Also, in plain C there are no designated initializers, nor compound literals. So, in the following code everything is wrong:

  struct mystr s1 = { .name = "John", .count = 5, .avg = 2.7 };
  s2 = (struct mystr) { .name = "John", .count = 5, .avg = 2.7 };

You may feel pity for these as they are convenient. The problem is that they come from “standards”.

And one more thing: there are no inline functions in C.

We might say that we limit what we use in plain C with what is included in the ANSI C plus the long long type which is not there, but was supported in nearly all C compiler existed to the date. This is not a rule meaning that if you manage to find something “interesting” (weird) in the official text of ANSI C, this in no way means you may demand us to comply; the opposit is true, too: if something is not there, that (by itself) does not mean we can't use it, specially if we talk about the library, not the language itself.

Allowed subset of the standard library

If something is included into the standard C library, this doesn't automatically mean you should use it. Tendencies are that one day we'll have to create our own library to be used instead of the “standard” one, and the less we use of the libc now, the less effort it will take to get rid of it.

Unfortunately, it is a bit hard to give a complete answer on what is okay and what is not okay here; may be such answer will be given later. As of now, the following is definitely allowed:

system call wrappers and their infrastructure such as constant definitions;
library functions of the exec* family;
the higher-level input/output functions declared in stdio.h, except for fread and fwrite which are senseless (use syscalls instead), and the gets function which must never be used for obvious security reasons;
the errno variable;
the functions malloc, free, getenv, setenv, unsetenv, exit and _exit from the stdlib.h header file;
the string manipulation functions declared in the string.h header file;
the math functions declared in the math.h header file and available with -lm.

On the other hand, all the following is definitely not allowed:

any “extensions” such as GNU extensions;
everything that depends on locales, such as functions from the ctype.h header file; we have to make an exception for the [fvs]printf function family here (and perhaps for the [fvs]scanf as well), and that's a problem, but perhaps our own version of these functions will not depend on locales;
everything that ruins the possibility to build statically with glibc, such as getpwnam and the company;
everything that uses or depends on threads;
functions invented specifically to support threads, usually named with the _r suffix, such as strtok_r.

For all the features not included into either of these lists, the situation is subject to be discussed. For features that “formally” appear on both lists, such as the strtok_r function which is also “a function from string.h”, the list of prohibitions has the precedence.

Restrictions for C++

No C++ standard library

The so-called standard library of C++ must not be used in any form. Simply speaking, you must not include any header files that have no “.h” suffix and you must not use any names from that damn “namespace std”, neither with explicit std:: prefix, nor with the using namespace directive (as we'll see in the next section, the using directive is prohibited as such, because namespaces themselves are prohibited).

Compilers usually provide “legacy” headers such as iostream.h, vector.h and the like. These are prohibited, too.

Definitely it is strictly forbidden to include “plain C compatibility” headers such as cstdio, cstring and so on. As we already mentioned, some header files from the plain C standard library are not allowed, but many of them are allowed — and they must be included exactly as you would do it in plain C.

People often ask one (stupid) question: okay, but what to use instead of the standard library? Sometimes it is possible to answer this question correctly. Instead of iostream, either plain C “high-level” input-output may be used (that is, the functions and types declared in stdio.h), or you can use i/o system calls such as open, close, read, write etc. directly. Instead of the (very stupid and ugly) string class, each particular software project may invent its own implementation.

It is very important to understand (and accept!) that no replacement for STL templates (containers and algorithms) allowed, neither from other libraries, nor written by yoursef. Data structures are built exactly the same way as in plain C, manually, for every particular task. There is no such thing as “generic” data structures. Period.

No standard-invented “features”

None of the “features” introduced by the so-called “standards” is allowed here. Just to start with, your code must not contain any of the following keywords: using, namespace, typename (the class keyword is to be used in template arguments that represent types), nullptr (the numerical 0 must be used to denote the null address), auto (damn if you feel you need it, you're wrong: don't prevent the compiler from detecting your errors!), constexpr, consteval, constinit, noexcept, final, override, import, module, requires, export, co_yield, co_return, co_await, wchar_t, char8_t, char16_t, char32_t, alignas, alignof, register, static_assert, thread_local...

Something may be missing from this list, but you've got the idea: if a new keyword is “added” by another damn “standard” or its meaning changed, then the keyword is prohibited here.

Not only keywords are prohibited; all the “concepts” introduced in these “standards” are prohibited, too. Don't even think about all these lambdas, coroutines, structured bindings, move semantic (it is when a type for constructor's argument is declared with &&), variadic templates etc.

Once again: everything that comes from C++ “standards” is prohibited. No exceptions from this rule are to be made, ever.

No loop var definition within for head

Please don't do this:

    for (int i = 0; i < 10; ++i) {

Instead, define your loop variable right before the loop head, like this:

    int i;
    for (i = 0; i < 10; i++) {

Beware of exception handling

C++ exception handling sometimes may save a lot of time and effort; actually, exceptions don't come from “standards”, and despite their terrible inefficiency, sometimes it is okay to use them.

However, the idea behind the C++ exception handling (which is that a particular exceptional situation is to be identified by a type) is weird; furhtermore, implementation of C++ exceptions is a bloated piece of runtime, costs a lot of machine code being added to your functions, and all this mess is definitely no good; disabling exceptions with the compiler's flags may be desirable in many cases.

So, a decision is to be made for each particular software project whether it will use exceptions or no.

Please be specially careful with the decision if your project (or a part of it) is a software library. Using exceptions in a library means the library can't be used by projects that decide not to have exceptions. Perhaps the very minimum for a library must be a possibility to get rid of exceptions with a compile-time option, but if you implement such an option, you'll see there's no longer any gain from exceptions as you have to implement traditional error handling anyway. So it is always a better option for a library not to rely on exception handling.

No RTTI nor dynamic_cast

These are not from standards, too, but RTTI is too monstrous, and dynamic_cast is too slow. Don't use them both.

No member pointers

There is absolutely nothing wrong with member pointers, except for one thing: if you really need a member pointer, then your class (or structure, heh?) is overcomplicated. Instead of member pointers, better try to refactor your classes and methods so that they stop being that complicated.

Default argument values

Default values for arguments of functions and methods are discouraged but still allowed. However, we restrict what can serve as the default value. The standards are too liberal on this. Within the RebuildWorld project, only explicit compile-time constants may be used in this role. These include explicit numerical, char and string literals, macros that are known to expand to such literals, and enum constants. Nothing else is allowed.

Templates are okay, but not for containers

Many similar guides prohibit templates altogether. This might surprize you, but here we disagree: within the RebuildWorld project, templates as such are allowed. What is not allowed is using templates to create generic container classes and “algorithms” for them, like STL does.

People often ask smth. like "hey, but what else templates can be used for?!" Okay, if you don't know the answer, then simply don't use templates at all. However, once you encounter a small task where templates can make things better (not being used to create another damn generic container), then, well, recall this section.

For both languages

#include"" vs. #include<>

Before the committees touched all this stuff with their dirty hands, the difference between #include "" and #include <> was obvious: the form #include "" was used for the headers that belong to your program itself, while #include <> was for the header files external to your program, that is, headers from libraries, no matter whether it is the so-called “standard” library or any other. So this is the principle we continue to use, no matter what these brainless irresponsible terrorists invent (all these 'modules', 'system directories' etc).

In particular, all the libraries included within the source tarball are still libraries, so their headers must be included with #include <>.

The #include "" form is only to be used for headers that belong to the main part of the program's code, that is, the part which the author or authors don't consider as libraries. Typically the main code is placed in its own directory.

No GNU extensions

Well, do we really have to mention that horrible things like nested functions, VLAs in C++ (which were initially introduced for plain C only, starting with the C99 'standard') and lots of other monsters that gcc supports as “extensions” are not to be used? Heh, actually they are so ugly that even damn standard committees don't want to adopt them.

previous • up