Modules In Depth
The previous article C++ Modules introduced the basic concepts and syntax of modules, but still lacked systematic guidance. This article summarizes my experience using modules over the past years and should be sufficient for most people.
Preprocessor
Modules do not change the way the preprocessor works, so you can still use #include directives normally.
import module_name; and module module_name; are also preprocessor directives; they cannot be expanded via macro substitution and must occupy a line by themselves. The handling of module module_name; occurs before conditional compilation, so you cannot use conditional compilation to selectively enable them. Ordinary export declarations are not preprocessor directives and are not subject to the above restrictions.
A recommended practice is to use the EXPORT macro inside lib.h to control exported declarations, and then in the .cpp file declare and import the module:
With this approach, lib.h and the module implementation are completely isolated.
Since modules cannot export macros, and the contents after importing a module are context independent, if a module library wants to provide macros to users, it must isolate the macros into a separate file.
If a library supports configuration via macros, then after converting it to a module, you should set the macro definitions at compile time to configure the module.
You can use a config.h file: let users provide definitions in it, and then #include "config.h" in the module implementation file. The __has_include directive is very useful for testing whether config.h exists.
If the user provides my_lib_config.h, its macros will be included and take effect; otherwise, the library uses default behaviour.
Module Implementation Units and Module Interface Units
Distinguishing between module implementation units and module interface units generally serves two purposes.
First, it separates implementation from interface. If some definitions are not distributed in source form, you can put them in module implementation units during development and distribute only the module interface units to users.
Second, after modifying only the contents of a module implementation unit, code that depends on that module does not need to be rebuilt. For large codebases, this can speed up the iteration cycle.
When you modify only the implementation details of add inside math_impl.cpp, no translation unit that depends on module math needs to be recompiled, because they depend only on the module interface.
Note that module partitions also support the distinction between implementation and interface units, so using partition implementation units can also avoid unnecessary builds.
Modifying math-detail_impl.cpp likewise does not trigger rebuilding of code that depends on math.
Module Partitions vs. Multiple Modules
Generally speaking, if your project is a library, it is fine to wrap it as a single module. This is also how the standard library is currently provided. Therefore, if you are a beginner, I strongly advise not to spend time on module partitions.
Only after you have already implemented your library as a module and are familiar with the various implementation approaches should you try module partitions.
Using module partitions usually serves two purposes:
- Improving parallelism
- Moving toward a pure‑module approach. i.e., transforming the project so that it uses almost no
#include
If the library you develop is very large and the boundaries between functionalities are clear, using multiple modules is also acceptable. A single module declares all interfaces in one module; reducing the amount of imported module content can avoid parsing unnecessary data, speeding up compilation and reducing memory usage.
If you are developing end‑user software, you do not need to provide a single module for user convenience, so module partitions are unnecessary. In that case, using multiple modules is appropriate.
One limitation of multiple modules is that if class A is declared in module A, then module B cannot provide a definition of A; in other words, definitions in module B are not considered to complete declarations from module A. Module partitions do not have this restriction. If your code relies on this pattern, you must put it in a single module or use partitions.
Incorrect example:
With module partitions, completing the definition is legal:
When I implemented modules for C++/WinRT, I deliberately chose multiple modules rather than a single winrt module with partitions. That way, I improved parallelism (Windows.*.h files are about 10 times larger than the standard library) while avoiding rebuilding code unrelated to the IDL being iterated.
Avoiding ODR Violations
Modules themselves do not change any ODR related rules, but modularisation can introduce new ODR violations, expose existing ones, or change the behaviour of buggy code that already violates the ODR.
The primary cause of ODR violations is incompatible compilation options – for example, all four major compilers support changing the signedness of char. Historically, there is no definitive list of which options cause ODR violations when inconsistent. Compilers that have implemented modules maintain an internal list that is constantly improved, and in most cases they can report compatibility issues.
Most ODR violations are caused by structures/functions in the same namespace having different definitions under different conditions or in different files.
For instance, some code checks the NDEBUG macro to provide different definitions; similarly, Windows' UNICODE and _UNICODE macros. Defining these macros in one file and not defining them in another can lead to ODR violations. Therefore, you must ensure that such macros are defined consistently everywhere they are needed.
If NDEBUG is defined when compiling the module, but user code that imports the module does not define NDEBUG and includes the same header, Data will have two inconsistent definitions, violating the ODR.
Another case is when a template specialization is defined in a different file from the primary template / template parameters. If a user does not consistently include the file containing the specialization, an ODR violation occurs.
If your library is to be provided both as headers and as modules, you must mark declarations inside the module implementation with extern "C++"; otherwise, those declarations belong to the current module and are different from the ones in the header when used as a header.
If user code in one file does #include "lib.h" and import lib;, there will be two different declarations of lib_func, and the compiler will consider them conflicting. If they are in different files that respectively use #include and import, it violates the ODR.
Strictly speaking, building binary libraries with different compiler versions or different standard versions can also cause ODR violations because they lead to differing definitions. In practice, however, binaries are isolated at the ABI level, so this is generally not considered problematic. With modules, since compiled modules store compiler internal structures rather than binaries, modules built with different compiler versions or different standards are indeed incompatible.
If your library is purely modular, you can actually drop inline at namespace scope because you no longer need to resolve symbol conflicts.
inline functions inside a module are truly inline functions, because the standard does specify relevant rules for inline such that the module's intermediate files can expose the function body, and importers can directly obtain and expand it from the module intermediate file, not merely as a hint.
At the same time, the new design for inline introduces a "defect": an inline function inside a module cannot call functions with internal linkage nor use variables with internal linkage, because they are not part of the module's public interface. Thus, when the inline function is expanded in other modules, it cannot rely on them.
However, C++ LTO (including ThinLTO) and PGO (including SPGO) techniques are now quite mature, and there are even tools like BOLT that directly optimise binaries. Usually, writing inline is not necessary.
Export Using Declarations
using declarations have always had a special role: for function overloads (including function templates among the overload set), a single using declaration brings in all those overloads; for function templates, class templates, and alias templates, it refers to the primary template.
Modules allow exporting using declarations, which means you can safely export everything you need with a single using declaration, as long as the entity exists, without having to add export to each overload individually. Moreover, it automatically selects the appropriate declarations, avoiding the risk of adding duplicate export to repeated declarations or incorrectly adding export to specialisations.
Wrapping a specialization in an export block is meaningless, it does not export the primary template or the specialisation; only the primary template can be exported. Adding export directly to a specialization is a syntax error, whereas the using declaration handles the appropriate declarations automatically.
Thus, there are two styles for implementing modules: the using style and the direct style. In practice, libc++ and libstdc++ use the using style, while the STL uses the direct style.
Module Wrapping Strategies
In the previous article I already explained the meaning of extern "C++" in the context of modules, it allows an entity to be exposed to users both as a module and as a header, referring to the same entity. Together with export using, this gives rise to several styles for providing both a module and a header.
Style 1:
extern "C++" can also be placed in module.cpp wrapping the #include "lib.h".
Style 2:
Style 3:
All three styles above require that you can modify lib.h. If you cannot modify it, there are two alternative styles:
Style 4:
Because you cannot modify lib.hpp, you must include its contents inside extern "C++" so that all declarations become entities not belonging to any module. Then you use export using to export them. Before the module, you also need to additionally include <upstream_library> to satisfy dependencies and avoid bringing non‑module things into the extern "C++" block.
Style 5:
Including the header in the global module fragment makes all declarations belong to the global module, not to the current module A. Therefore, you can include it directly without extern "C++" and then use export using to export. This looks cleaner, but it actually adds unnecessary declarations to the global module fragment.
As I have shown, if you cannot modify lib.h, you should use the using style rather than redeclaring.
The potential risk of redeclaring struct x is that if lib.h removes x in a future update, module A would export its own declaration of x, causing the code's behaviour to change or producing hard to read errors, such as ODR violations.
Include After Import
The standard does not forbid including the same thing after importing it (e.g., the standard library), but from the implementation results of GCC, Clang, and MSVC, there are many bugs when including after importing. At present, we can consider this a common pitfall:
The reason is probably the complexity of C++ syntax: when including after importing, the compiler must perform parsing and merging of duplicate declarations simultaneously, increasing implementation complexity.
When the code includes first and imports later, the compiler only needs to parse first, and merging is done after parsing, which is relatively simpler.
Therefore, for now you should ensure that imports come before includes. If you really cannot avoid this issue, you can consider using the wrapping techniques described above - create module wrappers for those headers, turning them into modules to solve the problem.