Binary resource inclusion (since C23)

From cppreference.com

#embed is a preprocessor directive to include (binary) resources in the build, where a resource is defined as a source of data accessible from the translation environment.

Syntax

#embed < h-char-sequence > embed-parameter-sequence (optional) new-line (1)
#embed " q-char-sequence " embed-parameter-sequence (optional) new-line (2)
#embed pp-tokens new-line (3)
__has_embed ( " q-char-sequence " embed-parameter-sequence (optional) )
__has_embed ( < h-char-sequence > embed-parameter-sequence (optional) )
(4)
__has_embed ( string-literal pp-balanced-token-sequence (optional) )
__has_embed ( < h-pp-tokens > pp-balanced-token-sequence (optional) )
(5)
1) Searches for a resource identified uniquely by h-char-sequence
2) Searches for a resource identified by q-char-sequence and replaces the directive by a list of integers corresponding to the data of the resource. It may fallback to (1)
3) If neither (1) nor (2) is matched, pp-tokens will undergo macro replacement. The directive after replacement will be tried to match with (1) or (2)
4) Checks whether a resource is available for embedding, whether it is empty or not and whether the parameters passed are supported by the implementation.
5) If (4) is not matched, h-pp-tokens and pp-balanced-token-sequence will undergo macro replacement. The directive after replacement will be tried to match with (4)
new-line - The new-line character
h-char-sequence - A sequence of one or more h-chars, where the appearance of any of the following causes undefined behavior:
  • the character '
  • the character "
  • the character \
  • the character sequence //
  • the character sequence /*
h-char - Any member of the source character set except new-line and >
q-char-sequence - A sequence of one or more q-chars, where the appearance of any of the following causes undefined behavior:
  • the character '
  • the character \
  • the character sequence //
  • the character sequence /*
q-char - Any member of the source character set except new-line and "
pp-tokens - A sequence of one or more preprocessing tokens
string-literal - A string literal
h-pp-tokens - A sequence of one or more preprocessing tokens except >
embed-parameter-sequence - A sequence of one or more pp-parameter s. Note that unlike an attribute-list, this sequence is not comma separated.
pp-parameter - An attribute-token (see: attributes) but comprised of preprocessing tokens instead of tokens.
pp-balanced-token-sequence - A balanced-token-sequence (see: attributes) but comprised of preprocessing tokens instead of tokens

Explanation

1) Searches for the resource identified by h-char-sequence in implementation-defined manner.
2) Searches for the resource identified by q-char-sequence in implementation-defined manner. For (1,2), the implementations typically use a mechanism similar to, but distinct from, the implementation-defined search paths used for source file inclusion. The construct __has_embed(__FILE__ ... appears in one of the examples in the standard, suggesting, in case (2)
3) The preprocessing tokens after embed in the directive are processed just as in normal text (i.e., each identifier currently defined as a macro name is replaced by its replacement list of preprocessing tokens). The directive resulting after all replacements shall match one of the two previous forms. The method by which a sequence of preprocessing tokens between < and > preprocessing token pair or a pair of "
4) The resource identified by h-char-sequence or q-char-sequence is searched for as if that preprocessing token sequence were the pp-tokens in syntax (3), except that no further macro expansion is performed. If such a directive would not satisfy the syntactic requirements of an #embed directive, the program is ill-formed. The __has__embed expression evaluates to __STDC_EMBED_FOUND__ if the search for the resource succeeds, the resource is non empty and all the parameters are supported, to __STDC_EMBED_EMPTY__ if the resource is empty and all the parameters are supported, and to __STDC_EMBED_NOT_FOUND__
5) This form is considered only if syntax (4) does not match, in which case the preprocessing tokens are processed just as in normal text.

In the case the resource is not found or one of the parameters is not supported by the implementation, the program is ill-formed.

__has_embed can be expanded in the expression of #if and #elif. It is treated as a defined macro by #ifdef, #ifndef, #elifdef, #elifndef and defined

A resource has an implementation resource width which is the implementation-defined size in bits of the located resource. Its resource width is the implementation resource width unless modified by a limit parameter. If the resource width is 0, the resource is considered empty. The embed element width is equal to CHAR_BIT

The expansion of a #embed directive is a token sequence formed from the list of integer constant expressions

The values of the integer constant expressions in the expanded sequence are determined by an implementation-defined mapping of the resource’s data. Each integer constant expression’s value is in the range [02embed element width)

  1. The list of integer constant expressions is used to initialize an array of a type compatible with unsigned char, or compatible with char if char
  2. The embed element width is equal to CHAR_BIT,

then the contents of the initialized elements of the array are as-if the resource’s binary data is fread into the array at translation time.

Implementations are encouraged to take into account translation-time bit and byte orders as well as execution-time bit and byte orders to more appropriately represent the resource’s binary data from the directive. This maximizes the chance that, if the resource referenced at translation time through the #embed directive is the same one accessed through execution-time means, the data that is e.g. fread or similar into contiguous storage will compare bit-for-bit equal to an array of character type initialized from an #embed

Parameters

The standard defines the parameters limit, prefix, suffix and if_empty

limit

limit( constant-expression ) (1)
__limit__( constant-expression ) (2)

The limit embed parameter can appear at most once in the embed parameter sequence. It must have an argument, which must be an integer (preprocessor) constant expression that evaluates to a non negative number and does not contain the token defined

suffix

suffix( pp-balanced-token-sequence (optional) ) (1)
__suffix__( pp-balanced-token-sequence (optional) ) (2)

The suffix

prefix

prefix( pp-balanced-token-sequence (optional) ) (1)
__prefix__( pp-balanced-token-sequence (optional) ) (2)

The prefix

if_empty

if_empty( pp-balanced-token-sequence (optional) ) (1)
__if_empty__( pp-balanced-token-sequence (optional) ) (2)

The if_empty

Example

#include <stdint.h>
#include <stdio.h>
 
const uint8_t image_data[] = {
#embed "image.png"
};
 
const char message[] = {
#embed "message.txt" if_empty('M', 'i', 's', 's', 'i', 'n', 'g', '\n')
,'\0' // null terminator
};
 
void dump(const uint8_t arr[], size_t size)
{
    for (size_t i = 0; i != size; ++i)
        printf("%02X%c", arr[i], (i + 1) % 16 ? ' ' : '\n');
    puts("");
}
 
int main()
{
    puts("image_data[]:");
    dump(image_data, sizeof image_data);
    puts("message[]:");
    dump((const uint8_t*)message, sizeof message);
}

Possible output:

image_data[]:
89 50 4E 47 0D 0A 1A 0A 00 00 00 0D 49 48 44 52
00 00 00 01 00 00 00 01 01 03 00 00 00 25 DB 56
...
message[]:
4D 69 73 73 69 6E 67 0A 00

References

  • C23 standard (ISO/IEC 9899:2024):
  • 6.4.7 Header names (p: 69)
  • 6.10.1 Conditional inclusion (p: 165-169)
  • 6.10.2 Binary resource inclusion (p: 170-177)