{"id":5011,"date":"2021-10-16T00:01:55","date_gmt":"2021-10-16T07:01:55","guid":{"rendered":"https:\/\/c-for-dummies.com\/blog\/?p=5011"},"modified":"2021-10-09T16:51:14","modified_gmt":"2021-10-09T23:51:14","slug":"trigraph-sequences","status":"publish","type":"post","link":"https:\/\/c-for-dummies.com\/blog\/?p=5011","title":{"rendered":"Trigraph Sequences"},"content":{"rendered":"<p>I doubt you&#8217;ve ever used a trigraph. If you saw a trigraph in some C code, you might assume it was a typo or, from the early days of telecommunications, a modem burp. But trigraphs present a legitimate if not arcane way to represent certain characters, a holdover from the days of teletype input and primitive, barely-ASCII keyboards.<br \/>\n<!--more--><br \/>\nAs the primary input devices back in the mainframe days, teletype machines lacked certain symbols now common on computer keyboards:<\/p>\n<p><code># [ \\ ] ^ { | } ~<\/code><\/p>\n<p>If you&#8217;re going to code C, even in 1973 on a teletype machine connected to a mainframe across campus, you need to use these characters. To emulate them, the trigraph sequence is available. Here&#8217;s the definition from the C standard:<\/p>\n<blockquote><p>The trigraph sequences enable the input of characters that are not defined in the Invariant Code Set as described in ISO\/IEC 646, which is a subset of the seven-bit US ASCII code set.<\/p><\/blockquote>\n<p>A trigraph sequence starts with two question marks, <code>??<\/code>. The third character references the missing symbol, as shown in this table:<\/p>\n<table>\n<tr>\n<td>Trigraph<\/td>\n<td>Char.<\/td>\n<td>Trigraph<\/td>\n<td>Char.<\/td>\n<td>Trigraph<\/td>\n<td>Char.<\/td>\n<\/tr>\n<tr>\n<td>??=<\/td>\n<td>#<\/td>\n<td>??(<\/td>\n<td>[<\/td>\n<td>??\/<\/td>\n<td>\\<\/td>\n<\/tr>\n<tr>\n<td>??)<\/td>\n<td>]<\/td>\n<td>??&#8217;<\/td>\n<td>^<\/td>\n<td>??<<\/td>\n<td>{<\/td>\n<\/tr>\n<tr>\n<td>??!<\/td>\n<td>|<\/td>\n<td>??><\/td>\n<td>}<\/td>\n<td>??-<\/td>\n<td>~<\/td>\n<\/tr>\n<\/table>\n<p>In older C code, you might see something like:<\/p>\n<p><code>printf(\"phone ??= \");<\/code><\/p>\n<p>The <code>??=<\/code> trigraph represents the # character. It&#8217;s translated by the precompiler into the proper character. Then the source code is compiled.<\/p>\n<p>Modern compilers dislike trigraphs. Even your editor may choke on the sequence, improperly interpreting them and causing any context-based color coding to go haywire. Yet, according to the C standard, trigraphs are still valid in C.<\/p>\n<p>The following code outputs the pantheon of trigraph characters. Presenting them in a string is the only way I could keep my editor from getting cross with me:<\/p>\n<h3><a href=\"https:\/\/github.com\/dangookin\/C-For-Dummies-Blog\/blob\/master\/2021_10_16-Lesson.c\" rel=\"noopener\" target=\"_blank\">2021_10_16-Lesson.c<\/a><\/h3>\n<pre class=\"screen\">\r\n#include &lt;stdio.h&gt;\r\n\r\nint main()\r\n{\r\n    char trigraph[] =  \"??= ??( ??\/ ??) ??' ??&lt; ??! ??&gt; ??-\";\r\n\r\n    printf(\"%s\\n\",trigraph);\r\n\r\n    return(0);\r\n}<\/pre>\n<p>Beyond making the editor uncomfortable, trigraphs are flagged with warnings by modern compilers. The above code generates 9 warnings with <em>clang<\/em>, one for each trigraph &mdash; even with the trigraphs enclosed in double quotes. The warning states that the trigraph is ignored.<\/p>\n<p>Here&#8217;s the program&#8217;s output:<\/p>\n<p><code>??= ??( ??\/ ??) ??' ??< ??! ??> ??-<\/code><\/p>\n<p>To enable the trigraphs, you must compile under an older C standard. Yep, even though the current standard allows for trigraphs, your compiler may choose to ignore them. To properly process the trigraphs, use the <code>-std=c89<\/code> switch to force the compiler to generate a program compatible with the C89 standard:<\/p>\n<p><code>clang -std=c89 2021_10_16-Lesson.c<\/code><\/p>\n<p>Warnings about the trigraphs may still appear, though the warning now states that the trigraph is converted into the proper, corresponding character. Further, because the <code>??\/<\/code> trigraph is converted to the \\ (backslash), the compiler reports a missing escape character. Here is the updated output:<\/p>\n<p><code># [  ] ^ { | } ~<\/code><\/p>\n<p>To address the missing escape character, double-up on the <code>??\/<\/code> trigraph: <code>??\/??\/<\/code> The double backslash escapes the backslash character, which is now output:<\/p>\n<p><code># [ \\ ] ^ { | } ~<\/code><\/p>\n<p>Trigraphs are an interesting and delightfully cryptic relic of days gone by. I really wish I was actively coding back when these things were common and the nerds knew the trigraph sequences by heart. It&#8217;s sad to lose such legacies.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Valid in C source code, but rarely used these days, trigraphs present another way to represent certain characters. <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=5011\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[18],"class_list":["post-5011","post","type-post","status-publish","format-standard","hentry","category-main","tag-trigraph"],"_links":{"self":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5011","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5011"}],"version-history":[{"count":5,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5011\/revisions"}],"predecessor-version":[{"id":5017,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/5011\/revisions\/5017"}],"wp:attachment":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5011"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5011"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5011"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}