{"id":4782,"date":"2021-06-05T00:01:34","date_gmt":"2021-06-05T07:01:34","guid":{"rendered":"https:\/\/c-for-dummies.com\/blog\/?p=4782"},"modified":"2021-06-19T07:27:17","modified_gmt":"2021-06-19T14:27:17","slug":"understanding-the-glob","status":"publish","type":"post","link":"https:\/\/c-for-dummies.com\/blog\/?p=4782","title":{"rendered":"Understanding the Glob"},"content":{"rendered":"<p>From the history of the Unix operating system, <em>glob<\/em> is the term used for wildcard matching in filenames. It&#8217;s short for <em>global<\/em>, which to me means that two extra bytes of storage (for <code>'a'<\/code> and <code>'l'<\/code>) were important back in the day.<br \/>\n<!--more--><br \/>\nThe two common glob characters are <code>*<\/code> to match a cluster of characters in a filename, and <code>?<\/code> to match a single character. So <code>*.c<\/code> matches all C language source code files &mdash; but only when your code properly interprets the glob character input. Otherwise, as shown in <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=4775\">last week&#8217;s Lesson<\/a>, only the first matching filename in a directory is returned. (An explanation of this effect is forthcoming.)<\/p>\n<p>Wildcards are always active in Windows. In Linux\/Unix, the glob feature must be active for the wildcards to match filename characters. In bash and other shells, the feature is on by default. Glob functionality is disabled by activating the <em>noglob<\/em> setting. To do so, use the <em>set<\/em> command:<\/p>\n<p><code>set -o noglob<\/code><\/p>\n<p>No feedback is generated, but after issuing the above command any wildcards are interpreted literally. For example:<\/p>\n<p><code>$ ls *.c<br \/>\nls: *.c: No such file or directory<\/code><\/p>\n<p>To reactivate glob, use this command:<\/p>\n<p><code>set +o noglob<\/code><\/p>\n<p>(You can review all shell settings by typing the <strong>set -o<\/strong> command.)<\/p>\n<p>In your C code, the <em>glob()<\/em> function helps evaluate filename input that includes the global wildcard characters. Here is the <em>man<\/em> page format for the <em>glob()<\/em> function, which requires that the <code>glob.h<\/code> header file be included:<\/p>\n<p><code>int glob(const char * restrict pattern, int flags, int (*errfunc)(const char *epath, int errno), glob_t * restrict pglob);<\/code><\/p>\n<p>The four arguments are:<\/p>\n<p><code><em>pattern<\/em><\/code>, which is the pathname\/wildcard pattern to match, such as <code>*.c<\/code>.<\/p>\n<p><code><em>flags<\/em><\/code> are a series of options to modify the function&#8217;s behavior. Defined constants set the options, which can be combined by logically OR&#8217;ing them with each other. Plenty of options are available as documented on the <em>man<\/em> page.<\/p>\n<p><code><em>errfunc<\/em><\/code> is an error-handing function that helps deal with some <em>glob()<\/em> quirks. It can be set to <code>NULL<\/code> if this concept overwhelms you.<\/p>\n<p>The final argument <code><em>pglob<\/em><\/code> is a pointer to the base of a linked list packed with useful information about the matching files.<\/p>\n<p>The <em>glob()<\/em> function returns zero upon success. Otherwise it returns an error code, which I recommend testing against a slate of defined constants, such as <code>GLOB_NOMATCH<\/code> when no matching files are found. Here is sample code:<\/p>\n<h3><a href=\"https:\/\/github.com\/dangookin\/C-For-Dummies-Blog\/blob\/master\/2021_06_05-Lesson.c\" rel=\"noopener\" target=\"_blank\">2021_06_05-Lesson.c<\/a><\/h3>\n<pre class=\"screen\">\r\n#include &lt;stdio.h&gt;\r\n#include &lt;stdlib.h&gt;\r\n#include &lt;glob.h&gt;\r\n\r\nint main()\r\n{\r\n    char **found;\r\n    glob_t gstruct;\r\n    int r;\r\n\r\n    r = glob(\"*.c\", GLOB_ERR , NULL, &amp;gstruct);\r\n    <span class=\"comments\">\/* check for errors *\/<\/span>\r\n    if( r!=0 )\r\n    {\r\n        if( r==GLOB_NOMATCH )\r\n            fprintf(stderr,\"No matches\\n\");\r\n        else\r\n            fprintf(stderr,\"Some kinda glob error\\n\");\r\n        exit(1);\r\n    }\r\n    \r\n    <span class=\"comments\">\/* success, output found filenames *\/<\/span>\r\n    printf(\"Found %zu filename matches\\n\",gstruct.gl_pathc);\r\n    found = gstruct.gl_pathv;\r\n    while(*found)\r\n    {\r\n        printf(\"%s\\n\",*found);\r\n        found++;\r\n    }\r\n\r\n    return(0);\r\n}<\/pre>\n<p>The <em>glob()<\/em> function is called at Line 11 with the wildcard argument <code>*.c<\/code>. Errors are handled at Line 13. When an error occurs (<code>r!=0<\/code>), a second check is done at Line 15 with the defined constant <code>GLOB_NOMATCH<\/code>. This condition reflects when no files match the wildcard given, and an appropriate message is output.<\/p>\n<p>Upon success, the number of matching files held in the <em>gl_pathc<\/em> member of structure <code>gstruct<\/code> is output at Line 23. Double-pointer <code>found<\/code> is assigned to the base of the linked list referenced by <code>gstruct.gl_pathv<\/code>. A <em>while<\/em> loop processes and outputs the the names. Here is sample output:<\/p>\n<p><code>Found 4 filename matches<br \/>\ncowsbulls.c<br \/>\nfun.c<br \/>\nlogs.c<br \/>\nmem_binary.c<\/code><\/p>\n<p>In <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=4801\">next week&#8217;s Lesson<\/a>, I return to the original question of how to handle wildcards in command line input.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>You may know them as wildcards in Windows, but in the Linux\/Unix universe it&#8217;s the <em>glob<\/em>. <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=4782\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-4782","post","type-post","status-publish","format-standard","hentry","category-main"],"_links":{"self":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/4782","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4782"}],"version-history":[{"count":9,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/4782\/revisions"}],"predecessor-version":[{"id":4833,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/4782\/revisions\/4833"}],"wp:attachment":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4782"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4782"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4782"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}