{"id":4595,"date":"2021-02-06T00:01:07","date_gmt":"2021-02-06T08:01:07","guid":{"rendered":"https:\/\/c-for-dummies.com\/blog\/?p=4595"},"modified":"2021-02-13T09:26:42","modified_gmt":"2021-02-13T17:26:42","slug":"parsing-and-converting","status":"publish","type":"post","link":"https:\/\/c-for-dummies.com\/blog\/?p=4595","title":{"rendered":"Parsing and Converting"},"content":{"rendered":"<p>The goal stated in <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=4568\">last week&#8217;s Lesson<\/a> is to convert a date formatted in a filename string into a <em>time_t<\/em> value. The filename string must be scanned for expected year, month, and date values. This process involves a custom function, <em>convert()<\/em>, as well as the <em>strtol()<\/em> function to translate strings of digits into <em>long int<\/em> values.<br \/>\n<!--more--><br \/>\nThe <em>convert()<\/em> function references a string and a length. It pulls the given number of characters from the string, creating a substring for further manipulation:<\/p>\n<p><code>char *convert( char *s, int size )<\/code><\/p>\n<p>Argument <code>*s<\/code> references the string, <em>int<\/em> variable <code>size<\/code> references the substring&#8217;s length. Characters are copied one at a time from <code>s<\/code> to a <em>static char<\/em> <code>buffer[]<\/code>. As <em>static<\/em> storage, the buffer&#8217;s contents aren&#8217;t discarded when the function terminates, which is how the value (its address) is returned.<\/p>\n<p>Within the <em>convert()<\/em> function, a <em>for<\/em> loop processes each character in the string according to the <code>size<\/code> value. An <em>if<\/em> test checks for a period (separating the filename from the extension) and the null character. If encountered, the program terminates as the filename string is most likely improperly formed. Otherwise, the character is set into the buffer:<\/p>\n<p><code>buffer[x] = c<\/code>.<\/p>\n<p>After the <em>for<\/em> loop ends, the <code>buffer[]<\/code> string is capped with a null character terminator, <code>buffer[x] = '\\0'<\/code>. The address of <code>buffer[]<\/code> is returned and used in the <em>strtol()<\/em> function to generate a <em>long int<\/em> value in the <em>main()<\/em> function. For example:<\/p>\n<p><code>month = strtol(convert(filename+4,2),NULL,10);<\/code><\/p>\n<p>Above, a substring two characters long is extracted from the fifth character of <code>filename<\/code> (<code>filename+x<\/code>) and returned as its own string. The new string  is used immediately in the <em>strtol()<\/em> function to obtain an integer value. This value is stored in the <code>month<\/code> variable.<\/p>\n<p>Here is the full code:<\/p>\n<h3><a href=\"https:\/\/github.com\/dangookin\/C-For-Dummies-Blog\/blob\/master\/2021_02_06-Lesson.c\" rel=\"noopener\" target=\"_blank\">2021_02_06-Lesson.c<\/a><\/h3>\n<pre class=\"screen\">\r\n#include &lt;stdio.h&gt;\r\n#include &lt;stdlib.h&gt;\r\n\r\n<span class=\"comments\">\/* copy and convert the digits *\/<\/span>\r\nchar *convert( char *s, int size )\r\n{\r\n    int x;\r\n    static char buffer[5];\r\n    char c;\r\n\r\n    <span class=\"comments\">\/* avoid buffer overflow *\/<\/span>\r\n    if( size &gt; 4 )\r\n    {\r\n        fprintf(stderr,\"Buffer overflow: %d\\n\",size);\r\n        exit(1);\r\n    }\r\n\r\n    <span class=\"comments\">\/* process the given number of characters *\/<\/span>\r\n    for( x=0; x&lt;size; x++ )\r\n    {\r\n        c = *(s+x);\r\n        if( c=='.' || c=='\\0' )\r\n        {\r\n            fprintf(stderr,\"Malformed filename\\n\");\r\n            exit(2);\r\n        }\r\n        buffer[x] = c;\r\n    }\r\n    buffer[x] = '\\0';\r\n\r\n    return(buffer);\r\n}\r\n\r\nint main(int argc, char *argv[])\r\n{\r\n    char *filename;\r\n    int year, month, day;\r\n    \r\n    <span class=\"comments\">\/* check for filename argument *\/<\/span>\r\n    if( argc&lt;2 )\r\n    {\r\n        <span class=\"comments\">\/* output error message to standard error *\/<\/span>\r\n        fprintf(stderr,\"Filename option required\\n\\n\");\r\n        <span class=\"comments\">\/* leave with exit code 1*\/<\/span>\r\n        exit(1);\r\n    }\r\n    <span class=\"comments\">\/* assign to pointer for convenience *\/<\/span>\r\n    filename = argv[1];\r\n\r\n    <span class=\"comments\">\/* code to confirm that the file exists goes here *\/<\/span>\r\n    <span class=\"comments\">\/* ... *\/<\/span>\r\n    \r\n    <span class=\"comments\">\/* extract integers. *\/<\/span>\r\n    year = strtol(convert(filename+0,4),NULL,10);\r\n    month = strtol(convert(filename+4,2),NULL,10);\r\n    day = strtol(convert(filename+6,2),NULL,10);\r\n\r\n    <span class=\"comments\">\/* output results *\/<\/span>\r\n    printf(\"%4d %2d %2d\\n\",year,month,day);\r\n\r\n    return(0);\r\n}<\/pre>\n<p>This code assumes the filename to be in the proper format; no checking is done beyond confirming an argument presents itself (Line 12). Here is the output generated when using the filename <code>20210115.txt<\/code>:<\/p>\n<p><code>2021&nbsp;&nbsp;1 15<\/code><\/p>\n<p>And for the filename <code>2021okay.txt<\/code>:<\/p>\n<p><code>2021&nbsp;&nbsp;0&nbsp;&nbsp;0<\/code><\/p>\n<p>The <em>convert()<\/em> function doesn&#8217;t validate proper input. Any non-digit text is translated into zero values, as shown in the above output. This condition could be tested for later in the code, though my feeling is that zero is still a valid number.<\/p>\n<p>In <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=4590\">next week&#8217;s Lesson<\/a>, I add the time functions required to convert the <code>year<\/code>, <code>month<\/code>, and <code>day<\/code> int values into a <em>time_t<\/em> value.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Data from a filename string must be parsed into individual year, month, and day integers. <a href=\"https:\/\/c-for-dummies.com\/blog\/?p=4595\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-4595","post","type-post","status-publish","format-standard","hentry","category-main"],"_links":{"self":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/4595","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=4595"}],"version-history":[{"count":5,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/4595\/revisions"}],"predecessor-version":[{"id":4627,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/4595\/revisions\/4627"}],"wp:attachment":[{"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=4595"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=4595"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/c-for-dummies.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=4595"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}