Last Updated: 2024-11-11 Mon 14:37

CMSC216 Lab10: I/O Redirection and File Statistics

CODE DISTRIBUTION: lab10-code.zip

  • Download the code distribution
  • See further setup instructions below

CHANGELOG:

Mon Nov 11 02:34:43 PM EST 2024

A second bug in Lab10 was reported wherein the testing files for problems 1 and 2 are named out swapped. This and the earlier Makefile problem can be corrected by running the following command in the lab10-code directory:

  curl https://www.cs.umd.edu/~profk/216/lab10-update.sh |bash

This will eliminate the incorrect versions of testing files and update them to correct ones.

Mon Nov 11 11:50:54 AM EST 2024

A bug in the Lab10 Makefile was identified which affects the command make test-prob1. The following in Makefile should be corrected as indicated:

  68: test-prob1 : test-setup redirect_child
  69:	./testy test_redirect_child.org  # CORRECT: add "_child" to existin test_redirect.org
  #     ./testy test_redirect.org        # INCORRECT

The Makefile has been updated in the current codepack.

1 Rationale

This lab covers two independent but short concepts in separate problems.

Unix maintains a table of open File Descriptors for each running process. Using the dup() and dup2() system calls, programs can manipulate this table to achieve interesting effects, notably redirection of output from standard locations to other places. This exercise demonstrates some common techniques for doing so and will acquaint students with the basics of how the file descriptor table works and how it is inherited by child processes.

Frequently systems programs must check for the existence of file and interrogate their properties. The first exercise studies two common system calls for this task, access() and stat(). It employs them in a few simple tasks to acquaint students with their use.

Grading Policy

Credit for this exercise is earned by completing the code/asnwers here and submitting a Zip of the work to Gradescope. Students are responsible to check that the results produced locally via make test are reflected on Gradescope after submitting their completed Zip. Successful completion earns 1 Engagement Point.

Lab Exercises are open resource/open collaboration and students are encouraged to cooperate on labs. Students may submit work as groups of up to 5 to Gradescope: one person submits then adds the names of their group members to the submission.

See the full policies in the course syllabus.

2 Codepack

The codepack for this exercise is linked at the top of this document. Always download it and unzip/unpack it. It should contain the following files which are briefly described.

File Use Description
QUESTIONS.txt EDIT Questions to answer: fill in the multiple choice selections in this file.
     
switch_stdout.c Study Problem 1: C file to study to answer QUIZ questions
redirect_child.c EDIT Problem 1: I/O redirection program to complete
nums.txt Data Problem 1: Data used in the tests
     
stat_demo.c Study Problem 2 demo showing access() / stat() system calls
newer_file.c EDIT Problem 2 code to complete
QUESTIONS.txt.bk Backup Backup copy of the original file to help revert if needed
Makefile Build Enables make test and make zip
testy Testing Test running scripts
test_lab10.org Testing Tests for this exercise

3 Problem 1: I/O Redirection

Programs often need to deal with open files for reading and writing. The UNIX Operating System (Linux included in this) maintains a data structure called the File Descriptor Table for all open files. Some entries in this table are automatically created like Standard Input and Standard Output. Others are created via the open() system call. The table is maintained in Kernel Space and can only be altered via system calls like open() / close() / ~dup() / dup2().

It is useful to have some diagrams of how the dup() and dup2() system calls manipulate the table of file descriptors. The following diagrams will be discussed in lecture and may be used by course staff to assist students in understanding how programs like switch_stdout.c work.

Fork and Child File Descriptors

fork-fds.png

Figure 1: Effects of open()'ing a file then calling fork() : the child and parent both refer to the same open file.

dup() and dup2() System calls

dup-and-dup2.png

Figure 2: LEFT: Effect of calling dup() to create a duplicate file descriptor table entry. RIGHT: Effect of calling dup2() to overwrite on file descriptor entry with another.

4 Problem 2: stat() and access() System Calls

Basic file statistics such as file size, modification times, and general file type are maintained by the operating system and are made available to programs through system calls like stat(). This populates a struct statbuf with data on the file and the stat_demo.c file shows the contents and means to access parts of this struct. Occasionally simple checks like whether a particular file exists are needed for which the simpler access() system call is useful.

This problem demonstrates stat() / access() and asks students to complete a program that uses them.

5 QUESTIONS.txt File Contents

Below are the contents of the QUESTIONS.txt file for the exercise. Follow the instructions in it to complete the QUIZ and CODE questions for the exercise.

                           _________________

                            LAB10 QUESTIONS
                           _________________





Exercise Instructions
=====================

  Follow the instructions below to experiment with topics related to
  this exercise.
  - For sections marked QUIZ, fill in an (X) for the appropriate
    response in this file. Use the command `make test-quiz' to see if
    all of your answers are correct.
  - For sections marked CODE, complete the code indicated. Use the
    command `make test-code' to check if your code is complete.
  - DO NOT CHANGE any parts of this file except the QUIZ sections as it
    may interfere with the tests otherwise.
  - If your `QUESTIONS.txt' file seems corrupted, restore it by copying
    over the `QUESTIONS.txt.bk' backup file.
  - When you complete the exercises, check your answers with `make test'
    and if all is well, create a zip file with `make zip' and upload it
    to Gradescope. Ensure that the Autograder there reflects your local
    results.
  - IF YOU WORK IN A GROUP only one member needs to submit and then add
    the names of their group.


QUIZ Problem 1: Questions on switch_stdout.c
============================================

  Analyze the `switch_stdout.c' program. Compile and run it via
  ,----
  | > make switch_stdout
  | ...
  | > ./switch_stdout
  | ...
  `----

  Analyze the code and focus your attention on the use of `open() /
  dup() / dup2()' which this program demonstrates.

  Answer the following Questions about the techniques used in this
  program. You may need to consult the Manual Page / Documentation on
  some functions to answer confidently.


Program Output
~~~~~~~~~~~~~~

  Which of the following is the output for `switch_stdout' when run?
  (each of 1. 2. 3. appear on separate lines in the output)
  - ( ) 1. Now you see me. 2. Now you don't!  3. How mysterious...
  - ( ) 1. Now you see me. 2. Now you don't!
  - ( ) 1. Now you see me. 3. How mysterious...
  - ( ) 1. Now you see me.


open() system call
~~~~~~~~~~~~~~~~~~

  The `open()' system call is used to open a file for writing in the
  example.  What is returned by this system call?
  - ( ) A `FILE *' which is passed to subsequent I/O operations or
    `NULL' for failure
  - ( ) An integer file descriptor which is >= 0 for success and -1 for
    failure
  - ( ) An integer return code that is 1 for success and 0 for failure
  - ( ) A `char *' which is the name of the opened file or `NULL' for
    failure


Use of dup()
~~~~~~~~~~~~

  Which of the following best describes how the `dup()' system call is
  used in `switch_stdout.c'?
  - ( ) It creates a duplicate of a file descriptor allowing standard
    output to be restored to the screen late in the program.
  - ( ) It manipulates the file descriptor table so output that would go
    to the screen goes into a file instead.
  - ( ) It duplicates an existing file creating an efficient copy of it
    on disk.
  - ( ) It creates a child process that prints to a file instead of the
    screen.


Use of dup2()
~~~~~~~~~~~~~

  Which of the following best describes how the `dup2()' system call is
  used in `switch_stdout.c'?
  - ( ) It creates a duplicate of a file descriptor allowing standard
    output to be restored to the screen late in the program.
  - ( ) It manipulates the file descriptor table so output that would go
    to the screen goes into a file instead.
  - ( ) It duplicates an existing file creating an efficient copy of it
    on disk.
  - ( ) It creates a child process that prints to a file instead of the
    screen.


printf() changes in behavior
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  Good old `printf()' is used in `switch_stdout.c' in several places but
  seems to change its behavior in some of these spots. Which of the
  following best describes this variation in behavior?
  - ( ) `printf()' is called with different arguments that cause it to
    print to different destinations, sometimes standard output,
    sometimes a file
  - ( ) `printf()' is called the same way in each case but automatically
    begins printing to a file that is `open()''d and when it is
    `close()''d, `printf()' reverts to printing to the screen
  - ( ) `printf()' is called the same in each case and always prints to
    standard output but by changing what is in the file descriptor table
    at that position, output goes to the screen or to a file.


CODE Problem 1: Complete redirect_child.c
=========================================

  An incomplete file called `redirect_child.c' is present in the code
  pack which is intended to preform the following operations.
  1. Create a child process
  2. The child process redirects its output into a file
  3. The child process then exec()'s a new process image which will
     place output in the file specified
  4. The parent process blocks until the child is complete and then
     shows the output file.

  A skeleton is provided but a number of TODO/??? items must be filled
  into to complete the program.  A number of the techniques are required
  to complete the code
  - Process manipulation from previous labs
  - I/O redirection demonstrated in this lab in `switch_stdout.c'

  Employ these to complete the program and pass the tests available in
  `make test-code'.

  A correct run looks like the following.
  ,----
  | > make
  | gcc -Wall -Werror -g  -o switch_stdout switch_stdout.c
  | gcc -Wall -Werror -g  -o redirect_child redirect_child.c
  | 
  | > ./redirect_child 
  | usage: ./redirect_child <childfile>
  | 
  | > ./redirect_child childout.txt
  | Removing file 'childout.txt' prior to run via a subshell
  | Creating a child to do 'wc'
  | Parent waiting for child to complete
  | Child redirecting output to 'childout.txt', then exec()'ing
  | Child complete, return code 0
  | Showing output of 'childout.txt' via 'cat' command
  | 25 25 66 nums.txt
  | 
  | > cat childout.txt 
  | 25 25 66 nums.txt
  `----


QUIZ PROBLEM 2: Questions on stat_demo.c
========================================

  Analyze the `stat_demo.c' program. Compile and run it via
  ,----
  | >> make stat_demo
  | ...
  | >> ./stat_demo somefile.txt
  | ...
  | >> ./stat_demo a_dirname/
  `----

  Finally, you can contrast the behavior of `filestats.c' to the shell
  command `stat' which provides similar functionality.

  Answer the following questions about how the system call works.


File Access
~~~~~~~~~~~

  Which of the filling best describes how the `access()' system call is
  used?
  - ( ) It alters read/write access to a file by adjusting its mode bits
  - ( ) It is called to determine if a file exists using the `F_OK' flag
  - ( ) It ensures a file exists and can be accessed by creating if
    necessary
  - ( ) It must be called before `stat()' in order to initialize a
    `struct stat' data structure


File Size
~~~~~~~~~

  How does `stat()' report the Size of a file?
  - ( ) An integer is passed to the calls to be set to the size
  - ( ) The return value is the number of bytes in the file
  - ( ) The size is printed to standard output during the function call
  - ( ) The field `sb.st_size' contains the number of bytes in file


File Kind
~~~~~~~~~

  `stat()()' report the "kind" of file being queried. How is this
  information used in `stat_demo.c'?
  - ( ) The field `sb.st_filetype' is set to a string like "file" or
    "pipe" to indicate the type of a file and that field can be printed
    directly
  - ( ) An integer passed in as an address to the call is set to the
    type of the file
  - ( ) The kind is encoded in the `sb.st_mode' field of the struct and
    macros are used to distinguish the kind and print an appropriate
    message.
  - ( ) The kind is encoded in the `sb.st_mode' field of the struct and
    functions are used to distinguish the kind and print an appropriate
    message.

  Which one of the following is NOT a file "kind" reported by `stat() /
  lstat()'
  - ( ) Directory
  - ( ) Binary
  - ( ) Socket
  - ( ) Symbolic Link (symlink)


mtime / ctime
~~~~~~~~~~~~~

  Do some research and determine the difference between the `ctime' and
  `mtime' fields that are reported by `stat() / lstat()'. Which of the
  following best describes this difference:
  - ( ) `mtime' and `ctime' actually always report the same time and
    their redundancy is a historical oddity.
  - ( ) `mtime' indicates the last time that the file was moved from one
    directory to another while `ctime' indicates the last change to the
    data in the file
  - ( ) `mtime' is the last time of modification when data was altered
    in the file while `ctime' is the last access time when data was read
    from the file
  - ( ) `mtime' is when the actual data of a file changes while `ctime'
    is associated with permissions, links, or other meta data changes
    associated with the file.


Additional Items to Observe
~~~~~~~~~~~~~~~~~~~~~~~~~~~

  The function `ctime()' and `strmode()' both use an interesting
  technique to make it possible to easily produce a printable string for
  situations like those in `filestats.c'.  Time permitting, examine the
  source code for `strmode()' in `strmode.c' and discuss with a TA how a
  string is returned but there is no requirement to free that string.

  Finally, like many system calls, `stat()' has a command-line
  equivalent in the form of the `stat' command which prints to the
  screen many of the statistics of a file and is useful at times.
  ,----
  | >> stat A.txt
  |   File: A.txt
  |   Size: 2         	Blocks: 8          IO Block: 4096   regular file
  | Device: 254,1	Inode: 7080409     Links: 1
  | Access: (0644/-rw-r--r--)  Uid: ( 1000/kauffman)   Gid: (  985/   users)
  | Access: 2024-11-07 14:15:55.332936076 -0500
  | Modify: 2024-11-07 14:10:55.332925991 -0500
  | Change: 2024-11-07 14:15:55.329602657 -0500
  |  Birth: 2024-11-06 16:43:41.436780892 -0500
  `----


CODE PROBLEM 2: newer_file.c Program
====================================

  Fill in the template code provided in `newer_file.c'. The intent of
  the program is to ensure that two named files exist and then compare
  the modification time of them to print an older vs newer file.  This
  will require use of the `access()' and `stat()' system calls as well
  as a `diff_timespec()' function that is provided in the
  template. Below is a demonstration of how the complete program should
  work.

  ,----
  | >> make newer_file              # build
  | gcc -Wall -Werror -g -Og -o newer_file newer_file.c
  | 
  | >> ./newer_file
  | Usage: ./newer_file <file1> <file2>
  | 
  | >> echo A > A.txt               # create 3 files
  | >> echo B > B.txt               
  | >> echo C > C.txt               # newest
  | >> touch -d '-5min' A.txt       # oldest
  | >> touch -d '-3min' B.txt       # middle
  | 
  | >> ./newer_file A.txt B.txt
  | A.txt is OLDER than B.txt
  | 
  | >> ./newer_file C.txt B.txt
  | C.txt is NEWER than B.txt
  | 
  | >> ./newer_file C.txt C.txt
  | C.txt and C.txt are EQUAL in age
  | 
  | >> ./newer_file C.txt X.txt
  | X.txt cannot be accessed
  | 
  | >> ./newer_file Z.txt X.txt
  | Both Z.txt and X.txt cannot be accessed
  `----

6 Submission

Follow the instructions at the end of Lab01 if you need a refresher on how to upload your completed exercise zip to Gradescope.


Web Accessibility
Author: Chris Kauffman (profk@umd.edu)
Date: 2024-11-11 Mon 14:37