CSCI 4061 HW08: mmap() / Basic Signals and Handlers
- Due: 11:59pm Mon 3/22/2021
- Approximately 0.83% of total grade
- Homework and Quizzes are open resource/open collaboration. You must submit your own work but you may freely discuss HW topics with other members of the class.
CODE DISTRIBUTION: hw08-code.zip
CHANGELOG: Empty
1 Rationale
Files are often stored in "binary format" for efficiency of storage
and access. Rather than more familiar formatted text formats, these
formats require use of binary file I/O to manipulate them, frequently
low level Unix read() / write()
calls. They also often require
jumping to different positions in the file which can be done via the
lseek()
system call. These are explored in this HW.
A viable alternative to file I/O is to make use of memory mapped files
through mmap()
. This utilizes a system call to expose files as a
pointer into operating system managed space which holds parts of the
file in main memory. While equivalent in power to standard I/O,
mmap()
avoids the need for intermediate buffers and allows pointer
arithmetic to be used to locate and alter the file.
Signals are one of the simplest forms of communication between
processes. They are essential for management of running programs and
can also be used for other purposes. This HW explores the C function
kill()
which sends signals and the signal handler setup function
signal()
which can allow signals to be caught and handled. It
assumes basic familiarity with the shell commands kill
and pkill
which also send signals primarily to terminate misbehaving processes.
1.1 Associated Reading
Stevens/Rago Ch 3 covers basic I/O functions like read() / write()
as well as lseek()
in Ch 3.6. These functions work equally as well
for text and binary data.
Stevens/Rago Ch 14 discusses advanced I/O techniques with Ch 14.8
covering mmap()
for creating a memory mapped file. Optionally,
Bryant and O'Hallaron's "Computer Systems: A Programmers Perspective"
also has some coverage of mmap()
in section 9.8.4. This textbook is
mentioned as it is the required text for CSCI 2021, a prerequisite to
CSCI 4061.
Ch 10 of Stevens/Rago discusses basics of signals and signal handlers.
1.2 Grading Policy
Credit for this HW is earned by taking the associate Quiz which is
linked under Gradescope
. The quiz will ask similar questions as
those that are present in the QUESTIONS.txt
file and those that
complete all answers in QUESTIONS.txt
should have no trouble with
the quiz.
See the full policy in the syllabus.
2 Codepack
The codepack for the HW contains the following files:
File | |
---|---|
QUESTIONS.txt |
Questions to answer |
binfiles-mmap/ |
Directory for Problems 1-2 |
Makefile |
Makefile to build Problem 2/3 programs |
department.h |
Header file for programs |
make_dept_directory.c |
Problem 1-2 program to create data file |
cse_depts.dat.bk |
Backup of data file created in Problem 1-2 |
print_department_read.c |
Problem 1 program to analyze |
print_department_mmap.c |
Problem 2 program to analyze |
signals/ |
Problem 3 directory |
circle_of_life.c |
Problem 3 code to analyze |
birth_death.c |
Problem 3 code to analyze |
no_interruptions.c |
File with signal handler for Problem 2 |
3 What to Understand
Ensure that you understand
- How data in files can be directly
read()
into arrays and structs. - Use of the
lseek()
system call to move to a desired byte position in a file - Use of
mmap()
to create a memory mapped file for reading - How to send signals to other process with the
kill()
function - How a process can detect whether a child was signaled and if the signal was terminal
- How processes can set up simple signal handlers and which signals cannot be handled.
4 Questions
_________________ HW 08 QUESTIONS _________________ - Name: (FILL THIS in) - NetID: (THE kauf0095 IN kauf0095@umn.edu) Write your answers to the questions below directly in this text file. HW quiz questions will be related to the questions in this file. PROBLEM 1: Binary File Format w/ Read ===================================== A ~ Compile all programs in the directory `binfiles/' with the provided `Makefile'. Run the command ,---- | ./make_dept_directory cse_depts.dat `---- to create the `cse_depts.dat' binary file. Examine the source code for this program along with the header `department.h'. - What system calls are used in `make_dept_directory.c' to create this file? - How is the `sizeof()' operator used to simplify some of the computations in `make_dept_directory.c'? - What data is in `cse_depts.dat' and how is it ordered? B ~ Run the `print_department_read' program which takes a binary data file and a department code to print. Show a few examples of running this program with the valid command line arguments. Include in your demo runs that - Use the `cse_depts.dat' with known and unknown department codes - Use a file other than `cse_depts.dat' C ~ Study the source code for `print_department_read' and describe how it initially prints the table of offsets shown below. ,---- | Dept Name: CS Offset: 104 | Dept Name: EE Offset: 2152 | Dept Name: IT Offset: 3688 `---- What specific sequence of calls leads to this information? D ~ What system call is used to skip immediately to the location in the file where desired contacts are located? What arguments does this system call take? Consult the manual entry for this function to find out how else it can be used. PROBLEM 2: mmap() and binary files ================================== An alternative to using standard I/O functions is "memory mapped" files through the system call `mmap()'. The program `print_department_mmap.c' provides the functionality as the previous `print_department_read.c' but uses a different mechanism. (A) ~~~ Early in `print_department_mmap.c' an `open()' call is used as in the previous program but it is followed shortly by a call to `mmap()' in the lines ,---- | char *file_bytes = | mmap(NULL, size, PROT_READ, MAP_SHARED, | fd, 0); `---- Look up reference documentation on `mmap()' and describe some of the arguments to it including the `NULL' and `size' arguments. Also describe its return value. (B) ~~~ The initial setup of the program uses `mmap()' to assign a pointer to variable `char *file_bytes'. This pointer will refer directly to the bytes of the binary file. Examine the lines ,---- | //////////////////////////////////////////////////////////////////////////////// | // CHECK the file_header_t struct for integrity, size of department array | file_header_t *header = (file_header_t *) file_bytes; // binary header struct is first thing in the file `---- Explain what is happening here: what value will the variable `header' get and how is it used in subsequent lines. (C) ~~~ After finishing with the file header, the next section of the program begins with the following. ,---- | //////////////////////////////////////////////////////////////////////////////// | // SEARCH the array of department offsets for the department named | // on the command line | | dept_offset_t *offsets = // after file header, array of dept_offset_t structures | (dept_offset_t *) (file_bytes + sizeof(file_header_t)); | `---- Explain what value the `offsets_arr' variable is assigned and how it is used in the remainder of the SEARCH section. (D) ~~~ The final phase of the program begins below ,---- | //////////////////////////////////////////////////////////////////////////////// | // PRINT out all personnel in the specified department | ... | contact_t *dept_contacts = (contact_t *) (file_bytes + offset); `---- Describe what value `dept_contacts' is assigned and how the final phase uses it. PROBLEM 3: `birth_death.c' ========================== A ~ Compile `circle_of_life.c' to the program `circle_of_life' and run it. Examine the results and feel free to terminate execution early. Examine the source code if desired though it is merely a print/sleep loop. Compile `birth_death.c' to the program `birth_death'. This program is invoked with two arguments, another program name and a "lifetime" which is an integer number of seconds. Run it like ,---- | $> ./birth_death ./circle_of_life 4 `---- and show the output below. B ~ Examine the source code for `birth_death.c' and determine the system call the parent program (`birth_death') uses to send signals to the child program. Paste this line below and explain which signal is being sent. C ~ `birth_death.c' waits for a child to finish then outputs what signal caused it to be terminated if that was the cause of death. Paste the lines of code which determine if a child was terminated due to a signal below and mention the macros used for this purpose. D ~ Compile the program `no_interruptions.c' and run it with `birth_death'. Show your results below. Note that you may need to send signals to `no_interruptions' to forcibly end it. The `pkill' command is useful for this as in ,---- | pkill no_inter # send TERM signal to proc name matching "no_inter" | pkill -KILL no_inter # send KILL signal to proc name matching "no_inter" `---- E ~ Examine the `no_interruptions.c' code and describe how it is able to avoid being killed when receiving the interrupt and TERM signals. Show the lines of code used to accomplish this signal handling.