CMSC216 HW11: Memory Mapping, pmap, Threads, and Worms
Homeworks are optional practice with no submission and no credit
CODE DISTRIBUTION: hw11-code.zip
CHANGELOG: Empty
Table of Contents
1 Rationale
On modern computing systems, virtual memory creates the illusion that
every program has a linear address space from 0 to some large
address. Mostly this happens behind the scenes and is managed by the
operating system but knowledge of presence of virtual addresses
provides insight into many aspects of practical programming. One can
inspect some of the OS information on the virtual address space of a
program using utilities such as pmap.
Basic working knowledge of a threading library is an essential element in any system programmer's toolkit. This HW is focused on the basics of the pthreads (POSIX Threads) library due to its widespread availability. The provided code introduces basic thread usage via a simple visual demo where threads run at different rates. Basic mechanics such as passing threads initialization arguments and utilizing mutexes to avoid race conditions are demonstrated. Completing the lab will give basic knowledge of the pthreads library and use of mutexes to coordinate multiple threads.
Associated Reading / Preparation
Bryant and O'Hallaron: Ch 9 on Virtual Memory is informative for it's
coverage virtual memory in general. The mmap() function is discussed
in section 9.8.4. The overview of virtual memory is useful to
understand the output of pmap.
Bryant and O'Hallaron: Ch 12.3-6 covers threaded programming. Ch 12.5 discusses coordinating threads with Semaphores which are nearly identical in their use there as the Mutexes that are used in the provided code.
While not the focus, the provided code also utilizes Software Signals to terminate cleanly, covered in B&O Ch 8.5.
Grading Policy
Homework provides optional extra practice for students looking for
it. No submission is collected and the HW is not worth any course
credit. Solutions will be posted some time after the HW is posted.
Students approaching course staff for additional practice problems
will be asked to explain their own written answers to the
QUESTIONS.txt documents in HWs before any additional problems will
be constructed.
2 Codepack
The codepack for the lab contains the following files:
| File | Description |
|---|---|
QUESTIONS.txt |
Questions to answer |
memory-parts/ |
Problem 1 Directory |
Makefile |
Makefile to build Problem 1 programs |
memory_parts.c |
Problem 1 program to analyze |
gettysburg.txt |
Problem 1 data file |
pthread-worms/ |
Problem 2 Directory |
worms_pthread.c |
Problem 2 Code to analyze |
worms.py |
Problem 2 Optional file to examine |
3 Questions
Analyze these files and answer the questions given in QUESTIONS.txt.
_________________
HW 11 QUESTIONS
_________________
HWs are optional practice and there is no submission.
PROBLEM 1: Virtual Memory and pmap
==================================
Code for this problem is in the `memory-parts' subdirectory.
(A) memory_parts memory areas
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Examine the source code for the provided `memory-parts/memory_parts.c'
program. Identify what region of program memory you expect the
following variables to be allocated into:
- global_arr[]
- stack_arr[]
- heap_arr
(B) Running memory_parts and pmap
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Compile the `memory_parts' using the provided Makefile.
,----
| >> make
| gcc -Wall -g -Og -o memory_parts memory_parts.c
`----
Run the program and note that it prints several pieces of information
- The addresses of several of the variables allocated
- Its Process ID (PID) which is a unique number used to identify the
running program. This is an integer.
For example, the output might be
,----
| >> ./memory-parts
| 0x556e124d31e9 : main()
| 0x556e124d60a0 : global_arr
| 0x556e2f3ca010 : heap_arr
| 0x600000000000 : mmap'd block1
| 0x600000001000 : mmap'd block2
| 0x7f75897b6000 : mmap'd block3
| 0x7f75897b5000 : mmap'd file
| 0x7ffe7b9c5d40 : stack_arr
| my pid is 2760199
| press any key to continue
`----
so the programs PID is 2760199
The program will also stop at this point until a key is pressed. DO
NOT PRESS A KEY YET.
Open another terminal and type the following command in that new
terminal.
,----
| >> pmap THE-PID-NUMBER-THAT-WAS-PRINTED-EARLIER
`----
Paste the output of pmap below.
(C) Program Addresses vs Mapped Addresses
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pmap prints out the virtual address space table for the program. The
leftmost column is a virtual address mapped by the OS for the program
to some physical location. The next column is the size of the area of
memory associated with that starting address. The 3rd column contains
permissions of the program has for the memory area: r for read, w for
read, x for execute. The final column is contains any identifying
information about the memory area that pmap can discern.
Compare the addresses of variables and functions from the paused
program to the output. Try to determine the virtual address space in
which each variable resides and what region of program memory that
virtual address must belong to (stack, heap, globals, text). In some
cases, the identifying information provided by pmap may make this
obvious.
(D) Min Size of Mapped Areas
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The minimum size of any virtual area of memory appears to be 4K. Why
is this the case?
(E) Additional Observations
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Notice that in addition to the "normal" variables that are mapped,
there are entries for
- mmap'd block1
- mmap'd block2
- mmap'd block3
- mmap'd file
in the memory_parts output. Each of these is created with the `mmap()'
system call.
Find the corresponding entries in the output of pmap and report them
below. Note anything that is shown in the code calls to `mmap()' that
affect the virtual addresses chosen for each of these and the size of
the area mapped.
The Many Uses of mmap()
~~~~~~~~~~~~~~~~~~~~~~~
The mmap() function is useful for a variety of purposes as it allows
direct mapping of virtual addresses for use by a program. It is used
here to
1. Obtain anonymous memory blocks
2. Get access a file's contents through a memory mapped pointer
Labs and projects will build on #1 above by showing how anonymous
blocks can be used to create expandable array data structures or
create the heap for malloc() implementations. #2 is also worth of
exploring: do some research on "memory mapped files" to turn up
tutorials on this (or chat up a certain friendly 216 professor if you
want some more problems on the topic).
PROBLEM 2: Setup
================
A
~
Compile and run `worm_pthread.c' using the provided Makefile. Make
sure to follow the program prompts. Describe what you observe as the
program runs on the terminal.
Note: when compiling code that uses PThreads library functions, link
against that library via `-lpthread'.
B
~
Examine the code and describe the types data associated with each of
the following elements in the program.
- The "board" which has several pieces of data
- The "worms" which have several pieces of data
C
~
Describe the area of code in `main()' which creates threads and awaits
them finishing.
- Describe the function which is used to start a thread running and
its arguments. Consult the manual for documentation on this function
if needed.
- Describe the function which is used to wait until a thread finishes.
PROBLEM 2: The Worms
====================
A
~
Examine the `worm_func()' function carefully. You may wish to
consider the specific data passed to these worms which are in the
array of `wormparam_t' structs near the top of the source code.
Describe the basic algorithm that worms follow to do their work.
B
~
Describe how worms avoid both claiming the same piece of
territory. What system call mechanism is used to ensure that two worms
cannot claim the same area of the board? Describe the function calls
that are used to accomplish this and what functions in `main()' are
used to set up this coordination mechanism.
C
~
Describe the logic that appears to cause worms to be 'fast' and
'slow': how is this artificial speed actually created in the
`worm_func()' code.
While the speed differences of worms is an intended creation of the
program, speculate as to what use threads can be when dealing with
entities of different speeds.
Optional: The Printer
=====================
The printing thread runs `print_func()'. This does not introduce any
new concurrency techniques. However it does use some special display
tricks, printing extremely strange output like
,----
| printf("\33[s"); // save cursor position
| printf("\33[?25l"); // hide cursor to avoid flicker
| printf("\33[u"); // restore cursor position
`----
This type of output also appears in some other places in the program.
The strange looking characters are special sequences for *terminal
control*. The program interpreting output can be manipulated in
various ways to change the cursor position, color of text, and so
forth. This is an extremely old set of conventions that harkens back
to a day when these special sequences would physically manipulate
aspects of the machine in which they appeared. Modern "terminal
emulators" such as Gnome Terminal or XTerm which show a command line
honor these conventions making it possible for programs to display
information in the same position as is done here or take over the
entire screen of the terminal such as vi and gdb do.
Discussion of terminal control is beyond the scope of our course but
does receive some treatment in Stevens/Rago and elsewhere. You can
read more about the history and insanity of terminal control and the
escape sequences that give some control over them in
<https://en.wikipedia.org/wiki/ANSI_escape_code>
To program a more fully-featured terminal program, use a robust
terminal control library such as ncurses.
<https://en.wikipedia.org/wiki/Ncurses> This library is used in the
below Python version of the worms program.
Problem Optional Enrichment: Threads in Python
==============================================
Threads are not unique to C and the Pthreads library. Most modern
programming environments include some version of them. The prototype
version of the worms program was written in Python and can be found in
`worms.py' and can be run via
,----
| > ./worms.py
`----
- No territory or step numbers are tracked in the Python version; it
runs until Ctrl-c is pressed
- The `curses' library is used to manipulate the terminal screen which
allows for boxes and color to be used but complicates the
implementation somewhat.
- Python features several types of mutual exclusion locks including
one used here that can be re-locked by the same thread without
negative effects.
- Note that Python threads are not quite like the PThreads common in
C; they will not have the same recognition by the operating system
so will not lead to speedup in number crunching endeavors; however
they are just fine for tracking concurrent tasks like the worms
which are not compute intensive
Time permitting, explore the code of `worms.py' and draw some
parallels to the C code.