CSCI 2021 HW13: pmap and Linking
- Due: 11:59pm Tue 13-Dec 2022
- Approximately 0.83% of total grade
- Homework and Quizzes are open resource/open collaboration. You must submit your own work but you may freely discuss HW topics with other members of the class.
CODE DISTRIBUTION: hw13-code.zip
CHANGELOG: Empty
1 Rationale
On modern computing systems, virtual memory creates the illusion that
every program has a linear address space from 0 to some large
address. Mostly this happens behind the scenes and is managed by the
operating system but knowledge of presence of virtual addresses
provides insight into many aspects of practical programming. One can
inspect some of the OS information on the virtual address space of a
program using utilities such as pmap
.
The linker is a little discussed portion of a typical compiler chain but it is a frequent source of frustrating compilation errors when dealing with code libraries. This lab covers the basics of linking to system libraries and how this affects the virtual memory image of the resulting program.
Associated Reading / Preparation
Bryant and O'Hallaron: Ch 9 on Virtual Memory is informative for this
Problem 1. The mmap()
function is discussed in section 9.8.4. Bryant
and O'Hallaron Ch 7 on Linking and ELF formats is pertinent to the
second problem.
Grading Policy
Credit for this HW is earned by taking the associated HW Quiz which is
linked under Gradescope
. The quiz will ask similar questions as
those that are present in the QUESTIONS.txt
file and those that
complete all answers in QUESTIONS.txt
should have no trouble with
the quiz.
Homework and Quizzes are open resource/open collaboration. You must submit your own work but you may freely discuss HW topics with other members of the class.
See the full policies in the course syllabus.
2 Codepack
The codepack for the HW contains the following files:
File | State | Description |
---|---|---|
QUESTIONS.txt |
EDIT | Questions to answer |
Makefile |
Provided | Makefile to build programs for the HW |
memory_parts.c |
Provided | Problem 1 program to analyze |
gettysburg.txt |
Provided | Problem 1 data file |
do_math.c |
Provided | Problem 2 program compile and link |
do_pthreads.c |
Provided | Problem 2 program compile and link |
3 What to Understand
Ensure that you understand
- That program regions like the stack and heap are comprised of virtual addresses that the OS maps to physical locations
- Basic compiler options to link against standard libraries like the math library
- How the
nm
command can show defined and undefined symbols in an executable - How to use
ldd
to show what libraries an executable is dynamically dependent upon
4 Questions
Analyze the files in the provided codepack and answer the questions
given in QUESTIONS.txt
.
_________________ HW 13 QUESTIONS _________________ - Name: (FILL THIS in) - NetID: (THE kauf0095 IN kauf0095@umn.edu) Write your answers to the questions below directly in this text file. Submit the whole text file while taking the associated quiz. PROBLEM 1: Virtual Memory and pmap ================================== (A) memory_parts memory areas ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Examine the source code for the provided `memory_parts.c' program. Identify what region of program memory you expect the following variables to be allocated into: - global_arr[] - stack_arr[] - heap_arr (B) Running memory_parts and pmap ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Compile the `memory_parts' using the provided Makefile. ,---- | > make memory_parts `---- Run the program and note that it prints several pieces of information - The addresses of several of the variables allocated - Its Process ID (PID) which is a unique number used to identify the running program. This is an integer. For example, the output might be ,---- | > ./memory-parts | 0x5605a7c271e9 : main() | 0x5605a7c2a0c0 : global_arr | 0x7ffe5ff7d600 : stack_arr | 0x5605a92442a0 : heap_arr | 0x7f1fa7303000 : mmap'd file | 0x600000000000 : mmap'd block1 | 0x600000001000 : mmap'd block2 | my pid is 8406 | press any key to continue `---- so the programs PID is 8406 The program will also stop at this point until a key is pressed. DO NOT PRESS A KEY YET. Open another terminal and type the following command in that new terminal. ,---- | > pmap THE-PID-NUMBER-THAT-WAS-PRINTED-EARLIER `---- Paste the output of pmap below. (C) Program Addresses vs Mapped Addresses ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ pmap prints out the virtual address space table for the program. The leftmost column is a virtual address mapped by the OS for the program to some physical location. The next column is the size of the area of memory associated with that starting address. The 3rd column contains permissions of the program has for the memory area: r for read, w for read, x for execute. The final column is contains any identifying information about the memory area that pmap can discern. Compare the addresses of variables and functions from the paused program to the output. Try to determine the virtual address space in which each variable resides and what region of program memory that virtual address must belong to (stack, heap, globals, text). In some cases, the identifying information provided by pmap may make this obvious. (D) Min Size of Mapped Areas ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The minimum size of any virtual area of memory appears to be 4K. Why is this the case? (E) Additional Observations ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Notice that in addition to the "normal" variables that are mapped, there is also an entry for the mmap()'d file 'gettysburg.txt' in the virtual address table. The mmap() function is explored in the next problem but note its calling sequence which involves use of a couple system calls: 1. `open()' which is a low level file opening call which returns a numeric file descriptor. 2. `fstat()' which obtains information such as size for an open file based on its numeric file descriptor. The `stat() / fstat()' system calls are used to ask the Unix Operating System information about files such as their size, modification times, and access permissions. This system call is studied more in Operating System courses. Finally there are additional calls to `mmap()' which allocate memory to the program at a specific virtual address. Similar code to this is often used to allocate and expand the heap area of memory for programs in implementations of `malloc()'. PROBLEM 2: Linking to System Libraries ====================================== (A) ~~~ The file `do_math.c' contains some basic usage of the C library math functions like `pow()'. Compile this program using the command line ,---- | > gcc do_math.c `---- and show the results below which should be problematic. Describe why the linker complains about functions like `cos' and `pow'. *Note*: problems will arise on Linux systems with gcc: other OS/compiler combinations may not cause any problems. (B) ~~~ In order to fix this problem, one must link the program against the math library typically called `libm'. This can be done with the option `-l' for "library" and `m' for the math library as shown: ,---- | > gcc do_math.c -lm `---- Show a run of the resulting executable after a successful compile below. (C) ~~~ After successfully compiling `do_math.c', use the `ldd' command to examine which dynamically linked libraries it requires to run. Assuming the executable is named `a.out', invoke the command like this ,---- | > ldd a.out `---- Show the output for this command and note anything related to the math library that is reported. (D) ~~~ Run the program which should report its Process ID (pid) before pausing. In a separate terminal, while the program is still running, execute the pmap command to see the virtual address space for the program (command `pmap <pid>'). Paste the results below and describe any relation to the math library that is apparent. (E) ~~~ Repeat the general steps above with the C file `do_pthreads.c' which will require linking to the PThreads library with `-lpthread'. - Compile to show error messages - Compile successfully with proper linking and show output - Call `ldd' on the executable - While the program is paused, run `pmap' to see its virtual address space Show the output of these commands below.