CMSC216 Lab03: Basic File I/O and Valgrind Debugging
- Due: 11:59pm Sun 15-Feb-2026 on Gradescope
- Approximately 1.00% of total grade
CODE DISTRIBUTION: lab03-code.zip
CHANGELOG: Empty
Table of Contents
1 Rationale
Reading basic data from a file is essential in any programming
environment. C provides the standard fscanf() function to read
formatted data. This lab demonstrates its use and provides some
techniques to read simple, structured data from files into dynamically
allocated structures. This lab also demonstrates one type of struct
referring to an array of another type of struct which is a common
pattern that often appears in class projects. Mastering the syntax
associated with this situation is an essential C-programming skill.
Memory problems are inevitable in C programs and learning how to effectively identify and resolve them is an essential skill for low-level development. Valgrind is a valuable ally when diagnosing memory problems as it gives very good clues about the nature of the errors so long as one can interpret its error messages. Part of this Lab focuses on fixing existing code where error messages such as "Invalid Writes" and "120 bytes Definitely Lost" are abundant. Completing the lab will give solid practice on sleuthing out memory bugs using such error messages.
Associated Reading
Most good C references will cover basic file I/O and uses of functions
like fscanf(). K&R's "The C Programming Language" does so as do
many of the C references listed on the course Canvas page.
Valgrind will be discussed in lecture but those looking for a quick reference might consult Valgrind's Explanation of error messages for additional information on how to read its tea leaves.
Grading Policy
Credit for this lab is earned by completing the code/answers in the
Lab codepack and submitting a Zip of the work to Gradescope preferably
via running make submit. Students are responsible to check that the
results produced locally are reflected on Gradescope after submitting
their completed Zip.
Lab Exercises are Free Collaboration and students are encouraged to cooperate on labs. Students may submit work as groups of up to 5 to Gradescope: one person submits then adds the names of their group members to the submission.
No late submissions are accepted for Lab work but the lowest two lab scores for the semester will be dropped including zeros due to missing submissions. See the full policies in the course syllabus.
2 Codepack
The codepack for this lab is linked at the top of this document. Always download it and unzip/unpack it. It should contain the following files which are briefly described.
| File | Use | Description |
|---|---|---|
treasure_main.c |
EDIT | Problem 1 C file to edit and complete; TODO sections are marked in the code |
treasure.h |
Provided | Problem 1 header file declaring data types and prototypes |
map1.tm |
Data | Problem 1 Data to read in treasure_main.c |
map2.tm |
Data | Problem 1 Data to read in treasure_main.c |
test_treasure_main.org |
Testing | Problem 1 CODE tests, used in make test-prob1 |
mon_main.c |
EDIT | Problem 2 C file to edit and complete; contains bugs that must be fixed |
mon-top5.txt |
Data | Problem 2 Data to read in mon_main.c |
map-all.txt |
Data | Problem 2 Data to read in mon_main.c |
mon-*.txt |
Data | Problem 2 Additional data files |
test_mon_main.org |
Testing | Problem 2 CODE tests, used in make test-prob2 |
QUESTIONS.txt |
EDIT | Instructions and Questions to answer: fill in the multiple choice selections in this file. |
Makefile |
Build | Enables make test and make zip |
QUESTIONS.txt.bk |
Backup | Backup copy of the original file to help revert if needed |
QUESTIONS.md5 |
Testing | Checksum for answers in questions file |
test_quiz_filter |
Testing | Filter to extract answers from Questions file, used in testing |
test_lab03.org |
Testing | Tests for this lab |
testy |
Testing | Test running scripts |
gradescope-submit |
Misc | Allows submission to Gradescope from the command line |
3 QUESTIONS.txt File Contents
Below are the contents of the QUESTIONS.txt file for the lab.
Follow the instructions in it to complete the QUIZ and CODE questions
for the lab.
_________________
LAB03 QUESTIONS
_________________
Exercise Instructions
=====================
Follow the instructions below to experiment with topics related to
this exercise.
- For sections marked QUIZ, fill in an (X) for the appropriate
response in this file. Use the command `make test-quiz' to see if
all of your answers are correct.
- For sections marked CODE, complete the code indicated. Use the
command `make test-code' to check if your code is complete.
- DO NOT CHANGE any parts of this file except the QUIZ sections as it
may interfere with the tests otherwise.
- If your `QUESTIONS.txt' file seems corrupted, restore it by copying
over the `QUESTIONS.txt.bk' backup file.
- When you complete the exercises, check your answers with `make test'
and if all is well. Create a zip file and submit it to Gradescope
with `make submit'. Ensure that the Autograder there reflects your
local results.
- IF YOU WORK IN A GROUP only one member needs to submit and then add
the names of their group on Gradescope.
PROBLEM 1: Treasuremap I/O Program
==================================
Staff will discuss some aspects of the provided treasure_main.c
file. This file has some blanks marked with ???? which need to be
filled in. It also demonstrates several important techniques that are
common in C program.
When the application is complete, it will run as shown below on the
provided map1.tm and map2.tm files.
,----
| > make # compile
| gcc -Wall -Werror -g -c treasure_main.c
| gcc -Wall -Werror -g -o treasure_main treasure_main.o treasure.h
|
| > ./treasure_main # run with no file provided
| usage: ./treasure_main <treasure_file.tm>
|
| > ./treasure_main map1.tm # run on first treasuremap
| Loading treasure map from file 'map1.tm'
| Reading map from file 'map1.tm'
| Allocating map struct
| Map is 7 by 5
| 3 treasures on the map
| Allocating array of treasure locations
| Reading treasures
| Treasure at 0 2 called 'Death_Crystals'
| Treasure at 4 1 called 'Mega_Seeds'
| Treasure at 6 3 called 'Flurbo_stash'
| Completed file, closing
| Returning pointer to heap-allocated treasure_t
|
| ==TREASURE MAP==
| ..A..
| .....
| .....
| .....
| .B...
| .....
| ...C.
| ================
| A: Death_Crystals
| B: Mega_Seeds
| C: Flurbo_stash
|
| Deallocating map
|
|
| > ./treasure_main map2.tm # run on second treasuremap
| Loading treasure map from file 'map2.tm'
| Reading map from file 'map2.tm'
| Allocating map struct
| Map is 9 by 13
| 10 treasures on the map
| Allocating array of treasure locations
| Reading treasures
| Treasure at 5 2 called 'Goblet_of_Fire'
| Treasure at 3 8 called 'Invisibility_Cloak'
| Treasure at 4 11 called 'Elder_Wand'
| Treasure at 8 10 called 'Mirror_of_Erised'
| Treasure at 1 12 called 'Philosophers_Stone'
| Treasure at 7 9 called 'Marauders_Map'
| Treasure at 8 2 called 'Pensieve'
| Treasure at 3 9 called 'Sword_of_Gryffindor'
| Treasure at 7 0 called 'Tom_Riddles_Diary'
| Treasure at 0 11 called 'Time_Turner'
| Completed file, closing
| Returning pointer to heap-allocated treasure_t
|
| ==TREASURE MAP==
| ...........J.
| ............E
| .............
| ........BH...
| ...........C.
| ..A..........
| .............
| I........F...
| ..G.......D..
| ================
| A: Goblet_of_Fire
| B: Invisibility_Cloak
| C: Elder_Wand
| D: Mirror_of_Erised
| E: Philosophers_Stone
| F: Marauders_Map
| G: Pensieve
| H: Sword_of_Gryffindor
| I: Tom_Riddles_Diary
| J: Time_Turner
|
| Deallocating map
`----
PROBLEM 1 QUIZ: File Input in treasure_main.c
=============================================
Make sure to study the code and ask questions to resolve the following
queries about the treasure map application.
Command Line Arguments
----------------------
The `treasure_main.c' program is run with command line parameters like
`> ./treasure_main map1.tm'. How is the file name `map1.tm' made
available in the `main()' function?
- ( ) The `fscanf()' function is used to read the command line as the
user types it in.
- ( ) The global variable `args' will be an array with `args[0]' as
the string `map1.tm'
- ( ) The `argc' parameter will have value 2 and `argv[1]' will be a
pointer to the string `map1.tm'
- ( ) The `getenv()' function is used to extract the command line
arguments from the environment.
Struct Definition
-----------------
The `treasuremap_t' struct is defined in the following location which
shows its fields (parts):
- ( ) In the c file `treasure_main.c'
- ( ) In the header `treasure.h'
- ( ) In the header `stdio.h'
- ( ) Trick question: the definition is read from the user typing
File Format
-----------
One important part of completing the `treasuremap_load()' function
will be to understand the format of the treasuremap files. According
to the documentation for the function and your observations of the
`map1.tm' and `map2.tm' files, which of the best describes the format
of these files.
- ( ) The file looks like how treasure_main prints them, as something
like
,----
| ..A..
| .....
| .....
| .....
| .B...
| .....
| ...C.
`----
and the application should just read the strings in the file into
the struct.
- ( ) The first line has the the size of the treasuremap AND the
number of treasures in the map (N). There are N remaining lines
which have treasure positions/descriptions in them.
- ( ) The first line has the number of rows/columns in the map and
each remaining line is a row/col/description position of a
treasure. The number of treasures is not given and must be detected
using EOF as a return from `fscanf()'
- ( ) The file is stored in binary format and is not easily read by
humans. However, it can be read directly into the struct via
`fread()'.
Nested Struct Syntax
--------------------
During `treasuremap_load()' the following line of code is used to read
in the row for an individual treasure's location.
,----
| fscanf(file_handle, "%d", &tmap->locations[i].row);
`----
Which of the following best describes the syntax used based on the
`treasuremap_t' type?
- ( ) `tmap' is a pointer to a struct so `->' is used to dereference
it to a field; the `locations' field is a pointer to an array so
square braces are used to dereference it. `fscanf()' needs an
address so `&' is used.
- ( ) `tmap' is a normal struct so `->' is used to access its fields;
the `locations' field is a normal array so square braces are used to
dereference it. `fscanf()' needs an address so `&' is used.
- ( ) `tmap' is a pointer so `&' is used to dereference it with `->'
getting its field. The `locations' field is a normal array so
square braces are used to dereference it.
- ( ) Trick Question: The syntax for this line is actually incorrect
and won't compile. It is a TODO item to fix its syntax.
Nested Struct Memory Allocation
-------------------------------
During the program execution, two different memory allocations are
made. Which of the following best describes how allocated memory is
related to the nested nature of the structs?
- ( ) The first allocation creates space for a treasuremap_t
struct. The second creates an array for all treasureloc_t structs
The address of the array is stored in the locations field of the
treasuremap_t struct.
- ( ) The first allocation creates space for an array of treasureloc_t
structs. The second creates a treasuremap_t struct and then connects
it to the array.
- ( ) The first allocation creates space for a treasuremap_t struct
AND the array of treasure locations. The second allocation is
unnecessary and needs to be removed.
- ( ) The current implementation is an incorrect method to declare the
structures, and it is sufficient to create local variables for both
the treasuremap_t struct and an array of treasuremap_loc structs.
Two-Dimensional Array Allocation
--------------------------------
Analyze the beginning of the `treasuremap_print()' function where the
`printspace' variable is established. Which of the following code
patterns is used to create a two-dimensional table of characters that
is used later with syntax like `printspace[i][j] = ...;'?
- ( ) A function called malloc2D(row,col) is used which allocates a
two-dimensional table and assign it printspace
- ( ) A function called calloc(row,col) is used which allocates a
two-dimensional table and assign it printspace
- ( ) First one malloc() is used to allocate all row/column elements,
then a loop is used to assign pointers to positions within the
single large memory block
- ( ) First one malloc() is used to allocate an array of pointers
sized on the height, then a loop is used to malloc() rows sized on
the width with the first array pointing at each row
Character Tricks
----------------
Locate the function `treasuremap_print()' and find the code with the
comment
,----
| // an interesting line
`----
Analyze what is happening with this line and select the answer below
which best describes it.
- ( ) It is accessing a character from an array such as A, B, C... to
display on the treasure map for each treasure.
- ( ) It is using pointer arithmetic to find the location of a
character in memory to display for each treasure.
- ( ) It is adding an integer onto character 'A' to get a new
character to display on the treasure map for each treasure.
- ( ) It is reading the character to display for the treasure from the
input file.
Memory De-allocation
--------------------
malloc() allocates blocks of memory for structs and arrays and those
blocks persist throughout program execution outliving the function
which calls malloc(). In order not to exhaust available memory, these
blocks must eventually be de-allocated: for very malloc(), a free().
treasure_main uses is responsible for de-allocating memory at the end
of program execution. Which of the following best describes how it
works?
- ( ) treasuremap_free() only calls free(tmap) which will free all
allocated memory associated with that struct
- ( ) treasuremap_free() does nothing as C keeps tracks of all
malloc'd blocks and automatically de-allocates them before main()
finishes.
- ( ) treasuremap_free() first free()'s the locations array inside the
tmap struct and then free()'s the tmap struct itself. This order is
necessary as free()'ing tmap first would lose access to its fields.
- ( ) treasuremap_main() actually has some logic bugs where the
pointer to the treasuremap struct is lost so cannot be free()'d
properly. This is a bug that needs to be fixed in the program to
pass test cases.
PROBLEM 2 CODE: Complete treasure_main.c
========================================
Complete the TODO items in the `treasure_main.c' file so that the two
functions marked REQUIRED compile and run successfully. Correct output
will look like the demo shown at the top of this file.
Test that the code behaves correctly via the command
,----
| # Run all Problem 1 code tests
| >> make test-prob1
|
| # Run only Test #2 for Problem 1 code
| >> make test-prob1 testnum=2
`----
PROBLEM 2: A Bit of Debugging
=============================
This problem will focus on debugging an existing C program which was
written in a rather sloppy fashion leading to some serious flaws in it
including memory problems. This will be a chance to practice reading
Valgrind error messages and using them to locate problems in the code
and resolve them.
Context
~~~~~~~
Your current supervisor has left the following note with the code and
data files in the code pack.
,----
| TRAINER: We have most of the data we needed in the mon-*.txt files but
| we need to complete the analysis I promised Prof. Birch by the end of
| the week. We need the Median Base Points of the samples in the data
| files. I've written a program in mon_main.c to calculate this in a
| flexible way but it keeps crashing and I don't have time to debug
| it. I'm leaving that to you: Sorry! I've heard Valgrind may help to
| diagnose the memory problems but I never learned to use it
| myself. Please make corrections to the code to make it work so we can
| determine the median Base Points for the samples outlined in the
| tests.
|
| -Prof. Oak
`----
Intended Behavior
~~~~~~~~~~~~~~~~~
The intended behavior of the program `mon_main' is to read data files
formatted like `mon-top5.txt' which have 3 columns. It will store all
the data in an array, sort it, and then print the median (middle)
element.
,----
| >> make
| gcc -Wall -Werror -g -c mon_main.c
| gcc -Wall -Werror -g -o mon_main mon_main.o
|
| >> ./mon_main
| usage: ./mon_main <input_file> [filter]
|
| >> ./mon_main mon-top5.txt
| Sorted array:
| 0: Charmander 309 Fire
| 1: Squirtle 314 Water
| 2: Bulbasaur 318 Grass
| 3: Pikachu 320 Electric
| 4: Psyduck 320 Water
| Median BASE_POINTS is
| 2: Bulbasaur 318 Grass
|
| >> ./mon_main mon-top5.txt Water
| Sorted array:
| 0: Squirtle 314 Water
| 1: Psyduck 320 Water
| Median BASE_POINTS is
| 1: Psyduck 320 Water
|
| >> ./mon_main mon-top5.txt Ground
| No elements left after filtering on 'Ground'
`----
Unfortunately the behavior of the program at present is somewhat less
helpful.
,----
| >> ./mon_main mon-top5.txt
| Segmentation fault (core dumped)
`----
Instructions
~~~~~~~~~~~~
As is indicated (and is often the case with other people's code) the
program has some bugs which will need to be corrected. The ultimate
goals is to correct the code so that it passes the test cases
provided. The QUIZ questions in `QUESTIONS.txt' relate to some of
Error messages that will be produced in the tests and are a good way
to get started.
PROBLEM 2 QUIZ: Errors in mon_main.c
====================================
Study the code and error messages produced by test cases or command
line runs of the program to start determining where memory errors
occur and how they may be corrected. Answer the following QUIZ
questions.
First Test
~~~~~~~~~~
Run the first test via `make test-prob2' or `make test-prob2
testnum=1'. It will fail. Analyze its output, both the Side-by-Side
diff output and the Valgrind Report which is lower in the results.
Check all of the following that are relevant to the bugs reported.
- ( ) Valgrind messages indicate that this is likely a bug in Valgrind
itself which should be reported to the tool developers ASAP.
- ( ) The Valgrind messages indicate that there is likely a memory
problem with writing too far into a block of memory.
- ( ) The Valgrind Report indicates that a memory Write is being done
some distance BEFORE a malloc()'d block
- ( ) The Valgrind Report indicates that a memory Write is being done
some distance AFTER a malloc()'d block
- ( ) The memory Write is problematic because the wrong C syntax is
used in the program: no * or -> is used to dereference a pointer.
- ( ) The memory Write is problematic because it is assuming a pointer
to an array but not enough space has been allocated
The following Valgrind Error message often appears in the VALGRIND
REPORT towards the bottom of test results.
,----
| ==85164== Invalid write of size 8
| ==85164== at 0x109762: main (mon_main.c:99)
| ==85164== Address 0x4a9e374 is 0 bytes after a block of size 132 alloc'd
| ==85164== at 0x48447A8: malloc (vg_replace_malloc.c:446)
| ==85164== by 0x109661: main (mon_main.c:83)
`----
Check ALL of the following that correspond to information in this
message.
- ( ) It reports that some variable / array element is being assigned
(write-en to) but the pointer to the variable is out of bounds
- ( ) It gives a source code line where the out of bounds access is
occurring.
- ( ) It shows the exact memory address that is being accessed that is
out of bounds (in hexadecimal format)
- ( ) It gives the relative position of the memory access as just
after a block that was allocated using malloc() and the location in
source code where that allocation occurred
Make some code changes to correct this bug and check that any error
messages change.
First Test Again
~~~~~~~~~~~~~~~~
Run the first test again and check new error messages. Which part of
the code seems to be problematic now?
- ( ) The loop that reads data from the file into an array
- ( ) The loop that prints out the contents of the sorted array
- ( ) The insertion sort routine meant to sort the data
- ( ) The indexing to determine the median (it accesses an
out-of-bounds element)
Use any nearby hints to fix the code and reduce the errors / generate
new errors.
First Test AGAIN
~~~~~~~~~~~~~~~~
If all has gone well, the output for the first test will now appear
correct BUT there are additional problems reported by Valgrind. Check
ALL of the following problems appear in the Valgrind report.
- ( ) There are Invalid reads AFTER a heap-allocated block
- ( ) There are Invalid writes BEFORE a heap-allocated block
- ( ) Uninitialized memory used to determine a conditional jump/move
- ( ) Memory is lost / leaked as it is not de-allocated before the end
of the program
One of the problems identified in the Valgrind output is associated
with the following lines which illustrate a common beginner mistake
(something one wouldn't expect the seasoned Prof. Oak to do but
perhaps C isn't his strong suit...)
,----
| char *filter = malloc(sizeof(char));
| if(argc > 2){ // filter present
| filter = argv[2];
| }
| else{ // default to no filtering
| filter = NULL;
| }
`----
Why is the above code a memory error?
- ( ) malloc()'ing memory for filter then re-assigning it to point
elsewhere creates a memory leak: the malloc()'d block can never be
free()'d
- ( ) character pointers like filter should not be pointed to
heap-allocated space with malloc()
- ( ) When pointers should not be set to NULL as this will cause a
segmentation fault
- ( ) pointing filter at a string like argv[2] leads to problems later
as argv[2] is not heap-allocated so the code that free()'s it will
causes an error
Use the error messages provided in the Valgrind Report to fix the
remaining error messages associated with this test.
Hint: there are 3 changes that need to be made.
FIRST TEST!
~~~~~~~~~~~
The First test should now pass. In fact, a few more tests may pass.
Carry on with this pattern of inspecting test results, especially the
VALGRIND REPORT to continue diagnosing errors to correct them. A few
hints:
- Any attempt to access a NULL pointer will trigger a segmentation
fault; this is reported as an "Invalid Read" or "Invalid Write" of
the address 0x0. When working with pointers you know may be NULL,
check whether they are NULL before attempting to dereference them.
- All code paths of exiting must de-allocate all memory. If you are
exiting the program early, make sure to free() an malloc()'d blocks
and close any files.
- Opening files can fail, often due to the wrong filename being typed
in. Make sure to check for this condition and report appropriate
errors. Attempting to work with unopened file is likely to end in
crashes.
PROBLEM 2 CODE: Repair mon_main.c
=================================
As indicated above, continue debugging mon_main.c until all tests
pass. A completely corrected program should give the following results
on running tests:
,----
| >> make test-prob2
| gcc -Wall -Werror -g -fstack-protector-all -c mon_main.c
| gcc -Wall -Werror -g -fstack-protector-all -o mon_main mon_main.o
| ./testy -o md test_mon_main.org
| =================================================
| == test_mon_main.org : mon_main application tests
| == Running 8 / 8 tests
| 1) ./mon_main mon-lvl1.txt Fire : ok
| 2) ./mon_main mon-all.txt Grass : ok
| 3) ./mon_main mon-top5.txt : ok
| 4) ./mon_main mon-lvl1.txt : ok
| 5) ./mon_main mon-all.txt : ok
| 6) ./mon_main mon-all.txt Ground : ok
| 7) ./mon_main no-file.txt : ok
| 8) ./mon_main mon-lvl2.txt BadFilter : ok
| =================================================
| RESULTS: 8 / 8 tests passed
`----
4 Submission
Follow the instructions at the end of Lab01 if you need a refresher on how to upload your completed lab zip to Gradescope.