CMSC216 HW07: Assembly Debugging and Stack Security
- Due: 11:59pm Mon 21-Oct-2024
- Approximately 0.83% of total grade
- Homework and Quizzes are open resource/open collaboration. You must submit your own work but you may freely discuss HW topics with other members of the class.
CODE DISTRIBUTION: hw07-code.zip
- Download the code distribution
- See further setup instructions below
CHANGELOG:
- Thu Oct 17 03:59:02 PM EDT 2024
Post 938 reports a problem wherein the
smash1
andsmash2
programs are supposed to produce a "Stack Smashing" error but do not appear to do so on GRACE. This is due to bad configuration of the GCC install on GRACE: it does not allow stack protection to be added to programs.To see the stack smashing behavior (or lack thereof), students will need to use some newly-added pre-compiled versions of these programs that come from a Linux with a properly installed GCC. They are now part of the HW07 codepack or can be downloaded via
>> wget https://www.cs.umd.edu/~profk/216/smash1_static >> wget https://www.cs.umd.edu/~profk/216/smash2_static
1 Rationale
Code that deals with linked data structures such as lists, trees, and hash tables can be difficult to understand even in high-level programming languages and viewing such algorithms as assembly code only increases this difficulty. This HW covers a debugging scenario that shows some data structure processing as assembly code in a debugging exercise.
The HW also introduces a class of problems that can arise when working with data in the function call stack. Since return addresses for functions are also stored in the stack, modifying stack data incorrectly can "clobber" return addresses. Modern compilers like GCC insert some safety measures to detect this but they usually result in programs crashing. This HW studies such a case.
Associated Reading / Preparation
- Bryant and O'Hallaron: Ch 3.1-3.9 on assembly code instructions in x86-64 are essential for understanding how to follow assembly code in the debugger.
- Bryant and O'Hallaron: Ch 3.10 on data layout in the stack can
provide insight to the behavior in the
smash
programs of problem 2. - CSCI 2021 Quick Guide to gdb: Run Assembly Without Source Code Available: Tricks to deal with binary only files where no source code is available.
Grading Policy
Credit for this HW is earned by taking the associated HW Quiz which is
linked under Gradescope
. The quiz will ask similar questions as
those that are present in the QUESTIONS.txt
file and those that
complete all answers in QUESTIONS.txt
should have no trouble with
the quiz.
Homework and Quizzes are open resource/open collaboration. You must submit your own work but you may freely discuss HW topics with other members of the class.
See the full policies in the course syllabus.
2 Codepack
The codepack for the HW contains the following files:
File | Description |
---|---|
QUESTIONS.txt |
Questions to answer |
quote_main.c |
Problem 1 main() function to analyze |
quote_data.o |
Problem 1 binary file used in debugging |
smash1.c |
Problem 2 erroneous program to analyze |
smash2.c |
Problem 2 erroneous program to analyze |
smash1_static |
Precompiled version of the smash1 program to show stack smashing |
smash2_static |
Precompiled version of the smash2 program |
3 What to Understand
Ensure that you understand
- How to gather information about a compiled
a.out
or.o
file using binary utilities likenm
andobjdump -d
- How step through compiled code in the debugger to analyze what assembly code is doing even when no source code is available
- The basics of stack layout and how return addresses can get clobbered if when writing of the end of stack arrays
4 Questions
Analyze the files in the provided codepack and answer the questions
given in QUESTIONS.txt
.
_________________ HW 08 QUESTIONS _________________ Write your answers to the questions below directly in this text file to prepare for the associated quiz. Credit for the HW is earned by completing the associated online quiz on Gradescope. PROBLEM 1: Quote Binary Debugging ================================= The two files `quote_main.c' and `quote_data.o' can be compiled together to form an executable as in the following. ,---- | > gcc quote_main.c quote_data.o | > ./a.out | Complete this sentence by C++ creator Bjarne Stroustrup: | C makes it easy to shoot yourself in the foot; ... | | enter a number from 0 to 15: 2 | | 2: This is why most programmers are such poor dancers. | | Have a nice tall glass of ... NOPE. `---- As in a previous exercise, the intention is to use the debugger to detect the correct response. In this case however, the correct completion is present in `quote_main.c'. However, one must enter a number which selects from several responses in an attempt to match the correct completion. It is possible to "brute force" the solution by trying all solutions. However, the intent of the activity is to explore the running code with the debugger to answer the questions below. This will give insight into some stages of the binary bomb assignment. A ~ Use some binary utility programs to print out the symbols that are present in `quote_data.o'. Review the previous lab if you have forgotten programs like `nm' and `objdump' can print symbols in a binary object file. Speculate as to which data might pertain to where potential options are stored. B ~ The entry point into the assembly code in `quote_data.o' is the function `get_it'. Use either the debugger or a disassembly of the object to trace whether this functions performs the entire computation or if other functions are also used. Use this along with other observations of the content of the `quote_data.o' file to infer what kind of data structure the choices are stored in. C ~ Use the debugger to step through the functions in `quote_data.o'. Keep in mind that the parameters to the function follow the standard convention: 1st param in register `%rdi', second in `%rsi', and so forth. You should be able to identify a loop in a critical function in which the choices are present. Use `print' and `x' commands in gdb to examine data pointed to be registers to help identify where the correct response is located. Keep in mind that often the `testX' instruction is used to determine truthy/falsey qualities about a register. This takes several forms that were discussed in lecture - `testl %edx, %edx' may be used to check if `%edx' is 0 or negative - `testq %rax, %rax' may be used to check if `%rax' is a `NULL' pointer You are likely to see some of the above uses for the test instruction while examining the assembly code in `quote_data.o' Recall that you can examine memory addresses pointed to registers with gdb commands like the following. ,---- | (gdb) x/d $rax # print memory pointed to by rax as a decimal integer | (gdb) x/x $rax # print memory pointed to by rax as a hex number | (gdb) x/s $rax # print memory pointed to by rax as a string `---- Include what debugger techniques you used in your answer along with the index of the correct completion. PROBLEM 2: Stack Smashing ========================= A ~ Examine the obviously flawed code in `smash1.c'. ,---- | 1 // smash1.c: demonstrates problems with the function call stack | 2 #include <stdio.h> | 3 | 4 void fill_seq(int *a); | 5 void demo(); | 6 | 7 int main(){ | 8 printf("About to do the demo\n"); | 9 demo(); | 10 printf("Demo Complete\n"); | 11 return 0; | 12 } | 13 | 14 void demo(){ | 15 int arr[4]; | 16 | 17 fill_seq(arr); | 18 | 19 for(int i=0; i<4; i++){ | 20 printf("[%d]: %d\n",i,arr[i]); | 21 } | 22 } | 23 | 24 #define END 8 | 25 void fill_seq(int *a){ | 26 for(int i=0; i<END; i++){ | 27 a[i] = (i+1)*2; | 28 } | 29 } `---- Compiling and running this code with gcc on a Linux system (and perhaps in other compiler/OS configurations) will yield the following kind of behavior. ,---- | > gcc smash1.c | > ./a.out | About to do the demo | [0]: 2 | [1]: 4 | [2]: 6 | [3]: 8 | --- stack smashing detected ---: <unknown> terminated | Aborted (core dumped) `---- Describe the kind of error that is occurring in this code and why it is referred to as "stack smashing." Identify which part of the code is causing the problem. B ~ Consider the alternative file `smash2.c'. Note the similarities to the previous program but also slight changes in it. Compile and run this program and report what errors you see. Describe if the errors are similar or not and relate these to changes in the program. C ~ Recompile both smash programs as shown and use Valgrind to run them again. Show your output and comment on how much information Valgrind provides about each program run and the out-of-bounds array accesses that are performed. Based on your observations, mention which memory area Valgrind monitors more effectively. ,---- | > gcc -g smash1.c | > valgrind ./a.out | ... | | > gcc -g smash2.c | > valgrind ./a.out | ... `---- stack smashing detected ***: terminated --------------------------------------- `=8832=' `=8832=' Process terminating with default action of signal 6 (SIGABRT): dumping core `=8832=' at 0x48B1615: raise (in /usr/lib/libc-2.32.so) `=8832=' by 0x489A861: abort (in /usr/lib/libc-2.32.so) `=8832=' by 0x48F35E7: __libc_message (in /usr/lib/libc-2.32.so) `=8832=' by 0x4983A39: __fortify_fail (in /usr/lib/libc-2.32.so) `=8832=' by 0x4983A03: __stack_chk_fail (in /usr/lib/libc-2.32.so) `=8832=' by 0x1091EF: demo (smash1.c:22) `=8832=' by 0x109172: main (smash1.c:9) `=8832=' `=8832=' HEAP SUMMARY: `=8832=' in use at exit: 0 bytes in 0 blocks `=8832=' total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated `=8832=' `=8832=' All heap blocks were freed -- no leaks are possible `=8832=' `=8832=' For lists of detected and suppressed errors, rerun with: -s `=8832=' ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) Aborted (core dumped) > gcc -g smash2.c valgrind ./a.out `=8848=' Memcheck, a memory error > detector `=8848=' Copyright (C) 2002-2017, and GNU GPL'd, by Julian > Seward et al. `=8848=' Using Valgrind-3.16.1 and LibVEX; rerun with > -h for copyright info `=8848=' Command: a.out `=8848=' About to do > the demo `=8848=' Invalid write of size 4 `=8848=' at 0x109233: > fill_seq (smash2.c:28) `=8848=' by 0x1091B7: demo (smash2.c:17) > `=8848=' by 0x109182: main (smash2.c:9) `=8848=' Address 0x4a40490 > is 0 bytes after a block of size 16 alloc'd `=8848=' at 0x483A77F: > malloc (vg_replace_malloc.c:307) `=8848=' by 0x1091A7: demo > (smash2.c:15) `=8848=' by 0x109182: main (smash2.c:9) `=8848=' [0]: > 2 [1]: 4 [2]: 6 [3]: 8 Demo Complete `=8848=' `=8848=' HEAP SUMMARY: > `=8848=' in use at exit: 0 bytes in 0 blocks `=8848=' total heap > usage: 2 allocs, 2 frees, 1,040 bytes allocated `=8848=' `=8848=' > All heap blocks were freed -- no leaks are possible `=8848=' > `=8848=' For lists of detected and suppressed errors, rerun with: -s > `=8848=' ERROR SUMMARY: 4 errors from 1 contexts (suppressed: 0 from > 0) #+END_SRC ,---- | > gcc -g smash1.c | > valgrind ./a.out | ==8522== Memcheck, a memory error detector | ==8522== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. | ==8522== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info | ==8522== Command: ./a.out | ==8522== | [0]: 2 | [1]: 4 | [2]: 6 | [3]: 8 | *** stack smashing detected ***: terminated | ==8522== | ==8522== Process terminating with default action of signal 6 (SIGABRT): dumping core | ==8522== at 0x48B1615: raise (in /usr/lib/libc-2.32.so) | ==8522== by 0x489A861: abort (in /usr/lib/libc-2.32.so) | ==8522== by 0x48F35E7: __libc_message (in /usr/lib/libc-2.32.so) | ==8522== by 0x4983A39: __fortify_fail (in /usr/lib/libc-2.32.so) | ==8522== by 0x4983A03: __stack_chk_fail (in /usr/lib/libc-2.32.so) | ==8522== by 0x1091B6: main (smash1.c:16) | ==8522== | ==8522== HEAP SUMMARY: | ==8522== in use at exit: 0 bytes in 0 blocks | ==8522== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated | ==8522== | ==8522== All heap blocks were freed -- no leaks are possible | ==8522== | ==8522== For lists of detected and suppressed errors, rerun with: -s | ==8522== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) | Aborted (core dumped) | | | > gcc -g smash2.c | > valgrind ./a.out | ==8540== Memcheck, a memory error detector | ==8540== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. | ==8540== Using Valgrind-3.16.1 and LibVEX; rerun with -h for copyright info | ==8540== Command: ./a.out | ==8540== | ==8540== Invalid write of size 4 | ==8540== at 0x1091FA: fill_seq (smash2.c:21) | ==8540== by 0x10917A: main (smash2.c:9) | ==8540== Address 0x4a40050 is 0 bytes after a block of size 16 alloc'd | ==8540== at 0x483A77F: malloc (vg_replace_malloc.c:307) | ==8540== by 0x10916A: main (smash2.c:7) | ==8540== | [0]: 2 | [1]: 4 | [2]: 6 | [3]: 8 | ==8540== | ==8540== HEAP SUMMARY: | ==8540== in use at exit: 0 bytes in 0 blocks | ==8540== total heap usage: 2 allocs, 2 frees, 1,040 bytes allocated | ==8540== | ==8540== All heap blocks were freed -- no leaks are possible | ==8540== | ==8540== For lists of detected and suppressed errors, rerun with: -s | ==8540== ERROR SUMMARY: 4 errors from 1 contexts (suppressed: 0 from 0) | `----