Last Updated: 2024-10-28 Mon 10:35

CMSC216 Project 3: Assembly Coding and Debugging

CODE/TEST DISTRIBUTION: p3-code.zip

VIDEO OVERVIEW: https://youtu.be/wxZU9UimmFc

CHANGELOG:

Mon Oct 28 10:33:14 AM EDT 2024

Post 1402 pointed out that the autograder total was listed incorrectly on Gradescope. The Manual inspection total was also listed incorrectly. This has been corrected. The table below summarizes the avialable / full points for each element of project 3.

Full Avail Element
20 20 Prob 1 Autograder
40 60 Prob 2 Autograder
40 45 Prob 1 Manual inspection, 5pts MAKEUP available
100 125 Total
Tue Oct 15 05:11:10 PM EDT 2024
A video overview for P3 has been posted at the following location. In addition to surveying content associated with the project, several terminal and debugger tricks are demonstrated that may be of use to folks as they work on the project. VIDEO: https://youtu.be/wxZU9UimmFc

1 Introduction

This project will feel somewhat familiar in that it is nearly identical to the preceding project: there is a coding problem and a puzzle-solving problem. The major change is that everything is at the assembly level:

  • Problem 1 re-works the Clock functions from the previous project in x86-64 Assembly rather than C
  • Problem 2 involves analyzing a binary executable to provide it with the correct input to "defuse" the executable much like the previous project's Puzzlebox problem

Working with assembly will get you a much more acquainted with the low-level details of the x86-64 platform and give you a greater appreciation for "high-level" languages (like C).

2 Download Code and Setup

Download the code pack linked at the top of the page. Unzip this which will create a project folder. Create new files in this folder. Ultimately you will re-zip this folder to submit it.

File State Notes
Makefile Provided Problem 1 Build file
clock.h Provided Problem 1 header file
clock_main.c Provided Problem 1 main() function
clock_sim.c Provided Problem 1 clockmeter simulator functions
clock_update_asm.s CREATE Problem 1 Assembly functions, re-code C in x86-64, main source file for problem 1
clock_update.c CREATE Problem 1 C functions, COPY from Project 2 or see a staff member to discuss
     
test_clock_update.c Testing Problem 1 testing program for clock_update_asm.c
test_clock_update_asm.s Testing Problem 1 testing program for clock_update_asm.c
test_clock_update.org Testing Problem 1 testing data file
testy Testing Problem 1 test running script
puzzlebin Provided Problem 2 Executable for debugging
input.txt EDIT Problem 2 Input for puzzlebox, fill this in

3 Problem 1: Clock Display Assembly Functions

The functions in this problem are identical to a previous project in which code to support an LCD clock display was written. These functions are:

int set_tod_from_ports(tod_t *tod)
Retrieves value CLOCK_TIME_PORT and converts this to number of seconds from the beginning of the day with rounding via bit shifts and masking. Then sets the fields of the struct pointed to by tod to have the correct hours, minutes, seconds, and AM/PM indication.
int set_display_from_tod(tod_t tod, int *display)
Given a tod_t struct, reset and alter the bits pointed to by display to cause a proper clock display.
int clock_update()
Update global CLOCK_DISPLAY_PORT using the previous two functions.

The big change in this iteration will be that the functions must be written in x86-64 assembly code. As C functions each of these is short, up to 85 lines maximum. The assembly versions will be somewhat longer as each C line typically needs 1-4 lines of assembly code to implement fully. Coding these functions in assembly give you real experience writing working assembly code and working with it in combination with C.

The code setup and tests are mostly identical for this problem as for the previous C version of the problem. Refer to original Clock LCD Display Problem description for a broad overview of the simulator and files associated with it.

3.1 Hand-Code Your Assembly

As discussed in class, one can generate assembly code from C code with appropriate compiler flags. This can be useful for getting oriented and as a beginning to the code your assembly versions of the functions. However, this exercise is about writing assembly yourself to gain a deeper understanding of it.

Code that is clearly compiler-generated with no hand coding will receive 0 credit.

  • No credit will be given on manual inspection
  • Penalties will be assessed for Automated Tests which lower credit to 0

Do not let that dissuade you from looking at compiler-generated assembly code from you C solution to the functions. Make sure that you take the following steps which are part of the manual inspection criteria.

Base your Assembly code on your C code

The files to be submitted for this problem include

  • clock_update.c: C version of the functions
  • clock_update_asm.s: Assembly version of the functions

Graders may examine these for a correspondence between to the algorithm used in the C version to the Assembly version. Compiler generated assembly often does significant re-arrangements of assembly code with many intermediate labels that hand-written code will not have.

If you were not able to complete the C functions for the Project 2 or were not confident in your solutions, see a course staff member who will help you get them up and running quickly.

Annotate your Assembly Thoroughly

Comment your assembly code A LOT. While good C code can be quite self-explanatory with descriptive variable names and clear control structures, assembly is rarely so easy to understand. Include clear commentary on your assembly. This should include

  • Subdividing functions into smaller blocks with comments describing what the blocks accomplish.
  • Descriptions of which "variables" from the C side are held in which registers.
  • Descriptions of most assembly lines and their effect on the variables held in the registers.
  • Descriptions of any data such as bitmasks stored in the assembly code.
  • Use informative label names like .ROUNDING_UP to convey further meaning about what goals certain positions in code are accomplishing.

Use Division

While it is a slow instruction that is cumbersome to set up, using idivX division instruction is the most human-readable means to compute several results needed in the required functions. Compiler generated code uses many tricks to avoid integer division so a lack of idivX instructions along this line will be a clear sign little effort has been put into the assembly code.

3.2 General Cautions when coding Assembly

  1. Get your editor set up to make coding assembly easier. If you are using VS Code, the following video will show you how to install an extension to do syntax highlighting and block comment/uncomment operations in assembly: https://youtu.be/AgmXUFOEgIw
  2. Be disciplined about your register use: comment what "variables" are in which registers as it is up to you to keep track. The #1 advice from past students to future students is "Comment the Crap out of your assembly code" on this project.
  3. Be Careful with constants: forgetting a $ in constants will lead to a bare, absolute memory address which will likely segfault your program. Contrast:

       movq    $0,%rax                 # rax = 0
       movq    0, %rax                 # rax = *(0): segfault
                                       # bare 0 is memory address 0 - out of bounds
    

    Running your programs, assembly code included, in Valgrind can help to identify these problems. In Valgrind output, look for a line number in the assembly code which has absolute memory addresses or a register that has an invalid address.

  4. Recognize that in x86-64 function parameters are passed in registers for up to 6 arguments. These are arranged as follows

    1. rdi / edi / di (arg 1)
    2. rsi / esi / si (arg 2)
    3. rdx / edx / dx (arg 3)
    4. rcx / ecx / cx (arg 4)
    5. r8 / r8d / r8w (arg 5)
    6. r9 / r9d / r9w (arg 6)

    and the specific register corresponds to how argument sizes (64 bit args in rdi, 32 bit in edi, etc). The functions you will write have few arguments so they will all be in registers.

  5. Use registers sparingly. The following registers (64-bit names) are "scratch" registers or "caller save." Functions may alter them freely (though some may contain function arguments).

    rax rcx rdx rdi rsi r8 r9 r10 r11  # Caller save registers
    

    No special actions need to be taken at the end of the function regarding these registers except that rax should contain the function return value.

    Remaining registers are "callee save": if used, their original values must be restored before returning from the function.

    rbx rbp r12 r13 r14 r15            # Callee save registers
    

    This is typically done by pushing the callee registers to be used on the stack, using them, them popping them off the stack in reverse order. Avoid this if you can (and you probably can in our case).

  6. Be careful to adjust the stack pointer using pushX/popX or subq/addq . Keep in mind the stack must be aligned to 16-byte boundaries for function calls to work correctly. Above all, don't treat rsp as a general purpose register.

3.3 Register Summary Diagram

For reference, here is a picture that appears in the lecture slides that summarizes the names and special uses for the registers in x86-64.

registers.png

Figure 1: Summary of general purpose register usages in x86-64.

3.4 Structure of clock_update_asm.s

Below is a rough outline of the structure of required assmebly file. Consider copying this file as you get started and commenting parts of it out as needed.

.text                           # IMPORTANT: subsequent stuff is executable
.global  set_tod_from_ports
        
## ENTRY POINT FOR REQUIRED FUNCTION
set_tod_from_ports:
        ## assembly instructions here

        ## a useful technique for this problem
        movX    SOME_GLOBAL_VAR(%rip), %reg
        # load global variable into register
        # Check the C type of the variable
        #    char / short / int / long
        # and use one of
        #    movb / movw / movl / movq 
        # and appropriately sized destination register                                            

        ## DON'T FORGET TO RETURN FROM FUNCTIONS

### Change to definint semi-global variables used with the next function 
### via the '.data' directive
.data                           # IMPORTANT: use .data directive for data section
	
my_int:                         # declare location an single int
        .int 1234               # value 1234

other_int:                      # declare another accessible via name 'other_int'
        .int 0b0101             # binary value as per C '0b' convention

my_array:                       # declare multiple ints sequentially starting at location
        .int 20                 # 'my_array' for an array. Each are spaced 4 bytes from the
        .int 0x00014            # next and can be given values using the same prefixes as 
        .int 0b11110            # are understood by gcc.


## WARNING: Don't forget to switch back to .text as below
## Otherwise you may get weird permission errors when executing 

.text
.global  set_display_from_tod

## ENTRY POINT FOR REQUIRED FUNCTION
set_display_from_tod:
        ## assembly instructions here

	## two useful techniques for this problem
        movl    my_int(%rip),%eax    # load my_int into register eax
        leaq    my_array(%rip),%rdx  # load pointer to beginning of my_array into rdx


.text
.global clock_update
        
## ENTRY POINT FOR REQUIRED FUNCTION
clock_update:
	## assembly instructions here

3.5 set_tod_from_ports()

int set_tod_from_ports(tod_t *tod);
// Reads the time of day from the CLOCK_TIME_PORT global variable. If
// the port's value is invalid (negative or larger than 16 times the
// number of seconds in a day) does nothing to tod and returns 1 to
// indicate an error. Otherwise, this function uses the port value to
// calculate the number of seconds from start of day (port value is
// 16*number of seconds from midnight). Rounds seconds up if there at
// least 8/16 have passed. Uses shifts and masks for this calculation
// to be efficient. Then uses division on the seconds since the
// begining of the day to calculate the time of day broken into hours,
// minutes, seconds, and sets the AM/PM designation with 1 for AM and
// 2 for PM. By the end, all fields of the `tod` struct are filled in
// and 0 is returned for success.
 // 
// CONSTRAINT: Uses only integer operations. No floating point
// operations are used as the target machine does not have a FPU.

Note that this function uses a tod_t struct which is in clock.h described here:

// Breaks time down into 12-hour format
typedef struct{
  int   day_secs;    // seconds from start of day
  short time_secs;   // seconds in current hour
  short time_mins;   // minutes in current hour
  short time_hours;  // current hour of day
  char  ampm;        // 1 for am, 2 for pm
} tod_t;

Assembly Implementation Notes set_tod_from_ports

  1. The function one argument: a pointer to a struct which will be in register rdi.
  2. Return values or functions are to be placed eax for 32 bit quantities as is the case here (int).
  3. To access a global variable and copy it into a register, use the following assembly syntax

       movl  CLOCK_TIME_PORT(%rip), %ecx    # copy global var to reg ecx
    

    The function should not change CLOCK_TIME_PORT so copying it to a register is among the first steps to perform.

  4. Use comparisons and jump to a separate section of code that is clearly marked as "error" if you detect a bad arguments.
  5. Make use of shift / mask operations to convert the port value to the number of seconds from the beginning of the day. Use the remainder value to determine if seconds should round up or not.
  6. Make use of division to "break down" the argument time_secs. Keep in mind that the idivl instruction must have eax as the dividend, edx zeroed out via a cqto instruction. Any 32-bit register can contain the divisor. After the instruction, eax will hold the quotient and edx the remainder. With cleverness, you'll only need to do a couple divisions.
  7. A pointer to a tod_t struct can access its fields using the following offset table which assume that %reg holds a pointer to the struct (substitute an actual register name).

        Destination Assembly
    C Field Access Offset Size Assign 5 to field
    tod->day_secs 0 bytes 4 bytes movl $5, 0(%reg)
    tod->time_secs 4 bytes 2 bytes movw $5, 4(%reg)
    tod->time_mins 6 bytes 2 bytes movw $5, 6(%reg)
    tod->time_hours 8 bytes 2 bytes movw $5, 8(%reg)
    tod->ampm 10 bytes 1 byte movb $5,10(%reg)

    You will need to use these offsets to set the fields of the struct near the end of the routine.

3.6 set_display_from_tod

int set_display_from_tod(tod_t tod, int *display);
// Accepts a tod and alters the bits in the int pointed at by display
// to reflect how the LCD clock should appear. If any time_** fields
// of tod are negative or too large (e.g. bigger than 12 for hours,
// bigger than 59 for min/sec) or if the AM/PM is not 1 or 2, no
// change is made to display and 1 is returned to indicate an
// error. The display pattern is constructed via shifting bit patterns
// representing digits and using logical operations to combine them.
// May make use of an array of bit masks corresponding to the pattern
// for each digit of the clock to make the task easier.  Returns 0 to
// indicate success. This function DOES NOT modify any global
// variables

Assembly Implementation Notes set_display_from_tod

  1. Arguments will be
    • a packed tod_t struct in %rdi and %rsi
    • an integer pointer in %rdx
  2. The packed tod_t struct is spread across two registers, %rdi and %rsi so will have the following layout.

        Bits Shift  
    C Field Access Register in reg Required Size
    tod.day_secs %rdi 0-31 None 4 bytes
    tod.time_secs %rdi 32-47 Right by 32 2 bytes
    tod.time_mins %rdi 48-63 Right by 48 2 bytes
    tod.time_hours %rsi 0-15 None 2 bytes
    tod.ampm %rsi 16-23 Right by 16 1 bytes

    To access individual fields of the struct, you will need to do shifting and masking to extract the values from the %rdi / %rsi registers.

  3. Use comparisons and jump to a separate section of code that is clearly marked as "error" if you detect bad fields in the tod struct argument.
  4. As was the case in the C version of the problem, it is useful to create a table of bit masks corresponding to the bits that should be set for each clock digit (e.g. digit "1" has bit patter 0b0000110). In assembly this is easiest to do by using a data section with successive integers. An example of how this can be done is below.

       .section .data
       array:                          # an array of 3 ints
               .int 200                # array[0] = 200
               .int 300                # array[1] = 300
               .int 400                # array[3] = 400
       const:
               .int 17                 # special constant
        
       .section .text
       .globl func
       func:
               leaq array(%rip),%r8    # r8 points to array, rip used to enable relocation
               movq $2,%r9             # r9 = 2, index into array
               movl (%r8,%r9,4),%r10d  # r10d = array[2], note 32-bit movl and dest reg
               movl const(%rip),%r11d  # r11d = 17 (const), rip used to enable relocation
    

    Adapt this example to create a table of useful bit masks for digits. The GCC assembler understands binary constants specified with the 0b0011011 style syntax.

  5. Make use of division again to compute "digits" for the ones and tens place of the hours and minutes for the clock. Use these digits to reference into the table of digit bit masks you create to progressively build up the correct bit pattern for the clock display.
  6. Use shifts and ORs to combine the digit bit patterns to create the final clock display bit pattern.

3.7 clock_update

int clock_update();
// Examines the CLOCK_TIME_PORT global variable to determine hour,
// minute, and am/pm.  Sets the global variable CLOCK_DISPLAY_PORT bits
// to show the proper time.  If CLOCK_TIME_PORT appears to be in error
// (to large/small) makes no change to CLOCK_DISPLAY_PORT and returns 1
// to indicate an error. Otherwise returns 0 to indicate success.
//
// Makes use of the previous two functions: set_tod_from_ports() and
// set_display_from_tod().
// 
// CONSTRAINT: Does not allocate any heap memory as malloc() is NOT
// available on the target microcontroller.  Uses stack and global
// memory only.

Assembly Implementation Notes for clock_update

  1. No arguments come into the function.
  2. Call the two previous functions to create the struct and manipulate the bits of an the display. Calling a function requires that the stack be aligned to 16-bytes; there is always an 8-byte quantity on the stack (previous value of the rsp stack pointer). This means the stack must be extended with a pushq instruction before any calls. A typical sequence is

       pushq/subq %rsp      # adjust the stack pointer to make space for local      
                           # values AND align to a 16-byte boundary
    
       call    some_func   # stack aligned, call function
       ## return val from func in rax or eax
    
       call    other_func  # stack still aligned, call other function
       ## return val from func in rax or eax
    
       popq/addq %rsp       # restore the stack pointer to its original value
    

    NOTE: the specific number of pushq instructions to use or subq values to decrease %rsp is dependent on the situation. Common total adjustments are 8 bytes, 24 bytes, and 40 bytes. Pick one that fits the situation here.

  3. In order to call the set_tod_from_ports() function, this function will need to allocate space on the stack for a tod_t. This struct is 12 bytes big so at least that amount of memory will need be available in the stack for use. Since there are also function calls required, grow the stack by 16*N+8 bytes for a non-negative value of N.
  4. Similarly, to call the set_display_from_tod() function, one will need a packed tod_t in a register. If the preceding set_tod_from_ports() call succeeded, this packed struct can be read from memory into registers with several movq instructions. That stack space can re-used if needed.
  5. Keep in mind that you will need to do error checking of the return values from the two functions: if they return non-zero values jump to a clearly marked "error" section and return a 1. If an error occurs, don't forget to pop restore registers and the stack pointer before returning.

3.8 Notes on Partial Credit

Partial credit will be awarded in Manual Inspection for code that looks functional but did not pass tests. However, keep in mind that tests for clock_update() rely on the previous 2 functions working correctly and clock_main requires all functions to work correctly in conjunction. There is no partial credit available for these Automated Tests even if they fail tests due to previous functions behaving incorrectly, they still fail.

3.9 Grading Criteria for Problem 1   grading 60

Weight Criteria
  AUOTMATED TESTS
20 make test-prob1 runs 40 tests for correctness, 0.5 points per test
  test_clock_update.c provides tests for functions in clock_update_asm.s
  There are also tests of clock_main which uses functions from clock_update_asm.c
  MANUAL INSPECTION CRITERIA
   
10 General Criteria for all Functions
  Clear signs of hand-crafted assembly are present.
  Reasonable indentation for assembly file which is mostly "flat": instructions line up, labels are offset from instructions
  Detailed documentation/comments are provided showing the algorithm used in the assembly
  Use of good label names to indicate jump targets: .NEG_PORT is good, .L32 is bad
  High-level variables and registers they occupy are documented in comments
  Error checking on the input values is done with a clear "Error" section/label for each function
  Any callee save registers used (rbx rbp r12 r13 r14 r15) are pushed at the top of functions and popped at the end
   
10 set_tod_from_ports()
  Clear use of shift / mask instructions to convert CLOCK_TIME_PORT to seconds with rounding
  Clear use of the division instruction to compute the seconds, minutes, hours
  Clear section or lines which write fields of tod_t struct to memory
  The idivX instruction is used to compute quotients and remainders that are needed.
   
15 set_display_from_clock()
  There is a clearly documented .data section in assembly setting up useful tables of bitmasks
  Struct fields are unpacked from an argument register using shift operations
  The idivX instruction is used to compute quotients and remainders that are needed.
   
10 clock_update()
  The stack is extended to make space available for local variables (tod_t struct)
  Function calls to the earlier two functions are made with appropriate arguments passed
  The stack is properly aligned at a 16-byte boundary for function calls, likely through a subq
  Changes to the stack for local variables / alignment are undone via a complementary addq instruction
  There is a clear sequence of instructions that load a memory address for the first function call
  There is a clear sequence of instructions that load a packed struct into registers for the second function call

NOTE: Passing all tests and earning all manual inspection criteria will earn up to 5 Points of Project Makeup Credit which will offset past and future loss of credit on projects.

4 Problem 2: The Puzzlebin

4.1 Overview

2021 GDB Quick Guide/Assembly https://www-users.cse.umn.edu/~kauffman/tutorials/gdb.html#gdb-assembly

The nature of this problem is similar to the previous project's puzzlebox: there is a program called puzzlebin which expects certain inputs from a parameter file as input. If the inputs are "correct", a phase will be "passed" earning points and allowing access to a subsequent phases. The major change is that puzzlebin is in binary so must be debugged in assembly. The GDB guide above has a special section on debugging binaries which is worth reading. The typical startup regime is:

>> gdb -tui puzzlebin
(gdb) set args input.txt      # set the command line arguments
(gdb) layout asm              # show disassembled instructions
(gdb) layout regs             # show the register file
(gdb) break phase01           # break at the start of the first phase01
(gdb) run                     # get cracking

Below is a summary of useful information concerning the puzzlebin.

Input File
Data for input should be placed in the input.txt file. The first value in this file will be the userID (first part of your UMD email address) which is 8 or fewer characters.
UserID Randomization
Each phase has some randomization based on the UserID so that the specific answers of an one students will not necessarily work for another student.
One Phase Input per Line
Place the input for each phase on its own line. Some input phases read a whole line and then dissect it for individual data. Putting each input on its own line ensures you won't confuse the input processing.
Passing Phases Earns Points
As with the earlier puzzlebox, points for this problem are earned based on how many phases are completed. Each phase that is completed will earn points.
Use GDB to work with Puzzlebin
The debugger is the best tool to work with running the given program. It may be tempting to try to brute force the puzzlebin by trying many possible inputs but in most cases, a little exploration will suffice to solve most phases.
The input line "SKIP" will skip a phase with a small penalty
Students woefully stuck on a phase may skip it to the next phase with the input line SKIP. Be aware that this applies a small penalty to the overall score.

4.2 Permission Denied Errors

In come cases, the process of zipping an executable like puzzliebin then unzipping it leads to the permissions on it being set incorrectly. Below is a common permissions error and how to fix it by changing the permissions on puzzlebin.

>> ./puzzlebin input.txt
bash: ./puzzlebin: Permission denied

# Permission error on puzzlebin: it is not set to be executable. The
# fix is:

>> chmod u+x puzzlebin
# Manually add the execute 'x' permission
# OR
>> make
chmod u+x puzzlebin
# use the provided Makefile to run that command

>> ./puzzlebin input.txt
========================================
Puzzlebin (release Tue 10-Oct-2024)
'YOUR_DIRECTORY_ID' is a userID and must be max 8 characters
# Now running normally albeit with the need to modify the input file

4.3 Puzzlebin Scoring   grading 40

Scoring is done according to the following table.

Pts Phase Notes
5 Phase 1  
5 Phase 2  
7 Phase 3  
8 Phase 4  
7 Phase 5  
8 Phase 6  
10 Phase 7 Not Required
10 ??? Additional Makeup credit if you can find it
-1 SKIP Penalty for using SKIP to bypass a phase
40 60 Max 40 point for full credit, 20 MAKEUP Credit available

4.4 Advice and Hints

  • Most of the time you should run puzzlebin in gdb as in

      >> gdb -tui ./puzzlebin
    

    Refer to the Quick Guide to GDB if you have forgotten how to use gdb and pay particular attention to the sections on debugging assembly.

  • Most phases process input via calls to scanf()-style functions. Quick insight to the expected input for a phase comes from analyzing the format strings like "%d %d" and "%s %f %d" to those calls. Figure out how to do this early so you can determine the quantity and types of input to each phase.
  • It is worthwhile to look at the Failure Messages when a phase is not going to be passed. These are passed to the failure() function: printing them out may give you some hints.
  • Make use of other tools to analyze puzzlebin aside from the debugger. Some of these like strings are described at the end of the Quick Guide to GDB. They will allow you to search for "interesting" data in the executable puzzlebin.
  • Disassemble the executable to look at its entire source assembly code as a text file. The Quick Guide to GDB shows how to use objdump to do this. Looking at the whole source code reveals that one cannot hide secrets easily in programs.
  • Feel free to do some internet research. There is a well-known "Binary Bomb Lab" assignment by Bryant and O'Hallaron, our textbook authors, that inspired Puzzlebin. It has a long history and there are some useful guides out there that can help you through rough patches. Keep in mind that your code will differ from any online tutorials BUT the techniques to defuse it may be similar to what is required to solve puzzles..

4.5 Compatibility

puzzlebin is a binary executable file and these are always a bit flaky to distribute as is. It has been tested on GRACE and is know to run normally there and is likely to run normally on most Linux systems. If you see strange behavior on a different Linux platform such as segmentation faults, revert to working on GRACE to for immediate relief but also email Prof. Kauffman incompatibilities are of interest.


Web Accessibility
Author: Chris Kauffman (profk@umd.edu)
Date: 2024-10-28 Mon 10:35