Last Updated: 2026-04-20 Mon 15:59

CMSC216 Project 4: Make me a Bake

CODE/TEST DISTRIBUTION: p4-code.zip

VIDEO OVERVIEW: https://umd.instructure.com/courses/1398391/pages/week-12-videos

CHANGELOG: Empty

1 Overview

Build systems pervade software development. They automate repetitive tasks and raise the quality of software by making it easier to create and test software efficiently. Make and its Makefiles is a venerable example of a build system that has stood the test of time. While it has some quirks, it is well worth knowing how to use Makefiles as well as know some details of the internal implementation of such systems.

This project will build a simplified version of Make called Bake which uses Bakefiles that have a simplified syntax compared to Makefiles but implement the core principles of a build system. These principles include the following which are all related to systems programming:

  • Creating child processes to execute external commands
  • Examining files on the file system to determine attributes such as presence / absence and modification time
  • Recursively traversing a data structure representing dependencies and executing commands to update targets or detecting that no such updates are needed.

2 Download Code and Setup

Download the code pack linked at the top of the page. Unzip this which will create a project folder. Create new files in this folder. Ultimately you will re-zip this folder to submit it.

File State Notes
bake.h Provided Header file for shellac
bake_util.c Provided Utility functions provided
     
bake_funcs.c CREATE Functions that operate data in the Bake system
bake_main.c CREATE main() function for the Bake
Build/Testing Files    
Makefile Provided Build file to compile all programs
testy Testing Test running script
test_bake12.c Testing Unit tests for Problems 1 & 2
test_bake34.c Testing Unit tests for Problems 3 & 4
test_data12/ Testing Problem 1/2 testing data
test_data34/ Testing Problem 3/4 testing data
test_prob1.org Testing Tests for Problem 1
test_prob2.org Testing Tests for Problem 2
test_prob3.org Testing Tests for Problem 3
test_prob4.org Testing Tests for Problem 4
test_bake5.c Testing Problem 5 Optional unit tests forMakeup Credit
test_prob5.org Testing Problem 5 Optional tests for Makeup Credit
test_data5/ Testing Problem 5 testing data

3 Overview of Bake

3.1 Bakefiles

Bake/Bakefiles share many similiarities with Make/Makefiles. Readers are assumed to have some familiarity with Makefiles. If not, consult Lab09 materials that introduced the basics of make and/or review external manuals and documentation that describe the essence of Makefiles.

Below is a sample Bakefile provided in the test_data12/SampleBakefile

 1: # SampleBakefile: a sample Bakefile. Run this with
 2: # >> bake -f SampleBakefile
 3: 
 4: # lead target is 'demo' which will be used if
 5: # no target is specified on the command line.
 6: # Some of the commands have @ as their prefix which
 7: # silences the command.
 8: demo : hello bye
 9: 	@ echo Running programs
10: 	./hello
11: 	./bye
12: 	@ echo Done running programs
13: 
14: all : hello bye
15: 
16: # this is program 1 / 2
17: hello : hello.o
18: 	gcc -o hello hello.o
19: 
20: hello.o : hello.c
21: 	gcc -c hello.c
22: 
23: # this is program 2 / 2
24: bye : bye.o
25: 	gcc -o bye bye.o
26: 
27: bye.o : bye.c
28: 	gcc -c bye.c
29: 
30: # this target removes built files
31: clean :
32: 	rm -f hello.o bye.o hello bye
33: 
34: # END of SampleBakefile

A few notes on the features / non-features of Bakefiles.

  • Comments always start a line with #; comments are not allowed elsewhere (such as after other parts)
  • Blank lines are allowed and have no effect on builds
  • Any other line is presumed to start a Rules of the same type Makefiles use target : dependencies and the a list of space-indented commands
  • It is required that target and dependency are space separated by the colon. target : dep1 dep2 is fine, target: dep1 dep2 is not supported.
  • Commands that start with @ cmd will cause the command to be Silenced during builds. The Space after @ is required.
  • No variable substitution presently supported and no automatic variables such as $@ are supported

Generally Bakefiles are a subset of Makefiles so make should work for Bakefiles (e.g. make -f Bakefile should build) The output of bake will be a little different than make but accomplish the same task.

3.2 Data Structures

There are 3 central data structures used in Bake which are found in the header bake.h.

cmd_t
Encodes information about a single command like gcc -o myprog x.o y.o from a Bakefile.
rule_t
Encodes an entire rule from a Bakefile including its target, dependencies, and an array of cmd_t structs for each of the commands to run.
bake_t
Encodes the entire contents of a Bakefile. It's main feature is an array of rule_t structs and a String Table: a single large string with other parts of the data structure pointing into it for their own strings.

It is worthwhile spend some time acquainting yourself with some of the documentation on these structs provided in bake.h. Refer back to the header often as you need to recall parts of the data.

DESIGN NOTE: The data structures are designed as a compromise between minimizing the need to malloc() memory while still providing reasonable flexibility to handle Bakefiles. One notable tradeoff that will affect your code

  • The rules[] array in bake_t is malloc()'d so can grow via re-allocation. It therefore tracks both rule_capcity (size of the array) and rule_count, the number of elements in the array.
  • Other arrays are fixed length: deps[] and cmds[] in rule_t and tokens[] in cmd_t have a fixed size. In a production setting, one might reconsider this but using fixed sizes here simplifies much of the parsing difficulty by pre-allocating space for these arrays.

3.3 Outline of bake_funcs.c

The primary implementation files required are bake_funcs.c and bake_main.c. As the name suggests, bake_main.c will contain the main() function and is part of the final problem.

bake_funcs.c has a number of "service" functions which manipulate Bake data. Each of these is required and will be tested. The outline and some brief documentation for them is below.

// bake_funcs.c: required service functions for handling of Bakefiles

#include "bake.h"

////////////////////////////////////////////////////////////////////////////////
// PROBLEM 1
////////////////////////////////////////////////////////////////////////////////

char *slurp_file_efficient(char *fname);
// PROBLEM 1: Uses combination of stat() and read() to efficiently
// read in the entire contents of a file into a malloc()'d block of
// memory, null terminates it (\0). Returns a pointer to the file data
// or NULL if the file could not be opened. If the file can't be
// opened, use perror() to print "Couldn't open file" and why.

rule_t *bake_target_rule(bake_t *bake, char *targname);
// PROBLEM 1: Searches bake for a rule with a target that equals
// `targname`. If found, returns a pointer to that rule. Otherwise
// returns NULL if no rule with target `targname` exists.
//
// DESIGN NOTE: The curent design makes use of linear search for
// target location, O(rule_count) for each lookup. Maintainting a hash
// table of target->rule key/vals would allow O(1) lookup for rules
// with given target names.

void bake_print_cmd(cmd_t *cmd);
// PROBLEM 1: If the SILENCE_BIT is set, does nothing and immediately
// returns. Otherwise, prints the given command token by token. If I/O
// redirects are indicated by the fields of the command, prints the
// after all tokens with "< infile" for input followed by "> outfile"
// if present. This function is used to print out commands in
// bake_execute_cmd().

////////////////////////////////////////////////////////////////////////////////
// PROBLEM 2
////////////////////////////////////////////////////////////////////////////////

int bake_execute_cmd(cmd_t *cmd);
// PROBLEM 2: Called during bake_do_update(). Prints the command using
// bake_print_cmd() unless its SILENCE bit is set. fork()'s a child
// process which exec's the command specified in cmd->tokens.  Sets up
// I/O redirection in child process if indicated by cmd flags and
// fields. Uses a wait() call to wait until the child process is
// done. If the child completes normally, its exit code is passed up,
// 0 or non-zero. If the child process fails to complete normally or
// prints an error message with line number of command
// CMD_NONSTAND_EXIT. Non-zero returns from this function will usually
// be handled in a build by cancelling the build whil 0 return will
// cause the build to prceed and execute subsequent commands.
//
// During Input and Output redirection, if the requrested files cannot
// be opened, the child process exits with code CMD_FAIL_INPREDI or
// CMD_FAIL_OUTREDI. Additionally, if the child fails to exec() a
// command, CMD_FAIL_EXEC is used as its return code. All of these are
// non-zero and trigger builds to fail. In the above failure cases,
// prints one of the below error messages appropriate to the error.
// 
// perror("ERROR: can't open file for output")
// perror("ERROR: can't open file for output")
// perror("ERROR: job failed to exec");
//
// CLARIFICATION: When open()'ing files for output redirection,
// certain options should be passed to ensure that the created file
// adheres to conventions
// - For the second argument to open(), pass options
//     O_WRONLY|O_CREAT|O_TRUNC
//   which will open the file for writing, create it if not present,
//   and truncate the file if it already exists
// - For the third argument to open(), pass the arguments
//     S_IRUSR|S_IWUSR
//   which will set the "rw" permissions on the created file for the
//   owner (user) of the file

////////////////////////////////////////////////////////////////////////////////
// PROBLEM 3
////////////////////////////////////////////////////////////////////////////////

int bake_set_updates(bake_t *bake, char *targname);
// PROBLEM 3: Starting at `targname`, recurses down the
// target/dependency DAG setting the UPDATE bit in rule_flags to
// indicate whether a target requires updates. Makes use of the
// CLEAR_BIT(), CHECK_BIT(), and SET_BIT() macros when dealing with
// bit-manipulations. For a given rule, first visits all its
// dependencies to check if they require updates via recursion. Then
// determines if `targname` requires an update. The conditions that
// require an update are described in detail in the program
// specification and shoudl be consulted while implementing this
// function. Returns 1 if target needs to be updated so associated
// rule commands should be run later. Returns 0 if the named target
// does not need to be updated. Prints an error if targname is not
// found and returns -1. If any dependency returns -1, an error
// ocurred lower down the DAG. In case, nothing is printed nothing and
// -1 is returned up the nested recursion.

int bake_do_updates(bake_t *bake, char *targname);
// PROBLEM 3: Starting at `targname`, run commands if the UPDATE bit
// is set. Before running commands associated with `targname`,
// recursively visit dependencies and if their UPDATE bit is set, run
// their commands first. Returns number of rules that were updated or
// -1 to indicate an error ocurred while running commands.

////////////////////////////////////////////////////////////////////////////////
// PROBLEM 4
////////////////////////////////////////////////////////////////////////////////

int bake_cmd_postprocess(bake_t *bake, cmd_t *cmd);
// PROBLEM 4: Examines cmd->tokens to set cmd->flags and a few other
// fields. Used to post-process the raw tokens of a command. Likely
// uses the utility function array_shift() to ease the re-arrangment
// of arrays.
// 
// 1. If tokens[0] = "@", eliminates this token by shifting all tokens
// over and sets the SILENCED bit
// 
// 2. If tokens[i] = "<", input redirection is desired; if tokens[i+1]
// is not NULL, returns -1. Otherwise sets the INPREDI bit and
// input_redirect field to tokens[i+1]. Then shifts all tokens over to
// eliminate tokens[i] and tokens[i+1].
// 
// 3. If tokens[i] = ">", output redirection is desired and performs
// similar operations to #2 with the OUTREDI flag and output_redirect
// fields.
//
// Correctly handles any order of "< input" / "> output" and other
// tokens (ex: output redirection may appear first followed by normal
// tokens followed by input redirection)
// 
// If multiple "<" or ">" tokens appear, the behavior of this function
// is unspecified: it may return an error, segfault, or behave randomly.
// 
// Returns 0 on succesful completion.
//
// DESIGN NOTE: This function must be called prior to any commands
// being run. The ideal location is during parsing to modify commands
// as a bake_t structure is built. However, this functionality is
// separated out to enable future functionality such as variable
// substitution to be added to commands and tested separately from
// parsing. Any additional context needed for such things will be part
// of the 'bake' parameter (e.g. may contain a map of variable
// names/values for substitutions).

void bake_post_process(bake_t *bake);
// PROBLEM 4: Applies postprocessing operations to the bake after
// loading it from a file. Currently only iterates over all rules and
// applies bake_cmd_postprocess() to each of their commands.
//
// DESIGN NOTE: This function is where additional operations could be
// used to further modify the bakefile such as performing variable
// substitutions on rules / targets, including other bakefiles. etc.

////////////////////////////////////////////////////////////////////////////////
// PROBLEM 5 MAKEUP CREDIT (OPTIONAL)
////////////////////////////////////////////////////////////////////////////////

int bake_check_cycles(bake_t *bake, char *targname, int do_print);
// PROBLEM 5 MAKEUP: Analyzes `bake` starting at targname to
// deteremine if any cycles are present which would create problems
// completing a build: if targA depends on targB depends an targA, a
// cyclcil dependency exists and could not be fulfilled.  Returns 1 if
// a cycle is present, 0 if no cycle is present, and -1 if an error
// occurs such as not finding a needed target.
//
// If do_print is non-zero, prints out a message like:
// 
//   ERROR Cyclic Dependency detected:
//   fileA.txt -> fileB.txt -> fileC.txt -> fileA.txt
// 
// where the first line is always identical and the second line shows
// the path from the initial target through a cycle. Implementations
// assume that these messages will fit in a 2048-byte array and have
// undefined behavior if the message is larger.
//
// NOTES: This function needs to initiate a recursive descent into
// `bake` via a helper function which does the brunt of the work.
// This helper function may be defined in whatever way with whatever
// prototype is convenient. Global variables may be used to help
// generate cycle message BUT implementations with pride will find
// ways to avoid the use of global variables.

3.4 Provided Utility Functions

The emphasis of the project is on utilizing some system calls to achieve an interesting effect. To that end, some code is provided to focus attention on these aspects of the project. Most of these functions will be used by code in bake_funcs.c or the main() function so consult these enough to know how to use them.

Notable functions are:

Dprintf()
Print a debugging message which is only shown if an environment variable is set. Extremely useful during debugging and alleviates the need to remove debugging messages after the fact.
array_shift()
moves around elements in an array to "delete" one. Useful when post-processing the tokens in a cmd_t.
bake_create_from_file()
reads the contents of a file and create a bake_t from it. This is tricky code worth studying but students are not required to modify it in any way, just use it in main() to load a specified file then post-process the data structure before operating on it. The function makes use slurp_file_efficient() which students will need to write. Some other functions used in parsing are provided as well.

Below is an outline of the provided function. The full source can be viewed in the indicated source file.

// bake_util.c: provided functions for the bake program

#include "bake.h"

////////////////////////////////////////////////////////////////////////////////
// string processing
////////////////////////////////////////////////////////////////////////////////

void split_into_lines(char *text, char ***lines, int *nlines);
// Modifies `text` so that each line ends with a \0 (rather than
// \n). Sets lines to point to a malloc()'d array of the start of each
// line. Sets nlines to be the number of lines.
//
// EXAMPLE:
// char text[32] = "every good\nboy does fine\nFACEG\n\0";
// char **lines; int nlines;
// split_into_lines(text, &lines, &nlines);
// - text[] is now "every good\0boy does fine\0FACEG\0\0"
// - nlines is now 4
// - lines is now {"every good","boy does fine","FACEG",""}
// - lines[i] points into text[]

void split_into_tokens(char *line, char ***tokens, int *ntokens);
// Modifies line so that whitespace-separate characters are replaced
// with \0's to denote 'tokens'. Sets parameter `tokens` to be a
// malloc()'d array with pointers to each of the tokens. Sets ntokens
// to be the nubmer of tokens in the line.
//
// EXAMPLE:
// char line[32] = "hello  :  friendly world\0";
// char **tokens; int notkens;
// split_into_tokens(line, &tokens, &ntokens);
// - line[] is now "hello\0\0:\0\0friendly\0world\0"
// - ntokens is now 4
// - tokens is now {"hello",":","friendly","world"}
// - tokens[i] points into line[]

////////////////////////////////////////////////////////////////////////////////
// general utilities
////////////////////////////////////////////////////////////////////////////////

void array_shift(char *strs[], int delpos, int maxlen);
// shift the array of strings left by 1 starting at delpos; eliminates
// delpos from the array.

long diff_timespec(struct timespec a, struct timespec b);
// returns difference in times between a-b; positive return means a is
// newer than b; negative indicates b is newer than a

void Dprintf(const char* format, ...);
// Prints out a message if the environment variable DEBUG is set;
// Try running as `DEBUG=1 ./some_program`

void ind_printf(int ind, const char* format, ...);
// Like printf but indents 'ind' spaces; used for printing recursive
// data structures

////////////////////////////////////////////////////////////////////////////////
// bake-specific provided functions
////////////////////////////////////////////////////////////////////////////////

rule_t *bake_add_empty_rule(bake_t *bake);
// Modifies `bake` to add a new, empty rule to it and returns a
// pointer to that rule. If bake->rules[] is full (rule_capacity and
// rule_count are equal) doubles the size of rules[] via realloc() in
// order create room at its end for the new empty rule. Returns a
// pointer to the new empty rule.
//
// This function intitalizes all the data in the returned rule to NULL
// or 0 including the nested arrays. It uses of the memset() to
// quickly initialize the entire rule_t struct to 0's. This is
// possible due to the nested arrays being within the rule_t rather
// than pointers to other blocks of memory which means a single
// memset() call will suffice.
// 
// CAUTION: Calling this function MAY invalidate any pointers to
// existing rules as the array that houses the rules may move.
// This sequence is dangerous:
//   rule_t *rule  = bake_target_rule(bake, "sometarget");
//   rule_t *empty = bake_add_empty_rule(bake);
//   rule may now point to de-allocated memory

int bake_add_implicit_rules(bake_t *bake);
// Iterate over all rules appending implicit rules for any dependency
// that is not an explicit target. New rules added in this way are
// marked with the RULE_IMPLICIT_BIT. The number of new rules added is
// returned.  
//
// Situations where a program artifact has a source file dependency
// often occur such as in:
// ------------------------------------------------------------
// progA : src1.o src2.o           # EXPLICIT RULE 1
//         gcc src1.o src2.o
//
// src1.o : src1.c                 # EXPLICIT RULE 2
//         gcc -c src1.c
//
// src2.o : src2.c                 # EXPLICIT RULE 3
//         gcc -c src2.c
// ------------------------------------------------------------
// This would create an irregular rules array which would complicate
// the recursive traversal of the DAG data structure. Instead,
// implicit rules are added for the .c files here which need to exist
// to make the overall data structure look like this:
// ------------------------------------------------------------
// progA : src1.o src2.o           # EXPLICIT RULE 1
//         gcc src1.o src2.o
//
// src1.o : src1.c                 # EXPLICIT RULE 2
//         gcc -c src1.c
//
// src2.o : src2.c                 # EXPLICIT RULE 3
//         gcc -c src2.c
//
// src1.c :                        # IMPLICIT RULE 4
//
// src2.c :                        # IMPLICIT RULE 5
// ------------------------------------------------------------
// The Implicit rules are added at the end and are automatically
// satisfied so long as the Target files exist.
//
// CAUTION: Since bake->rules[] may expand, care must be taken when
// referencing rules in this function to avoid inadvertently derefing
// a pointer to free()'d memory.

bake_t *bake_create_from_file(char *fname);
// parse file and create a bakefile from it

void bake_free(bake_t *bake);
// De-allocates memory associated with a bake_t

void bake_show_cmd(cmd_t *cmd, int ind);
// print a formatted version of cmd_t for testing / debugging

void bake_show_rule(rule_t *rule, int ind);
// print a formatted version of a rule_t for testing / debugging

void bake_show_bake(bake_t *bake, int ind);
// print a formatted version of a bake_t for testing / debugging

3.5 Sample Usage

Below is an example use of the complete bake program to perform make-like activities in the provided test_data12/ directory.

  1: # Demonstration of bake 
  2: >> cd p4-code
  3: 
  4: >> make bake
  5: gcc -Wall -Wno-comment -Werror -g -Wno-format-security -c bake_funcs.c
  6: gcc -Wall -Wno-comment -Werror -g -Wno-format-security -c bake_main.c
  7: gcc -Wall -Wno-comment -Werror -g -Wno-format-security -c bake_util.c
  8: gcc -Wall -Wno-comment -Werror -g -Wno-format-security -o bake bake_funcs.o bake_main.o bake_util.o
  9: 
 10: >> cd test_data12
 11: 
 12: >> cat Bakefile1                          # show a sample Bakefile
 13: # this is a comment
 14: 
 15: hello : hello.c
 16:       gcc -o hello hello.c
 17: 
 18: # this is the end of the Bakefile
 19: 
 20: >> ../bake -f Bakefile1                   # use bake to build the targets
 21: bake: updating 'hello' via 1 command(s)
 22: gcc -o hello hello.c 
 23: bake complete, 1 update(s) performed
 24: 
 25: >> ./hello                                # run the built program
 26: Hello world!
 27: 
 28: >> ../bake -f Bakefile1                   # build again but there's no action needed
 29: bake: file 'hello' is up to date
 30: 
 31: >> rm hello                               # delete hello program
 32: 
 33: >> cp Bakefile1 Bakefile                  # copy to the default name 'Bakefile'
 34: 
 35: >> ../bake                                # build using default 'Bakefile'
 36: bake: updating 'hello' via 1 command(s)
 37: gcc -o hello hello.c 
 38: bake complete, 1 update(s) performed
 39: 
 40: >> ../bake                                # again, rebuilding need take no action
 41: bake: file 'hello' is up to date
 42: 
 43: >> cat Bakefile3                          # show a more complex Bakefile
 44: # this is a leading comment
 45: 
 46: # this target uses the program that is built
 47: # and runs it printing messages around it
 48: demo : hello bye
 49: 	@ echo Running programs
 50: 	./hello
 51: 	./bye
 52: 	@ echo Done running programs
 53: 
 54: all : hello bye
 55: 
 56: # this is program 1 / 2
 57: hello : hello.o
 58: 	gcc -o hello hello.o
 59: 
 60: hello.o : hello.c
 61: 	gcc -c hello.c
 62: 
 63: # this is program 2 / 2
 64: bye : bye.o
 65: 	gcc -o bye bye.o
 66: 
 67: bye.o : bye.c
 68: 	gcc -c bye.c
 69: 
 70: # this target removes built files
 71: clean :
 72: 	rm -f hello.o bye.o hello bye
 73: 
 74: # END OF THE Bakefile3
 75: 
 76: >> ../bake -f Bakefile3 clean             # clean up any targets that exist
 77: bake: updating 'clean' via 1 command(s)
 78: rm -f hello.o bye.o hello bye 
 79: bake complete, 1 update(s) performed
 80: 
 81: >> ../bake -f Bakefile3 all               # build with a named target 'all'
 82: bake: updating 'hello.o' via 1 command(s)
 83: gcc -c hello.c 
 84: bake: updating 'hello' via 1 command(s)
 85: gcc -o hello hello.o 
 86: bake: updating 'bye.o' via 1 command(s)
 87: gcc -c bye.c 
 88: bake: updating 'bye' via 1 command(s)
 89: gcc -o bye bye.o 
 90: bake: updating 'all' via 0 command(s)
 91: bake complete, 5 update(s) performed
 92: 
 93: >> ../bake -f Bakefile3                   # build with the default 1st target 'demo'
 94: bake: updating 'demo' via 4 command(s)
 95: Running programs
 96: ./hello 
 97: Hello world!
 98: ./bye 
 99: See you, space cowboy
100: Done running programs
101: bake complete, 1 update(s) performed
102: 
103: >> ../bake -f Bakefile3 clean             # clean up the build
104: bake: updating 'clean' via 1 command(s)
105: rm -f hello.o bye.o hello bye 
106: bake complete, 1 update(s) performed
107: 
108: >> ../bake -f Bakefile3 bye               # build the target 'bye'
109: bake: updating 'bye.o' via 1 command(s)
110: gcc -c bye.c 
111: bake: updating 'bye' via 1 command(s)
112: gcc -o bye bye.o 
113: bake complete, 2 update(s) performed
114: 
115: >> ./bye                                  # run bye program
116: See you, space cowboy
117: 
118: >> ../bake -f Bakefile3 no-target         # show the behavior on a missing target
119: No rule to create target 'no-target'
120: bake failed
121: 
122: >> ../bake -f Bakefile3 all               # build everything
123: bake: updating 'hello.o' via 1 command(s)
124: gcc -c hello.c 
125: bake: updating 'hello' via 1 command(s)
126: gcc -o hello hello.o 
127: bake: updating 'all' via 0 command(s)
128: bake complete, 3 update(s) performed
129: 
130: >> ../bake -f Bakefile3 all               # verify no updates are needed
131: bake: nothing to be done for target 'all'
132: 
133: >> touch -d '-1min' hello hello.o         # change timestamp on program / .o
134: >> touch hello.c                          # to be older than the source file
135: 
136: >> ../bake -f Bakefile3 all               # bake deteces the need to update hello but not bye
137: bake: updating 'hello.o' via 1 command(s)
138: gcc -c hello.c 
139: bake: updating 'hello' via 1 command(s)
140: gcc -o hello hello.o 
141: bake: updating 'all' via 0 command(s)
142: bake complete, 3 update(s) performed

4 Problem 1: Service Functions for Bake

The functions here are meant to acquaint students with some of the data types and functionalities required in Bake. None of the functions is terribly long and should be considered a "warm-up" for later work.

4.1 Slurping Files

Code is provided in bake_create_from_file() to parse a Bakefile and create a bake_t from its contents. This requires that the entire file be loaded into memory via a "slurp". This will be provided by the following function.

char *slurp_file_efficient(char *fname);
// PROBLEM 1: Uses combination of stat() and read() to efficiently
// read in the entire contents of a file into a malloc()'d block of
// memory, null terminates it (\0). Returns a pointer to the file data
// or NULL if the file could not be opened.

As the documentation suggests, the goal here is efficiency.

  • Make use of the stat() system call to determine the size of the file in bytes and use a single malloc() to allocate enough space for it
  • Use an open() call and as few read() calls as possible to read the entire contents of the file into the allocated memory

While in many cases it is not efficient to read the entire contents of a file into memory, Bakefiles tend to a have a memory footprint that is similar to the in-memory data structure reflecting the file contents so this approach is justified.

4.2 Looking up Rules

The main data of Bakefiles is stored in the bake_t rules[] array. In other parts of the code, one must find a rule_t struct in this array based on its target so the following function is reuired.

rule_t *bake_target_rule(bake_t *bake, char *targname);
// PROBLEM 1: Searches bake for a rule with a target that equals
// `targname`. If found, returns a pointer to that rule. Otherwise
// returns NULL if no rule with target `targname` exists.
//
// DESIGN NOTE: The curent design makes use of linear search for
// target location, O(rule_count) for each lookup. Maintainting a hash
// table of target->rule key/vals would allow O(1) lookup for rules
// with given target names.

4.3 Working with Bits in Flags Fields

Several structs like cmd_t and rule_t have flags fields that govern their behavior. "Flags" is a colloquial term for true/false options that change the behavior of some entity. In this case, the flags are stored as a series of 0's and 1's in a struct field which must be checked or changed in some situations.

When working with bits in the flag fields of structs, you are encouraged to use the following 3 macros which are provided in the bake.h header. This will make your code more readable and less error-prone.

// Macros to deal with bits in flags
#define SET_BIT(flag,   bitmask) (flag |=  bitmask)
#define CLEAR_BIT(flag, bitmask) (flag &= ~bitmask)
#define CHECK_BIT(flag, bitmask) (flag &   bitmask)

As an example, one might check whether a particular Command is to printed or is silenced using a block like this:

if( CHECK_BIT(cmd->cmd_flags, CMD_SILENCE_BIT) ){
  // code for True / is silenced
}
else{
  // code for False / is not silenced
}

4.4 Printing Commands

Aside from rule_t structs, the other major struct is cmd_t. This struct encodes a command to run including aspects such as

  • The Tokens for the command: a command like

    gcc -o program -g file1.c file2.c
    

    would have a tokens[] array that looks like

    tokens[] = {"gcc","-o","program","-g","file1.c","file2.c",NULL}
    

    Tokens will serve the role of argv[] when exec()'ing the command.

  • Whether or not the command is "silenced": started with the @ symbol in the source file which suppresses printing. This is in cmd_flags as presence/absence of CMD_SILENCE_BIT.
  • Whether to redirect input or output for the command. These attributes show up in the fields input_redirect / output_redirect and within cmd_flags.
  • The source file line the command came from, useful when reporting errors.

Commands are printed when the are run unless silenced and the following function is responsible for formatting them.

void bake_print_cmd(cmd_t *cmd);
// PROBLEM 1: If the SILENCE_BIT is set, does nothing and immediately
// returns. Otherwise, prints the given command token by token. If I/O
// redirects are indicated by the fields of the command, prints the
// after all tokens with "< infile" for input followed by "> outfile"
// if present. This function is used to print out commands in
// bake_execute_cmd().

5 Problem 2: Executing Commands

It is typical during builds to run commands that may compile code, transform inputs, or run tests. The cmd_t struct encodes a single command to run. It is used to execute the command and then check the results.

  • If the command is successful, further commands may be run and dependent targets can be updated
  • If the command fails, then it is likely the build cannot proceed. Any dependent targets should be notified that they NOT run their commands.

The central function of this problem executes commands.

int bake_execute_cmd(cmd_t *cmd);
// PROBLEM 2: Called during bake_do_update(). Prints the command using
// bake_print_cmd() unless its SILENCE bit is set. fork()'s a child
// process which exec's the command specified in cmd->tokens.  Sets up
// I/O redirection in child process if indicated by cmd flags and
// fields. Uses a wait() call to wait until the child process is
// done. If the child completes normally, its exit code is passed up,
// 0 or non-zero. If the child process fails to complete normally or
// prints an error message with line number of command
// CMD_NONSTAND_EXIT. Non-zero returns from this function will usually
// be handled in a build by cancelling the build whil 0 return will
// cause the build to prceed and execute subsequent commands.
//
// During Input and Output redirection, if the requrested files cannot
// be opened, the child process exits with code CMD_FAIL_INPREDI or
// CMD_FAIL_OUTREDI. Additionally, if the child fails to exec() a
// command, CMD_FAIL_EXEC is used as its return code. All of these are
// non-zero and trigger builds to fail. In the above failure cases,
// prints one of the below error messages appropriate to the error.
// 
// perror("ERROR: can't open file for output")
// perror("ERROR: can't open file for output")
// perror("ERROR: job failed to exec");

5.1 Fork / Exec / Wait

As the documentation indicates, the central idea of the function is to fork() a child process to execute the command.

The child will exec() the command described in the cmd_t tokens[] array. tokens[0] can be used as a command name like gcc with the entirety of tokens[] comprising an argv[] for the command to be executed.

The parent process bake will wait() for this child to finish its execution then check the status of its completion. If the child exists normally and has a 0-return code, it will have completed successfully and the function returns 0. If things go wrong, a non-zero value is returned.

5.2 Setting Up I/O Redirection

If the fields / flags for the command indicate as much, Input and/or Output redirection for the command should be set up. This needs to occur only in the child process and should happen prior to an exec() call.

  • Input redirection will alter Standard Input to read from a file named in cmd->input_redirect
  • Out redirection will alter Standard Output to write to a file named in cmd->output_redirect

The dup2() system call is useful for this and should be reviewed as it has been presented in lab and lecture.

5.3 Failure Detection

There are a number of things that can go wrong in a child process that need to be detected and reported. Specific error messages and return codes are required for each of the following situations. Each condition is something that the child process may detect. When these arise, the child should print the indicated message with perror() and call the exit(ERRCODE) function to immediately exit with the specified Error Code. The parent process will pass this Error Exit Code up as its return value. In the case that a child command has a non-standard exit (e.g. segfault) this will be detected in the parent which does not print anything but does return the indicated error code.

Situation In Error Code perror() Message
Input Redirection Fails Child CMD_FAIL_INPREDI ERROR: can't open file for input
       
Output Redirection Fails Child CMD_FAIL_OUTREDI ERROR: can't open file for output
       
Exec Fails Child CMD_FAIL_EXEC ERROR: command failed to exec
       
Nonstandard Child Exit Detected Parent CMD_NONSTAND_EXIT None

6 Problem 3: Walking the Bake DAG

This problem implements two functions responsible for traversing the structure within a Bakefile and detecting if targets need to be updated. The traversal is necessarily recursive: fileA may be the initial target but depends on fileB which in turn depends fileC. If fileC needs to be updated, then fileB and fileA will too. The technical term of the underlying data structure Bakefiles encode is a Directed Acyclic Graph (DAG) and is the abstract data type of choice for representing dependencies. The data type is

  • A mathematical Graph as it is comprised of vertices and edges (nodes and the lines between them)
  • Directed as the edges (lines) between vertices (nodes) have a direction: fileA depends on fileB, usually drawn as a line from fileB to fileA or "do fileB first then you can do fileA".
  • Acyclic in that there should be no cycles / circles on following edges. A cycle of dependencies like "fileA depends on fileB depends on fileC depends on fileA" would make it impossible to know which file should be updated first. Checking for cyclic dependencies will be part of the Makeup credit in the final release phase.

makefile-dag.png

Figure 1: A drawing of a Bakefile / Makefile DAG (Directed Acyclic Graph); DAG's are similar to trees BUT vertices/nodes may have multiple parents.

The two required functions for this problem traverse the DAG embedded in a Bakefile but perform two separate actions on it.

int bake_set_updates(bake_t *bake, char *targname);
// PROBLEM 3: Starting at `targname`, recurses down the
// target/dependency DAG setting the UPDATE bit in rule_flags to
// indicate whether a target requires updates. Makes use of the
// CLEAR_BIT(), CHECK_BIT(), and SET_BIT() macros when dealing with
// bit-manipulations. For a given rule, first visits all its
// dependencies to check if they require updates via recursion. Then
// determines if `targname` requires an update. The conditions that
// require an update are described in detail in the program
// specification and shoudl be consulted while implementing this
// function. Returns 1 if target needs to be updated so associated
// rule commands should be run later. Returns 0 if the named target
// does not need to be updated. Prints an error if targname is not
// found and returns -1. If any dependency returns -1, an error
// ocurred lower down the DAG. In case, nothing is printed nothing and
// -1 is returned up the nested recursion.

int bake_do_updates(bake_t *bake, char *targname);
// PROBLEM 3: Starting at `targname`, run commands if the UPDATE bit
// is set. Before running commands associated with `targname`,
// recursively visit dependencies and if their UPDATE bit is set, run
// their commands first. Returns number of rules that were updated or
// -1 to indicate an error ocurred while running commands.

6.1 Recursing on Bake Structs

The essence of both bake_set_updates() and bake_do_updates() is to start at some target in the Bake DAG, possibly visit all of its dependencies via recursion, then perform some actions on the current target before returning. This style of recursive code will be familiar to those who have programmed Tree Traversals before: it is a "post-order traversal" but adapted to a DAG rather than a tree.

  • bake_set_updates(bake,target): Determine if the RULE_UPDATE_BIT should be set for this target's rule as it needs its commands run to update it. If any dependency needs an update, this target also needs an update. If no decencies need updates, then this target may need to be updated based on a few other factors.
  • bake_do_updates(bake,target): If this target's rule has its RULE_UPDATE_BIT set, recurse on all its dependencies updating them, then run the commands on the current rule to update it.

In both cases, the general outline of the functions will be akin to the following:

bake_recursive_func(bake, target):
  check that target is in the rules[] array of bake
  if not, return print "No rule to create target XXX" and return -1 for an error

  Perform some initial checks / actions on the current rule

  Loop over all dependencies calling bake_recursive_func() on them
  Capture the return values of recursive call on each dependency
  Use those returns to potentially take further action on the current target

  return whether or not action was taken on the current target

There are details to be filled in for vagaries like "initial checks" and "take further action" but the overall structure of both function is similar and will follow the above template. Remaining sections provide details.

6.2 Setting Update Flags

bake_set_updates(bake,target) must determine whether to set the RULE_UPDATE_BIT in the rule_flags field of a rule to indicate that the rule's commands should be run to update it. Alternatively, the analysis may instead leave that bit clear in which case the target is already up to date and the commands need not be run. Here are the conditions that should be checked to determine whether to set RULE_UPDATE_BIT.

  1. By default, rules should NOT require an Update unless one of the following things is true.
  2. If the rule is Implicit (RULE_IMPLICIT_BIT is set), then it does not need to be updated. Implicit rules correspond to some file that is dependency of an Explicit rule. An exception to this is if the Implicit rule target file is missing: fileA.c is the target of the rule as it is needed by some other target but there is no fileA.c in the build directory. Check for the presence of the target file and if it is missing, print the error message

    No rule to create target 'XXX'
    

    with XXX replaced by the target file. then return -1.

  3. Recursively call bake_set_updates() on each dependency of the given target.
    • If any of these dependencies returns -1 for an error, pass this up the recursion
    • If any dependency returns a 1 to indicate it requires an update, then the current rule requires and update so set the RULE_UPDATE_BIT.
    • Finally, for each dependency, check whether the dependency is a file and the target is a file. If both are, compare their timestamps. If the dependency is Newer than the target, the target will require an update. See the additional information on modification times below.
  4. Check if the current target is a file. If not and there are commands to run, those commands might create the target or achieve some other effect so it requires an update. A typical example is a clean target

    clean : 
    	rm -f file1.o file2.o program
    

    There are no dependencies that require checking and no actual file called clean that is created. The intent is that typing bake clean will always remove any of the mentioned files so it always needs an update.

At the end of bake_set_updates(), return a 1 if the RULE_UPDATE_BIT was set for the current target or a 0 if it was not.

6.3 Dealing with Files and Modification Times

Checking Presence of Files

At several places in bake_set_updates() one must check whether a file exists. For example a rule like

x.o : x.c
	gcc -c x.c

will need to check for the presence of both x.o and x.c at some point.

The access() system call is useful for this. It has been demonstrated in a recent lab and should be used to check if a file is present. The return value of access() will be a 0 to indicate success in accessing an existing file or -1 if an error occurred and a file could not be accessed, usually because it does not exist.

Comparing File Modification Times

The stat() system call should be used to get information such as modification times for files. stat() fills in a struct stat sb struct with information on the file. The Modification time is present in the fields of the stat struct usually in two fields.

  • long st_mtime : this old-school field gives the modification time to 1-second resolution. While present, modern systems provide a higher resolution modification time.
  • struct timespec st_mtim : this struct provides full Nanosecond resolution of modification times. It is slightly less easy to use at a timespec struct has several fields but is favored because it gives time at a higher fidelity.

Extract the st_mtim field after a stat call as it is the higher-resolution timestamp. To compare two timestamps, make use of the diff_timespec(a,b) function provided in bake_util.c.

long diff_timespec(struct timespec a, struct timespec b);
// returns difference in times between a-b; positive return means a is
// newer than b; negative indicates b is newer than a

The function is similar in spirit to strcmp(): it gives a sorting order for the two complex pieces of data passed to it.

The curious coder may look at the body of this function which shows how the fields of the timespec are used to determine the ordering.

6.4 Outline of Set Updates Function

To assist with understanding the general structure of this function and the nature of recursing on a DAG like this, the following outline of bake_set_updates() is provided which can be copied in and used to speed construction.

// PROBLEM 3: Starting at `targname`, recurses down the
// target/dependency DAG setting the UPDATE bit in rule_flags to
// indicate whether a target requires updates. Makes use of the
// CLEAR_BIT(), CHECK_BIT(), and SET_BIT() macros when dealing with
// bit-manipulations. For a given rule, first visits all its
// dependencies to check if they require updates via recursion. Then
// determines if `targname` requires an update. The conditions that
// require an update are described in detail in the program
// specification and shoudl be consulted while implementing this
// function. Returns 1 if target needs to be updated so associated
// rule commands should be run later. Returns 0 if the named target
// does not need to be updated. Prints an error if targname is not
// found and returns -1. If any dependency returns -1, an error
// ocurred lower down the DAG. In case, nothing is printed nothing and
// -1 is returned up the nested recursion.
int bake_set_updates(bake_t *bake, char *targname){
  rule_t *targ_rule = ???;
  if(targ_rule == NULL){
    printf("No rule to create target '%s'\n", targname);
    return -1;
  }

  // clears the update bit to default to false
  CLEAR_BIT(???);

  // special case for implicit targets: enure that they have a file
  // present that is accessible
  if(CHECK_BIT(???)){
    if( access(???)==0 ){                        // use the access() system call to check file exists
      return 0;
    }
    else{                                        // does not exist
      printf("Implicit rule cannot find file '%s'\n", targname);
      return -1;
    }
  }

  // visit each dependency of the target; run ALL iterations for this
  // loop unless a dependency returns an error. ALL depencies should
  // be checked even if one indicates tha the current target needs an
  // update, other depencies may also need updates and should be visited.
  for(int i=0; targ_rule->deps[i]!=NULL; i++){
    int dep_update = ???;            // recursive call to set updates on deeper rules
    if(dep_update == -1){            // dep check failed
      return -1;                     // note: traditional make checks this case to include target/dep in error message but we won't bother here
    }
    else if(dep_update == 1){        // dep needs update so this target also needs an update
      SET_BIT(???);
    }
    else if( ??? ){                  // check both target and dependency file exist, are normal files
      // both target and dep are files that exist, compare timestamps
      // for them as if the dependency file is newer than the target
      // file, an update is required

      // Use stat() to compare timestamps of target file and
      // dependency file; if the dependency is newer than the target,
      // set the update bit for this rule; about 10 lines of code
      ???;
    }
  }

  // special case for rules like 'clean : (nothing)'
  if(access(targname,F_OK)!=0 &&     // no file exists
     targ_rule->cmd_count>0)         // there are commands to run
  {
    SET_BIT(targ_rule->rule_flags, RULE_UPDATE_BIT); // update target
  }

  // return 1 if the UPDATE bit is set and 0 otherwie
  return ???;
}

6.5 Performing Updates

After determining whether each rule in a DAG should be updated or not, the bake_do_updates(bake,function) can be used to recurse through the bake DAG and execute any commands that are required to update targets. This process is again recursive: before performing a given target's commands, its dependencies should checked as if they need to be updated, their commands should be run first. A set of rules like the following make this apparent.

program : prog.o
	gcc -o program prog.o

prog.o : prog.c
	gcc -c prog.c

If prog.o does not exist, then the first command gcc -o program prog.o would fail. Instead, the commands associated with the dependency should be run first to create this file. Then program can be created.

Calls to bake_do_updates() assume that the RULE_UPDATE_BIT in each rule_flag is set or clear (1 or 0) according to whether a rule needs its commands updated or not. Usually this would happen a main() function by first calling bake_set_updates() and then calling bake_do_updates().

Here is the general flow of bake_do_updates(bake,target):

  • Check if the RULE_UPDATE_BIT is set for target; if not, there is nothing to be done so return immediately
  • Recursively call bake_do_updates() on all dependencies of the current target
  • Print out the message

    bake: updating 'XXX' via NNN command(s)
    

    where XXX is the target being updated and NNN is the number of commands that will be run to update it

  • Finally, execute all commands associated with the target using bake_execute_cmd().
  • Any errors that occur in the above should trigger an immediate return of -1 for from the function. That includes:
    • target isn't found
    • Any dependency returns an error
    • Any Command that executes fails or results in a non-zero Exit Code

Before returning, Clear the RULE_UPDATE_BIT from this rule to indicate that the target for the rule is now up to date. By setting RULE_UPDATE_BIT to 0, if this rule is ever visited again, the commands will not be repeated. An example of this appears in the next paragraph.

The return value of bake_do_updates() is the number of targets are updated (i.e. rules that have their commands run). This can be accomplished by

  • Returning 0 if a target does not need to be updated
  • Summing the return values of all dependencies (but not -1 for errors)
  • Adding one to the sum of updates after the commands for the current rule are run and then returning that sum

Below is an example Bakefile to demonstrate with calls to bake to sho the idea of the returning the total number of updates and avoiding executing the same rule twice.

all : prog1 prog2


prog1 : x.o y.o
	gcc -o prog1 x.o y.o

prog2 : x.o z.o
	gcc -o prog2 x.o z.o

x.o : x.c
	gcc -c x.c

y.o : y.c
	gcc -c y.c

z.o : z.c
	gcc -c z.c

Here is the behavior of running bake on this Bakefile

>> ../bake
bake: updating 'x.o' via 1 command(s)
gcc -c x.c 
bake: updating 'y.o' via 1 command(s)
gcc -c y.c 
bake: updating 'prog1' via 1 command(s)
gcc -o prog1 x.o y.o 
bake: updating 'z.o' via 1 command(s)
gcc -c z.c 
bake: updating 'prog2' via 1 command(s)
gcc -o prog2 x.o z.o 
bake: updating 'all' via 0 command(s)
bake complete, 6 update(s) performed
  • 6 total updates are done, one for each rule
  • Each update corresponds to one rule and its target
  • prog1 is updated first which means x.o must be created.
  • Running gcc -c x.c creates x.o; having completed target x.o, the RULE_UPDATE_BIT in its rule_flags will be cleared to indicate it is up to date.
  • prog2 also needs x.o BUT since x.o is up to date, there is no need to re-run any commands to create it. On visiting the x.o rule, its RULE_UPDATE_BIT is NOT set so no commands are run
  • Even though the target all has no commands run, it still counts as an update.

Output and Error Messages

Below is a table of output messages for bake_set_updates() and bake_do_updates() along with the conditions under which they should be printed.

Message Condition
No rule to create target '%s' Printed if a target cannot be found in the rules array which returns an error
   
Implicit rule cannot find file 'XXX' Printed if an Implicit Rule has it's file missing which results in an error: example:. file.c is
  listed as a dependency but there is no file.c available.
   
bake: updating 'XXX' via NNN command(s) Printed before running commands to update target XXX, NNN is the number of commands to be run
   
FFF:LLL ERROR during target 'XXX', exit code NNN Printed when a command to update XXX fails,
  gives the file FFF and line number LLL of the command along with the exit code NNN

7 Problem 4: Postprocessing and Main

The final steps to completing the Bake application are to do a bit of post-processing on the Bakefile contents before tying all functionality together in a main() function.

7.1 Post Processing

To make parsing a Bakefile reasonable, only a minimal amount of processing of its raw contents are done before shuffling them into a bake_t struct. Then intent is then to post-process this data structure to modify its contents based on supported features. This could include a variety of things including

  • Translating automatic and named variables to their values
  • Instituting pattern-based rules
  • Inclusion of other source files
  • etc.

For the simplest version of Bake, only one item is addressed in post-processing: modify each Command so that if it includes I/O redirection or command silencing, the associated cmd_t struct is modified to reflect this. Post-processing of commands are handled by the following function.

int bake_cmd_postprocess(bake_t *bake, cmd_t *cmd);
// PROBLEM 4: Examines cmd->tokens to set cmd->flags and a few other
// fields. Used to post-process the raw tokens of a command. Likely
// uses the utility function array_shift() to ease the re-arrangment
// of arrays.
// 
// 1. If tokens[0] = "@", eliminates this token by shifting all tokens
// over and sets the SILENCED bit
// 
// 2. If tokens[i] = "<", input redirection is desired; if tokens[i+1]
// is not NULL, returns -1. Otherwise sets the INPREDI bit and
// input_redirect field to tokens[i+1]. Then shifts all tokens over to
// eliminate tokens[i] and tokens[i+1].
// 
// 3. If tokens[i] = ">", output redirection is desired and performs
// similar operations to #2 with the OUTREDI flag and output_redirect
// fields.
//
// Correctly handles any order of "< input" / "> output" and other
// tokens (ex: output redirection may appear first followed by normal
// tokens followed by input redirection)
// 
// If multiple "<" or ">" tokens appear, the behavior of this function
// is unspecified: it may return an error, segfault, or behave randomly.
// 
// Returns 0 on succesful completion.
//
// DESIGN NOTE: This function must be called prior to any commands
// being run. The ideal location is during parsing to modify commands
// as a bake_t structure is built. However, this functionality is
// separated out to enable future functionality such as variable
// substitution to be added to commands and tested separately from
// parsing. Any additional context needed for such things will be part
// of the 'bake' parameter (e.g. may contain a map of variable
// names/values for substitutions).

An example Bakefile is useful to understand the intent of this function.

prog : dep1.c dep2.c
	cat dep1.c dep2.c > deps.c        # cmd0
	@ echo Counting lines in deps.c   # cmd1
	wc -l < deps.c                    # cmd2
	gcc -o prog deps.c                # cmd3

The comments like # cmd0 are technically not legal in a Bakefile but make it easier to reference the necessary transformations.

cmd0 Example

cat dep1.c dep2.c > deps.c        # cmd0
  • This command indicates Output Redirection is needed
  • The raw cmd->tokens[] array will be {"cat","dep1.c","dep2.c",">","deps.c",NULL}
  • bake_cmd_postprocess() will detect the ">" token and set the cmd->output_redirect field to be deps.c as will as set the CMD_OUTREDI_BIT in cmd->cmd_flags.
  • The tokens[] array will then be altered remove the output redirection tokens so that it appears as {"cat","dep1.c","dep2.c",NULL}

cmd1 Example

@ echo Counting lines in deps.c   # cmd1
  • This command indicates command printing should be Silenced
  • The raw cmd->tokens[] array will be {"@", "echo", "Counting", "lines", "in", "deps.c", NULL}
  • bake_cmd_postprocess() will detect the "@" as the 0th token and set the set the CMD_SILENCE_BIT in cmd->cmd_flags.
  • The tokens[] array will then be altered remove the @ token so that it appears as {"echo", "Counting", "lines", "in", "deps.c", NULL}

cmd2 Example

wc -l < deps.c                    # cmd2
  • This command indicates Input Redirection is needed
  • The raw cmd->tokens[] array will be {"wc","-l","<","deps.c",NULL}
  • bake_cmd_postprocess() will detect the "<" token and set the cmd->input_redirect field to be deps.c as will as set the CMD_INPREDI_BIT in cmd->cmd_flags.
  • The tokens[] array will then be altered remove the input redirection tokens so that it appears as {"wc","-l",NULL}

cmd3 Example

gcc -o prog deps.c                # cmd3

No modifications need be made to this command as it does not have any I/O redirection or silencing.

Re-arranging Token Arrays

The array_shift() function from ~bakeutils.c~* is useful for re-arranging arrays to remove tokens in the . However, be careful when using this function. Try it in a limited setting to ensure that you understand how to use it properly.

void array_shift(char *strs[], int delpos, int maxlen);
// shift the array of strings left by 1 starting at delpos; eliminates
// delpos from the array.

Post-Processing All Commands

bake_cmd_postprocess() is called in a loop in the following function.

void bake_post_process(bake_t *bake);
// PROBLEM 4: Applies postprocessing operations to the bake after
// loading it from a file. Currently only iterates over all rules and
// applies bake_cmd_postprocess() to each of their commands.
//
// DESIGN NOTE: This function is where additional operations could be
// used to further modify the bakefile such as performing variable
// substitutions on rules / targets, including other bakefiles. etc.

As the design note indicates, other transformations of the bake_t may be added here as the functionality of Bake expands but for now it is a relatively short set of nested loops to post-process commands.

7.2 Main Function

The final step in constructing Bake is to provide a main() function in bake_main.c which sequences all the previous functionality properly.

Command Line options

For convenience, bake will look for a default Bakefile just as make looks for a file named Makefile. Alternatively, a command line option allows for the specification of a specific file to use. IN addition, Bake supports denoting a target to build the command line just as Make does.

The following are example usages that must be supported.

>> bake                         
# use default "Bakefile" and 1st target in that file

>> bake all
# use Bakefile but build target "all" rather than 1st target

>> bake -f Bakefile2         
# use Bakefile2 with 1st target in that file

>> bake -f Bakefile2 all
# use Bakefile2 but build target "all" rather than 1st target

>> bake all -f Bakefile2         
# NOT SUPPORTED and not tested

>> bake -f NoFile
Couldn't open file: No such file or directory
ERROR: unable to process file 'NoFile'
# Example error message for when a requested Bakefile cannot be found

Note that your main() function will need to support each of the cases above except the NOT SUPPORTED command line structure. Conveniently, each of the supported cases has 1, 2, 3, and 4 command line arguments respectively which can make the analyzing these cases easier.

Outline of main() Logic

Below is a suggested outline of how the main() function might flow and make use of the previous functions.

 1: main:
 2:   process the command line arguments to determine if the default Bakefile or some other 
 3:     file should be loaded
 4: 
 5:   load the requested Bakefile int a bake_t struct
 6:   if the load fails, 
 7:     print "ERROR: unable to process file 'XXX'" 
 8:     substituting in the filename and exit the program with EXIT_FAILURE
 9: 
10:   if the loaded file has no rules in it (empty)
11:     print "ERROR: 'XXX' has 0 rules
12:     and exit with EXIT_FAILURE
13: 
14:   Add implict rules to the bake_t
15:   Post-process the bake_t
16: 
17:   If a target like "all" or "clean" was specified on the command line
18:     set that as the target to update
19:     otherwise set the 0th rule target as the one to update
20: 
21:   Analyze the bake_t starting at the specified target to set which
22:     targets need to be updated
23: 
24:   If the update analysis fails
25:     print "bake failed" and exit with a failure
26: 
27:   If the update analysis indicates the target does not need to be updated
28:     Check if the target is an existing file
29:       If so, print the message "bake: file 'XXX' is up to date"
30:       If not, print the mssage "bake: nothing to be done for target 'XXX'"
31:     Then return/exit with a 0 exit code for a successful build
32: 
33:   Since some targets need to be updated, recursively run comands to update
34:     dependencies and targets counting the total number of rule updates
35: 
36:   If updates fail (perhaps due to commands failing)
37:     print the message "bake failed" and exit with EXIT_FAILURE
38: 
39:   Otherwise, on a successful update, print the message
40:     "bake complete, NNN update(s) performed" 
41:     substituting in the number of updates (rules with commands run)
42:   
43:   return 0 for success

A few notes on this overview

  • While somewhat long, many of the lines above just translate to calls to previously written functions like bake_add_implicit_rules() or bake_do_updates()
  • When exiting you may need to free memory to prevent a memory leak. bake_util.c provides bake_free() to de-allocate the memory associated with a bake_t and its nested data. Note carefully the Exit cases above as many of them will require freeing memory.
  • At various points when errors are detected in the Bake process, the overview indicates "exit with EXIT_FAILURE". This is accomplished in one of two ways.

    • By return EXIT_FAILURE; from main()
    • By calling the C standard exit(EXIT_FAILURE) function from anywhere

    The constants EXIT_FAILURE (usually defined to be 1) and EXIT_SUCCESS (usually defined to be 0) are defined in included headers and may be used to convey a bit more semantic meaning about failure/success than just writing constants 1/0.

Main Output and Error Messages

The following table lists some of the expected output and error messages that main() will produce and the conditions associated with them.

Message Conditions
ERROR: unable to process file 'XXX' Requested Bakefile failed to load properly
   
ERROR: '%s' has 0 rules Requested Bakefile had no rules in it
   
bake failed Printed when errors occurred during bake_set_updates() or bake_do_updates()
   
bake: file 'XXX' is up to date No updates needed for XXX and it is an accessible file
   
bake: nothing to be done for target 'XXX' No updates needed for XXX and it is NOT an accessible file
   
bake complete, NNN update(s) performed Successful completion, substitute NNN for the number of updates performed

8 Grading Criteria   grading 100

8.1 Points by Section

The following criteria will be checked. Some are Automated and available during development via command like make test while others are done Manually by graders after submission.

Weight Criteria
  AUTOMATED TESTS: 1 point per test
45 TOTAL
   
10 Problem 1: make test-prob1 runs tests from test_prob1.org for correctness
10 Problem 2: make test-prob2 runs tests from test_prob2.org for correctness
10 Problem 3: make test-prob3 runs tests from test_prob3.org for correctness
15 Problem 4: make test-prob3 runs tests from test_prob4.org for correctness
  MANUAL INSPECTION
55 TOTAL
   
15 Code Style: Functions adhere to CMSC 216 C Coding Style Guide which dictates
  reasonable indentation, appropriate commenting, consistency of curly usage, etc.
   
10 Appropriate code shape and flow: no over-complicated sections that with unneeded nested conditionals,
  loops, etc. when simpler solutions would suffice.
   
5 PROBLEM 1 bake_funcs.c
  slurp_file_efficient():
  - stat used for size of file
  - efficient memory allocation via single malloc()
  - error checking open() calls reporting appropriately on failures
  bake_target_rule(): Iteration over bake_t data structure, use of strcmp()
  bake_print_cmd(): CHECK_BIT() macro used to inspect SILENCE_BIT, clear code to print I/O redirects
   
10 PROBLEM 2 bake_funcs.c bake_execute_cmd()
  Use of bake_print_cmd() to output command if needed
  Use of fork() to create a child process, code split between parent/child
  Parent uses wait() / waitpid() to block until child is done, identifies normal/abnormal child exit
  Child sets up input/output redirection via dup2()
  Chile uses an exec()-family function to execute the given command
  Child does error checking for failure of I/O redirection or exec and returns appropriate Exit Codes
   
  PROBLEM 3 bake_funcs.c
  Use of bit macros like SET_BIT() / CHECK_BIT() for bit flag manipulations
  Utilize previous functions like bake_target_rule() to access bake_t data
  Uses clear structure to visit all dependencies recursively, processes "lower" targets before "higher" ones
5 CRITERIA FOR bake_set_updates()
  Uses updates on dependencies to dictate whether a given target needs an update
  Handles special case of Implicit rules: target file must be present, no updates required
  Utilizes access() system call check presences/absence of target files
  Utilizes stat() system call to get timestamp information on target/dependency files
5 CRITERIA FOR bake_do_updates()
  Iterates over all commands using bake_execute_cmd() to run them when an update is needed
  Checks the return code of each command and terminates with an error if commands fail
  Clears the UPDATE bit after updating a target to ensure that the rule is updated only once
   
  PROBLEM 4 bake_funcs.c and bake_main.c
5 CRITERIA FOR bake_cmd_post_process() in bake_funcs.c
  Has cases to handle each of input redirection, output redirection, and silencing commands
  Handles command array re-arrangement reasonably, likely through use of array_shift()
10 CRITERIA FOR main() in bake_main.c
  Clear command line handling logic to detect -f <bakefile> option and command line target like all
  Code to use the default file Bakefile if no command line file is specified
  Use of bake_create_from_file() to load a Bakefile with Error checking to ensure it was loaded
  Use of bake_funcs.c functions like bake_add_implicit_rules and bake_post_process() after loading
  Handles case of no target being specified on the command line and defaults to the 0th rule target
  Utilizes bake_set_updates() to analyze the desired target and detect if updates are needed
  Utilizes bake_do_updates() to update the targets when needed
  Checks for error returns (-1) from various functions, de-allocates memory if so and exits with EXIT_FAILURE
   
  WORK_DISCLOSURE.txt
  Loss of up to 10% of project credit for failure to complete and type signature in work disclosure text file.

8.2 Work Disclosure

In conjunction with the Free Collaboration policy for projects, all submissions must include a WORK_DISCLOSURE.txt file. This document outlines the resources that were utilized to complete the project. Each significant resource, be it course staff member, fellow student, website, textbook, AI, or other item should be named with at least a sentence describing how that item influenced the submission.

The rough format of these disclosures is provided in the template WORK_DISCLOSURE.txt file that is part of the project. This document will be checked for reasonable completeness by staff during Manual Inspection. The provided template document is below and should be edited and included in the project submission.

                           _________________

                            WORK DISCLOSURE
                           _________________


(A) HUMAN COLLABORATORS
=======================

  Aside from the person submitting this assignment, the following people
  contributed ideas and discussion to the completion of this work
  INCLUDING course staff members. Write NONE if no collaborators were
  involved.
  - Person 1 <person1@email.com> helped understand Problem X and the
    meaning of...
  - Person 2 <person2@email.com> helped debug Code for Problem Y...
  - etc.


(B) RESOURCE UTILIZATION
========================

  The following resources such as websites, course notes, artificial
  intelligence tools (LLMs/ChatBots/etc.) were utilized in the
  completion of this work. Include course materials such as textbooks
  and lecture slides as well. (Write NONE if no resources were used
  [which would be hard to believe]).
  - Resource 1 is here <https://some.resource.org/useful_stuff.html> and
    provided help for Problem Z to understand...
  - Resource 2 is the book "C Code for Dummies" by Boo Kauthor with
    chapter 8 helping a lot with the malloc()/free() usage on Problem W
  - Resource 3 is here <https://airegurgitator.com> and provided AI
    refinements for the algorithm used on problem Q and also helped
    debug code for Problem N.
  - etc.


(C) ADHERENCE TO THE PRIME DIRECTIVE
====================================

  PRIME DIRECTIVE: Be able to explain your own work including assignment
  answers, program code, and exam solutions. The work you submit should
  be the product of your own effort and reflect your personal
  understanding. (See the course syllabus for more information.)

  I submit this work in accordance with the PRIME DIRECTIVE. I affirm
  that I can explain the code and answers within as I created them and
  they reflect my personal understanding.

  Signed,

  <REPLACE WITH SUBMITTER NAME>

9 Optional Problem 5: Makeup Credit

The data structure that Bakefiles are meant to encode is a DAG with the "A" indicating "acyclic" - no cycles of target/dependencies. There is no way to enforce this in the text file format that constitutes Bakefiles so a sloppy writer may create the following:

# cycle of A->B->C->A
fileA.txt : fileB.txt
	@ echo completed fileA.txt

fileB.txt : fileC.txt
	@ echo completed fileB.txt

fileC.txt : fileA.txt
	@ echo completed fileC.txt

Such structure is sometimes referred to as a cyclic dependency. The best that programs like Bake can do is to analyze the data provided to detect such cycles and bail out before inadvertently entering an infinite recursion. Here is an example run of an augmented bake which performs such checks and bails out if they fail.

>> cd data5/bake_check_cycles01/

>> cat Bakefile 
# cycle of A->B->C->A
fileA.txt : fileB.txt
	@ echo completed fileA.txt

fileB.txt : fileC.txt
	@ echo completed fileB.txt

fileC.txt : fileA.txt
	@ echo completed fileC.txt

>> ../../bake
ERROR Cyclic Dependency detected:
fileA.txt -> fileB.txt -> fileC.txt -> fileA.txt
bake failed

For 10 points of MAKEUP CREDIT add cyclic dependency checks to your implementation of Bake. This will require 2 major changes:

  1. Implement a top-level service function in bake_funcs.c that detects cycles
  2. Call that function in main() before performing recursive operations on a loaded bake_t structure

Below are details on aspects of this Makeup problem.

Reminder on MAKEUP CREDIT

Makeup Credit are additional points that allow students to exceed the total of points possible for the project. These points go into the overall pool of points earned on projects and will make up for lost credit on this or other projects.

MAKEUP CREDIT is NOT EXTRA CREDIT: student scores on the project portion of the course will not exceed 100%. However, MAKEUP CREDIT allows for students to ensure they get full project credit even with some mistakes.

9.1 bake_check_cycles()

Add the following top-level function to bake.h and your own bake_funcs.c.

int bake_check_cycles(bake_t *bake, char *targname, int do_print);
// PROBLEM 5 MAKEUP: Analyzes `bake` starting at targname to
// deteremine if any cycles are present which would create problems
// completing a build: if targA depends on targB depends an targA, a
// cyclcil dependency exists and could not be fulfilled.  Returns 1 if
// a cycle is present, 0 if no cycle is present, and -1 if an error
// occurs such as not finding a needed target.
//
// If do_print is non-zero, prints out a message like:
// 
//   ERROR Cyclic Dependency detected:
//   fileA.txt -> fileB.txt -> fileC.txt -> fileA.txt
// 
// where the first line is always identical and the second line shows
// the path from the initial target through a cycle. Implementations
// assume that these messages will fit in a 2048-byte array and have
// undefined behavior if the message is larger.
//
// NOTES: This function needs to initiate a recursive descent into
// `bake` via a helper function which does the brunt of the work.
// This helper function may be defined in whatever way with whatever
// prototype is convenient. Global variables may be used to help
// generate cycle message BUT implementations with pride will find
// ways to avoid the use of global variables.
  • As NOTES suggestion you will need to add one ore more additional recursive "helper" functions that are called by bake_check_cycles(). You may add functions of whatever sort you wish to bake_funcs.c: defining them above bake_check_cycles() in the C file will make them available for use in that function.
  • Identifying when cycles occur is only a modestly complex recursive problem. Make use of the RULE_VISITED_BIT for rule_flags as you recursively descend through rules. Employ a similar strategy as your bake_set_updates() function though take care to adapt to the current usage: don't blindly copy all aspects of the previous pattern.
  • Creating the ERROR message with the detected cycle is the trickiest aspect of the problem. Typical strategies will grow a string with each recursive call and then "shrink" it in some fashion as the recursion unwinds. You may find the sprintf() function useful in this context and the fact that strings can be ended by placing a \0 character in it. You may use global variables for your implementation if you can't figure out a way around doing so BUT know it is not difficult to do without them.

9.2 Bake main() for Cycle Checking

After completing the bake_check_cycles() function, add a call to it in your bake_main.c code in main(). Place the call prior to any other recursive calls. If the cyclic dependency check fails, exit with EXIT_FAILURE.

9.3 Problem 5 Grading Criteria

Weight Criteria
5 AUTOMATED TESTS:
  make test-prob5 runs tests from test_prob5.org for correctness
  Several tests for bake_check_cycles() and main() augmented for cycle detection
  1 point per test
   
5 MANUAL INSPECTION
  bake_check_cycles() uses a recursive helper function that is present and documented
  Recursive helper function has a clear structure of base and recursive cases
  Recursive helper function contains clear creation of an ERROR message tracking a cyclic path
  main() contains an active call to bake_check_cycles() with early exit on a failure
   
10 TOTAL

10 Assignment Submission

10.1 Submit to Gradescope

Refer to the Project 1 instructions and adapt them for details of how to submit to Gradescope. In summary they are

Command Line Submission
Type make submit to create a zip file and upload it to Gradescope; enter your login information when prompted.
Manual Submission
Type make zip to create pX-complete.zip, transfer this file to a local device, then upload it to the appropriate assignment on Gradescope via the sites web interface.

10.2 Late Policies

You may wish to review the policy on late project submission which will cost 1 Engagement Point per day late. No projects will be accepted more than 48 hours after the deadline.

https://www.cs.umd.edu/~profk/216/syllabus.html#late-submission


Web Accessibility
Author: Chris Kauffman (profk@umd.edu)
Date: 2026-04-20 Mon 15:59