CSCI 2021 Lab10: Timing Code and Machine Speed
- Due: 11:59pm Tue 04-Apr-2023 on Gradescope
- Approximately 1.00% of total grade
CODE DISTRIBUTION: lab10-code.zip
CHANGELOG: Empty
1 Rationale
Differences and oddities in CPU architecture are often revealed through observations about the time certain programs take to run. This lab explores these issues through a program that is provided as part of the most recent HW. The code implements some "micro-benchmarks" which repeatedly performing arithmetic operations with slight variations. Timing these, along with some knowledge of CPU architecture, is instructive for observing some of the implementation differences that constitute low-level CPU implementation.
Grading Policy
Credit for this Lab is earned by completing the exercises here and
submitting a Zip of the work to Gradescope. Students are responsible
to check that the results produced locally via make test
are
reflected on Gradescope after submitting their completed
Zip. Successful completion earns 1 Engagement Point.
Lab Exercises are open resource/open collaboration and students are encouraged to cooperate on labs. Students may submit work as groups of up to 5 to Gradescope: one person submits then adds the names of their group members to the submission.
See the full policies in the course syllabus.
2 Codepack
The codepack for the HW contains the following files:
File | Description | |
---|---|---|
QUESTIONS.txt |
EDIT | Questions to answer: fill in the multiple choice selections in this file. |
superscalar_main.c |
Provided | C code also in HW10 that is used to observe CPU timing differences |
superscalar_funcs.c |
Provided | |
Makefile |
Build | Enables make test and make zip |
QUESTIONS.txt.bk |
Backup | Backup copy of the original file to help revert if needed |
QUESTIONS.md5 |
Testing | Checksum for answers in questions file |
test_quiz_filter |
Testing | Filter to extract answers from Questions file, used in testing |
test_lab10.org |
Testing | Tests for this lab |
testy |
Testing | Test running scripts |
3 Timing on Two Machines
HW10 has students timing the execution speed of the several different
arithmetic functions. For this the time
utility is used and your
lab demoer will briefly discuss the time
utility and the information
it provides about program runs. This information is coverd in HW10
posted here:
https://www-users.cs.umn.edu/~kauffman/2021/hw10.html
Timing code runs is interesting as one fully expects the results to vary from one computer to the next. Lab leaders will demonstrate as much by showing timings of the same program runs on two different processors available through the CSE Labs system
csel-plate01.cselabs.umn.edu
which is a server in Keller Hall machine room (SSH only).csel-kh1260-NN.cselabs.umn.edu
machines such ascsel-kh1260-01
which are desktop workstations in the Keller 1-260 (physical access or SSH login)
Running the same programs on these two machines will lead to different times, sometimes in ways that are quite surprising.
Lab staff will focus their presentation on timing the
superscalar_main
program which is central to HW10. It is used to
run small integer benchmarks that repeatedly add / multiply in
different combinations. Timings of these operations reveal
peculiarities of some processors.
NOTE: Staff will show results on both the machines csel-plate01
and csel-kh1260-NN
during the lab but for HW10, the focus is timing
only in csel-kh1260-NN
machines.
4 Using the time and lscpu Utilities
Staff will briefly discuss the time
utility and cover the 3 types of
times it reports. They may timing of programs like the following and
explain the differing times for these.
> time make # build a program using a makefile ... > time ls -lR /sys > /dev/null # recursive listing of /sys system directory .. > time ping -c 3 google.com # contact google 3 times to see if it is responding ...
Staff will mention which of the measures that time
reports is most
important to evaluating integer arithmetic code like
superscalara_main
.
Make sure to download the superscalar_main
application from the HW10 specification here:
https://www-users.cs.umn.edu/~kauffman/2021/hw10.html
Staff will Compile and time runs of it as students will do in HW10.
On a run of the program such as
> make ... > time ./superscalar_main 1 30 add1_diff ...
When timing programs, it is good to know something about the CPU on which the program is being run. This can be obtained via the `lscpu` utility on Linux systems. It can be run just by typing
> lscpu ...
and reports a variety of information including BogoMIPS, a "crude measure of CPU speed" which can be used to roughly compare processor clock speed.
5 QUESTIONS.txt File Contents
Below are the contents of the QUESTIONS.txt
file for the lab.
Follow the instructions in it to complete the QUIZ and CODE questions
for the lab.
__________________ LAB 10 QUESTIONS __________________ Lab Instructions ================ Follow the instructions below to experiment with topics related to this lab. - For sections marked QUIZ, fill in an (X) for the appropriate response in this file. Use the command `make test-quiz' to see if all of your answers are correct. - For sections marked CODE, complete the code indicated. Use the command `make test-code' to check if your code is complete. - DO NOT CHANGE any parts of this file except the QUIZ sections as it may interfere with the tests otherwise. - If your `QUESTIONS.txt' file seems corrupted, restore it by copying over the `QUESTIONS.txt.bk' backup file. - When you complete the exercises, check your answers with `make test' and if all is well, create a zip file with `make zip' and upload it to Gradescope. Ensure that the Autograder there reflects your local results. - IF YOU WORK IN A GROUP only one member needs to submit and then add the names of their group. QUIZ Timing Code ================ Using the HW10 code pack which contains the `superscalar_main' benchmark program, answer the following questions concerning timing on several lab machines. You will need to SSH into several machines to complete the questions. time utility ~~~~~~~~~~~~ On a run of the program such as ,---- | > time ./superscalar_main 1 30 add1_diff | ... `---- which of the reported times is the most relevant to understanding processor speed? - ( ) The `real' time as it reports how many seconds the user has to wait for the program to complete - ( ) The `user' time which is the number of seconds that the CPU spends executing the code in the user's program - ( ) The `sys' time because it indicates how much time the program spends in OS system calls Processor types ~~~~~~~~~~~~~~~ Use the `lscpu' utility on these two machines: - csel-plate01.cselabs.umn.edu : a server machine - csel-kh1260-10.cselabs.umn.edu : a desktop lab machine Analyze the output to the types of processors and their relative processing speed according to the "BogoMIPS" measure. - ( ) `csel-plate01' and `csel-kh1260-NN' both have AMD processors and the BogoMIPS measure indicates `csel-plate01' is faster - ( ) `csel-plate01' and `csel-kh1260-NN' both have Intel processors and the BogoMIPS measure indicates `csel-kh1260-NN' is faster - ( ) `csel-plate01' has Intel processors and `csel-kh1260-NN' has AMD processors and the BogoMIPS measure indicates `csel-plate01' is faster - ( ) `csel-plate01' has AMD processors and `csel-kh1260-NN' has Intel processors and the BogoMIPS measure indicates `csel-kh1260-NN' is faster Timings using `superscalar_main' ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Compile the `superscalar_main' program using the provided `Makefile' and time runs of it on both `csel-plate01' and `csel-kh1260-25' using the following commands: ,---- | >> make | gcc -Wall -Werror -g -Og -o superscalar_main superscalar_main.c superscalar_funcs.c | | >> time ./superscalar_main 1 30 add1_diff `---- According to what you observe for this, which of the following best reflects the outcome of the runs between the two machines? - ( ) `csel-plate01' takes about 0.91s to run while `csel-kh1260-NN' takes about 0.63s to run indicating `csel-kh1260-NN' is faster - ( ) `csel-plate01' takes about 0.50s to run while `csel-kh1260-NN' takes about 0.85s to run indicating `csel-plate01' is faster - ( ) `csel-plate01' takes about 1.99s to run while `csel-kh1260-NN' takes about 0.25s to run indicating `csel-kh1260-NN' is faster - ( ) `csel-plate01' takes about 0.10s to run while `csel-kh1260-NN' takes about 1.15s to run indicating `csel-plate01' is faster Analysis of Benchmarks ~~~~~~~~~~~~~~~~~~~~~~ Among the micro 'benchmarks' implemented in `superscalar_main` are the following two ,---- | add2_diff : add 2 times in same loop; different destination variables | add2_same : add 2 times in same loop; same destination variable `---- Find the code for the two functions that implement these benchmarks in the file `superscalar_funcs.c' (each benchmark has a function named for it). Analyze the code and CHECK ALL OF THE BELOW ITEMS that are true. - ( ) Both `add2_diff()' and `add2_same()' have loops the repeatedly perform arithmetic operations - ( ) `add2_diff()' will loop fewer times than `add2_same()' for the same function parameters / command line arguments - ( ) Both `add2_diff()' and `add2_same()' dereference pointers in their loops so interact with main memory every iteration - ( ) Both `add2_diff()' and `add2_same()' primarily work on registers in their loops as there are no memory references in the loop body - ( ) The biggest difference between them is that `add2_diff()' adds each iteration to different variables/registers while `add2_same()' adds to the same variable/register each iteration - ( ) The biggest difference between them is that `add2_diff()' adds twice each iteration while `add2_same()' adds once each iteration Timing Mysteries ~~~~~~~~~~~~~~~~ Time runs of the two benchmarks above by running these commands. ,---- | time ./superscalar_main 1 30 add2_diff | time ./superscalar_main 1 30 add2_same `---- Perform the timing on BOTH `csel-plate01' and `csel-kh1260-NN' and report on the relations below. On `csel-plate01' - ( ) csel-plate01: time for `add2_diff < add2_same' - ( ) csel-plate01: time for `add2_diff > add2_same' - ( ) csel-plate01: time for `add2_diff = add2_same' On `csel-kh1260-NN' - ( ) csel-kh1260-NN: time for `add2_diff < add2_same' - ( ) csel-kh1260-NN: time for `add2_diff > add2_same' - ( ) csel-kh1260-NN: time for `add2_diff = add2_same' These results should seem strange to you and requires further discussion which will come in lecture. CODE None ========= None for this lab : analyze the provided superscalar_main.c and superscalar_funcs.c to learn several interesting techniques such as how to create an array of *function pointers* and select one to run.
6 Submission
Follow the instructions at the end of Lab01 if you need a refresher on how to upload your completed lab zip to Gradescope.