CSCI 2021 Lab10: Timing Code and Machine Speed
- Due: 11:59pm Tue 04-Apr-2023 on Gradescope
- Approximately 1.00% of total grade
CODE DISTRIBUTION: lab10-code.zip
CHANGELOG: Empty
1 Rationale
Differences and oddities in CPU architecture are often revealed through observations about the time certain programs take to run. This lab explores these issues through a program that is provided as part of the most recent HW. The code implements some "micro-benchmarks" which repeatedly performing arithmetic operations with slight variations. Timing these, along with some knowledge of CPU architecture, is instructive for observing some of the implementation differences that constitute low-level CPU implementation.
Grading Policy
Credit for this Lab is earned by completing the exercises here and
submitting a Zip of the work to Gradescope. Students are responsible
to check that the results produced locally via make test are
reflected on Gradescope after submitting their completed
Zip. Successful completion earns 1 Engagement Point.
Lab Exercises are open resource/open collaboration and students are encouraged to cooperate on labs. Students may submit work as groups of up to 5 to Gradescope: one person submits then adds the names of their group members to the submission.
See the full policies in the course syllabus.
2 Codepack
The codepack for the HW contains the following files:
| File | Description | |
|---|---|---|
QUESTIONS.txt |
EDIT | Questions to answer: fill in the multiple choice selections in this file. |
superscalar_main.c |
Provided | C code also in HW10 that is used to observe CPU timing differences |
superscalar_funcs.c |
Provided | |
Makefile |
Build | Enables make test and make zip |
QUESTIONS.txt.bk |
Backup | Backup copy of the original file to help revert if needed |
QUESTIONS.md5 |
Testing | Checksum for answers in questions file |
test_quiz_filter |
Testing | Filter to extract answers from Questions file, used in testing |
test_lab10.org |
Testing | Tests for this lab |
testy |
Testing | Test running scripts |
3 Timing on Two Machines
HW10 has students timing the execution speed of the several different
arithmetic functions. For this the time utility is used and your
lab demoer will briefly discuss the time utility and the information
it provides about program runs. This information is coverd in HW10
posted here:
https://www-users.cs.umn.edu/~kauffman/2021/hw10.html
Timing code runs is interesting as one fully expects the results to vary from one computer to the next. Lab leaders will demonstrate as much by showing timings of the same program runs on two different processors available through the CSE Labs system
csel-plate01.cselabs.umn.eduwhich is a server in Keller Hall machine room (SSH only).csel-kh1260-NN.cselabs.umn.edumachines such ascsel-kh1260-01which are desktop workstations in the Keller 1-260 (physical access or SSH login)
Running the same programs on these two machines will lead to different times, sometimes in ways that are quite surprising.
Lab staff will focus their presentation on timing the
superscalar_main program which is central to HW10. It is used to
run small integer benchmarks that repeatedly add / multiply in
different combinations. Timings of these operations reveal
peculiarities of some processors.
NOTE: Staff will show results on both the machines csel-plate01
and csel-kh1260-NN during the lab but for HW10, the focus is timing
only in csel-kh1260-NN machines.
4 Using the time and lscpu Utilities
Staff will briefly discuss the time utility and cover the 3 types of
times it reports. They may timing of programs like the following and
explain the differing times for these.
> time make # build a program using a makefile ... > time ls -lR /sys > /dev/null # recursive listing of /sys system directory .. > time ping -c 3 google.com # contact google 3 times to see if it is responding ...
Staff will mention which of the measures that time reports is most
important to evaluating integer arithmetic code like
superscalara_main.
Make sure to download the superscalar_main application from the HW10 specification here:
https://www-users.cs.umn.edu/~kauffman/2021/hw10.html
Staff will Compile and time runs of it as students will do in HW10.
On a run of the program such as
> make ... > time ./superscalar_main 1 30 add1_diff ...
When timing programs, it is good to know something about the CPU on which the program is being run. This can be obtained via the `lscpu` utility on Linux systems. It can be run just by typing
> lscpu ...
and reports a variety of information including BogoMIPS, a "crude measure of CPU speed" which can be used to roughly compare processor clock speed.
5 QUESTIONS.txt File Contents
Below are the contents of the QUESTIONS.txt file for the lab.
Follow the instructions in it to complete the QUIZ and CODE questions
for the lab.
__________________
LAB 10 QUESTIONS
__________________
Lab Instructions
================
Follow the instructions below to experiment with topics related to
this lab.
- For sections marked QUIZ, fill in an (X) for the appropriate
response in this file. Use the command `make test-quiz' to see if
all of your answers are correct.
- For sections marked CODE, complete the code indicated. Use the
command `make test-code' to check if your code is complete.
- DO NOT CHANGE any parts of this file except the QUIZ sections as it
may interfere with the tests otherwise.
- If your `QUESTIONS.txt' file seems corrupted, restore it by copying
over the `QUESTIONS.txt.bk' backup file.
- When you complete the exercises, check your answers with `make test'
and if all is well, create a zip file with `make zip' and upload it
to Gradescope. Ensure that the Autograder there reflects your local
results.
- IF YOU WORK IN A GROUP only one member needs to submit and then add
the names of their group.
QUIZ Timing Code
================
Using the HW10 code pack which contains the `superscalar_main'
benchmark program, answer the following questions concerning timing on
several lab machines. You will need to SSH into several machines to
complete the questions.
time utility
~~~~~~~~~~~~
On a run of the program such as
,----
| > time ./superscalar_main 1 30 add1_diff
| ...
`----
which of the reported times is the most relevant to understanding
processor speed?
- ( ) The `real' time as it reports how many seconds the user has to
wait for the program to complete
- ( ) The `user' time which is the number of seconds that the CPU
spends executing the code in the user's program
- ( ) The `sys' time because it indicates how much time the program
spends in OS system calls
Processor types
~~~~~~~~~~~~~~~
Use the `lscpu' utility on these two machines:
- csel-plate01.cselabs.umn.edu : a server machine
- csel-kh1260-10.cselabs.umn.edu : a desktop lab machine
Analyze the output to the types of processors and their relative
processing speed according to the "BogoMIPS" measure.
- ( ) `csel-plate01' and `csel-kh1260-NN' both have AMD processors and
the BogoMIPS measure indicates `csel-plate01' is faster
- ( ) `csel-plate01' and `csel-kh1260-NN' both have Intel processors
and the BogoMIPS measure indicates `csel-kh1260-NN' is faster
- ( ) `csel-plate01' has Intel processors and `csel-kh1260-NN' has AMD
processors and the BogoMIPS measure indicates `csel-plate01' is
faster
- ( ) `csel-plate01' has AMD processors and `csel-kh1260-NN' has Intel
processors and the BogoMIPS measure indicates `csel-kh1260-NN' is
faster
Timings using `superscalar_main'
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Compile the `superscalar_main' program using the provided `Makefile'
and time runs of it on both `csel-plate01' and `csel-kh1260-25' using
the following commands:
,----
| >> make
| gcc -Wall -Werror -g -Og -o superscalar_main superscalar_main.c superscalar_funcs.c
|
| >> time ./superscalar_main 1 30 add1_diff
`----
According to what you observe for this, which of the following best
reflects the outcome of the runs between the two machines?
- ( ) `csel-plate01' takes about 0.91s to run while `csel-kh1260-NN'
takes about 0.63s to run indicating `csel-kh1260-NN' is faster
- ( ) `csel-plate01' takes about 0.50s to run while `csel-kh1260-NN'
takes about 0.85s to run indicating `csel-plate01' is faster
- ( ) `csel-plate01' takes about 1.99s to run while `csel-kh1260-NN'
takes about 0.25s to run indicating `csel-kh1260-NN' is faster
- ( ) `csel-plate01' takes about 0.10s to run while `csel-kh1260-NN'
takes about 1.15s to run indicating `csel-plate01' is faster
Analysis of Benchmarks
~~~~~~~~~~~~~~~~~~~~~~
Among the micro 'benchmarks' implemented in `superscalar_main` are the
following two
,----
| add2_diff : add 2 times in same loop; different destination variables
| add2_same : add 2 times in same loop; same destination variable
`----
Find the code for the two functions that implement these benchmarks in
the file `superscalar_funcs.c' (each benchmark has a function named
for it).
Analyze the code and CHECK ALL OF THE BELOW ITEMS that are true.
- ( ) Both `add2_diff()' and `add2_same()' have loops the repeatedly
perform arithmetic operations
- ( ) `add2_diff()' will loop fewer times than `add2_same()' for the
same function parameters / command line arguments
- ( ) Both `add2_diff()' and `add2_same()' dereference pointers in
their loops so interact with main memory every iteration
- ( ) Both `add2_diff()' and `add2_same()' primarily work on registers
in their loops as there are no memory references in the loop body
- ( ) The biggest difference between them is that `add2_diff()' adds
each iteration to different variables/registers while `add2_same()'
adds to the same variable/register each iteration
- ( ) The biggest difference between them is that `add2_diff()' adds
twice each iteration while `add2_same()' adds once each iteration
Timing Mysteries
~~~~~~~~~~~~~~~~
Time runs of the two benchmarks above by running these commands.
,----
| time ./superscalar_main 1 30 add2_diff
| time ./superscalar_main 1 30 add2_same
`----
Perform the timing on BOTH `csel-plate01' and `csel-kh1260-NN' and
report on the relations below.
On `csel-plate01'
- ( ) csel-plate01: time for `add2_diff < add2_same'
- ( ) csel-plate01: time for `add2_diff > add2_same'
- ( ) csel-plate01: time for `add2_diff = add2_same'
On `csel-kh1260-NN'
- ( ) csel-kh1260-NN: time for `add2_diff < add2_same'
- ( ) csel-kh1260-NN: time for `add2_diff > add2_same'
- ( ) csel-kh1260-NN: time for `add2_diff = add2_same'
These results should seem strange to you and requires further
discussion which will come in lecture.
CODE None
=========
None for this lab : analyze the provided superscalar_main.c and
superscalar_funcs.c to learn several interesting techniques such as
how to create an array of *function pointers* and select one to run.
6 Submission
Follow the instructions at the end of Lab01 if you need a refresher on how to upload your completed lab zip to Gradescope.