1 Pre-Check

This section is designed as a conceptual check for you to determine if you conceptually understand and have any misconceptions about this topic. Please answer true/false to the following questions, and include an explanation:

1.1 The compiler may output pseudoinstructions.
   True. It is the job of the assembler to replace these pseudoinstructions.

1.2 The main job of the assembler is to perform optimizations on the assembly code.
   False. That’s the job of the compiler. The assembler is primarily responsible for replacing pseudoinstructions and resolving offsets.

1.3 The object files produced by the assembler are only moved, not edited, by the linker.
   False. The linker needs to relocate all absolute address references.

1.4 The destination of all jump instructions is completely determined after linking.
   False. Jumps relative to registers (i.e. from jalr instructions) are only known at run-time. Otherwise, you would not be able to call a function from different call sites.

2 Translation

2.1 In this question, we will be translating between RISC-V code and binary/hexadecimal values.

Translate the following RISC-V instructions into binary and hexadecimal notations.

1. addi $s1 $x0 -24

   $0b_0000000000000000000000000000000000000000000000000000000000000000_00000000 = 0x_0

2. sh $s1 $4($t1)

   $0b_0000000000000000000000000000000000000000000000000000000000000000_00000000 = 0x_0

For this question, use the reference sheet to get information about the instructions and convert them to binary representation. One thing that helps is splitting the parsing into parts. For question 1:

1. addi $s1 $x0 $x4:
   2. rd= $s1 = $0b011001
   3. rs1 = $x0 = $0b000000
   4. immediate = -24 = $0b1111 1110 1000
   5. opcode = 001 0011
   6. funct3 = 000
   7. Bringing it together - $0b1111 1110 1000 0000 0000 0100 1001 0011 = 0xFE800493
For question 2, with a similar method we get the answer: 0b0000 0000 1001 0011 0001 0010 0010 0011 = 0x00931223

2.2
In this question, we will be translating between RISC-V code and binary/hexadecimal values.

Translate the following hexadecimal values into the relevant RISC-V instruction.

1. \(0x234554B7 = \) ________________
2. \(0xFE050CE3 = \) ________________

For the reverse conversion, we want to first determine the instruction type. In order to do that, we first look at the opcode (and then func3/func7 if necessary). Let’s start with the first one:

1. \(0x234554B7 = 0b0010 0011 0100 0101 0101 0100 1011 0111\), the opcode is always the last 7 bits so opcode = 011 0111, which corresponds to the operation lui!

2. Looking at lui, we can see that the first 20 bits correspond to the immediate, and the next 5 ones are the register ones. So:

3. \(0b0010 0011 0100 0101 0101 = 0x23455\) So, the immediate input was indeed \(0x23455\).

4. Looking at the next 5 bits, they must be the rd register values. So, we have

5. \(rd = 0b01001\)

6. That is equal to 9, which is the register \(x9 = s1\). Thus, overall we have

7. lui \(s1 0x23455\)

For question 2, with a similar approach: beq a0, x0, -8

3 RISC-V Addressing

We have several addressing modes to access memory (immediate not listed):

1. Base displacement addressing adds an immediate to a register value to create a data memory address (used for lw, lb, sw, sb).

2. PC-relative addressing uses the PC and adds the immediate value of the instruction (multiplied by 2) to create an instruction address (used by branch and jump instructions).

3. Register Addressing uses the value in a register as an instruction address. For instance, jalr, jr, and ret, where jr and ret are just pseudoinstructions that get converted to jalr.

3.1 What is the range of 32-bit instructions that can be reached from the current PC using a branch instruction? Recall that RISC-V supports 16b instructions via an extension.

The B-format instruction encoding has space for a 12 bit immediate field. Since RISC-V supports 16b instructions, the byte offset needed to branch to any instruction will always be divisible by 2. Thus, we can assume an implicit 0 at bit 0, allowing a 13 bit byte offset (that’s why the reference sheet describes the immediate as \(imm[12:1]\)). Since the byte offset is signed, the branch immediate can move the PC in the range of \([-2^{12}, 2^{12} - 1]\) bytes. As we are looking for the range of 32-bit
instructions, we look for only offsets divisible by 4, leading to a total of \([-2^{10}, 2^{10} - 1]\) 32-bit instructions to branch to.

3.2 What is the maximum range of 32-bit instructions that can be reached from the current PC using a jump instruction?

The immediate field of the jal instruction is 20 bits, while that of the jalr instruction is only 12 bits, so jal can reach a wider range of instructions. Similar to above, this 20-bit immediate is multiplied by 2 and added to the PC to get the final address. Since the immediate is signed, we have a range of \([-2^{20}, 2^{20} - 2]\) bytes, or \([-2^{19}, 2^{19} - 1]\) 2-byte instructions. As we actually want the number of 4-byte instructions, we can reference those within \([-2^{18}, 2^{18} - 1]\) instructions of the current PC.

3.3 Given the following RISC-V code (and instruction addresses), fill in the blank fields for the following instructions (you’ll need your RISC-V reference sheet!). Each field refers to a different block of the instruction encoding.

```
1 0x002cff00: loop: add t1, t2, t0  |________|________|________|________|________|__0x33__|
2 0x002cff04: jal ra, foo          |__________________________|_________________|__0x6F__|
3 0x002cff08: bne t1, zero, loop   |________|________|________|________|________|__0x63__|
4 ...
5 0x002cff2c: foo: jr ra           ra = __________________________
```

```
1 0x002cff00: loop: add t1, t2, t0  | 0 | 5 | 7 | 0 | 6 | 0x33 | → 0x00538333
2 0x002cff04: jal ra, foo          | 0 | 0x14 | 0 | 0 | 1 | 0x6F | → 0x028000ef
3 0x002cff08: bne t1, zero, loop   | 1 | 0x3F | 0 | 6 | 1 | 0xC | 1 | 0x63 | → 0xfe031ce3
4 ...
5 0x002cff2c: foo: jr ra           ra = ____0x002cff08_____
```

4 CALL

The following is a diagram of the CALL stack detailing how C programs are built and executed by machines.
4.1 How many passes through the code does the Assembler have to make? Why?

Two: The first finds all the label addresses, and the second resolves forward references while using these label addresses.

4.2 Which step in CALL resolves relative addressing? Absolute addressing?

The assembler usually handles relative addressing. The linker handles absolute addressing, resolving the references to memory locations outside.

4.3 Describe the six main parts of the object files outputted by the Assembler (Header, Text, Data, Relocation Table, Symbol Table, Debugging Information).

- Header: Sizes and positions of the other parts
- Text: The machine code
- Data: Binary representation of any data in the source file
- Relocation Table: Identifies lines of code that need to be “handled” by the Linker (jumps to external labels (e.g. lib files), references to static data)
- Symbol Table: List of file labels and data that can be referenced across files
- Debugging Information: Additional information for debuggers
5 Boolean Logic

In digital electronics, it is often important to get certain outputs based on your inputs, as laid out by a truth table. Truth tables map directly to Boolean expressions, and Boolean expressions map directly to logic gates. However, in order to minimize the number of logic gates needed to implement a circuit, it is often useful to simplify long Boolean expressions.

We can simplify expressions using the nine key laws of Boolean algebra:

<table>
<thead>
<tr>
<th>Name</th>
<th>AND Form</th>
<th>OR form</th>
</tr>
</thead>
<tbody>
<tr>
<td>Commutative</td>
<td>AB = BA</td>
<td>A + B = B + A</td>
</tr>
<tr>
<td>Associative</td>
<td>AB(C) = A(BC)</td>
<td>A + (B + C) = (A + B) + C</td>
</tr>
<tr>
<td>Identity</td>
<td>1A = A</td>
<td>0 + A = A</td>
</tr>
<tr>
<td>Null</td>
<td>0A = 0</td>
<td>1 + A = 1</td>
</tr>
<tr>
<td>Absorption</td>
<td>A(A + B) = A</td>
<td>A + AB = A</td>
</tr>
<tr>
<td>Distributive</td>
<td>(A + B)(A + C) = A + BC</td>
<td>A(B + C) = AB + AC</td>
</tr>
<tr>
<td>Idempotent</td>
<td>A(A) = A</td>
<td>A + A = A</td>
</tr>
<tr>
<td>Inverse</td>
<td>A(\overline{A}) = 0</td>
<td>A + A = 1</td>
</tr>
<tr>
<td>De Morgan’s</td>
<td>\overline{AB} = \overline{A} + \overline{B}</td>
<td>\overline{A + B} = \overline{A}(\overline{B})</td>
</tr>
</tbody>
</table>

5.1 Simplify the following Boolean expressions:

(a) \((A + B)(A + \overline{B})C\)

\[
(A + B)(A + \overline{B})C = (A + B\overline{B})C = AC
\]

(b) \(\overline{A}\overline{B}\overline{C} + \overline{A}B\overline{C} + AB\overline{C} + ABC + A\overline{B}C\)

\[
\overline{A}\overline{C}(\overline{B} + B) + AC(B + \overline{B}) + AC(B + \overline{B}) = AC + AC
\]

(c) \(A(\overline{BC} + BC)\)

\[
A(\overline{BC} + BC) = A + \overline{B}\overline{C} + BC
\]

(d) \(\overline{A}(A + B) + (B + AA)(A + \overline{B})\)
\[
\overline{A}(A + B) + (B + AA)(A + \overline{B}) = (\overline{A}A + \overline{A}B) + (B + AA)(A + \overline{B})
\]
\[
= \overline{A}B + (B + A)(A + \overline{B})
\]
\[
= \overline{A}B + (BA + AA + B\overline{B} + A\overline{B})
\]
\[
= \overline{A}B + (BA + A + A\overline{B})
\]
\[
= \overline{A}B + A
\]
\[
= A + B
\]

Alternatively,
\[
\overline{A}(A + B) + (B + AA)(A + \overline{B}) = \overline{A}(A + B) + (A + B)(A + \overline{B})
\]
\[
= (A + B)(\overline{A} + A + \overline{B})
\]
\[
= A + B
\]

5.2 Use multiple iterations of De Morgan’s laws to prove the identity \( \overline{A} + AB = \overline{A} + B \).

\[
\overline{A} + AB = \overline{A}AB
\]
\[
= A(A + B)
\]
\[
= AA + AB
\]
\[
= AB
\]
\[
= \overline{A} + B
\]

5.3 Prove that De Morgan’s law can be generalized for the complement of any number of terms.

We will prove \( \overline{A} + B + C + ... = \overline{A}(B)(C) ... \). Treat \( B + C + ... \) as a single term, then apply De Morgan’s on both terms, resulting in \( \overline{A}(B + C + ...) \). Repeating this for all terms results in \( \overline{A}(B)(C) ... \).

A similar approach works for the other identity, as both AND and OR are associative.