# Strings, Arrays, and Addressing --- CS 130 // 2021-10-13 ## Administrivia - Quiz 3 changes announced on Blackboard + Online, unlimited time, due tonight - Assignment 3 is released + Extended deadline to Friday, October 22nd # Questions ## ...about anything? # Procedures ## Calling a Procedure 1. Put parameters in appropriate registers + `$a0`, `$a1`, `$a2`, `$a3` 2. Transfer control to the procedure + `jal ProcedureLabel` 3. Perform task 4. Place result in a location the callee can find + `$v0`, `$v1` 5. Return control to the caller + `jr $ra` ## Leaf Procedures - A **leaf procedure** is a function that makes no other function calls ```c int func(int x, int y) { return x + y; } ``` ## Leaf Procedures ```c int func(int x, int y) { return x + y; } int main() { func(1, 2); } ``` --- ```mips func: add $v0, $a0, $a1 jr $ra main: li $a0, 1 li $a1, 2 jal func ``` ## Non-Leaf Procedures - Non-leaf procedures are the interesting cases ```c int termial(int n) { if (n == 0) { return 0; } else{ return n + termial(n - 1); } } ``` - Before a procedure calls another procedure, it needs to **save** any data that might be overwritten by the procedure call **in memory** ## Memory Structure ![MIPS memory structure](/teaching/2021f/cs130/assets/images/COD/mips-memory.png) ## Termial Example ```c int termial(int n) { if (n == 0) { return 0; } else { return n + termial(n - 1); } } ``` ## Termial Example ```mips termial: beq $a0, $zero, ret_zero # Checks base case addi $sp, $sp, -8 # Allocates space sw $ra, 0($sp) # for $ra and n sw $a0, 4($sp) # on the stack addi $a0, $a0, -1 # Loads n-1 as arg jal termial # Calls termial(n-1) lw $a0, 4($sp) # Restores n add $v0, $v0, $a0 # Adds n to result lw $ra, 0($sp) # Restores $ra addi $sp, $sp, 8 # Deallocates memory jr $ra # Returns result ret_zero: li $v0, 0 # Returns 0 jr $ra ``` ## Exercise - Convert the following C code into MIPS ```c int compute (int e) { return e + e; } int calc(int c) { int d = compute(7); return c + d; } int main() { int a = 5; int b = calc(a); } ``` ## Exercise: Compute ```c int compute (int e) { return e + e; } ``` ```mips compute: add $v0, $a0, $a0 # returns e + e jr $ra ``` ## Exercise: Calc - Take 1 ```c int calc(int c) { int d = compute(7); return c + d; } ``` ```mips calc: li $a0, 7 # puts 7 in a0 jal compute # calls compute(7) add $v0, $v0, $a0 # adds argument to result jr $ra ``` - Unfortunately this will fail. Why? + Both `$a0` and `$ra` will be erased! ## Exercise: Calc - Take 2 ```c int calc(int c) { int d = compute(7); return c + d; } ``` ```mips calc: addi $sp, $sp, -8 # allocates space sw $a0, 0($sp) # stores c on the stack sw $ra, 4($sp) # stores $ra on the stack li $a0, 7 # puts 7 in a0 jal compute # calls compute(7) lw $ra, 4($sp) # restores $ra lw $a0, 0($sp) # restores argument addi $sp, $sp, 8 # deallocates space add $v0, $v0, $a0 # adds argument to result jr $ra ``` ## Exercise: Main ```c int main() { int a = 5; int b = calc(a); } ``` ```mips main: li $s0, 5 # a = 5 move $a0, $s0 # loads a into argument jal calc # calls calc(5) move $s1, $v0 # b = result ``` # Tail Calls ## Tail Calls - A **tail call** is when a procedure immediately returns the result of another procedure ```c int foo(int x) { return x + x; } int bar(int x) { int y = x + x; return foo(y); // this is a tail call } ``` - A tail call can be implemented **without** creating a new procedure frame ## Tail Calls ```c int foo(int x) { return x + x; } int bar(int x) { int y = x + x; return foo(y); } ``` ```mips foo: add $v0, $a0, $a0 jr $ra bar: add $a0, $a0, $a0 j foo ``` ## Tail Recursion - If a function makes tail calls to itself, it is called **tail recursive** - Is our implementation of `termial` tail recursive? ```c int termial(int n) { if (n == 0) { return 0; } else { return n + termial(n - 1); } } ``` - No! It has to add `n` before returning ## Tail Recursion - Can we implement it tail recursively? ```c int terminal(int n) { return termial_helper(n, 0); } int termial_helper(int n, int so_far) { if (n == 0) { return 0; } else { return termial_helper(n - 1, n + so_far); } } ``` ## Tail Recursion - What does this look like in MIPS? ```mips termial: li $a1, 0 j termial_helper ``` ```mips termial_helper: beq $a0, $zero, ret_so_far add $a1, $a1, $a0 j termial_helper ret_so_far: move $v0, $a1 jr $ra ``` # Strings ## ASCII Encoding - Most programming languages use **one byte** (8 bits) to represent a single character - **ASCII** is the encoding most commonly adhered to + "American Standard Code for Information Interchange" ## ASCII Encoding ![ASCII table](/teaching/2021f/cs130/assets/images/COD/ascii.png) ## Unicode - Since most countries use languages with characters other than the English alphabet, a global standard of character encoding was created: **Unicode** - Unicode provides multiple encodings: + **UTF-8:** Variable-width encoding where ASCII characters are 8-bits and other characters can take up to 32-bits to encode + **UTF-16**: All characters are encoded in 16-bits - UTF-8 is the most popular because files encoded in ASCII are also valid UTF-8 files ## C Versus Java - In C, the `char` type usually uses how many bits? + 8 bits (one byte) and uses ASCII/UTF-8 encoding - In Java, the `char` type usually uses how many bits? + 16 bits (two bytes) and uses UTF-16 encoding ## C Versus Java - A **string** is a sequence of characters - In C, how many bytes of memory does `"abc"` take? + 4 bytes, one for each character, including `'\0'` - What about in Java? + 6 bytes, two for each character + 4 more bytes to store the length of the string + 4 more bytes to store an integer offset + 4 more bytes to store a cached hashcode + ...and sometimes more! ## MIPS Byte Instructions - Since `lw` and `sw` fetch and store 4 bytes of memory at a time, MIPS provides other commands to manipulate individual bytes - `lbu` `$t0`, `0($s0)` + Loads one byte from the `$s0` memory address - `sb` `$t0`, `0($s0)` + Stores one byte from the `$s0` memory address - These instructions ignore the leftmost 24 bits and only operate on the right-most 8 bits ## String Compare - Consider the following C function which returns 1 if two strings are equal and 0 otherwise ```c int string_compare(char *s1, char *s2) { if (*s1 != *s2) { return 0; } else if (*s1 == '\0') { return 1; } else { return string_compare(s1 + 1, s2 + 1); } } ``` - Convert it into MIPS assembly ## String Compare ```mips string_compare: lbu $t0, 0($a0) lbu $t1, 0($a1) bne $t0, $t1, return_zero beq $t0, $zero, return_one addi $a0, $a0, 1 addi $a1, $a1, 1 j string_compare return_zero: li $v0, 0 jr $ra return_one: li $v0, 1 jr $ra ``` ## Regions of MIPS Code - There can be multiple regions of a MIPS program + **.text**: Contains *program instructions* + **.data**: Contains *initial memory* ## Interaction with the OS - We are used to **printing** characters and **reading** characters within our programs which requires interaction with the operating system - In MIPS, we need to use the **syscall** instruction to send or receive information from the OS - To request a service, you must: + Put an **instruction code** in `$v0` + Put any arguments into `$a0`--`$a3` + Execute `syscall` ## Input/Output Example ```mips .data S1: .asciiz "Type in an integer: " S2: .asciiz "You entered the following: " .text main: add $v0, $zero, 4 # system code for print_str la $a0, S1 syscall addi $v0, $zero, 5 # system code for read_int syscall addi $t0, $v0, 0 addi $v0, $zero, 4 la $a0, S2 syscall addi $a0, $t0, 0 addi $v0, $zero, 1 # system code for print_int syscall addi $v0, $zero, 10 # system code for terminate syscall ``` # Addressing ## Addressing - MIPS uses fixed-size 32-bit instructions - What fields can instruction consist of? + Operation code + Register sources/destination + Constants - There's not enough room to store 32-bit constants + What are the implications of this? ## Addressing Modes 1. **Register**: Address is in a register + `jr $ra` 2. **Base/Displacement**: Address is in a register + C + `lw` `$t0`, `0`, `($sp)` 3. **Immediate**: Address is in a constant + `addi $t0`,` $t1 10` ## Addressing Modes 4. **PC Relative**: Address is `$pc` + C + `bne $t0`,` $zero, label` 5. **Pseudodirect Addressing**: + Address is upper 4 bits of `$pc` merged with 26 bits of a constant + `j label` ## Addressing Modes ![Addressing Modes](/teaching/2021f/cs130/assets/images/COD/addressing-modes.png)