reveal.js

# Data Structure Selection

---

CS 65 // 2021-04-06

## Extra Credit Activities
- [Civic Action Week](https://www.drake.edu/community/learningservice/studentopportunities/civicactionweek/)
- Drake Relays

# Weekend Review Sessions

# Extra Credit Assignment
- Optional, individual, assignment on images

# Questions
## ...about anything?

# TicTacToe Debrief

# Data Structure Selection

## Data Structure Selection
- We've seen several ways of structuring data:
- 
 **Lists:** Great for storing *sequential* data.
    + 
 Can add and remove elements, sort, etc.
- 
 **Dictionaries**: Great for storing **associative** data
    + 
 Can map keys to values
- 
 **Tuples:** Great for storing immutable sequential lists
    + 
 Can be used as keys in dictionaries
- 
 When faced with a problem, we often need to carefully consider what data structure to employ to help solve the problem

# Case Study
## Procedural Text Generation

## Procedural Text Generation
- As a case study, let's implement an algorithm to randomly generate English sentences
- 
 The technique we will employ is called **Markov analysis**, named after Andrei Markov
    + 
 Involves randomly generating one word at a time
    + 
 The probability of choosing a word is based on what words commonly follow the previous words

## Procedural Text Generation
- Consider the text from *Eric, the Half a Bee*:

> Half a bee, philosophically,<br>
> Must, ipso facto, half not be.<br>
> But half the bee has got to be<br>
> Vis a vis, its entity. D’you see?
> 
> But can a bee be said to be<br>
> Or not to be an entire bee<br>
> When half the bee is not a bee<br>
> Due to some ancient injury?

</div>

## Procedural Text Generation
- We need a way of knowing what words commonly follow certain *prefixes* of words
- 
 What data structures might we use to help us?
    + 
 Use a dictionary to map prefixes of words to what words commonly come next
    + 
 Need immutable keys, so we need to use strings or tuples for the prefixes
    + 
 Since multiple words may commonly follow, we could use a list to accumulate words

## Parsing Words

```py
def parse_words(filename):
    f = open(filename)
    words = []
    for line in f:
        words.extend(line.split())
    
    f.close()
    return words
```

## Markov Analysis

```py
def markov(words, order=2):
    suffixes = {}
    prefix = tuple(words[:order])
    for next_word in words[order:]:
        if prefix not in suffixes:
            suffixes[prefix] = [next_word]
        else:
            suffixes[prefix].append(next_word)
            
        prefix = prefix[1:] + (next_word,)
            
    return suffixes
```

## Random Text

```py
def random_text(suffixes, n=100):
    text = ""
    prefix = random.choice(list(suffixes.keys()))
    for i in range(n):
        next_word = random.choice(suffixes[prefix])
        text = text + " " + next_word
        prefix = prefix[1:] + (next_word,)
    
    return text
```