# Data Structure Selection --- CS 65 // 2021-04-06 ## Extra Credit Activities - [Civic Action Week](https://www.drake.edu/community/learningservice/studentopportunities/civicactionweek/) - Drake Relays # Weekend Review Sessions # Extra Credit Assignment - Optional, individual, assignment on images # Questions ## ...about anything? # TicTacToe Debrief # Data Structure Selection ## Data Structure Selection - We've seen several ways of structuring data: - **Lists:** Great for storing *sequential* data. + Can add and remove elements, sort, etc. - **Dictionaries**: Great for storing **associative** data + Can map keys to values - **Tuples:** Great for storing immutable sequential lists + Can be used as keys in dictionaries - When faced with a problem, we often need to carefully consider what data structure to employ to help solve the problem # Case Study ## Procedural Text Generation ## Procedural Text Generation - As a case study, let's implement an algorithm to randomly generate English sentences - The technique we will employ is called **Markov analysis**, named after Andrei Markov + Involves randomly generating one word at a time + The probability of choosing a word is based on what words commonly follow the previous words ## Procedural Text Generation - Consider the text from *Eric, the Half a Bee*:
> Half a bee, philosophically,
> Must, ipso facto, half not be.
> But half the bee has got to be
> Vis a vis, its entity. D’you see? > > But can a bee be said to be
> Or not to be an entire bee
> When half the bee is not a bee
> Due to some ancient injury?
## Procedural Text Generation - We need a way of knowing what words commonly follow certain *prefixes* of words - What data structures might we use to help us? + Use a dictionary to map prefixes of words to what words commonly come next + Need immutable keys, so we need to use strings or tuples for the prefixes + Since multiple words may commonly follow, we could use a list to accumulate words ## Parsing Words ```py def parse_words(filename): f = open(filename) words = [] for line in f: words.extend(line.split()) f.close() return words ``` ## Markov Analysis ```py def markov(words, order=2): suffixes = {} prefix = tuple(words[:order]) for next_word in words[order:]: if prefix not in suffixes: suffixes[prefix] = [next_word] else: suffixes[prefix].append(next_word) prefix = prefix[1:] + (next_word,) return suffixes ``` ## Random Text ```py def random_text(suffixes, n=100): text = "" prefix = random.choice(list(suffixes.keys())) for i in range(n): next_word = random.choice(suffixes[prefix]) text = text + " " + next_word prefix = prefix[1:] + (next_word,) return text ```