HW 5 (Caesar Ciphers)
Assignment overview
In this assignment, you'll write code to encode and decode messages using the Double Caesar Cipher method.
Goals
Get practice using functions, having good code organization, and working with strings.
Logistics
This is a partner assignment, which means you should complete all pieces of this assignment with your assigned partner. In particular, any coding must be completed side-by-side in a pair programming style. You are welcome to discuss the assignment with other classmates, course staff or Anna. Make sure to cite any help you receive in the "acknowlegdements" portion of the assignment.
You have a new partner for this assignment. Check the Moodle Gradebook to see who it is.
This assignment is due at 10PM on Monday, April 27.
Setup
Mount the COURSES drive and create a folder called hw5 in your STUWORK folder.
Open the new folder in VSCode.
If you need a refresher on how to complete these steps, refer back to the in-class
lab from the first day of class.
Next, create a file called caesar.py. You'll write all your code for this assignment in this file.
Background on Caesar Ciphers
Ciphers are procedures to encrypt and decrypt messages, which is important for ensuring that messages can be delivered securely. (Originally, ciphers were necessary when mailing or using a messenger to deliver sensitive information, but today, ciphers are more commonly used to encrypt information that is sent over the internet. For example, every time you enter a password or your credit card information on a website, it needs to be encrypted before being sent to the website's servers to avoid being intercepted by an attacker who wants to steal your data.)
One method of encryption that we discussed in class on Wednesday is the Caesar cipher. In this cipher, you pick a key between 1 and 25 (inclusive), then for each character in your message, you shift each letter forward by the key, wrapping around at the end of the alphabet. For example, if your original message is "carleton" and your key is 2, your encrypted message will be "ectngvqp". Supposedly, Julius Caesar used this technique to communicate with his generals. Unfortunately, it's very easy to crack.
A dramatically more effective variation is the double-Caesar cipher. It is just a little more complicated to implement than the original Caesar cipher. Instead of having a single numeric key, there is a second string of text that acts as the key. Each letter in the key tells you how many letters to advance: "a" is 0, "b" is 1, "c" is 2, and so on. For example, if your original message is "carleton" and your key is "dog", the encrypted string will be "foxoszrb". Specifically, index 0 gets shifted forward by 3 since "d" is 3, index 1 gets shifted forward by 15 since "o" is 15, and index 2 gets shifted forward by 7 since "g" is 7. Then we repeat the pattern: index 3 gets shifted forward by 3 (we loop back to the first letter of "dog"), and so on.
Just to make it clear, here's that same encoding again:
Original: c a r l e t o n Repeated key: d o g d o g d o Encrypted: f o x o s z r b
Part 1: Encode and decode
For this part, you'll create functions that encode and decode messages using a Double Caesar Cipher. For both functions, I recommend that you and your partner work out a solution on paper (pseudocode the algorithm) before starting to code. Then, you should check that your code works after implementing each function. Some of the instructions in Part 2 may be helpful for thinking through how to structure your tests.
Recall that we discussed the ord and chr functions in class. ord converts a character into its UTF-8 representation
(e.g., ord("a")=97) and chr converts an integer into the corresponding UTF-8 character (e.g., chr(97)="a"). You should
feel free to reference unicode tables, e.g., this one
if they're helpful. In the table I linked, the "decimal" column is what will be referenced by ord and chr.
Encode function
Create a function called encode that takes in two parameters: (1) message the original text to encode (stored as
a string) and (2) key the key to use when encoding with a double-caesar cipher (stored as a string). This function
should return a string that contains the encoded message. Your function definition should have the following format
(don't forget to add a docstring):
def encode(message, key):
# put the body here
For now, we will remove any non-letter characters from the input string, as these characters will not be encrypted. You
should remove spaces, periods, apostrophes, and commas from message before doing the encryption. It is fine to assume that
the original message only contains letters and the characters listed above. All encrypted characters should be lowercase,
regardless of whether the original character was upper or lowercase.
Decode function
Create a function called decode that takes in two parameters: (1) message the encrypted text that we want to decode
(stored as a string) and (2) key the key to use when decoding (stored as a string). This function should return a
string containing the decrypted text, all in lower case. The function definition should have the following format:
def decode(message, key):
# put the body here
Part 2: Main and testing code
In this part, you'll write two functions: main and testing. They can be completed in either order.
Main function
Create a function main that takes no parameters. The function should do the following:
- Ask the user whether they want to encode or decode
- Ask the user to enter the message to either encode or decode
- Ask the user to enter a key
- Call the appropriate function (
encodeordecode) - Print the results to the screen
You should also modify your file so that the main function will run if we type python3 caesar.py on the command
line, but won't run if we import the file in the interpreter (i.e., import caesar). Look back at the starter code
for HW3 or HW4 if you don't remember how to do this.
Testing function
Create a function testing that takes no parameters. You should use this function to test encode, decode, and the
the functions you'll write in Part 3.
Think carefully about what test cases to include and how you'll know whether your tests work. For example, this function might include code to encrypt a string and then decrypt it to make sure the original string is returned. You should include at least three different test cases.
You should also think about what are called "boundary" or "edge" cases. These tests evaluate how well your code handles unexpected, but still valid, input. In this setting, some examples of edge cases are very short keys and very long keys.
There's not a single format that your code in testing has to adhere to, but if you're stuck I would recommend looking back at the
Debugging lab from April 15 for an example of what tests can look like.
One final note on running testing(). You can either temporarily comment out your interactive code in main and instead
include a call to testing. Then, you can run python3 caesar.py to run your tests. Alternatively, you can open the
Python interpreter in terminal by typing python3, then run import caesar, then type caesar.testing() to run the tests.
Note that if you're testing in the interpreter, you need to restart the interpreter each time you make changes to the file.
Part 3: Adding in spaces and newlines
So far, we have excluded all characters that aren't letters. Now, we'll add in spaces and newline characters ("\n"). We won't deal with adding in the additional punctuation (commas, apostrophes, periods) that we excluded in Part 1, but the general process to add these symbols would be the same.
In this part, you won't modify your existing functions (other than testing), but you'll add two new functions:
encodeSpacewill behave similarly toencode, but will encode messages that contain spaces and newline charactersdecodeSpacewill behave similarly todecode, but will decode messages that contain spaces and newline characters
encodeSpace and decodeSpace will handle
encoding and decoding when you have spaces and newlines. A bad way to handle spaces and newlines would be to include these
characters, unencrypted, in the encoded message. This is a bad idea because then an attaker could infer things about your message
based on word and sentence length. Instead, we want to treat spaces and newlines like any other character when encoding. And our
encoded message might have spaces and newlines, but they won't be in their original positions!
The basic approach we'll take is to imagine that we have an alphabet with 28 characters, instead of 26 (treat space
as letter 27 and newline as character 28). You can still assume that key contains only letters.
After writing these function, you may want to think about how to re-organize your code so that you don't have a lot of duplicated code. This process is a standard part of coding and can be very useful for writing better code (i.e., code that has fewer bugs and is easier to maintain or extend later).
A note on newlines if you're using the input function to get a string from a user and the string has the newline character
"\n", you will find that it is interpreted as a slash followed by the letter n, rather than a newline character. To fix this,
you can use the following command:
myString = input("Enter a string ")
myString = myString.replace("\\n", "\n")
Optional extension
If you're encrypting a long message, it is easier to read it in from a text file than to type it all at the command line
prompt. Allow the user to choose whether they'd like to use the interactive input method (as you already implemented in main),
or if they'd like to read it from a text file (the filename must end with .txt). Then, if they choose to read from a text
file, use something similar to what was in HW3 to read the file line by line and return an encrypted version of the whole file.
Here's some syntax to get you started: in the example, filename is a variable that stores the name of the file to open.
The file needs to be in the same directory as your caesar.py code.
file = open(filename, "r")
for line in file:
# Modify this code
print(line)
file.close()
Wrap up
When you're finished, make sure to complete the usual documentation steps. This includes adding comments, writing function docstrings, and adding a top-level comment, acknowledgements, and a reflection to the header.
You should also think about coding style. Have you written everything in a consistent way that is easy to read? Does your code have any unnecessary print statements? (Remove them.) Is there any repetitive code that could be rewritten to use loops or functions? Review the style document on Moodle for the expectations for this assignment.
Assignment submission and misc. notes
You're welcome to add additional helper functions to organize your code, but please make sure that the functions listed in this assignment are included and behave as specified (same number of parameters, same return values). I ask this for two main reasons:
- Code should be written to be reusable elsewhere. For example, maybe I want to write a program to check if all the homework submissions produce the same encryptions. This is a much easier problem if everyone has used the same code specification! More generally, coding projects often need to fit into a larger system of code that expects functions to be defined in a certain way.
- When grading your homeworks, the graders and I run your code and also look at the code. It's easier to give helpful feedback when all the assignments follow the same specification.
Handing in the assignment
You need to hand in caesar.py on Gradescope.
Only one partner should submit the assignment to Gradescope (but make sure to add your partner to
your group after submitting)
Grading
This assignment is worth 40 points, broken up as follows:
encodeanddecodefunctions - 8 points eachmainfunction - 4 pointstestingfunction - 6 pointsencodeSpaceanddecodeSpacefunctions - 4 points each- Style (6 points): header, comments, following the style guidelines from Moodle.
Start early, ask lots of questions, and have fun!
Anna's acknowledgements
This assignment was adapted from assignments used by Andy Exley, Anna Rafferty, and Layla Oesper. Thanks for sharing!