Week 7
CS + X
On 7/06, I was able to meet the next group of Upward Bound students! There are four students in the class, two freshmen and two sophomores, and while they are excited to get started, two of them noted they were anxious about computer science, as they had tried it in middle school and found it difficult. With this in mind, Sophia and I spent a long time on ice-breakers and introducing computer science. We made sure to keep our explanations accessible and asked the students to brainstorm and share ways that they interact with computers and technology. They enjoyed most of the questions in the “Get To Know You” Python program, especially the one that asked to plan the perfect meal.
All of the students have Python programming experience, so we were able to move through the “Intro to Python” Google slides quicker than the previous section. A difference that we made from the previous session to this one is how we describe variables. The last session we used a more technical definition about objects and pointers, but we noticed that this lead to some doubts and confusions, so we opted for an analogy of a box used for moving. When you are moving, you pack items into a box and then label the box, so later you are able to find the items you need. The students were nodding along at this, so we felt this was a more appropriate first definition.
Lastly, I introduced and supported students through the “Introduction to Python” Google colab with practice and exploratory exercises. I drastically cut down on the content and rephrased the instructions to be more beginner-friendly than last session, and this proved to be incredibly successful. The students were able to interact with and explore the content with minimal assistance from Sophia and myself, and I felt that they are more confident than the last session due to this change.
Next week, we will introduce linguistics and the projects!
Study Skills and Belonging
Due to the Fourth of July holiday, the research team and I were unable to meet and share updates. I was able to code, here meaning identifying and flagging key words, two more transcripts. I felt like I was able to identify more key words than I was able to in the past due to the analysis I performed. In previous weeks, Deb has asked me to look into where our coding styles differed, and I found that I am more likely to use the “Self” code in places where the interviewee is not explicitly referencing themselves. I deduced that with my qualitative background, I have a tendency to read between the lines and draw conclusions; also, I remembered the voice of the interviewee from my time hand-correcting the A.I. generated transcripts, so I was drawing on the interviewee’s tone of voice and cadence.
I was able to perform my first draft of the sentiment analysis on the hand-corrected interview texts. For reference, the data that I have access to is in a .csv format with the reference text, a rolled up list of Deb’s codes, a rolled up list of my codes, the text Deb tagged, the text I tagged, and the “similarity” score. The similarity score is how Deb analyzed the differences in our codes and subcodes. Previously, I had uploaded, cleaned, and parsed the data into dataclasses called Code and Entry. Entry refers to each attribute of the data, and instead of a list of codes for Deb and I, I created a class called Code that hold the code, subcode, and note information for each tagged code.
With all the data neatly sectioned, I was able to add a Sentiment class which held the reference text, sentiment (Very Negative, Negative, None, Positive, and Very Positive), and polarity (a float from -1 to 1). Using the TextBlob package, I was able to scan each reference text, calculate a polarity, and then assign a sentiment. Each Entry class has the Sentiment as an attribute.
In addition, I needed to add a method to examine whether my code and Deb’s code were equal. My logic was that if we tagged the same code and subcode for a reference text, those two codes are “identical” and can be used in a sentiment analysis. To handle ‘nan’ values, I added a try/except clause.
def find_identical_code(self):
identical = dict()
for code in self.deb_codes:
for code1 in self.angela_codes:
try:
code.__eq__(code1)
identical[self.reference_text] = [code.code, code.subcode, self.sentiment.sentiment,
self.sentiment.polarity]
except:
continue
with open("sentiment.csv", "w") as csvfile:
w = csv.writer(csvfile)
for entry in entries:
# Build a dictionary of the member names and values...
same = entry.find_identical_code()
w.writerows(same.items())
After writing and exporting the data to a .csv, I was able to see that the code I wrote identified 84 common codes for analysis. Here is my priliminary analysis:
- Code breakdown: eight in Resources, 24 in Self, and 52 in Strategy
- Of the 8 in Resources, all 8 were ‘Very Negative’
- Of the 24 in Self, 8 were ‘Negative’, 2 were ‘None, and 14 were ‘Very Negative’
- Of the 52 in Strategy, 31 were ‘Very Negative’, 1 was ‘Positive’, 5 were ‘None’, and 15 were ‘Negative’
These results were confusing for me because there were so many ‘Very Negative’ and ‘Negative’ flagged text. For my next draft, I am going to see if I can learn a bit about how to train the TextBlob library as well as breaking down the larger reference text into more meaningful chunks that may give more insights.
General Reflections
I am happy that the changes that Sophia and I made to the introductory content were more accessible for the students. It is important to build confidence and success when you are first learning a new skill, or perhaps engaging with something you have not in awhile. I am looking to future content that the students will work with to add some more “everyday”, less technical definitions and explanations. While we would like the students to be familiar with the technical definitions, there is a time and a place for those. Instead we will scaffold and help the students build their working knowledge so they can more independently learn later.
For the Study Skills analysis, I am proud of the coding that I was able to do to compare individual Code objects and write to a .csv. I researched these techniques and asked friends for help, and after some debugging I am pleased with the program. I do feel that the results are not as accurate as they could be. I wonder if my program is not including “all” shared codes and instead is only saving the “first” code. I feel like we should have more shared codes in common, especially going forward after our coding reconcilliation meeting. I will do some more digging and data validation to ensure I have accurate results.