Over these past couple of weeks, I have continued to conduct my testing for my research, with my last official test day being last Friday. Even though ChatGPT’s lack of consistency did not always allow for smooth testing and led to some long work hours, I am happy with its turnout, and I am excited to look over the data and make conclusions.
However, one problem I encountered throughout the testing process that I did not anticipate was making a standardized system to categorize the data. A common rule in many gendered languages is that if a person’s gender is unknown, one should just default to masculine form; a sentence using masculine spellings did not necessarily mean that the translator assumes the sentence subject to be male, but rather in a neutral way. Therefore, the translator’s intention is a crucial factor when looking at gender bais in translations. However, due to ChatGPT’s lack of consistency, it did not always make its assumptions explicit while translating, leaving me with the question of how conscious and deliberate ChatGPT is with its responses. Is it actively synthesizing the information it has been programmed with and gleaned from the internet, or is it merely spitting out information from the web without thought? While this would be a good research topic for further investigation, it was not a question I could solve in this short timeline.
To make matters even more complicated, I also had to consider whether I wanted to categorize data purely by grammar or include a cultural context. This concern was prominent in my Ukrainian and Russian translations. For Russian, even though they have separate spellings for certain male and female occupations, it is common for many people to just use the male titles, regardless of gender. Ukrainian as a language is also quite similar to Russian due to its history with the Soviet Union. However, with its independence and the current war between Russian and Ukraine, the Ukraine government is actively encouraging its citizens to return to using the Ukrainian dialect used before the Cold War, which had a more gendered structure when compared to Russian. This has led the Ukrainian government to release statements in 2019 and 2021, which authorized a set of rules in forming feminine words, officially making words like feminine occupations a part of the formal Ukrainian language.
With all of this information and social context, I had to decide on where to draw the line between grammar and culture. After going back and forth many times, as well as speaking to many people for their opinion, I was able to decide on a system that combines both of these attributes. Since ChatGPT is familiar with information that occurred before September, 2021, it knew about the push for feminine words in Ukrainian culture when I asked it. Therefore, I inferred then that it applied this information in its translations, and that its sentences containing masculine forms were used to describe a male subject. Since ChatGPT also knew about Russia’s tendency to use the masculine form for certain occupations for both females and males, I categorized its sentences containing masculine forms as neutral. Though this system is not foolproof (for no one fully knows ChatGPT's intentions and capabilities), it has allowed me to standardize my research so that my data is consistent.
My goals for this week are to finish looking through my data and put the results into my tables. When I have finished creating my conclusions, I also hope to begin working on making my research poster and to have a good chunk of it done before I return back home, so that I will have to do minimal work before our July 28th benchmark.
I hope everyone has a good week and I am looking forward to hearing about everyone else’s project:)
Please sign in
If you are a registered user on Laidlaw Scholars Network, please sign in