The Value of 'Speed Bumps' in the Research Process
This summer, I completed my research project titled ‘Machine Learning Symmetries in Classical Mechanical Systems’. Before I get into the discussion of the value of ‘speed bumps’ promised by this blog’s title, I want to explain what exactly it is that I’ve been working on. To that effect, the following provides a breakdown of the jargon in my project’s title:
- A mechanical system is any moving thing (or set of things) that a physicist tries to predict the motion of. Calling a mechanical system ‘classical’ is essentially just saying that quantum physics is being ignored.
- In physics, a symmetry is some change we can make to a system without changing the behaviour of the system. The example I like to give is of an apple falling from a tree; if the apple isn’t moved any further from or closer to the ground, the apple will fall in the same way no matter where the tree is placed. This is called horizontal spatial symmetry.
- To find the symmetry of a system, one must find the transformation of the system’s coordinates that causes the system’s Lagrangian (a quantity that has to do with the energy of the system) to be changed in a very particular way.
- Machine learning is the process whereby a model finds the best way to behave in a given situation by repeatedly adjusting its guesses based on a measure of how ‘wrong’ it is called the loss function. Often, the model resulting from the learning process is applied to generate something (such as text) or classify things (such as images), but my project is only concerned with the learning process.
So, in summary, the goal of my project was to apply the machine learning process to find the changes of a given system’s coordinates that change its Lagrangian in the ‘correct’ way.
The planned progression of this project was as follows: Identify my approach to the problem → Apply this approach to finding the simplest class of symmetries → Extend the approach to identify more complicated symmetries → Evaluate results and success of the project.
In reality, my progress was not so linear. Within the first week I had identified my approach and successfully applied it to find the simplest class of symmetries. I was feeling great about my progress at this point, but I was soon to meet the first ‘speed bump’ in my journey. Two weeks later I had figured out a way to extend my approach to find more complicated symmetries. While this may sound like excellent progress, I wasn’t satisfied with what I had developed. ‘Why not?’ you may wonder. Well, although I had developed a process to do everything I had outlined at the outset, I felt that the extension of my original approach was sidestepping the purpose of the project in a manner that I will explain by way of analogy.
Consider my approach to machine learning the simplest class of symmetries as analogous to teaching a child to add 1 onto any number by putting out that number of fingers, then putting up one more and counting how many fingers they are now holding up. Then, my extended approach was like asking the child what 2+3 is, then having somebody else tell them the answer is 5, then watching them hold up five fingers without having to think about the answer and agree that the answer is indeed 5.
When I first sensed that something about my approach wasn’t as I wanted it, I ignored the thought because the algorithm was working and that was what mattered, right? I decided that my answer to that question was ‘no’, and as unhappy as I was to acknowledge that my previous two weeks of work had produced an algorithm that was self-defeating (in my eyes at least), I scrapped the approach and went back to the drawing board. By deciding to address the flawed nature of my approach, I went from being weeks ahead of schedule to being weeks behind schedule.
By going back to the drawing board, I started a long journey of learning new concepts from machine learning and making several failed approaches to the problem at hand, at the end of which I landed on an approach that works and is not, in my opinion, self-defeating. Indeed, I think there is something rather satisfying about the approach I have ended up with because it involves identifying physical symmetry based on an unrelated notion of symmetry from calculus known as the symmetry of second derivatives. However, there is nothing about my final approach to the problem that I couldn’t have done in the first few weeks of working on the project. Consequently, it might seem like the time spent on unsuccessful ideas and non-functioning algorithms was wasted, but I really don’t think this is the case.
I think that through my numerous failed attempts, I gained a deeper understanding of the underlying physical theory. Furthermore, despite my limited initial experience in machine learning, I acquired knowledge of valuable techniques while exploring dead-end ideas. Most importantly, I think that although research findings have the potential to affect the rest of the world, it is primarily the research process that affects the researcher. The less predictable the process, the more it becomes a journey of discovery – both personally and intellectually – for the researcher. Knowing that such ‘speed bumps’ are formative for both the research and the researcher, I can’t lament the setbacks I experienced over the course of the project.
In the words of Robert Pirsig in Zen and the Art of Motorcycle Maintenance, ‘Stuckness shouldn’t be avoided. It’s the psychic predecessor of all real understanding.’
Photo credits: P.Ctnt, Public domain, via Wikimedia Commons
Please sign in
If you are a registered user on Laidlaw Scholars Network, please sign in