Recently, I submitted a paper titled “Learning Graphical State Transitions” to the ICLR conference! In it, I describe a new type of neural network architecture, called a Gated Graph Transformer Neural Network, that is designed to use graphs as an internal state. I demonstrate its performance on the bAbI tasks as well as on some other tasks with complex rules. While the main technical details are provided in the paper, I figured it would be worthwhile to describe the motivation and basic ideas here.
Note: Before I get too far into this post, if you have read my paper and are interested in replicating or extending my experiments, the code for my model is available on GitHub.
Another thing that I’ve noticed is that almost all of the papers on machine learning are about successes. This is an example of an overall trend in science to focus on the positive results, since they are the most interesting. But it can also be very useful to discuss the negative results as well. Learning what doesn’t work is in some ways just as important as learning what does, and can save others from repeating the same mistakes. During my development of the GGT-NN, I had multiple iterations of the model, which all failed to learn anything interesting. The version of the model that worked was thus a product of an extended cycle of trial and error. In this post I will try to describe the failed models as well, and give my speculative theories for why they may not have been as successful.
This summer, I had the chance to do research at Mudd as part of the Intelligent Music Software team. Every year, under the advisement of Prof. Robert Keller, members of the Intelligent Music Software team work on computational jazz improvisation, usually in connection with the Impro-Visor software tool. Currently, Impro-Visor uses a grammar-based generation approach for most of its generation ability. The goal this summer was to try to integrate neural networks into the generation process.
It’s hard not to be blown away by the surprising power of neural networks these days. With enough training, so called “deep neural networks”, with many nodes and hidden layers, can do impressively well on modeling and predicting all kinds of data. (If you don’t know what I’m talking about, I recommend reading about recurrent character-level language models, Google Deep Dream, and neural Turing machines. Very cool stuff!) Now seems like as good a time as ever to experiment with what a neural network can do.
For a while now, I’ve been floating around vague ideas about writing a program to compose music. My original idea was based on a fractal decomposition of time and some sort of repetition mechanism, but after reading more about neural networks, I decided that they would be a better fit. So a few weeks ago, I got to work designing my network. And after training for a while, I am happy to report remarkable success!
For the last couple of weeks, I have been working on creating a "whiteboard drawing bot", a Raspberry-Pi-powered contraption that can draw shapes and text on a whiteboard. After four redesigns and about a thousand lines of code, I'm finally finished. Tada!
Anyways, now that I have finished building it, I am going to write a bit about how I did so in a few posts. For this first post, I'm going to be talking about the hardware behind the bot and a little bit of the software on the Raspberry Pi. Then, later, I'll talk about the custom software I wrote to translate shapes and lines into commands for the Pi.