I've been wanting to experiment with functional programming in Python, specifically for machine learning and neural networks. Neural networks are, in general, collections of sequences of operations. I thought it'd be interesting to see if I could construct network architectures out of only function calls for these operations.
I can't be purely functional here in the sense that pure functions have exactly the same output for the same input. Models are updated while training, so even though the model is a single function, the output will change given a static input. Technically you could return a new model for each update, but this would eat up a lot of memory.
I also had some fun messing with Python's operators. One problem with chaining functions is the first function you read is the last function that is called. For example, consider a network with one hidden layer trained on MNIST. If each operation is a function, then you'd calculate the digit probabilities like this:
log_softmax(linear(256, 10)(relu(linear(784, 256)(flatten(inputs)))))
Instead, XABY uses the >> operator to call functions in succession:
inputs >> flatten >> linear(784, 256) >> relu >> linear(256, 10) >> log_softmax
This way the code reads in the same order as execution.