|A neural network is a series of matrix multiplications that takes an input and produces an output. That's it!|
We do have to be able to represent our inputs and outputs numerically, but it turns out we can represent almost any kind of data numerically.
In this example, our input is an array of numbers. This might seem kind of boring and arbitrary, but it should also underscore just how flexible neural networks can be. We can come up with any combination of inputs and outputs we want, and our neural network will look for a way to define a relationship between them.
|Our Inputs||Our Desired Outputs|
|"Ok network, take these numbers and figure something out."||What we want our network to "figure out"|
|Matrix Multiplication #1|
|We call the values in the matrix the weights or coefficients of the network. Our weights, at the moment, are simply normally distributed random numbers with a mean of 0 and a standard deviation of 5. Their role is amplify or dampen the inputs we give them, depending on how significant each input is to the task the network is trying to learn. If the network learns that an input is signficant, it assigns a higher weight to that input.|
We multiply our input matrix with the weights matrix to get our first set of outputs, also called activations. Why are they called activations? I don't know!
|If we stopped right here, we would already have a neural network. It would just be a network with a single layer.|
|Matrix Multiplication #2|
|We can have as few or as many layers as we like (the more layers we add, the more complex our model), but our final set of activations should have the same shape as our desired output - one row and two columns.|
|Matrix Multiplication #3|
|Alright! We have our final activations. |
How did our network do? The spreadsheet generates new matrices every time it loads, so I don't know what your results look like, but I'm guessing they don't look very good. For example, right now I'm looking at an output of (-8902, 4926) which isn't just a little bit different from the (0, 1) that we wanted - it's in a completely different neighborhood.
This shouldn't be all that surprising, because our network is literally made up of random numbers. We can try to improve our results by simply asking the network for new sets of numbers until we get something that looks about right. In case you were wondering, this is not the way most data scientists do it.
There are a couple of ways that we can systematically improve our activations:
1. Make changes to the way we initialize our weights (making the random numbers in our matrices less random) so our first guess comes out a lot closer to the answer that we want.
2. Build in some kind of optimization process so the network can adjust its weights in response to the difference between our activations and desired outputs. This is where we get to put all that high school calculus to use.
We'll get into both of these things later on, but for now we're done with our explanation of what a neural network is. And we've implemented a (pretty terrible) neural network from scratch.
... That's it? Does this all feel suspiciously simple to you? It does to me too!
|Originally part of the One Data Science a Day series at theianchan.com/one-data-science-a-day/day-eleven|