I would not get that. So this prevents things where you have individual neurons taking on more of the work than they should. Simply, the X and Y coordinates the vertical and horizontal position of each data point, so as our neural network is given a point to classify. So that's why alternative functions work a little bit better in practice. I mean, compare that to the billions of neurons that exist inside your head. Hello and welcome to a deep learning with Python and Pytorch tutorial series, starting from the basics. So let's go ahead and run these two previous blocks shift, enter, shift, enter and at this point, you're ready to actually train your neural network. It actually compare that to the known correct value. It can also be used with text data, for example. Okay, so in this case, we predicted that this was a number nine. Go ahead, hit, shift. It's not that complicated, right? It's also good. Imagine this scale down 2 to 24 by 2 24 There's really not going to be a lot of information there, but it's still figured out that that's a rabbit. Next we need to define what's called an optimizer, and in this case, we're going to use a stochastic Grady and dissent. There's sort of a spooky aspect how this stuff all works together. You wouldn't want your car to suddenly crash into a wall just because it lost its network connection to the cloud, now would you? So this turns out to be a pretty good little calculus trick. You can learn how to use Keras in a new video course on the freeCodeCamp.org YouTube channel.. You are let alone a cluster of computers. Tensorflow will then go and figure out the optimal way to distribute and paralyze that work across your entire set of GP use and computers in your cluster. So that is one hot encoding. Enter after selecting the appropriate blocks of code here. There's actually a description of the state is set in the names dot text file here that goes along with that data set. Try different learning rates. By the time you watch this, they might even be a reality. That means that we're only going to train our neural network using that 60,000 set of training samples and were holding a side of 10,000 test samples so we can actually test how well are trained network works on data that it's never seen before. Neural network has challenges. And I have said before that often a lot of the work in machine learning is not so much building your models and tuning them. And again, convolution is just breaking up that image into little sub fields that overlap each other for individual processing. I mean, it's not even like a prominent piece of this image. We use the weights that we're currently using in our neural network to back propagate that our error to individual connections. Let's go ahead and execute that. If you're dealing with three D volumetric data of some sort. So, for example, in this picture here, that stop sign could be anywhere in the image, and a CNN is able to find that stop sign no matter where it might be. So you take the sum of all the weights of the inputs coming into a neuron. You know, it's basically breaking up the image into different sections. They are everywhere now, ranging from audio processing to more advanced reinforcement learning (i.e., Resnets in AlphaZero). We can also have an output that is a time Siri's or some sequence of data as well. So there we have it. So you know, sometimes you want to converge on the topology wanted, then go back and implement that at the tensorflow layer. This really got to run 10 epochs this time because again, it takes a long time or would be better. Is this a calculus trick for making Grady into sent faster? You can also run tensorflow on just about anything. They are very sensitive to the topology is that you choose and the choice of hyper parameters. When in London, one must eat. This takes a long time. But at the same time, it has a python interface so you can talk to it just like you would any other python library. And again caress is just a layer on top of tensorflow that makes deep learning a lot easier . Let's give it some more time pretty firmly in the nineties at this point. And I've already done that so won't actually do anything for me. We would call that a sequence to vector. Think twice before you do something like that. So we start off by, ah, initializing all over variables with random variables just to make sure that we have a set of initial random settings there for our weights. Okay, so we have all over input data converted already to numerical format. If you want to join our Facebook group Pecans, totally optional Disa place for students to hang out with each other and our development environs for this course will be Anaconda, which is a scientific python three environment. So we have a data set of mammogram masses that were detected in real people, and we've had riel. It models a neuron which has a set of inputs, each of which is given a specific weight. So we have to talk about one hot encoding at this point. I'm going to use the Pandas library. Using RNN's for Sentiment Analysis: what we're gonna do here is try to do sentiment analysis. FREE : Neural Networks in Python: Deep Learning for Beginners. And we're going to try to see if we can predict if a politician is Republican or Democrat, just based on how they voted on 17 different issues. And it's someone entering the field, either as a researcher or practitioner. All right, so we start off, like encoding, that known label to a one hot, encoded array. That's the power of caress. Try more neurons, less neuron. If you're trying to find the slope of a curve and you're dealing with multiple parameters and we're talking about partial derivatives, right, the first partial derivatives to figure out the slope that we're heading in now it turns out that this is very mathematically intensive and inefficient for computers to dio. So, for example, there's the Lynette five architecture that you can use that's suitable for handwriting recognition. We've talked about how this all works at a low level and intensive flow to It's still possible to implement a complete neural network, basically from scratch, but intense airflow to they have replaced much of that low level functionality with a higher level AP I called caress. Hey, we didn't actually run. So go forth and doom or deep learning Now think about the scale of your brain. Basically, we're going through 200 images that were known to be incorrect. I'm being sarcastic across your entire cluster, and that will ultimately print the value three in the form of a new tensor. I mean, we could do better. You know, different apologies. Maybe you're trying to decide if images or people are pictures of males or females may be trying to decide if someone's political party is Democrat or Republican. The same video cards you're using to play your video games can also be used to perform deep learning and create artificial neural networks. Like I said, CNN's our compute intensive. But you know, you can see there's a wide variety of handwriting capabilities of people who made this test data. I enjoyed working there. And finally, we will actually run it now. Well, now caress is slower than tensorflow, and you know it's doing a little bit more work under the hood, so this will take more time, but you'll see that the results are really good. It's very easy to do cross validation and, like, perform proper analysis and evaluation of this neural network. There are various applications of Deep Learning in the Industry, here are a few of the important ones that are present in our Day to Day tasks. Sounds very, very high level, high horse preachy, I know. It's just one of many that I offer in the fields of AI and Big Data, and I hope you want to continue your learning journey with me. You know, the rest of this course we're just gonna be talking about ways of implementing something like this. It's benign or malignant. Product recommendations. Okay? So a little hint there's to use pre processing dot standards scaler out of SK learned that can make things very easy for you. Okay, So as we keep running this thing over and over again will have some new data coming in that gets blended together with the output from the previous run through this neuron, and that just keeps happening over and over and over again. So we're going to reshape the training images to be 60,000 by 7 84 Again, we're going to still treat these as one D images. Ah, higher level might take those edges and recognize the shape of that stop Science says. Oh, my gosh. So this is an example of doing sentiment classifications using riel user review data from IMDB. Call our Max on the resulting classification and one hot format and see if that predicted classification matches the actual label for that data. Ultimately, we're gonna create sequential model, and we're just gonna follow the pattern that we showed earlier of doing a binary classification problem. So here's a little function that creates a model that can be used with psych, it learned. There's a saying that goes cells that fire together wire together. One way of dealing with it is just trial and error. We can reinforce those weights over time and reward the connections that produced the behavior that we want. You know, it's just a matter of choosing the one that makes sense for what you're trying to do. I mean, Well, this was actually room service, but you could definitely imagine that's in a restaurant instead. The big difference is this little loop here. That's just a way of converting. Step here. More the weight of input, more it will have an impact on the neural network. And since we have to simulate things over time, and not just through you know the static topology of your network. Mathematically a perceptron can be thought of like an equation of Weights, Inputs, and Bias. This is incredible stuff. So when there's less things for you to screw up and more things that caress can take on for you in terms of optimizing things where you're really trying to do often you can get better results without doing us much work, which is great. And then we could just compare that Ah, and compute the actual cross entropy term by doing reduce some to go across the entire set of all values within this patch and using this log arrhythmic comparison like we said to actually do cross entropy without log arrhythmic property. There was missing column name information in the data file, and there were missing values in there. Another interesting application of are an ends is machine generated music. It's kind of interesting to watch this because the accuracy is kind of fluctuating a little bit as we go Fear So you know, you can tell this kind of like maybe settling into a little local minima here and working its way out of those and finding better solutions over time. And also there's appear in a chain and chain link fence in there as well, for good measure. So instead of writing this big function that does consideration of learning by hand like we did in tensorflow caress does it all for us. And compare that to the one hot, encoded, known value that we have for that label. That could be worse. And if we do that enough times, it should converge to a neural network that is capable of reliably classifying these things . So now we can just use it. Let's take that one step further and we'll have a multi layer perceptron. It's kind of a low tech way of doing it, but it's effective. Okay, so those are the basic mathematical terms, or algorithmic terms that you need to understand to talk about artificial neural networks. So imagine that our threshold for our neuron was that if you have two or more inputs active , you will in turn fire off a signal. We have emergent behavior here, and individual linear threshold unit is a pretty simple concept. All right, so these are some pretty messy examples in this example. And finally we have the thing that we're trying to predict. In that respect, it sounds a lot like Apache Spark. You know that can basically like the bias turn that we talked about earlier that could help to. It's really that easy. Here, I will train our perceptron in 100 epochs. Again, it's 784 that's gonna flatten these two dimensional rays down to a one dimensional 784 10 sirs. But it ends up getting twisted by other people into something that is destructive, and that's something else you need to think about. What is a tensor anyway? See if you can improve upon things. 5. The activation function we talked about not using a step function and using something else , some other ones that are popular Rally was actually very popular right now of realization function we haven't talked about yet. It's very hard to understand intuitively what's going on inside of a neural network, a deep learning network in particular, so sometimes you just have toe. The output is 1 if any of the inputs is also 1. But either way, we will see if it's a channels first format or not, and reshape the data accordingly. And what you need to know is that it can compute all the partial derivatives you need just by traversing your graph in the number of outputs plus one that you have. Now an individual neuron will fire or send a signal to all the neurons that is connected to when enough of its input signals air activated so that the individual neuron level it's a very simple mechanism. You even saw in some of the examples that we ran in the Tensorflow playground, that sometimes we don't with neurons that were barely used it all, and by using drop out that would have forced that neuron to be to have been used more effectively. There's a tray containing my food. Let's play around some more. I mean, it is super super easy to deploy a I. Convolutional Neural Networks: so So far, we've seen the power of just using a simple multi layer perceptron to solve a wide variety of problems. So we want to have an upper bound on how many times steps we need to back propagate to. Basically, it converts each of the final weights that come out of your neural network into a probability. And that's just like a psychic learned model. So at that point, it goes off and says, Okay, we have this craft constructed of A and B A contains one. The way that it actually works is that you might push the actual trained neural network down to the car itself and actually execute that neural network on the computer that's running embedded within your car because the heavy lifting of deep learning is training that network. So if your past training data had a bias toward you know, white men in their twenties who are fresh out of college, your system is going to penalize more experienced candidates who might in fact be better candidates who got passed over simply because they were viewed as being too old by human people. In fact, it's just is quickly. That's the one. So let me tell you a story, because this is actually happen to me more than once. Let's talk about this and more colloquial language, if you will. We'll get there. And you're seeing that in some of the AP eyes and libraries that are coming out. And the rest here for loading up the training and test data is gonna look just like it did before. So this is the label, the severity, whether it's benign, zero or malignant one. So our neural network is going to be able to take every one of those rows of one dimensional data and try to figure out what number that represents in two dimensional space so you can see that it's thinking about the world or perceiving the world. But let's see if it works. I wasn't OK with that. It's built into tensorflow. So no matter which receptive Field picked up that stop sign at some layer, it will be recognized at a stop sign. One other thing. And as before, we will convert the label data into one hot, categorical format because that will match up nicely with the output of our neural network and nothing different here. So how do you build a CNN with caress? Three D layer is available as well. And then the output of those neurons can then get fed back to the next step to every neuron in that layer. When we're done, we can evaluate the performance of this network using a test data set that it's never seen before and see if it can correctly classify that data that it was not trained on. You can find a new job tomorrow, okay? It learn. If you haven't either or sort of problem, then that's what we call a binary classification problem, and you can see here there. But if you want to hit pause here, we can come back later. It's complicated, you know. Installing tensorflow is really easy. Each image is a one dimensional array or vector or tensor, if you will, Of Senator in 84 features 184 pixels. Therefore, a Perceptron can be used as a separator or a decision line that divides the input set of OR Gate, into two classes: Class 1: Inputs having output as 0 that lies below the decision line. A group of parameters, you know, some sort of ways that we have tuned the model and we need to identify different values of those parameters that produced the optimal results. I mean, obviously, making an actual feature for this model that includes age or sex or race or religion would be a pretty bad idea, right? So we call these local receptive fields there just groups of neurons that respond only to a part of what you're. We'll give it a title, will show that two dimensional images great scale and just show it. We're using dense dropout and sequential, and we're also going to use cross Val scored actually evaluator model and actually illustrate integrating caress with psychic learned like we talked about as well. And that's mainly what I want to talk about in this lecture. Finally, we need to boil that down to a single output neuron with a sigmoid activation function because we're dealing with a binary classification problem, and that's it. For example, maybe you miss misinterpreted a tumor that was measured by some you know, biopsy that was taken from a breast sample as being malignant, and that false positive of malignant cancerous result could result in riel unnecessary surgery to somebody. But these are very real concerns, and there are a lot of people out there that share my concern. At that point, you might apply a flattened layer to actually be able to feed that data into a perceptron, and that's where a densely or might come into play. See if you can improve upon things. Yeah, I can't understand that. As a result of analyzing that sequence. We also want to display the actual accuracy at each stage two and all this accuracy metric does is say, let's compare of the actual maximum argument from each output array that's gonna correspond to our one hot encoded value. We thought it was 1/6. So we've evaluated our neural networks by their ability to accurately classify something, and if we see like a 99.9% accuracy value we congratulate ourselves and pat ourselves on the back, but often that's not enough to think about. And from there, open up your anaconda. Like I said later in the course, we'll talk about some better approaches that we can use. Right? Let's actually use a CNN and see if we can do a better job at image classification than we've done before using one.