Creating Graphs We Already Know Video Lecture Transcript This transcript was automatically generated by Zoom, so there may be discrepancies between the video and the text. 16:16:57 Welcome, back in this video, we learn how we can use Plot Lee to make some charts that we already know how to make. 16:17:04 For instance, like Scatterplots, and all so let's go ahead. 16:17:09 You can notice here that we are now importing. Plotley express as Px up at the top, because we're going to use it. 16:17:16 So we'll start by learning how we can make scatter plots, which is a figure or a chart type that we've talked about a lot. 16:17:23 So you can do this with the Px. Scatter functions from Plotley. Express. 16:17:29 The documentation is linked here so you can make a scanner plot first of all, just by providing the X and Y elements as Tuples. 16:17:39 So, for instance, x equals 1, 2, 3, 4 y equals, 2, negative, 3, 7, one. 16:17:46 And then, if I call X equals X and y equals y, and then we'll also be including this width and height. 16:17:56 Argument for all of our plots, for now don't worry about those. 16:18:00 We'll talk about those in a later notebook, but they just set the width in the height of our figure. 16:18:04 Okay. So and then the last part we have to add is, I forgot to the Px scatter. 16:18:10 Okay, so this creates the figure object, stores it in fig. 16:18:16 And then it will add the trace of X and y. 16:18:20 So why plot it against X as a scatter point? 16:18:23 Okay. And so you might notice another. Well, not yet. But another thing we can do is just like with Seaborn or with Boco. 16:18:33 We can provide a data frame or a data dictionary where we put the first argument of our function as that data frame or data dictionary, and then all of the other arguments can be made as columns of the dictionary so here, I take that same X and y data but now I store. 16:18:51 It in a data frame called Df. And so I can actually go ahead and put this in its own own code chunks. 16:18:59 You can see. Okay. So we've got our data frame. 16:19:01 And then, once you have a data frame, the first argument can be, Df, and then the rest of the arguments like X and Y can be provided as column names. 16:19:11 So same exact plot as before. But now we're using a data frame. 16:19:15 You don't have to make the data frame your first argument. 16:19:18 You can put it as any or argument put it, in any order as long as you specify that it's the data frame by setting data frame equal to Df or by default, you can just make it the first argument, which is what I'll do. 16:19:33 And these these functions. So you can also notice that these points are pretty small. 16:19:40 I believe their size is close to one. Pixel, a little bit larger. 16:19:43 Whatever the default size of one is for a plot link. 16:19:47 So you can increase the size with the size argument. We'll also notice the size. 16:19:52 Max, argument, and I'll talk about that in a second. 16:19:55 You can see, we can also set the color. We can set the marker symbol with the symbol argument, and, unlike basically the other packages, we've learned up to this point where they use Alpha for the opacity here plotley uses opacity for the opacity so here 16:20:13 I'm going to make an array of sizes or colors, and we're gonna use those alternatively. 16:20:20 If my data frame had these as columns, I could use those as well. 16:20:23 But I made them as Tuples. Oh, sorry lists, so size is going to be set equal to sizes I'm gonna set the size, Max to 40. 16:20:33 We'll talk about that in just a second. I'm gonna set my color equal to colors, and you have to have with with Plotley your aesthetic arguments have to either be a column name or a list that is the same length as the number of observations so i'm 16:20:48 Going to do one times 4. Okay? So we can now notice, like the difference in opacity. 16:20:55 If I said, Oh, capacity to say point one. Now you can see they're very seated. 16:21:00 I'll leave it as one for now and now let's talk about seize Max so size Max will look at the largest size that you have, which for us is 40, and then 40 will get mapped to the maximum size, and then the remaining points we'll get mapped as a 16:21:20 Fraction. So 5 is 1, 8 as large as 40. So the circle that has the size of 5 will be one over 8, the size of the circle that's the largest one, which is this 1 20 is half of 40. 16:21:35 So the circle with size 20 will be half the size, so we can set to be 40, which will now set the size as we see it. 16:21:41 But I'll alternatively we could set it to be if we set it at 10. 16:21:45 The relative sizes of these will all be the same, but the absolute sizes, the ones that we see will now get smaller. Okay? 16:21:53 Or we could make them even larger if we set size, Max, equal to a 100. 16:21:57 So sizes sort of sets, like the relative size where the maximum gets mapped to whatever size Max is. 16:22:05 And then who remaining points get mapped according to the fraction of the maximum that they are. 16:22:10 So let's go ahead and go back to 40. And now we know a little bit more about how sizes work, and Plotley 16:22:20 So another thing we can do is we can provide the color arguments, and you might be able to notice here that the color was assigned both blue and red. 16:22:29 But now it's also treating it like a categorical variable. 16:22:32 So maybe it's worthwhile to show an example. 16:22:36 So colors equals maybe apple apple 16:22:42 Grape, rape, and so now it colors it blue and red again, but according to the categorical variable of apple and grape, so color takes in categorical variables. 16:22:56 It can also take in a continuous variable. So here is an example where colors is now a continuous, variable, and when it does that it provides a color bar in a color mapping for you, okay, so you can control which color bar and color. 16:23:14 Mapping gets used by well, the color bar that gets used can be provided by the color, continuous scale argument using the built in colors, color scales from or color palettes from plotley. 16:23:29 So you can see a list of all the available continuous color palettes that are available. 16:23:34 By going to this documentation link, and then for the categorical ones. 16:23:37 You can go here. So we're gonna go ahead and show an example where we change it to a gray color scale or a gray scale. And this is the argument that you'd provide for that and for any other color skill. 16:23:50 You want the argument can be found at this dictionary. 16:23:53 Or this link, not dictionary. So here you can see. Now we've taken this plot with the same exact values for colors. And now we've turned these into a grayscale plot where the color bar again is automatically provided on the right hand side 16:24:09 So in Matt Platlib, Seborn, and Boca there was an argument with the scatterplot function that allowed you to change the color of the edges of the markers that does not exist and Plotley, at least to my knowledge, so in order to change the edge color we have to do something 16:24:26 Called update traces. After we call scatter plot. 16:24:30 So what we do to update traces is right're going to go ahead and just go ahead and add this stuff. 16:24:38 So we're gonna go and do you have to cheat in order to change the markers? 16:24:43 You have to do markers, and then you set it equal to a dictionary. 16:24:49 And then within that dictionary you put in for the keys the elements that you'd like to change. 16:24:56 So for me. I want to show an example of changing the face color to be gray 16:25:02 And then I'm gonna set the opacity just to sort of give an example. 16:25:06 So here I'm going to set the opacity to one 16:25:10 I want to set the width of my edges to 2, so I think it's line width is gonna be 2. 16:25:20 And then finally, I want to set the color of my edge to black. 16:25:24 Oh, and I don't know, I remember. So I have to add in line, and then this is also its own dictionary. 16:25:40 And so color 16:25:44 Will be black, and then let's see if this runs without any errors, and then I figured that was in here. 16:25:51 Maybe it's a line 16:25:53 Let's see what it says. 16:25:57 So it's not color. It's fill color. 16:26:02 Did you 16:26:07 I see, and then it's probably still is color. 16:26:10 So let's go back to that. 16:26:37 Okay. So the error was that it was, I had it as markers, but it should be marker. 16:26:42 So if you notice here now, you can see I've got invalid property specified markers. 16:26:47 Did you mean, Marker? So if I change it to back to Marker, I get what I would like. 16:26:53 So you can put in the marker article, and then you provide the marker argument, a dictionary within that you can set the color of the marker. 16:27:02 Things like the opacity, and then to set the edge color. 16:27:05 You need to provide a line argument. So the line is the line around the edges and then within that you, you have to do a dictionary as well. 16:27:15 Okay. And then, for instance, we can see another example where maybe, if instead of black, I want to set it to red and now you can see I've got red edges. 16:27:28 Okay. So another thing we learned in Seborn was that if you use either Lm plot or reg plot, you can make a scatter plot. 16:27:37 That then had a regression trend line on top. So the same exact thing can be done with Px. 16:27:43 Scatter. All you have to do is set a trend line argument, and if you want regular regression, you do. Okay. 16:27:51 So this will do. Ordinary, least squares regression. And then, if I want to set the color of the trend line, you can do trend line color, override is equal to black 16:28:05 Did I spell it wrong? Probably 16:28:17 I see. I see what I did now, so we can do 16:28:23 10 times. So I just needed my sizes, was set from above. 16:28:27 I needed to be set for the length of the array so now we'll see. 16:28:34 I've got my scatter points, and now I've added a ordinary least squares trend line on top. 16:28:39 There are other options that you can find by going to the Px. 16:28:42 Scatter documentation provided above. So we talked about this a little bit in the last notebook when we introduced Plotley. 16:28:51 Express, so Plotley express has the ability to take in multiple arguments for either the X or the Y. 16:28:59 So let's demonstrate. And I want to point out that this is exclusive or so X or the Y, but not both. 16:29:06 Then we'll see what happens when we try and do both. 16:29:09 So here's an example where I've got this data frame. 16:29:10 That's got 2 X variables and 2 y variables. 16:29:14 So then, if I call Px, scatter and then provide a list of column names for why, it's going to plot both of those, and then it will color them for you. 16:29:23 So the blue points represent y. One. The red points represent y, 2 now I can do this same exact thing where I provide to a list of arguments for the X variable. 16:29:36 Now you can see red points represent those with x, one there with x, 2, and blue points represent those with x one. 16:29:45 And now we can see what happens. Like, will something happen if I try and provide an argument for both X and Y. 16:29:52 A list of arguments for X and Y. And now you can see that I get an error that you cannot access, that you cannot accept a list of column references or a list of columns for both X and Y, okay, so this being if your holds for a number of Px plotting 16:30:11 functions. For example, Px line, that we saw in the last notebook, so Px, dot line is a way that you can use Plotley express to create a line plot. 16:30:21 And here's a link to the documentation. So in this code Chunk, I make some data and then store it in random walks. 16:30:29 And I'm going to use this data frame to demonstrate the functionality of Px dot line. 16:30:33 So here we can see how you can. Just I'm just gonna do. 16:30:37 Remember, this is a data set where I imagine I'm doing a bunch of random walks and then recording it. 16:30:43 Perhaps for 2 different study groups, or maybe A is one intervention, and B is the control or something. 16:30:49 So here's an example where I can just plot a single line again by specifying the horizontal and the vertical axes. 16:30:58 So I just call Px dot line, and I put in the data that I'd like. 16:31:04 There are also variety of options for line that you can see at the documentation. 16:31:08 So, for instance, we can set the color of the line to be a variable. 16:31:12 The dash pattern of the line to be a variable. 16:31:14 We can turn on or off markers by setting markers equals to true. 16:31:19 So here's an example where I'm gonna set the color to be the walk number which is stored in the walk column. And then I'm going to set the line dash to be given by the study group 16:31:33 And so now you can see that, according to this legend, I've got the the study group and the walk. 16:31:41 So the one, the study group, is given by the solid or dashed line pattern wire. 16:31:48 As the color is given for the the walk, so this blue solid line represents walk one from study Group A, and so forth. 16:31:58 So there are more aesthetic options that you can find by checking out the line documentation that I provided above here 16:32:06 So before we close out this notebook, I want to give some limitations of plot. 16:32:11 We express that we haven't encountered up to this point that you but you may encounter if you'd like to use plotly, express so let's go back to that stocks example that we've used for the past 2 notebooks. 16:32:25 And remember I made this line chart where I've got Microsoft as a red line and apple as a blue line. 16:32:31 Now a natural thing, you might want to do is have one of these be a dotted line in addition to a different color. 16:32:38 Perhaps your audience, you know, for a fact includes some people that experience red, blue, color, blindness, or just color blindness in general, and you want them to be able to distinguish the 2 lines. 16:32:47 So you want to make one of them a dash line, and one of them a solid line. 16:32:51 So there doesn't appear to be a way to use this plotting function that we've provided here, and then make one of them dash. 16:33:00 And one of them plain, just with Px dot line. So what can we do in this situation? 16:33:04 Or we know we want to make one of them a dotted line, one of them as saw line. 16:33:09 Well, one way we can do this is to use update traces. 16:33:12 So we saw this above. I believe. So you call update traces. 16:33:18 And then the first thing you wanna do is we're gonna specify which line we want in the way we can do that is with the name, and so on. 16:33:26 We use Px dot line. The Microsoft lines name is Msft, and the apple's name is Aapl. So I'm going to specify that I want the name of my line to be Ms. Ft. 16:33:42 And now the next thing we want to do is well, once we've selected that line, we want it to have a dash pattern. 16:33:47 So I'm going to set my dash argument just equal to. 16:33:51 Dash. So this will provide a dash for the string. 16:33:54 So after you create the line plot for your figure, you can go ahead and update the selected trace that you're interested in by with the update traces function. 16:34:04 So now that I've done this, my Microsoft line is both red and dashed, which is a little bit more considerate of those people who might not be able to tell the difference between red and blue another way to do this. 16:34:18 We saw in the second notebook. You can just add the the traces wanted a time with add traits. 16:34:24 So, instead of relying on plot, lead express, which is nice. 16:34:29 But you know, as we've seen, has this limitation, you can go around this limitation just by using. 16:34:33 Add, trace. 16:34:36 So, instead of reviewing in depth all of the various chart types we've seen throughout this series of videos and lectures I've provided links to the documentations and the tutorials from Plotley on how to make things like bar charts pie charts box and 16:34:53 Whisker plots, histograms, and so forth, so if you're interested in any of those, and making those within plot, we, have provided a link to the documentation for you. 16:35:02 Go ahead and check it out I encourage you to try and learn more about plot. 16:35:06 Leave just by trying it out on your own. So these are links to how to make all the charts that we've already covered in other videos. 16:35:14 But now, just with the plot, Lee package. So we've introduced Plotley express, discussed how it works, both in taking in data and then setting aesthetic options in the next notebook before we move on to other things, like adding more interactivity and it's adjusting 16:35:32 How the figure layout looks. I want to introduce a few plot types that we haven't seen yet that you can make in plot, Lee.