Reading and Writing to File Recorded Transcript This transcript was created with a speech to text software. Please excuse any typos or mismatches with the video. Hi, everybody. Welcome back. Uh In this video, we will continue our Python prep video series by talking about reading and writing to file using Python. Let me go ahead and get our Jupiter notebook ready to go. All right. So in this notebook, we're gonna learn how we can use Python to create read and write text to a text file uh on our computer. So in the end of this notebook, you should know about file objects, how to write to file, how to read a file using Python as well as some preferred file object syntax in Python. So sometimes you're gonna want to be able to read and write data from and to a file uh perhaps using a text file or a CS V file or a T S V file. Uh If you've heard of those before, one way you can do this in Python is by using what's known as a file object. So if you are interested, here's a link to the Python documentation on file objects. Uh This will tell you all about how to read and write to files using the Python documentation or you can just watch the rest of this video to get the main points. So as a reminder before we move forward in this video, I need to remind you that remember in Jupiter notebooks, the order in which you execute a file is very I or execute a code chunk is very important. So remember we're gonna want to make sure we're very mindful of executing the code in the correct order. This can cause a issue when you are writing files in particular because maybe you write something that needs to go first in one code chunk uh and something that needs to go and after that in a second code chunk, and if you put the order in the wrong way, uh you'll mess up your file. So just be mindful that if you are reading and writing files in Jupiter notebooks, so we're first gonna see how we can write a file object. So what the syntax for this is is you're gonna call open, which is a keyword in Python. You'll do some parentheses, you'll first put the name of the file that you're interested in, write uh reading into or writing to uh using this open. So for us, it's gonna be new underscore file dot T X T and then you put a comma and then after the file name in the comma, you put how you'd like to open your file. So for us, we're going to write onto this file. So we're gonna call out a W a lower case W W for, right? And then plus, which tells Python that if this file does not already exist in our computer. The plus says, please create it. So now we have opened up that file object and stored it in the variable file. And so we're going to be done with that. Uh just to show off what this did, we're gonna close the file. And so what this does is once uh, you open a file, it will be open in your computer and it has the ability for us to be written to, uh, if we opened it in read mode, it would have the ability to be read from. But once we're done with our file, if we're done reading it or we're done writing it, it's really good and, uh proper to close the file to make sure that one, if we wrote something that those, uh, things that we wrote were committed to the actual file in our computer. And two to make sure the file is not corrupted, uh, by incorrectly running some code later. Ok. So we've written this file, we've created the file, new underscore file dot text. You should now be able to go to your Python prep folder and you should see a file, new underscore file dot text that was not there before you ran that code junk for the first time. Um It may not be at the top. I have my list sorted according to the um uh order in which the file has been updated or saved. Uh But it should be there somewhere. Feel free to scroll through and look for new underscore file dot text. This is the text file. We can open it in our Jupiter home and we can see for us it is empty now. Uh but it will soon be not empty. So let's go ahead and go back to our Jupiter Notebook. I'm gonna close the file as well. So we're gonna now go ahead and see how we can write to it. ... So we will reopen, we're gonna open it. Now. We don't need the plus because it exists, but we're gonna open it in right mode. And now if we want to write something to our file, uh particularly these have to be strings. So if you're gonna write something to the file, we just do file, which is our variable dot Right parentheses and then whatever string you'd like to write onto the file in, uh you know, inside the parentheses. So for us, we'll write the string. Now, my file has a line in it and then I want to go to a new line at the end of the file. So I put that slash N and then I'll write another line. This is the second line in the file and then I'm done with it for now. So I'm gonna close it. So now we can go back, open up new file. And now we can see that those two lines that I wrote to it exist in the file. Now, the important thing is these might not show up right away. You have to close the file object in Python in your Jupiter notebook before you can be sure that these lines will show up. They may show up before that depending upon, you know, your system. Uh but to be 100% sure that the things that you're trying to write to the file are actually written on the file OBJ uh file in your computer. You need to close it first. Ok. So let's go back. There's now going to be an exercise where I'm gonna ask you, you should open the file in right mode like we've been doing and try and write the line. A third line might be fine and then close it when you're done. So feel free to pause the video and try that or, and then come back and see how I did it. Ok? So we call for file equals open new file dot T X T. And we're being said to re open it and write mode. So we need a W file dot Right. A third line might be fine, period slash N and then we'll, we're done with writing to it. So file dot Clod. Ok. So that was the answer and we're being asked to open the file and see what happens. So let's go back, open up new file dot text and you'll notice that, wait a minute, the original lines, one and two that we wrote before are now gone. So what happened? Uh So what happened was when you say, w this command tells Python to open up the file and eliminate all the text that was in there before. So if we had a bunch of text in our file before opening it with the W is gonna make sure that all that text is erased and we have a blank file. So if we want to write to write new text to a file that already has something written within it, that we want to make sure that we save, need to open it with an A instead of A W because the A stands for a pen because we are a pending new text onto our file. So we're gonna call file equals open new file dot text. And now we have a instead of W and we're gonna say file dot right. A new line, we won't overwrite the old lines this time and then we close it. And so I asked you to check, we can check our file, uh close the old version, open up the new version and we can see here's our new line. Uh and it did not overwrite the A third line might be fine. OK? So now I want you to code and create a new CS V file called my first data dot CS V. And I want you to write a line of column names that are X Y Z separated by commons. So remember to put the slash N at the end of it because we're gonna do more in the next little code chunk as well. OK? So we do file equals open my first data dot CS V, right? Plus because this does not exist yet and we'll do file dot right, X Y Z slash N ... and then we'll uh so it says do not close the file yet. So let's go ahead. Now in the next code chunk, we're being asked to do something else. So if you did this first code junk and it matches mine very good. If you did close the file, make sure to reopen it in a pen mode for this next code junk, we're gonna go through and write a loop that writes these lists to the corresponding column. So all the values in X should be written in the X column, all the values in Y and the Y column and all the values in Z to the Z column. So my file is still open. So I don't have to open it again. So for I in range of, so we want to go through all the values in X so we could loop individually. But that's gonna get complicated trying to do it for the three without some other tricks. But if you know those tricks good for you and do it your way. What I'm gonna do is loop through the range of the length of the list. So I don't remember if we went over this, but L E N gives you the length of whatever object you're putting inside of here. So for us, it should return four and then we're gonna do file dot Right. And we're gonna do string of X and I plus a comma. Remember the plus is concatenating strings, plus string of Y at I A plus a comma plus string of Z at I plus a new line character. And then we'll close it at the end. So what's going on here is we're gonna loop through all the indices for our four li our three lists and each time through. So the zeroth time through, we're gonna write the zeroth entry of X and remember it has to be a string so I can write it to file and then a comma and then the zeroth item of Y plus a comma plus the zeroth item of Z plus the new line. And then we would do that when we get to +12 and three. OK. So now let's go ahead. We have closed the file, we can check it out my first data and we've got X Y Z and remember X was 1234, Y was 2468 and Z was 18 27 64. OK? So close that close that and back to our Jupiter notebook. ... And if you did that, uh and you wanna check your output with the one that I got there is a file in there called check data file dot CS V. And you can check your CS V with my CS V. OK. So reading a file object. So we're gonna suppose that we now have a file that contains data that we want to read in. So this could be a data file of numeric values. This could be a text file of maybe some writing like a uh a book or some essays that we'd like to read and analyze. Maybe it's a file full of tweets that we'd like to analyze. Uh So now we're gonna see how you can read it using the uh open command in Python. So to open something in read mode, which means we would not be able to write to it, only read it. We say open the name of the file comma R. So the R stands for read. And now once this is open, we can do a file. And then if we have the dot read function, this is going to return all of the text contained in the file. And when I print out that and returns it as a string and I'm gonna print out that string. And so here we can see uh here was the string that was or the text that was contained in new file dot text printed out on your computer. Ok. Now I want you to go ahead and code and try and reread those files contents uh using dot read. So feel free to pause the video and try and then come back and see me do it. Ok. So let's just try what we did right above file dot read. And you'll notice something weird is happening. Uh There's no text, no text is being returned. And so why is this happening? Well, the way that read dot works is that it's gonna go through all the, all the characters in the file and it goes through and reads them one character at a time. And the position of the character in the file as a string is known as the cursor. It's kept track of as the cursor. So we can think of the cursor as sort of like our eyeballs or uh if you're reading, let's say you're blind, maybe your fingers. Uh And so what you can imagine is, you know, in the, um syntax of Python, we imagine reading from left to right and then up to down uh like we would in America and we're going to say our cursor which starts here, the capital a position for the cursor would be zero and then the space would be one and then this t would be two and then the H is three and then the I is four and so forth all the way down to the bottom. And so after you've called file dot read, once your cursor is now left at the bottom of the text file, just like your eyes would be left at the bottom of a page when you were done reading it. So if we want to be able to use, read again to get the entire contents of the file, we have to move our cursor think of the file's eyeballs. We have to move our cursor back from the bottom up to the top. And the way this is done is with a seek function. And so the seek function takes in a position for your cursor which is the index in the in the file where you want you to start reading the string. So we call file and then dot seek. And if we want to go back to the very beginning of the file, we would put in a zero. And so this is remember our cursor is done reading. So it's down here to the right of the exclamation point calling dot seek and then putting in a zero is gonna put it all the way back here in front of the first capital A. OK? So I'll run this and now I want you to try and use file dot read again and then store it in a variable called file text. So we got file underscore text is equal to file dot read. And now we can print out file, text, print, file, text and now we can see that. Boom, that's what we wanted. Uh We can even call it a second time. Still file text still there. And this is it as a string. So now we're done with this file. So we're gonna close it ... and now do uh a little bit more of an exercise. So try and go through these on your own and then come back when you're ready to see me do it. So I'm gonna call file. It's equal to open. ... Uh My first data dot CS V in read mode and then we're gonna call file dot read lines. And what happens is, and we can open up my first data. So here are the lines. The first line is XYZ- 121248 and so on. And so what read lines does is it returns a list where every entry of the list is the line corresponding line of the file. So the first line here of the file was X Y Z. And then we've got that slash N telling us to go down to the next line. Uh The second line in the file is 121 and so forth. Uh So that's what read lines does. And it's, I find it very useful uh when I'm reading a file like this and I'm done with the file. So I'm gonna close it. ... So I'm gonna end this notebook by commenting on according to the Python documentation, what the preferred syntax is for reading uh and doing something to a data file. So uh we've been making a file variable and storing the file object in that variable and then we would close the file in a later code chunk. So this is perfectly acceptable and it works just the same. But according to the Python documentation, uh the writers of Python and of this particular um file object suggests using the practice of using a with keyword when you deal with a file object. So what a with keyword does is you'll say with this object. So you open whatever file you're opening and read or write or a pen mode or whatever. So with this file object as file, so think of this as saying file equal open. So think of this like file equal open my first data dot CS V, you're then gonna go ahead and do whatever you wanted to do with the file on an indented line. So for us, we'll just print file dot read lines as an example of something to do. And then when you're done, you do not have to put in file dot close because once you leave the indentation for the with the file is automatically gonna close on its own. ... And so we can check this by saying, all right, well, what happens if I type file? Uh and it's still being stored as this. So you shouldn't uh according to the syntax on the, in the Python documentation, you don't need to do a dot Close statement with the syntax, but we'll do a dot Close just to be safe. OK. All right. So, uh that's it for this, you know, about file objects, you know how to read the file object, you know how to write to a file object. Uh You know, the preferred syntax with the width statement. Uh You now have a strong foundation, uh writing and reading files is very important in data science because that's how data is stored. You're not always going to use this as a way to read data. Uh But you may use it as a way to store your data particularly if you're scraping it off of a webpage. All right. So that's gonna be it for this video. I hope to see you in the next video where we learn more about Python. Uh And I hope you have a great rest of your day. It's been awesome having you here. Bye.