I have a problem accessing csv file in jupyter notebook with python

I have a problem accessing csv file in jupyter notebook with python - python

Hi I have a beginner problem. So I wanted to access csv file with jupyter notebook and I am using python. I am opening the jupyter notebook on visual studio code. So here is my code
import pandas as pd
df3 = pd.read_csv("D:/medali.csv")
imax = df3["bronze"].idxmax()
df3[imax:imax+1]
The thing is I kept stuck with the error
FileNotFoundError: [Errno 2] No such file or directory: 'D:/medali.csv'
I assume it is due to pathway problem so I've put the .ipynb file with the .csv file in one folder but it does not work. How to solve the error?

The easiest thing to do, assuming you're on windows os, is to go to the file, right click, select "copy as file path", and then put that in the place of "D:/medali.csv". That should fix the issue, but you may find that you also have to set the file path string as a raw string to keep it from being messed up by the \ or / characters that windows uses. To do this, type a single "r" in front of the file path string, without the quotations. Just the character "r".
Another thought to try is that you may need to actually "Open" the file first, and then try to read from it. Given that you're in python, I would recommend the following syntax:
import pandas as pd
with open(r"filepath.csv", "r") as f:
df3 = pd.read_csv("D:/medali.csv")
imax = df3["bronze"].idxmax()
df3[imax:imax+1]
This is best practice because when you open the file with the "with" keyword, it will close after the block under it has executed automatically.

Related

Python: Excel file (xlsx) export with variable as the file path, using pandas

I defined an .xlsx file path as the variable output:
print(output)
r'C:\Users\Kev\Documents\Python code.xlsx'
I want to export a pandas dataframe as an .xlxs file, but need to have the file path as the output variable.
I can get the code to work with the file path. I've tried about a dozen ways (copying and/or piecing code together from documentation, stack overflow, blogs, etc.) and getting a variety of errors. None worked. Here is one that worked with the file path:
df = pd.DataFrame(file_list)
df.to_excel(r'C:\Users\Kev\Documents\Python code.xlsx', index=False)
I would want something like:
df.to_excel(output, index=False)
In any form or package, as long as it produces the same xlsx file and won’t need to be edited to change the file path and name (that would be done where the variable output is defined.
I've attempted several iterations on the XlsxWriter site, the openpyxl site, the pandas site, etc.
(with the appropriate python packages). Working in Jupyter Notebook, Python 3.8.
Any resources, packages, or code that will help me to use a variable in place of a file path for an xlsx export from a pandas dataframe?
Why I want it like this is a long story, but basically I'll have several places at the top of the code where myself and other (inexperienced) coders can quickly put file paths in and search for keywords (rather than hunt through code to find where to replace paths). The data itself is file paths that I'll iteratively search through (this is the beginning of a larger project).

try to put the path this way
output = "C://Users//Kev//Documents//Python code.xlsx"
df.to_excel(output , index=False)
Always worked for me
or you can also do like
output = "C://Users//Kev//Documents//"
df.to_excel(output +"Python code.xlsx" , index=False)

os module would be the most useful here:
from os import path
output = path.abspath("your_excel_file.xlsx")
print(output)
this will return the current working directory path plus the file name you've put into the abspath function as a parameter. Also for those interested about why some people use backslash "\" and not forwardslash "/" while writing file paths here is a good stackoverflow answer for it So what IS the right direction of the path's slash (/ or \) under Windows?

You can use format strings with python3
import pandas as pd
df = pd.DataFrame({"a":"b"}, {"c": "d"})
file_name = "filename.xlsx"
df.to_excel(f"/your/path/to/file/{file_name}", index=False)

Assuming that OP's dataframe is df, that OP is using Windows and wants to store the file in the Desktop, OP's username is cowboykevin05, and the filename that one wants is 0001.xlsx, one can use os.path as follows
from os import path
df.to_excel(path.join('C:\\Users\\cowboykevin05\\Desktop', '0001.xlsx'), index=False)

Why does my code not work? I'm a novice at working with pandas on Python3

I'm using Google colab to write code and been trying to import an excel file (.xlsx) onto it using pandas
import pandas as pd
df = pd.ExcelFile('‪C:/Users/Ankit Gupta/Downloads/DS1.xlsx')
df.head()
Error:
FileNotFoundError Traceback (most recent call last)
<ipython-input-9-a8b5ec91cbb7> in <module>()
1 import pandas as pd
2
----> 3 df = pd.ExcelFile('‪C:/Users/Ankit Gupta/Downloads/DS1.xlsx')
4 df.head()
This keeps showing. My path is correct, and I'm new to working with pandas. Can anyone help?

r'‪C:\\Users\\Ankit Gupta\\Downloads\\DS1.xlsx should've been r'‪C:\Users\Ankit Gupta\Downloads\DS1.xlsx. Your current path has two backslashes (r'\\') instead of one.

This has nothing to do with pandas. You are using google.colab, which is in the cloud and does not see your local files. Use from google.colab import files followed by files.upload().

Use:
df = pd.ExcelFile('‪C:\Users\Ankit Gupta\Downloads\DS1.xlsx')
Basically "\" instead of "/" .

When I started working with pandas, even I used to face this problem. So here are the few check up you should do:
1.) Spelling errors and case sensitive nature
2.) Location. I would suggest as a beginner make a folder in C drive users, wherever your anaconda files are stored. And in that particular folder save all your datasets. I am using this method and till date never found the error again.
3.) While writing name, mention it along with your file format.
Solution to your question:
SOL1:
Possibly the path is different or changed, so Open Anaconda prompt or cmd, then change the path. Lets assume you are in "c" drive and file is in 'd' drive. So open the cmd and write "d:" and hit enter. Then you will se "D:>". Now you should write "cd. After running that, write "jupyter notebook" and run it. It will generate and open a new notebook in the same folder that has your file.
SOL2:
Other Solution is using backslash, two backslashes.The problem is with using backslashes "". You must avoid that. Backslash is reserved for something called escape characters, like new line being denoted with "\n" and stuff. You should use two backslashes "\" or one forward slash "/"/

Use Python to open an Excel workbook

I am learning Python through 'Automate the Boring Stuff With Python' First Edition. In chapter 12, pg 267, we are supposed to open a file called example.xlsx.
The author's code reads:
import openpyxl
wb = openpyxl.load_workbook('example.xlsx')
type(wb)
However, when I try to open this file, I get the following error (this is the last line of the error):
FileNotFoundError: [Errno 2] No such file or directory: 'example.xlsx'
I know this file exists, because I downloaded it myself and am looking at it right now.
I have tried moving it to the current location in which Python 3.8 is, I have tried saving it with my Automate the Boring Stuff files that I've been working on the desktop, and I have tried saving it in every conceivable location on my machine, but I continue getting this same message.
I have imported openpyxl without error, but when I enter the line
wb = openpyxl.load_workbook('example.xlsx')
I have entered the entire pathway for the example.xlsx in the parenthesis, and I continue to get the same error.
What am I doing wrong? How am I supposed to open an Excel workbook?
I still don't understand how I am doing wrong, but this one is incredibly infuriating, and I feel incredibly stupid, because it must be something simple.
Any insight/help is greatly appreciated.

Your error is unambigous — your file in a supposed directory don't exist. Believe me.
For Python is irrelevant, whether you see it. Python itself must see it.
Specify the full path, using forward slashes, for example:
wb = openpyxl.load_workbook('C:/users/John/example.xlsx')
Or find out your real current (working) directory — and not the one supposed by you — with commands
import os
print(os.getcwd())
then move your example.xlsx to it, and then use only the name of your file
wb = openpyxl.load_workbook('example.xlsx')
You may also verify its existence with commands — use copy/paste from your code to avoid typos in the file name / path
import os.path
print(os.path.exists('example.xlsx')) # True, if Python sees it
or
import os.path
print(os.path.exists('C:/users/John/example.xlsx')) # True, if exists
to be sure that I'm right, i.e. that the error is not in the function openpyxl.load_workbook() itself, but in its parameter (the path to the file) provided by you.

I notice that the extension of the example file is not the same as described in the book, is example.csv. I was facing the same frustration as you

Permission denied when pandas dataframe to tempfile csv

I'm trying to store a pandas dataframe to a tempfile in csv format (in windows), but am being hit by:
[Errno 13] Permission denied: 'C:\Users\Username\AppData\Local\Temp\tmpweymbkye'
import tempfile
import pandas
with tempfile.NamedTemporaryFile() as temp:
df.to_csv(temp.name)
Where df is the dataframe. I've also tried changing the temp directory to one I am sure I have write permissions:
tempfile.tempdir='D:/Username/Temp/'
This gives me the same error message
Edit:
The tempfile appears to be locked for editing as when I change the loop to:
with tempfile.NamedTemporaryFile() as temp:
df.to_csv(temp.name + '.csv')
I can write the file in the temp directory, but then it is not automatically deleted at the end of the loop, as it is no longer a temp file.
However, if I change the code to:
with tempfile.NamedTemporaryFile(suffix='.csv') as temp:
training_data.to_csv(temp.name)
I get the same error message as before. The file is not open anywhere else.

I encountered the same error message and the issue was resolved after adding "/df.csv" to file_path.
df.to_csv('C:/Users/../df.csv', index = False)

Check your permissions and, according to this post, you can run your program as an administrator by right click and run as administrator.
We can use the to_csv command to do export a DataFrame in CSV format. Note that the code below will by default save the data into the current working directory. We can save it to a different folder by adding the foldername and a slash to the file
verticalStack.to_csv('foldername/out.csv').
Check out your working directory to make sure the CSV wrote out properly, and that you can open it! If you want, try to bring it back into python to make sure it imports properly.
newOutput = pd.read_csv('out.csv', keep_default_na=False, na_values=[""])
ref
Unlike TemporaryFile(), the user of mkstemp() is responsible for deleting the temporary file when done with it.
With the use of this function may introduce a security hole in your program. By the time you get around to doing anything with the file name it returns, someone else may have beaten you to the punch. mktemp() usage can be replaced easily with NamedTemporaryFile(), passing it the delete=False paramete.
Read more.
After export to CSV you can close your file with temp.close().
with tempfile.NamedTemporaryFile(delete=False) as temp:
df.to_csv(temp.name + '.csv')
temp.close()

Sometimes，you need check the file path that if you have right permission to read and write file. Especially when you use relative path.
xxx.to_csv('%s/file.csv'%(file_path), index = False)

Sometimes, it gives that error simply because there is another file with the same name and it has no permission to delete the earlier file and replace it with the new file.
So either name the file differently while saving it,
or
If you are working on Jupyter Notebook or a other similar environment, delete the file after executing the cell that reads it into memory. So that when you execute the cell which writes it to the machine, there is no other file that exists with that name.

I encountered the same error. I simply had not yet saved my entire python file. Once I saved my python file in VS code as "insertyourfilenamehere".py to documents(which is in my path), I ran my code again and I was able to save my data frame as a csv file.

As per my knowledge, this error pops up when one attempt to save the file that have been saved already and currently open in the background.
You may try closing those files first and then rerun the code.

Just give a valid path and a file name
e.g:
final_df.to_csv('D:\Study\Data Science\data sets\MNIST\sample.csv')

pandas.read_csv FileNotFoundError: File b'\xe2\x80\xaa<etc>' despite correct path

I'm trying to load a .csv file using the pd.read_csv() function when I get an error despite the file path being correct and using raw strings.
import pandas as pd
df = pd.read_csv('‪C:\\Users\\user\\Desktop\\datafile.csv')
df = pd.read_csv(r'‪C:\Users\user\Desktop\datafile.csv')
df = pd.read_csv('C:/Users/user/Desktop/datafile.csv')
all gives the error below:
FileNotFoundError: File b'\xe2\x80\xaaC:/Users/user/Desktop/tutorial.csv' (or the relevant path) does not exist.
Only when i copy the file into the working directory will it load correct.
Is anyone aware of what might be causing the error?
I had previously loaded other datasets with full filepaths without any problems and I'm currently only encountering issues since I've re-installed my python (via Anaconda package installer).
Edit:
I've found the issue that was causing the problem.
When I was copying the filepath over from the file properties window, I unwittingly copied another character that seems invisible.
Assigning that copied string also gives an unicode error.
Deleting that invisible character made any of above code work.

Try this and see if it works. This is independent of the path you provide.
pd.read_csv(r'C:\Users\aiLab\Desktop\example.csv')
Here r is a special character and means raw string. So prefix it to your string literal.
https://www.journaldev.com/23598/python-raw-string:
Python raw string is created by prefixing a string literal with ‘r’
or ‘R’. Python raw string treats backslash () as a literal character.
This is useful when we want to have a string that contains backslash
and don’t want it to be treated as an escape character.

$10 says your file path is correct with respect to the location of the .py file, but incorrect with respect to the location from which you call python
For example, let's say script.py is located in ~/script/, and file.csv is located in ~/. Let's say script.py contains
import pandas
df = pandas.read_csv('../file.csv') # correct path from ~/script/ where script.py resides
If from ~/ you run python script/script.py, you will get the FileNotFound error. However, if from ~/script/ you run python script.py, it will work.

I know following is a silly mistake but it could be the problem with your file.
I've renamed the file manually from adfa123 to abc.csv. The extension of the file was hidden, after renaming, Actual File name became abc.csv.csv. I've then removed the extra .csv from the name and everything was fine.
Hope it could help anyone else.

import pandas as pd
path1 = 'C:\\Users\\Dell\\Desktop\\Data\\Train_SU63ISt.csv'
path2 = 'C:\\Users\\Dell\\Desktop\\Data\\Test_0qrQsBZ.csv'
df1 = pd.read_csv(path1)
df2 = pd.read_csv(path2)
print(df1)
print(df2)

On Windows systems you should try with os.path.normcase.
It normalize the case of a pathname. On Unix and Mac OS X, this returns the path unchanged; on case-insensitive filesystems, it converts the path to lowercase. On Windows, it also converts forward slashes to backward slashes. Raise a TypeError if the type of path is not str or bytes (directly or indirectly through the os.PathLike interface).
import os
import pandas as pd
script_dir = os.getcwd()
file = 'example_file.csv'
data = pd.read_csv(os.path.normcase(os.path.join(script_dir, file)))

If you are using windows machine. Try checking the file extension.
There is a high possibility of file being saved as fileName.csv.txt instead of fileName.csv
You can check this by selecting File name extension checkbox under folder options (Please find screenshot)
below code worked for me:
import pandas as pd
df = pd.read_csv(r"C:\Users\vj_sr\Desktop\VJS\PyLearn\DataFiles\weather_data.csv");
If fileName.csv.txt, rename/correct it to fileName.csv
windows 10 screen shot
Hope it works,
Good Luck

Try using os.path.join to create the filepath:
import os
f_path = os.path.join(*['C:', 'Users', 'user', 'Desktop', 'datafile.csv'])
df = pd.read_csv(f_path)

I was trying to read the csv file from the folder that was in my 'c:\'drive but, it raises the error of escape,type error, unicode......as such but this code works
just take an variable then add r to read it.
rank = pd.read_csv (r'C:\Users\DELL\Desktop\datasets\iris.csv')
df=pd.DataFrame(rank)

There is an another problem on how to delete the characters that seem invisible.
My solution is copying the filepath from the file windows instead of the property windows.
That is no problem except that you should fulfill the filepath.

Experienced the same issue. Path was correct.
Changing the file name seems to solve the problem.
Old file name: Season 2017/2018 Premier League.csv
New file name: test.csv
Possibly the whitespaces or "/"

I had the same problem when running the file with the interactive functionality provided by Visual studio. Switched to running on the native command line and it worked for me.

For my particular issue, the failure to load the file correctly was due to an "invisible" character that was introduced when I copied the filepath from the security tab of the file properties in windows.
This character is e2 80 aa, the UTF-8 encoding of U+202A, the left-to-right embedding symbol. It can be easily removed by erasing (hitting backspace or delete) when you've located the character (leftmost char in the string).
Note: I chose to answer because the answers here do not answer my question and I believe a few folks (as seen in the comments) might meet the same situation as I did. There also seems to be new answers every now and then since I did not mark this question as resolved.

I had similar problem when I was using JupyterLab + Anaconda, and used my browser to type stuff on IDE.
My problem was simpler - there was a very subtle typo error. I didn't have to use raw text - either escaping or using {{r}} string worked for me :). If you use Anaconda with Jupyter Lab, I believe the Kernel starts with where you open the notebook from i.e. the working directory is that top level folder.

data = pd.read_csv('C:\\Users\username\Python\mydata.csv')
This worked for me. Note the double "\\" in "C:\\" where the rest of the folders use only a single "\".

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.