Python - How to write a windows path to a json file? - python

I am working together with a colleague and he has Ubuntu while I have windows. We have a dataset of json files which have in them a "path" written. His paths look like this:
'C:/Users/krock/Desktop/FIIT/BP/Ubuntu/luadb/etc/luarocks_test/modules/30log/share/lua/5.3/30log.lua'
But this doesn't work on Windows, I was trying to do
some_string.replace('/', '\\')
But this results in strings written in json that look like this:
'C:\\Users\\krock\\Desktop\\FIIT\\BP\\Ubuntu\\luadb\\etc\\luarocks_test\\data_all'
On my windows machine, I can't read (the program) these paths as it give an error:
No such file or directory
Is there a solution to this?
EDIT: I tried using Path from pathlib, but I got another error saying:
TypeError: Object of type WindowsPath is not JSON serializable
I found the solution to this is to do str(Path(path_string)), but the result is again the path in double quotes.

Yes, the solution is to use Python's built in pathlib. Also, using string literals might help the clarity of your program.
https://docs.python.org/3/library/pathlib.html

This question is missing code samples, so can't be more specific, but generally speaking, doing this manually is error-prone. Consider using a library, such as pathlib. E.G:
>>> from pathlib import Path
>>> Path('luarocks_test/modules/30log/share/lua/5.3/30log.lua')
PosixPath('luarocks_test/modules/30log/share/lua/5.3/30log.lua')
On Windows, instantiating a Path would give you a WindowsPath. You'll also want to use relative, rather than absolute references, as the paths will be different on your workstations.

Related

How to deal with multiple dots in a file name with Python pathlib?

I am having an issue with pathlib when I try to construct a file path that has a "." in its name, the pathlib module ignores it.
Here are example lines (I tried multiple versions, all resulted the same issue)
The issue is that the original file name will be coming from another application, so it is not like I can edit the name myself. I also do not want to do string replacement work arounds, if possible.
path=r"c:\temp"
1
p=Path(path).joinpath("myfile.001").with_suffix(".bat")
2
p=Path(path, "myfile.001").with_suffix(".bat")
3
p=Path(path).with_name("myfile.001").with_suffix(".bat")
All these lines will yield to
WindowsPath('C:/temp/myfile.bat')
So how do I make pathlib.Path to construct this full path properly. The final path has to be
WindowsPath('C:/temp/myfile.001.bat')
Not
WindowsPath('C:/temp/myfile.bat')
Naturally I am looking for a way to do it through pathlib itself, otherwise I can just use os.
thanks
You are telling pathlib to replace the suffix .001 with the suffix .bat. pathlib complies.
Tell pathlib to add .bat to the existing suffix.
p = Path(path, 'myfile.001')
p = p.with_suffix(p.suffix+'.001')

pandas.read_csv FileNotFoundError: File b'\xe2\x80\xaa<etc>' despite correct path

I'm trying to load a .csv file using the pd.read_csv() function when I get an error despite the file path being correct and using raw strings.
import pandas as pd
df = pd.read_csv('‪C:\\Users\\user\\Desktop\\datafile.csv')
df = pd.read_csv(r'‪C:\Users\user\Desktop\datafile.csv')
df = pd.read_csv('C:/Users/user/Desktop/datafile.csv')
all gives the error below:
FileNotFoundError: File b'\xe2\x80\xaaC:/Users/user/Desktop/tutorial.csv' (or the relevant path) does not exist.
Only when i copy the file into the working directory will it load correct.
Is anyone aware of what might be causing the error?
I had previously loaded other datasets with full filepaths without any problems and I'm currently only encountering issues since I've re-installed my python (via Anaconda package installer).
Edit:
I've found the issue that was causing the problem.
When I was copying the filepath over from the file properties window, I unwittingly copied another character that seems invisible.
Assigning that copied string also gives an unicode error.
Deleting that invisible character made any of above code work.
Try this and see if it works. This is independent of the path you provide.
pd.read_csv(r'C:\Users\aiLab\Desktop\example.csv')
Here r is a special character and means raw string. So prefix it to your string literal.
https://www.journaldev.com/23598/python-raw-string:
Python raw string is created by prefixing a string literal with ‘r’
or ‘R’. Python raw string treats backslash () as a literal character.
This is useful when we want to have a string that contains backslash
and don’t want it to be treated as an escape character.
$10 says your file path is correct with respect to the location of the .py file, but incorrect with respect to the location from which you call python
For example, let's say script.py is located in ~/script/, and file.csv is located in ~/. Let's say script.py contains
import pandas
df = pandas.read_csv('../file.csv') # correct path from ~/script/ where script.py resides
If from ~/ you run python script/script.py, you will get the FileNotFound error. However, if from ~/script/ you run python script.py, it will work.
I know following is a silly mistake but it could be the problem with your file.
I've renamed the file manually from adfa123 to abc.csv. The extension of the file was hidden, after renaming, Actual File name became abc.csv.csv. I've then removed the extra .csv from the name and everything was fine.
Hope it could help anyone else.
import pandas as pd
path1 = 'C:\\Users\\Dell\\Desktop\\Data\\Train_SU63ISt.csv'
path2 = 'C:\\Users\\Dell\\Desktop\\Data\\Test_0qrQsBZ.csv'
df1 = pd.read_csv(path1)
df2 = pd.read_csv(path2)
print(df1)
print(df2)
On Windows systems you should try with os.path.normcase.
It normalize the case of a pathname. On Unix and Mac OS X, this returns the path unchanged; on case-insensitive filesystems, it converts the path to lowercase. On Windows, it also converts forward slashes to backward slashes. Raise a TypeError if the type of path is not str or bytes (directly or indirectly through the os.PathLike interface).
import os
import pandas as pd
script_dir = os.getcwd()
file = 'example_file.csv'
data = pd.read_csv(os.path.normcase(os.path.join(script_dir, file)))
If you are using windows machine. Try checking the file extension.
There is a high possibility of file being saved as fileName.csv.txt instead of fileName.csv
You can check this by selecting File name extension checkbox under folder options (Please find screenshot)
below code worked for me:
import pandas as pd
df = pd.read_csv(r"C:\Users\vj_sr\Desktop\VJS\PyLearn\DataFiles\weather_data.csv");
If fileName.csv.txt, rename/correct it to fileName.csv
windows 10 screen shot
Hope it works,
Good Luck
Try using os.path.join to create the filepath:
import os
f_path = os.path.join(*['C:', 'Users', 'user', 'Desktop', 'datafile.csv'])
df = pd.read_csv(f_path)
I was trying to read the csv file from the folder that was in my 'c:\'drive but, it raises the error of escape,type error, unicode......as such but this code works
just take an variable then add r to read it.
rank = pd.read_csv (r'C:\Users\DELL\Desktop\datasets\iris.csv')
df=pd.DataFrame(rank)
There is an another problem on how to delete the characters that seem invisible.
My solution is copying the filepath from the file windows instead of the property windows.
That is no problem except that you should fulfill the filepath.
Experienced the same issue. Path was correct.
Changing the file name seems to solve the problem.
Old file name: Season 2017/2018 Premier League.csv
New file name: test.csv
Possibly the whitespaces or "/"
I had the same problem when running the file with the interactive functionality provided by Visual studio. Switched to running on the native command line and it worked for me.
For my particular issue, the failure to load the file correctly was due to an "invisible" character that was introduced when I copied the filepath from the security tab of the file properties in windows.
This character is e2 80 aa, the UTF-8 encoding of U+202A, the left-to-right embedding symbol. It can be easily removed by erasing (hitting backspace or delete) when you've located the character (leftmost char in the string).
Note: I chose to answer because the answers here do not answer my question and I believe a few folks (as seen in the comments) might meet the same situation as I did. There also seems to be new answers every now and then since I did not mark this question as resolved.
I had similar problem when I was using JupyterLab + Anaconda, and used my browser to type stuff on IDE.
My problem was simpler - there was a very subtle typo error. I didn't have to use raw text - either escaping or using {{r}} string worked for me :). If you use Anaconda with Jupyter Lab, I believe the Kernel starts with where you open the notebook from i.e. the working directory is that top level folder.
data = pd.read_csv('C:\\Users\username\Python\mydata.csv')
This worked for me. Note the double "\\" in "C:\\" where the rest of the folders use only a single "\".

normalize non-existing path using pathlib only

python has recently added the pathlib module (which i like a lot!).
there is just one thing i'm struggling with: is it possible to normalize a path to a file or directory that does not exist? i can do that perfectly well with os.path.normpath. but wouldn't it be absurd to have to use something other than the library that should take care of path related stuff?
the functionality i would like to have is this:
from os.path import normpath
from pathlib import Path
pth = Path('/tmp/some_directory/../i_do_not_exist.txt')
pth = Path(normpath(str(pth)))
# -> /tmp/i_do_not_exist.txt
but without having to resort to os.path and without having to type-cast to str and back to Path. also pth.resolve() does not work for non-existing files.
is there a simple way to do that with just pathlib?
is it possible to normalize a path to a file or directory that does not exist?
Starting from 3.6, it's the default behavior. See https://docs.python.org/3.6/library/pathlib.html#pathlib.Path.resolve
Path.resolve(strict=False)
...
If strict is False, the path is resolved as far as possible and any remainder is appended without checking whether it exists
As of Python 3.5: No, there's not.
PEP 0428 states:
Path resolution
The resolve() method makes a path absolute, resolving
any symlink on the way (like the POSIX realpath() call). It is the
only operation which will remove " .. " path components. On Windows,
this method will also take care to return the canonical path (with the
right casing).
Since resolve() is the only operation to remove the ".." components, and it fails when the file doesn't exist, there won't be a simple means using just pathlib.
Also, the pathlib documentation gives a hint as to why:
Spurious slashes and single dots are collapsed, but double dots ('..')
are not, since this would change the meaning of a path in the face of
symbolic links:
PurePath('foo//bar') produces PurePosixPath('foo/bar')
PurePath('foo/./bar') produces PurePosixPath('foo/bar')
PurePath('foo/../bar') produces PurePosixPath('foo/../bar')
(a naïve approach would make PurePosixPath('foo/../bar') equivalent to PurePosixPath('bar'), which is wrong if foo is a symbolic link to another directory)
All that said, you could create a 0 byte file at the location of your path, and then it'd be possible to resolve the path (thus eliminating the ..). I'm not sure that's any simpler than your normpath approach, though.
If this fits you usecase(e.g. ifle's directory already exists) you might try to resolve path's parent and then re-append file name, e.g.:
from pathlib import Path
p = Path()/'hello.there'
print(p.parent.resolve()/p.name)
Old question, but here is another solution in particular if you want POSIX paths across the board (like nix paths on Windows too).
I found pathlib resolve() to be broken as of Python 3.10, and this method is not currently exposed by PurePosixPath.
What I found worked was to use posixpath.normpath(). Also found PurePosixPath.joinpath() to be broken. I.E. It will not join ".." with "myfile.txt" as expected. It will return just "myfile.txt". But posixpath.join() works perfectly; will return "../myfile.txt".
Note this is in path strings, but easily back to pathlib.Path(my_posix_path) et al for an OOP container.
And easily transpose to Windows platform paths too by just constructing this way, as the module takes care of the platform independence for you.
Might be the solution for others with Python file path woes..

Convert a str to path type?

I am trying to interface with some existing code that saves a configuration, and expects a file path that is of type path.path. The code is expecting that the file path is returned from a pygtk browser window (via another function). I want to call the save_config function elsewhere in my code with a file path based on different inputs, constructed from string elements.
When I try to run the code, I am able to construct the file path correctly, but it is a string type, and the save function expects a path.path type.
Is there a way to convert a string to a path type? I've tried searching, but could only find the reverse case (path to string). I also tried using os.path.join(), but that returns a string as well.
Edit: This is python 2.7, if that makes a difference.
Since python 3.4:
from pathlib import Path
str_path = "my_path"
path = Path(str_path)
https://docs.python.org/3/library/pathlib.html#module-pathlib
Maybe that answer worked for python 2.7, if you are on Python 3 I like:
import os
p = "my/path/to/file.py"
os.path.normpath(p)
'my\\path\\to\\file.py'
If path.path represents a type, you can probably create an instance of that type with something like:
string_path = "/path/to/some/file"
the_path = path.path(string_path)
save_config(the_path())

Check if GZIP file exists in Python

I would like to check for the existence of a .gz file on my linux machine while running Python. If I do this for a text file, the following code works:
import os.path
os.path.isfile('bob.asc')
However, if bob.asc is in gzip format (bob.asc.gz), Python does not recognize it. Ideally, I would like to use os.path.isfile or something very concise (without writing new functions). Is it possible to make the file recognizable either in Python or by changing something in my system configuration?
Unfortunately I can't change the data format or the file names as they are being given to me in batch by a corporation.
After fooling around for a bit, the most concise way I could get the job done was
subprocess.call(['ls','bob.asc.gz']) == 0
which returns True if the file exists in the directory. This is the behavior I would expect from
os.path.isfile('bob.asc.gz')
but for some reason Python won't accept files with extension .gz as files when passed to os.path.isfile.
I don't feel like my solution is very elegant, but it is concise. If someone has a more elegant solution, I'd love to see it. Thanks.
Of course it doesn't; they are completely different files. You need to test it separately:
os.path.isfile('bob.asc.gz')
This would return True if that exact file was present in the current working directory.
Although a workaround could be:
from os import listdir, getcwd
from os.path import splitext, basename
any(splitext(basename(f))[0] == 'bob.asc' for f in listdir(getcwd()))
You need to test each file. For example :
if any(map(os.path.isfile, ['bob.asc', 'bob.asc.gz'])):
print 'yay'

Categories

Resources