dots and whitespaces in filepath location, importing file in pandas

dots and whitespaces in filepath location, importing file in pandas - python

I have a file in windows remote server which I'm trying to read in a pandas dataframe. The file path has white spaces and dots in it. Following is what I'm trying to do but its not working for me.
file_location = '\\servername\foldername\folder name\5. folder_name\foldername - 0331v7\filename.txt'
df = pd.read_csv(filelocation, sep = '|')
I'm getting the no such file exists error. I tried to prefix the file_location string with r and thats not working too. I would appreciate if someone could help me with this.

Backslahes are used to tell the parser that the next character should be interpreted as an escapce sequence and not a regular character/operator.
For example if you want to print " you have to use the escape sequence \":
print("This is a quotation mark: \".")
Output:
This is a quotation mark: ".
If you are using a single backslash the next character is marked as escape sequence and therefore makes the path invalid. To get around this you can either use the escape sequence for backslash wich is \ or in most cases you can use a forward slash as most librarys automatically convert it to a backslash on windows.

Related

I have a file location stored in a reach file. Like \reach. It thinks it is a carriage return

My file location is detecting the \r in \reach as a carriage return.
There is nothing I could find online about the topic. I need it to list the file location as only a string.

Declare your string as a raw string by prefixing a r. A raw string ignores all backslashes.
location = r'\reach'
Alternatively you could use double backslashes like so
location = '\\reach'
A third way would be to just use forward slashes instead
location = '/reach'

You need to escape slash with another slash in the string: \\reach

It looks like you might be using windows as your os which uses '\' as separators.
You could try defining your file path in raw string. Eg.
f = open(r'dir1\dir2\reach')

Trouble loading csv file into Jupyter Notebooks [duplicate]

This question already has answers here:
How should I write a Windows path in a Python string literal?
(5 answers)
Closed 3 years ago.
I'm trying to read a CSV file into Python (Spyder), but I keep getting an error. My code:
import csv
data = open("C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")
data = csv.reader(data)
print(data)
I get the following error:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes
in position 2-3: truncated \UXXXXXXXX escape
I have tried to replace the \ with \\ or with / and I've tried to put an r before "C.., but all these things didn't work.

This error occurs, because you are using a normal string as a path. You can use one of the three following solutions to fix your problem:
1: Just put r before your normal string. It converts a normal string to a raw string:
pandas.read_csv(r"C:\Users\DeePak\Desktop\myac.csv")
2:
pandas.read_csv("C:/Users/DeePak/Desktop/myac.csv")
3:
pandas.read_csv("C:\\Users\\DeePak\\Desktop\\myac.csv")

The first backslash in your string is being interpreted as a special character. In fact, because it's followed by a "U", it's being interpreted as the start of a Unicode code point.
To fix this, you need to escape the backslashes in the string. The direct way to do this is by doubling the backslashes:
data = open("C:\\Users\\miche\\Documents\\school\\jaar2\\MIK\\2.6\\vektis_agb_zorgverlener")
If you don't want to escape backslashes in a string, and you don't have any need for escape codes or quotation marks in the string, you can instead use a "raw" string, using "r" just before it, like so:
data = open(r"C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")

You can just put r in front of the string with your actual path, which denotes a raw string. For example:
data = open(r"C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")

Consider it as a raw string. Just as a simple answer, add r before your Windows path.
import csv
data = open(r"C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")
data = csv.reader(data)
print(data)

Try writing the file path as "C:\\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener" i.e with double backslash after the drive as opposed to "C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener"

Add r before your string. It converts a normal string to a raw string.

As per String literals:
String literals can be enclosed within single quotes (i.e. '...') or double quotes (i.e. "..."). They can also be enclosed in matching groups of three single or double quotes (these are generally referred to as triple-quoted strings).
The backslash character (i.e. \) is used to escape characters which otherwise will have a special meaning, such as newline, backslash itself, or the quote character. String literals may optionally be prefixed with a letter r or R. Such strings are called raw strings and use different rules for backslash escape sequences.
In triple-quoted strings, unescaped newlines and quotes are allowed, except that the three unescaped quotes in a row terminate the string.
Unless an r or R prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C.
So ideally you need to replace the line:
data = open("C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener")
To any one of the following characters:
Using raw prefix and single quotes (i.e. '...'):
data = open(r'C:\Users\miche\Documents\school\jaar2\MIK\2.6\vektis_agb_zorgverlener')
Using double quotes (i.e. "...") and escaping backslash character (i.e. \):
data = open("C:\\Users\\miche\\Documents\\school\\jaar2\\MIK\\2.6\\vektis_agb_zorgverlener")
Using double quotes (i.e. "...") and forwardslash character (i.e. /):
data = open("C:/Users/miche/Documents/school/jaar2/MIK/2.6/vektis_agb_zorgverlener")

Just putting an r in front works well.
eg:
white = pd.read_csv(r"C:\Users\hydro\a.csv")

It worked for me by neutralizing the '' by f = open('F:\\file.csv')

The double \ should work for Windows, but you still need to take care of the folders you mention in your path. All of them (except the filename) must exist. Otherwise you will get an error.

Pywin32 working with excel application, how to save as the file in other location than default? [duplicate]

I am getting filename from an api in this format containing mix of / and \.
infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
When I try to parse the directory structure, \ followed by a character is converted into single character.
Is there a way around to get each component correctly?
What I already tried:
path.normpath didn't help.
infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
os.path.normpath(infilename)
out:
'c:\\mydir1\\mydir2\\mydir3\\mydir4Sxyz.csv'

use r before the string to process it as a raw string (i.e. no string formatting).
e.g.
infilename = r'C:/blah/blah/blah.csv'
More details here:
https://docs.python.org/3.6/reference/lexical_analysis.html#string-and-bytes-literals

that's not visible in your example but writing this:
infilename = 'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv'
isn't a good idea because some of the lowercase (and a few uppercase) letters are interpreted as escape sequences if following an antislash. Notorious examples are \t, \b, there are others. For instance:
infilename = 'c:/mydir1/mydir2\thedir3\bigdir4\123xyz.csv'
doubly fails because 2 chars are interpreted as "tab" and "backspace".
When dealing with literal Windows-style path (or regexes), you have to use the raw prefix, and better, normalize your path to get rid of the slashes.
infilename = os.path.normpath(r'c:/mydir1/mydir2\mydir3\mydir4\123xyz.csv')
However, the raw prefix only applies to literals. If the returned string appears, when printing repr(string), as 'the\terrible\\dir', then tab chars have already been put in the string, and there's nothing you can do except a lousy post-processing.

Instead of parsing by \ try parsing by \\. You usually have to escape by \ so the \ character is actually \\.

Numbers Value Changes When Directly After the Backslash os.path.normpath(string)

I am using os.path.normpath and the values change when there are numbers directly after a backslash.
fileString = os.path.normpath("server:\Projects\05 Project Name\Data\20151021\Master.xlsx")
print fileString
Returns: server:\Projects\Project\Data�51021\MASTER_LIST.XLSX
Notice the '\05' disappeared and the '\20' turned into �.
Why is this happening and how can I fix it?

The easiest way to solve this is to use a raw string literal:
fileString = os.path.normpath(r"server:\Projects\05 Project Name\Data\20151021\Master.xlsx")
# ^
The backslash character denotes an escape sequence in regular strings.
The other way around this is to either use forward slashes as path delimiters, or double backslashes:
"server:/Projects/05 Project Name/Data/20151021/Master.xlsx"
or
"server:\\Projects\\05 Project Name\\Data\\20151021\\Master.xlsx"

String literals for file names

I am new to Python - but not to programming, and on a bit of a steep learning curve.
I have a programme that reads several input files - the first input file contains (amongst other things) the path and name the other files.
I can open the file and read the name OK. If I print the string it looks like this
Z:\ \python\ \rb_data.dat\n'
all my "\" become "\ \" I think I can fix this by using the "r" prefix to convert it to a literal.
My question is how do I attach the prefix to a string variable ??
This is what I want to do :
modat = open('z:\\python\mot1 input.txt') # first input file containing names of other file
rbfile = modat.readline() # get new file name
rbdat = open(rbfile) # open new file

The \\ is an escape sequence for the backslash character \. When you specify a string literal, they are enquoted by either ' or ". Because there are some characters you might need to specify to be part of the string which you cannot enter like this—for example the quotation marks themselves—escape sequences allow you to do it. They usually are \x where x is something you want to enter. Now because all escape sequences start with a backslash, the backslash itself also turns into a special character which you cannot specify directly within a string literal. So you need to escape it too.
That means that the string literal '\\' actually represents a string with a single character: The backslash. Raw strings, that are string literals with an r in front of the opening quotation character, ignore (most) escape sequences. So r'\\x' is actually the string where two backslashes are followed by an x. So it’s identical to the string described by the non-raw string literal '\\\\x'.
All this only applies to string literals though. The string itself holds no information about whether it was created with a raw string literal or not, or whether there was some escape sequence need or not. It just contains all the characters that make out the string.
That also means that as soon as you get a string from somewhere, for example by reading it from a file, then you don’t need to worry about escaping something in there to make sure that it’s a correct string. It just is.
So in your code, when you open the file at z:\python\mot1 input.txt, you need to specify that filename as a string first. So you have to use a string literal, either with escaping the backslashes, or by using a raw string.
Then, when you read the new filename from that file, you already have a real string, and don’t need to bother with anything more. Assuming that it was correctly written to the file, you can just use it like that.

The backslash \ in Python strings (and in code blocks on StackOverflow!) means, effectively, "treat the next character differently". As it is reserved for this purpose, when you actually have a backslash in your strings, it must be "escaped" by a preceding backslash:
>>> myString = "\\" # the first one "escapes" the second
>>> myString = "\" # no escape, so...
SyntaxError: EOL while scanning string literal
>>> print("\\") # when we actually print out the string
\
The short story is, you can basically ignore this in your strings. If you pass rbfile to open, Python will interpret it correctly.

Why not use os.path.normcase, like this:
with open(r'z:\python\mot1 input.txt') as f:
for line in f:
if line.strip():
if os.path.isfile(os.path.normcase(line.strip())):
with open(line.strip()) as f2:
# do something with
# f2
From the documentation of os.path.normcase:
Normalize the case of a pathname. On Unix and Mac OS X, this returns
the path unchanged; on case-insensitive filesystems, it converts the
path to lowercase. On Windows, it also converts forward slashes to
backward slashes.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.