python csv files read and upload

python csv files read and upload - python

I want to upload all the csv files that meet certain condition in a directory to a database. But I encounter an error at the beginning of my code.
mypath = "D:\user\01367564\Project Coordinator\Database Trying\all data csv"
csv_name_reg = r'^[0-9]{11}_HKG_[0-9]{14}_v2-0.csv$'
The error is below
File "D:\user\01367564\Project Coordinator\Database Trying\Upload_CA_Manifest.py", line 9
mypath = "D:\user\01367564\Project Coordinator\Database Trying\all data csv"
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \uXXXX escape
Can you help me? Thank you.

Currently your path looks like it's meant to contain a Unicode character with the \u.... Please note that on Windows you have three options for paths:
Raw strings
mypath = r"D:\user\01367564\Project Coordinator\Database Trying\all data csv"
Escaped backslashes
mypath = "D:\\user\\01367564\\Project Coordinator\\Database Trying\\all data csv"
Forward slashes
mypath = "D:/user/01367564/Project Coordinator/Database Trying/all data csv"

In Python, there are some cool backslash escapes. A "\" inside a string plus a character(s).
Some notable ones are "\n" and "\t" which are newline and tab. A non-builtin backslash escape will be turned into the actual character in the final string. "\\" will turn into one "\" during, say, a print statement.
The escape Python thinks your using is the unicode escape. "\uXXXX". To fix this all you need is to replace each backslash with a double backslash. "\\". So this string will work: "D:\\user\\01367564\\Project Coordinator\\Database Trying\\all data csv"
For a full list of Python Backslash Escapes look at the Python Docs.

Related

Trouble with opening docx file, seems to be Unicode issue

I am a novice to python & this is my first small project
I am having trouble inputting a file directory to open a Word document. I tried this by copying & pasting the directory from my command prompt, but this Error appears after plugging it in. How do I convert the command prompt to UTF-8 or find the directory in Unicode?
#After importing necessary modules for the project, I access the file
from docx import Document
import pandas as pd
import docx
doc = Document('C:\Users\trisy\OneDrive\Desktop\classes\SP_22_courses\CS1110\pye_files\kw_txt.docx')
#Error message
doc = Document('C:\Users\xxx\OneDrive\Desktop\classes\SP_22_courses\xxx\pye_files\kw_txt.docx')
^
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

The problem is caused by the backslashes in that pathname, combined with certain other characters.
In Python, putting \x in a string can have special behavior depending on what x is.
For example, \n does not mean "backslash n"; it means a newline character.
\U is one of these special cases.
To get around this, you have two options:
Use "raw strings". Put an r before the string. r'C:\Users\...' The r tells Python that backslashes should have no special meaning.
Use forward slashes in the file path. 'C:/Users/...' These will work even on Windows.

Python : Issue is checking os.path.exists

I am trying to check if a particular directory path exists or not.
below is my code
temp_path = '\\diwali\NSID-HYD-01\college'
meta_path = os.path.realpath(temp_path)
print(os.path.exists(meta_path))
When I am trying to execute this, it is throwing error as below
temp_path = '\\diwali\NSID-HYD-01\college'
# ^
error
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 8-9: malformed \N character escape
Help me resolve this.

Python interprets backslashes (\) inside strings as leading characters for escape codes. For example \n is a line-feed character.
If you want it to treat them as simply backslashes, add an r before the string, like so:
temp_path = r'\\diwali\NSID-HYD-01\college'

another method is using two backslashes \\ before N like this:
temp_path = '\\diwali\\NSID-HYD-01\college'
If you are get it from UI (as you mentioned in comments) you can replace \ with \\:
temp_path = '\\diwali\NSID-HYD-01\college'.replace("\\", "\\\\")
# '\\diwali\\NSID-HYD-01\\college'

Open multiple files from a file

I need to open a file that have multiple absolute file directories.
EX:
Layer 1 = C:\User\Files\Menu\Menu.snt
Layer 2 = C:\User\Files\N0 - Vertical.snt
The problem is that when I try to open C:\User\Files\Menu\Menu.snt python doesn't like \U or \N
I could open using r"C:\User\Files\Menu\Menu.snt" but I can't automate this process.
file = open(config.txt, "r").read()
list = []
for line in file.split("\n"):
list.append(open(line.split("=",1)[1]).read())
It prints out:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 33-34: malformed \N character escape

The backslash character \ is used as an escape character by the Python interpreter in order to provide special characters.
For example, \n is a "New Line" character, like you would get from pressing the Return key on your keyboard.
So if you are trying to read something like newFolder1\newFolder2, the interpreter reads it as:
newFolder1
ewFolder2
where the New Line character has been inserted between the two lines of text.
You already mentioned one workaround: using raw strings like r'my\folder\structure' and I'm a little curious why this can't be automated.
If you can automate it, you could try replacing all instances of a single backslash (\) with a double backslash (\\) in your file paths and that should work.
Alternatively, you can try looking in the os module and dynamically building your paths using os.path.join(), along with the os.sep operator.
One final point: You can save yourself some effort by replacing:
list.append(open(line.split("=",1)[1]).read())
by
list = open(line.split("=",1)[1]).readlines()

here is my solution:
file = open("config.txt", "r").readlines()
list = [open(x.split("=")[1].strip(), 'r').read() for x in file]
readlines creates a list that contains all lines in file, there is no need to split the whole string.

Copying files raises a SyntaxError can't decode bytes

I'm trying to code a short program that makes backups of a folder whenever I run it. Currently it's like this:
import time
import shutil
import os
date = time.strftime("%d-%m-%Y")
print(date)
shutil.copy2("C:\Users\joaop\Desktop\VanillaServer\world","C:\Users\joaop\Desktop\VanillaServer\Backups")
for filename in os.listdir("C:\Users\joaop\Desktop\VanillaServer\Backups"):
if filename == world:
os.rename(filename, "Backup " + date)
However I get an error:
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
and I can't figure out why (according to documentation, I think my code is properly written)
How can I fix this/do it in a better way?

In Python, \u... denotes a Unicode sequence, so your \Users directory is interpreted as a Unicode character -- not with very much success.
>>> "\u0061"
'a'
>>> "\users"
File "<stdin>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape
To fix it, you should escape the different \ as \\, or use r"..." to make it a raw string.
>>> "C:\\Users\\joaop\\Desktop\\VanillaServer\\world"
'C:\\Users\\joaop\\Desktop\\VanillaServer\\world'
>>> r"C:\Users\joaop\Desktop\VanillaServer\world"
'C:\\Users\\joaop\\Desktop\\VanillaServer\\world'
Don't do both, though, or else they will be escaped twice:
>>> r"C:\\Users\\joaop\\Desktop\\VanillaServer\\world"
'C:\\\\Users\\\\joaop\\\\Desktop\\\\VanillaServer\\\\world'
You only have to escape them when entering the paths directly in your source; if you read those paths from a file, from user input, or from some library function, they will automatically be escaped.

Backslashes are used for escape characters so when the interpreter sees the \ in your file path string it attempts to use them as an escape character (which are things like \n for new line and \t for tabs).
There are 2 ways around this, using raw strings or double slashing your file path so the interpeter ignores the escape sequence. Use a r to specify a raw string or \\. Now the choice in which you use is up to you but personally I prefer raw strings.
#with raw strings
shutil.copy2(r"C:\Users\joaop\Desktop\VanillaServer\world",r"C:\Users\joaop\Desktop\VanillaServer\Backups")
for filename in os.listdir(r"C:\Users\joaop\Desktop\VanillaServer\Backups"):
if filename == world:
os.rename(filename, "Backup " + date)
#with double slashes
shutil.copy2("C:\\Users\\joaop\\Desktop\\VanillaServer\\world","C:\\Users\\joaop\\Desktop\\VanillaServer\\Backups")
for filename in os.listdir("C:\\Users\\joaop\\Desktop\\VanillaServer\\Backups"):
if filename == world:
os.rename(filename, "Backup " + date)

Python - Must add r when opening a file

I have several .py files and I can open my file everywhere, except in my test.py file (I test scripts and functions there) instead of this:
file = open("C:\Users\User\Desktop\key_values.txt", "r")
I need to use this (with r) to avoid error:
file = open(r"C:\Users\User\Desktop\key_values.txt", "r")
I get this error: (when I try to open a file without r in my test.py script)
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
Any idea why is this happening ?

Backslash is an escape character, so you can include characters like "\n" (new line) and "\t" (tab). The r before the string means means "my backslashes are not escape characters".
Interestingly, it looks like your string "C:\Users\User\Desktop\key_values.txt" works ok in python 2 because none of the backslashes are part of anything looking like a known escape sequence. But in python 3, "\Uxxxx" indicates a unicode character. So maybe that is why some of your python files can cope and some can't.

The other answers are OK.. but this a time saving trick:
Try using slashes instead of backslashes:
file = open("C:/Users/User/Desktop/key_values.txt", "r")
It works in Windows. Tried with Python 2.7
Hope this helps

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python csv files read and upload - python

Related

Trouble with opening docx file, seems to be Unicode issue

Python : Issue is checking os.path.exists

Open multiple files from a file

Copying files raises a SyntaxError can't decode bytes

Python - Must add r when opening a file

Categories

Resources