Why is glob ignoring some directories? - python

I'm trying to find all *.txt files in a directory with glob(). In some cases, glob.glob('some\path\*.txt') gives an empty string, despite existing files in the given directories. This is especially true, if path is all lower-case or numeric.
As a minimal example I have two folders a and A on my C: drive both holding one Test.txt file.
import glob
files1 = glob.glob('C:\a\*.txt')
files2 = glob.glob('C:\A\*.txt')
yields
files1 = []
files2 = ['C:\\A\\Test.txt']
If this is by design, is there any other directory name, that leads to such unexpected behaviour?
(I'm working on win 7, with Python 2.7.10 (32bit))
EDIT: (2019) Added an answer for Python 3 using pathlib.

The problem is that \a has a special meaning in string literals (bell char).
Just double backslashes when inserting paths in string literals (i.e. use "C:\\a\\*.txt").
Python is different from C because when you use backslash with a character that doesn't have a special meaning (e.g. "\s") Python keeps both the backslash and the letter (in C instead you would get just the "s").
This sometimes hides the issue because things just work anyway even with a single backslash (depending on what is the first letter of the directory name) ...

I personally avoid using double-backslashes in Windows and just use Python's handy raw-string format. Just change your code to the following and you won't have to escape the backslashes:
import glob
files1 = glob.glob(r'C:\a\*.txt')
files2 = glob.glob(r'C:\A\*.txt')
Notice the r at the beginning of the string.
As already mentioned, the \a is a special character in Python. Here's a link to a list of Python's string literals:
https://docs.python.org/2/reference/lexical_analysis.html#string-literals

As my original answer attracted more views than expected and some time has passed. I wanted to add an answer that reliably solves this kind of problems and is also cross-plattform compatible. It's in python 3 on Windows 10, but should also work on *nix systems.
from pathlib import Path
filepath = Path(r'C:\a')
filelist = list(filepath.glob('*.txt'))
--> [WindowsPath('C:/a/Test.txt')]
I like this solution better, as I can copy and paste paths directly from windows explorer, without the need to add or double backslashes etc.

Related

Opening files in python3 for windows10

Hi I cannot open files in python 3 actually I have a problem with the path. I don't know how to write the path for it.:/ For example I have a file(bazi.py) in folder(w8) in driver(F). How should i write it's path. Please help me im an amateur:/
In Windows, there are a couple additional ways of referencing a file. That is because natively, Windows file path employs the backslash "" instead of the slash. Python allows using both in a Windows system, but there are a couple of pitfalls to watch out for. To sum them up:
Python lets you use OS-X/Linux style slashes "/" even in Windows. Therefore, you can refer to the file as 'C:/Users/narae/Desktop/alice.txt'. RECOMMENDED.
If using backslash, because it is a special character in Python, you must remember to escape every instance: 'C:\Users\narae\Desktop\alice.txt'
Alternatively, you can prefix the entire file name string with the rawstring marker "r": r'C:\Users\narae\Desktop\alice.txt'. That way, everything in the string is interpreted as a literal character, and you don't have to escape every backslash.
File Name Shortcuts and CWD (Current Working Directory)
So, using the full directory path and file name always works; you should be using this method. However, you might have seen files called by their name only, e.g., 'alice.txt' in Python. How is it done?
The concept of Current Working Directory (CWD) is crucial here. You can think of it as the folder your Python is operating inside at the moment. So far we have been using the absolute path, which begins from the topmost directory. But if your file reference does not start from the top (e.g., 'alice.txt', 'ling1330/alice.txt'), Python assumes that it starts in the CWD (a "relative path").
using the os.path.abspath function will translate the path to a version appropriate for the operating system.
os.path.abspath(r'F:\w8\bazi.py')

Can't replace character "\"

I've been working on a program that reads out an specific PDF and converts the data to an Excel file. The program itself already works, but while trying to refine some aspects I ran into a problem. What happens is the modules I'm working with read directories with simple slashes dividing each folder, such as:
"C:/Users/UserX"
While windows directories are divided by backslashes, such as:
"C:\Users\UserX"
I thought using a simple replace would work just fine:
directory.replace("\" ,"/")
But whenever I try to run the program, the \ isn't identified as a string. Instead it pops up as orange in the IDE I'm working with (PyCharm). Is there anyway to remediate this? Or maybe another useful solution?
In general you should work with the os.path package here.
os.getcwd() gives you the current directory, you can add a subfolder of it via more arguments, and put the filename last.
import os
path_to_file = os.path.join(os.getcwd(), "childFolder", filename)
In Python, the '\' character is represented by '\\':
directory.replace("\\" ,"/")
Just try adding another backslash.
First of all you need to pass "C:\Users\UserX" as a raw string. Use
directory=r"C:\Users\UserX"
Secondly, suppress the backslash using a second backslash.
directory.replace("\\" ,"/")
All of this is required as in python the backslash (\) is a special character known as an escape character.
Try this:
import os
path = "C:\\temp\myFolder\example\\"
newPath = path.replace(os.sep, '/')
print(newPath)
Output:<< C:/temp/myFolder/example/ >>

Jupyter note book path not found

I have tried to locate the file. used both forward and backwards also I have used 1 and 2 apostrophes - nothing has changed
This is the error I am getting
Windows is a bit trickier. This is jut a hunch but maybe try:
path = "C:\\Users\\BarbieA\\.... "
Windows paths are separated by \ but since that is used to escape special characters, you would need to escape it as well, so it becomes \\
Yeah. I recommend using pathlib to make your life easier, as sometimes, the spaces and the special symbols can be confusing when writing by hand.
from pathlib import PureWindowsPath
file = PureWindowsPath(r"C:\Users\Barbie..")
open(file)

saving/creating a file in a relative path with python 3

I'm new to Python and i'd like to build a script (Python 3) to test electronic modules and save log files.
I'd like to save the logfiles in the following format:
201410log.txt (yearmonthlog.txt)
This is done with the code:
import os
logfile=open(time.strftime('%Y%mlog.txt'), 'a')
logfile.write('This is a test\n\n\n')
This way, every month a new log file is created.
However, i'd like the logfiles to be in in a subdirectory (\logs).
I tried approaches like
logfile=open(time.strftime('\logs\%Y%mlog.txt'), 'a')
and similar things but i couldnt get any of them to work.
I searched trough other questions on stackoverflow (for example: Relative paths in Python ) and elsewhere in the internet, but i couldnt find the right solution.
Could someone point me in the right direction?
(sorry for any mistakes/spelling errors, i'm no native english speaker)
Remove the leading backslash. It makes the path absolute. Beside that, you need to escape a backslash.
logfile = open(time.strftime('logs\\%Y%mlog.txt'), 'a')
or use r'raw string literal':
logfile = open(time.strftime(r'logs\%Y%mlog.txt'), 'a')
For your current path string literal, it does not make problem. But paths like 'a\nb' will not work because \n is interpreted as newline instead of a literal backslash and n.

Python. Error when trying to create a new directory using the filename if it has a special character in it

My script searches the directory that it's in and will create new directories using the file names that it has found and moves them to that directory: John-doe-taxes.hrb -> John-doe/John-does-taxes.hrb. It works fine until it runs into an umlaut character then it will create the directory and return an "Error 2" saying that it cannot find the file. I'm fairly new to programming and the answers i've found have been to add a
coding: utf-8
line to the file which doesn't work I believe because i'm not using umlauts in my code i'm dealing with umlaut files. One thing I was curious about, does this problem just occur with umlauts or other special characters as well? This is the code i'm using, I appreciate any advice provided.
import os
import re
from os.path import dirname, abspath, join
dir = dirname(abspath(__file__))
(root, dirs, files) = os.walk(dir).next()
p = re.compile('(.*)-taxes-')
count = 0
for file in files:
match = p.search(file)
if match:
count = count + 1
print("Files processed: " + str(count))
dir_name = match.group(1)
full_dir = join(dir, dir_name)
if not os.access(full_dir, os.F_OK):
os.mkdir(full_dir)
os.rename(join(dir, file), join(full_dir, file))
raw_input()
I think your problem is passing strs to os.rename that aren't in the system encoding. As long as the filenames only use ascii characters this will work, however outside that range you're likely to run into problems.
The best solution is probably to work in unicode. The filesystem functions should return unicode strings if you give them unicode arguments. open should work fine on windows with unicode filenames.
If you do:
dir = dirname(abspath(unicode(__file__)))
Then you should be working with unicode strings the whole way.
One thing to consider would be to use Python 3. It has native support for unicode as the default. I'm not sure if you would have to do anything to change anything in the above code for it to work, but there is a python script in the examples to transition Python2 code to Python3.
Sorry I can't help you with Python2, I had a similar problem and just transitioned my project to Python3--ended up just being a bit easier for me!

Categories

Resources