How can I check a user-supplied path is sanitised?
I want to ensure it has no wildcards nor any shenanigans. Right now, I'm checking that it is not escaping the correct folder so:
if os.path.commonprefix([os.path.abspath(path),os.getcwd()]) != os.getcwd():
# raise error etc..
But like all self-written security check code, I want it held up to better scrutiny! And it doesn't address that the path is actually legal after all that.
I will then be using the path to create assets and such.
You could use Werkzeug's secure_filename:
werkzeug.utils.secure_filename(filename)
Pass it a filename and it will return a secure version of it. This
filename can then safely be stored on a regular file system and passed
to os.path.join(). The filename returned is an ASCII only string for
maximum portability.
On windows system the function also makes sure that the file is not
named after one of the special device files.
>>> secure_filename("My cool movie.mov")
'My_cool_movie.mov'
>>> secure_filename("../../../etc/passwd")
'etc_passwd'
>>> secure_filename(u'i contain cool \xfcml\xe4uts.txt')
'i_contain_cool_umlauts.txt'
Related
This is a screenshot of the execution:
As you see, the error says that the directory "JSONFiles/Apartment/Rent/dubizzleabudhabiproperty" is not there.
But look at my files, please:
The folder is definitely there.
Update 2
The code
self.file = open("JSONFiles/"+ item["category"]+"/" + item["action"]+"/"+ item['source']+"/"+fileName + '.json', 'wb') # Create a new JSON file with the name = fileName parameter
line = json.dumps(dict(item)) # Change the item to a JSON format in one line
self.file.write(line) # Write the item to the file
UPDATE
When I change the file name to a smaller one, it works, so the problem is because of the length of the path. what is the solution please?
Regular DOS paths are limited to MAX_PATH (260) characters, including the string's terminating NUL character. You can exceed this limit by using an extended-length path that starts with the \\?\ prefix. This path must be a Unicode string, fully qualified, and only use backslash as the path separator. Per Microsoft's file system functionality comparison, the maximum extended path length is 32760 characters. A individual file or directory name can be up to 255 characters (127 for the UDF filesystem). Extended UNC paths are also supported as \\?\UNC\server\share.
For example:
import os
def winapi_path(dos_path, encoding=None):
if (not isinstance(dos_path, unicode) and
encoding is not None):
dos_path = dos_path.decode(encoding)
path = os.path.abspath(dos_path)
if path.startswith(u"\\\\"):
return u"\\\\?\\UNC\\" + path[2:]
return u"\\\\?\\" + path
path = winapi_path(os.path.join(u"JSONFiles",
item["category"],
item["action"],
item["source"],
fileName + ".json"))
>>> path = winapi_path("C:\\Temp\\test.txt")
>>> print path
\\?\C:\Temp\test.txt
See the following pages on MSDN:
Naming Files, Paths, and Namespaces
Defining an MS-DOS Device Name
Kernel object namespaces
Background
Windows calls the NT runtime library function RtlDosPathNameToRelativeNtPathName_U_WithStatus to convert a DOS path to a native NT path. If we open (i.e. CreateFile) the above path with a breakpoint set on the latter function, we can see how it handles a path that starts with the \\?\ prefix.
Breakpoint 0 hit
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus:
00007ff9`d1fb5880 4883ec58 sub rsp,58h
0:000> du #rcx
000000b4`52fc0f60 "\\?\C:\Temp\test.txt"
0:000> r rdx
rdx=000000b450f9ec18
0:000> pt
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus+0x66:
00007ff9`d1fb58e6 c3 ret
The result replaces \\?\ with the NT DOS devices prefix \??\, and copies the string into a native UNICODE_STRING:
0:000> dS b450f9ec18
000000b4`536b7de0 "\??\C:\Temp\test.txt"
If you use //?/ instead of \\?\, then the path is still limited to MAX_PATH characters. If it's too long, then RtlDosPathNameToRelativeNtPathName returns the status code STATUS_NAME_TOO_LONG (0xC0000106).
If you use \\?\ for the prefix but use slash in the rest of the path, Windows will not translate the slash to backslash for you:
Breakpoint 0 hit
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus:
00007ff9`d1fb5880 4883ec58 sub rsp,58h
0:000> du #rcx
0000005b`c2ffbf30 "\\?\C:/Temp/test.txt"
0:000> r rdx
rdx=0000005bc0b3f068
0:000> pt
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus+0x66:
00007ff9`d1fb58e6 c3 ret
0:000> dS 5bc0b3f068
0000005b`c3066d30 "\??\C:/Temp/test.txt"
Forward slash is a valid object name character in the NT namespace. It's reserved by Microsoft filesystems, but you can use a forward slash in other named kernel objects, which get stored in \BaseNamedObjects or \Sessions\[session number]\BaseNamedObjects. Also, I don't think the I/O manager enforces the policy on reserved characters in device and filenames. It's up to the device. Maybe someone out there has a Windows device that implements a namespace that allows forward slash in names. At the very least you can create DOS device names that contain a forward slash. For example:
>>> kernel32 = ctypes.WinDLL('kernel32')
>>> kernel32.DefineDosDeviceW(0, u'My/Device', u'C:\\Temp')
>>> os.path.exists(u'\\\\?\\My/Device\\test.txt')
True
You may be wondering what \?? signifies. This used to be an actual directory for DOS device links in the object namespace, but starting with NT 5 (or NT 4 w/ Terminal Services) this became a virtual prefix. The object manager handles this prefix by first checking the logon session's DOS device links in the directory \Sessions\0\DosDevices\[LOGON_SESSION_ID] and then checking the system-wide DOS device links in the \Global?? directory.
Note that the former is a logon session, not a Windows session. The logon session directories are all under the DosDevices directory of Windows session 0 (i.e. the services session in Vista+). Thus if you have a mapped drive for a non-elevated logon, you'll discover that it's not available in an elevated command prompt, because your elevated token is actually for a different logon session.
An example of a DOS device link is \Global??\C: => \Device\HarddiskVolume2. In this case the DOS C: drive is actually a symbolic link to the HarddiskVolume2 device.
Here's a brief overview of how the system handles parsing a path to open a file. Given we're calling WinAPI CreateFile, it stores the translated NT UNICODE_STRING in an OBJECT_ATTRIBUTES structure and calls the system function NtCreateFile.
0:000> g
Breakpoint 1 hit
ntdll!NtCreateFile:
00007ff9`d2023d70 4c8bd1 mov r10,rcx
0:000> !obja #r8
Obja +000000b450f9ec58 at 000000b450f9ec58:
Name is \??\C:\Temp\test.txt
OBJ_CASE_INSENSITIVE
NtCreateFile calls the I/O manager function IoCreateFile, which in turn calls the undocumented object manager API ObOpenObjectByName. This does the work of parsing the path. The object manager starts with \??\C:\Temp\test.txt. Then it replaces that with \Global??\C:Temp\test.txt. Next it parses up to the C: symbolic link and has to start over (reparse) the final path \Device\HarddiskVolume2\Temp\test.txt.
Once the object manager gets to the HarddiskVolume2 device object, parsing is handed off to the I/O manager, which implements the Device object type. The ParseProcedure of an I/O Device creates the File object and an I/O Request Packet (IRP) with the major function code IRP_MJ_CREATE (an open/create operation) to be processed by the device stack. This is sent to the device driver via IoCallDriver. If the device implements reparse points (e.g. junction mountpoints, symbolic links, etc) and the path contains a reparse point, then the resolved path has to be resubmitted to the object manager to be parsed from the start.
The device driver will use the SeChangeNotifyPrivilege (almost always present and enabled) of the process token (or thread if impersonating) to bypass access checks while traversing directories. However, ultimately access to the device and target file has to be allowed by a security descriptor, which is verified via SeAccessCheck. Except simple filesystems such as FAT32 don't support file security.
below is Python 3 version regarding #Eryk Sun's solution.
def winapi_path(dos_path, encoding=None):
if (not isinstance(dos_path, str) and encoding is not None):
dos_path = dos_path.decode(encoding)
path = os.path.abspath(dos_path)
if path.startswith(u"\\\\"):
return u"\\\\?\\UNC\\" + path[2:]
return u"\\\\?\\" + path
#Python 3 renamed the unicode type to str, the old str type has been replaced by bytes. NameError: global name 'unicode' is not defined - in Python 3
Adding the solution that helped me fix a similar issue:
python version = 3.9, windows version = 10 pro.
I had an issue with the filename itself as it was too long for python's open built-in method. The error I got is that the path simply doesn't exist, although I use the 'w+' mode for open (which is supposed to open a new file regardless whether it exists or not).
I found this guide which solved the problem with a quick change to window's Registry Editor (specifically the Group Policy). Scroll down to the 'Make Windows 10 Accept Long File Paths' headline.
Don't forget to update your OS group policy to take effect immediately, a guide can be found here.
Hope this helps future searches as this post is quite old.
There can be multiple reasons for you getting this error. Please make sure of the following:
The parent directory of the folder (JSONFiles) is the same as the directory of the Python script.
Even though the folder exists it does not mean the individual file does. Verify the same and make sure the exact file name matches the one that your Python code is trying to access.
If you still face an issue, share the result of "dir" command on the innermost folder you are trying to access.
it works for me
import os
str1=r"C:\Users\sandeepmkwana\Desktop\folder_structure\models\manual\demodfadsfljdskfjslkdsjfklaj\inner-2djfklsdfjsdklfj\inner3fadsfksdfjdklsfjksdgjl\inner4dfhasdjfhsdjfskfklsjdkjfleioreirueewdsfksdmv\anotherInnerfolder4aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\5qbbbbbbbbbbbccccccccccccccccccccccccsssssssssssssssss\tmp.txt"
print(len(str1)) #346
path = os.path.abspath(str1)
if path.startswith(u"\\\\"):
path=u"\\\\?\\UNC\\"+path[2:]
else:
path=u"\\\\?\\"+path
with open(path,"r+") as f:
print(f.readline())
If you get a long path (more than 258 characters) issue in Windows, then try this.
Hi I cannot open files in python 3 actually I have a problem with the path. I don't know how to write the path for it.:/ For example I have a file(bazi.py) in folder(w8) in driver(F). How should i write it's path. Please help me im an amateur:/
In Windows, there are a couple additional ways of referencing a file. That is because natively, Windows file path employs the backslash "" instead of the slash. Python allows using both in a Windows system, but there are a couple of pitfalls to watch out for. To sum them up:
Python lets you use OS-X/Linux style slashes "/" even in Windows. Therefore, you can refer to the file as 'C:/Users/narae/Desktop/alice.txt'. RECOMMENDED.
If using backslash, because it is a special character in Python, you must remember to escape every instance: 'C:\Users\narae\Desktop\alice.txt'
Alternatively, you can prefix the entire file name string with the rawstring marker "r": r'C:\Users\narae\Desktop\alice.txt'. That way, everything in the string is interpreted as a literal character, and you don't have to escape every backslash.
File Name Shortcuts and CWD (Current Working Directory)
So, using the full directory path and file name always works; you should be using this method. However, you might have seen files called by their name only, e.g., 'alice.txt' in Python. How is it done?
The concept of Current Working Directory (CWD) is crucial here. You can think of it as the folder your Python is operating inside at the moment. So far we have been using the absolute path, which begins from the topmost directory. But if your file reference does not start from the top (e.g., 'alice.txt', 'ling1330/alice.txt'), Python assumes that it starts in the CWD (a "relative path").
using the os.path.abspath function will translate the path to a version appropriate for the operating system.
os.path.abspath(r'F:\w8\bazi.py')
I have a question on correctly decode a window path in python. I tried several method online but didn't find a solution. I assigned the path (folder directory) to a variable and would like to read it as raw. However, there is '\' combined with number and python can't read correctly, any suggestion? Thanks
fld_dic = 'D:TestData\20190917_DT19_HigherFlowRate_StdCooler\DM19_Data'
I would like to have:
r'D:TestData\20190917_DT19_HigherFlowRate_StdCooler\DM19_Data'
And I tried:
fr'{fld_dic}' it gives me answer as: 'D:TestData\x8190917_DT19_HigherFlowRate_StdCooler\\DM19_Data'
which is not what I want. Any idea how to change to raw string from an assigned variable with '\' and number combined?
Thanks
The problem's root caused is string assigning. When you assigning like that path='c:\202\data' python encode this string according to default UNICODE. You need to change your assigning. You have to assige as raw string. Also like this path usage is not best practice. It will occure proble continuesly. It is not meet with PEP8
You should not be used path variable as string. It will destroy python cross platform advantage.
You should use pathlib or os.path. I recommend pathlib. It have pure windows and linux path. Also while getting path use this path. If You get path from and input you can read it as raw text and convert to pathlib instance.
Check this link:
https://docs.python.org/3/library/pathlib.html
It works but not best practice. Just replace path assigning as raw string/
import os
def fcn(path=r'C:\202\data'):
print(path)
os.chdir(path)
fcn()
This is a screenshot of the execution:
As you see, the error says that the directory "JSONFiles/Apartment/Rent/dubizzleabudhabiproperty" is not there.
But look at my files, please:
The folder is definitely there.
Update 2
The code
self.file = open("JSONFiles/"+ item["category"]+"/" + item["action"]+"/"+ item['source']+"/"+fileName + '.json', 'wb') # Create a new JSON file with the name = fileName parameter
line = json.dumps(dict(item)) # Change the item to a JSON format in one line
self.file.write(line) # Write the item to the file
UPDATE
When I change the file name to a smaller one, it works, so the problem is because of the length of the path. what is the solution please?
Regular DOS paths are limited to MAX_PATH (260) characters, including the string's terminating NUL character. You can exceed this limit by using an extended-length path that starts with the \\?\ prefix. This path must be a Unicode string, fully qualified, and only use backslash as the path separator. Per Microsoft's file system functionality comparison, the maximum extended path length is 32760 characters. A individual file or directory name can be up to 255 characters (127 for the UDF filesystem). Extended UNC paths are also supported as \\?\UNC\server\share.
For example:
import os
def winapi_path(dos_path, encoding=None):
if (not isinstance(dos_path, unicode) and
encoding is not None):
dos_path = dos_path.decode(encoding)
path = os.path.abspath(dos_path)
if path.startswith(u"\\\\"):
return u"\\\\?\\UNC\\" + path[2:]
return u"\\\\?\\" + path
path = winapi_path(os.path.join(u"JSONFiles",
item["category"],
item["action"],
item["source"],
fileName + ".json"))
>>> path = winapi_path("C:\\Temp\\test.txt")
>>> print path
\\?\C:\Temp\test.txt
See the following pages on MSDN:
Naming Files, Paths, and Namespaces
Defining an MS-DOS Device Name
Kernel object namespaces
Background
Windows calls the NT runtime library function RtlDosPathNameToRelativeNtPathName_U_WithStatus to convert a DOS path to a native NT path. If we open (i.e. CreateFile) the above path with a breakpoint set on the latter function, we can see how it handles a path that starts with the \\?\ prefix.
Breakpoint 0 hit
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus:
00007ff9`d1fb5880 4883ec58 sub rsp,58h
0:000> du #rcx
000000b4`52fc0f60 "\\?\C:\Temp\test.txt"
0:000> r rdx
rdx=000000b450f9ec18
0:000> pt
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus+0x66:
00007ff9`d1fb58e6 c3 ret
The result replaces \\?\ with the NT DOS devices prefix \??\, and copies the string into a native UNICODE_STRING:
0:000> dS b450f9ec18
000000b4`536b7de0 "\??\C:\Temp\test.txt"
If you use //?/ instead of \\?\, then the path is still limited to MAX_PATH characters. If it's too long, then RtlDosPathNameToRelativeNtPathName returns the status code STATUS_NAME_TOO_LONG (0xC0000106).
If you use \\?\ for the prefix but use slash in the rest of the path, Windows will not translate the slash to backslash for you:
Breakpoint 0 hit
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus:
00007ff9`d1fb5880 4883ec58 sub rsp,58h
0:000> du #rcx
0000005b`c2ffbf30 "\\?\C:/Temp/test.txt"
0:000> r rdx
rdx=0000005bc0b3f068
0:000> pt
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus+0x66:
00007ff9`d1fb58e6 c3 ret
0:000> dS 5bc0b3f068
0000005b`c3066d30 "\??\C:/Temp/test.txt"
Forward slash is a valid object name character in the NT namespace. It's reserved by Microsoft filesystems, but you can use a forward slash in other named kernel objects, which get stored in \BaseNamedObjects or \Sessions\[session number]\BaseNamedObjects. Also, I don't think the I/O manager enforces the policy on reserved characters in device and filenames. It's up to the device. Maybe someone out there has a Windows device that implements a namespace that allows forward slash in names. At the very least you can create DOS device names that contain a forward slash. For example:
>>> kernel32 = ctypes.WinDLL('kernel32')
>>> kernel32.DefineDosDeviceW(0, u'My/Device', u'C:\\Temp')
>>> os.path.exists(u'\\\\?\\My/Device\\test.txt')
True
You may be wondering what \?? signifies. This used to be an actual directory for DOS device links in the object namespace, but starting with NT 5 (or NT 4 w/ Terminal Services) this became a virtual prefix. The object manager handles this prefix by first checking the logon session's DOS device links in the directory \Sessions\0\DosDevices\[LOGON_SESSION_ID] and then checking the system-wide DOS device links in the \Global?? directory.
Note that the former is a logon session, not a Windows session. The logon session directories are all under the DosDevices directory of Windows session 0 (i.e. the services session in Vista+). Thus if you have a mapped drive for a non-elevated logon, you'll discover that it's not available in an elevated command prompt, because your elevated token is actually for a different logon session.
An example of a DOS device link is \Global??\C: => \Device\HarddiskVolume2. In this case the DOS C: drive is actually a symbolic link to the HarddiskVolume2 device.
Here's a brief overview of how the system handles parsing a path to open a file. Given we're calling WinAPI CreateFile, it stores the translated NT UNICODE_STRING in an OBJECT_ATTRIBUTES structure and calls the system function NtCreateFile.
0:000> g
Breakpoint 1 hit
ntdll!NtCreateFile:
00007ff9`d2023d70 4c8bd1 mov r10,rcx
0:000> !obja #r8
Obja +000000b450f9ec58 at 000000b450f9ec58:
Name is \??\C:\Temp\test.txt
OBJ_CASE_INSENSITIVE
NtCreateFile calls the I/O manager function IoCreateFile, which in turn calls the undocumented object manager API ObOpenObjectByName. This does the work of parsing the path. The object manager starts with \??\C:\Temp\test.txt. Then it replaces that with \Global??\C:Temp\test.txt. Next it parses up to the C: symbolic link and has to start over (reparse) the final path \Device\HarddiskVolume2\Temp\test.txt.
Once the object manager gets to the HarddiskVolume2 device object, parsing is handed off to the I/O manager, which implements the Device object type. The ParseProcedure of an I/O Device creates the File object and an I/O Request Packet (IRP) with the major function code IRP_MJ_CREATE (an open/create operation) to be processed by the device stack. This is sent to the device driver via IoCallDriver. If the device implements reparse points (e.g. junction mountpoints, symbolic links, etc) and the path contains a reparse point, then the resolved path has to be resubmitted to the object manager to be parsed from the start.
The device driver will use the SeChangeNotifyPrivilege (almost always present and enabled) of the process token (or thread if impersonating) to bypass access checks while traversing directories. However, ultimately access to the device and target file has to be allowed by a security descriptor, which is verified via SeAccessCheck. Except simple filesystems such as FAT32 don't support file security.
below is Python 3 version regarding #Eryk Sun's solution.
def winapi_path(dos_path, encoding=None):
if (not isinstance(dos_path, str) and encoding is not None):
dos_path = dos_path.decode(encoding)
path = os.path.abspath(dos_path)
if path.startswith(u"\\\\"):
return u"\\\\?\\UNC\\" + path[2:]
return u"\\\\?\\" + path
#Python 3 renamed the unicode type to str, the old str type has been replaced by bytes. NameError: global name 'unicode' is not defined - in Python 3
Adding the solution that helped me fix a similar issue:
python version = 3.9, windows version = 10 pro.
I had an issue with the filename itself as it was too long for python's open built-in method. The error I got is that the path simply doesn't exist, although I use the 'w+' mode for open (which is supposed to open a new file regardless whether it exists or not).
I found this guide which solved the problem with a quick change to window's Registry Editor (specifically the Group Policy). Scroll down to the 'Make Windows 10 Accept Long File Paths' headline.
Don't forget to update your OS group policy to take effect immediately, a guide can be found here.
Hope this helps future searches as this post is quite old.
There can be multiple reasons for you getting this error. Please make sure of the following:
The parent directory of the folder (JSONFiles) is the same as the directory of the Python script.
Even though the folder exists it does not mean the individual file does. Verify the same and make sure the exact file name matches the one that your Python code is trying to access.
If you still face an issue, share the result of "dir" command on the innermost folder you are trying to access.
it works for me
import os
str1=r"C:\Users\sandeepmkwana\Desktop\folder_structure\models\manual\demodfadsfljdskfjslkdsjfklaj\inner-2djfklsdfjsdklfj\inner3fadsfksdfjdklsfjksdgjl\inner4dfhasdjfhsdjfskfklsjdkjfleioreirueewdsfksdmv\anotherInnerfolder4aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\5qbbbbbbbbbbbccccccccccccccccccccccccsssssssssssssssss\tmp.txt"
print(len(str1)) #346
path = os.path.abspath(str1)
if path.startswith(u"\\\\"):
path=u"\\\\?\\UNC\\"+path[2:]
else:
path=u"\\\\?\\"+path
with open(path,"r+") as f:
print(f.readline())
If you get a long path (more than 258 characters) issue in Windows, then try this.
so I'm writting a generic backup application with os module and pickle and far I've tried the code below to see if something is a file or directory (based on its string input and not its physical contents).
import os, re
def test(path):
prog = re.compile("^[-\w,\s]+.[A-Za-z]{3}$")
result = prog.match(path)
if os.path.isfile(path) or result:
print "is file"
elif os.path.isdir(path):
print "is directory"
else: print "I dont know"
Problems
test("C:/treeOfFunFiles/")
is directory
test("/beach.jpg")
I dont know
test("beach.jpg")
I dont know
test("/directory/")
I dont know
Desired Output
test("C:/treeOfFunFiles/")
is directory
test("/beach.jpg")
is file
test("beach.jpg")
is file
test("/directory/")
is directory
Resources
Test filename with regular expression
Python RE library
Validating file types by regular expression
what regular expression should I be using to tell the difference between what might be a file and what might be a directory? or is there a different way to go about this?
The os module provides methods to check whether or not a path is a file or a directory. It is advisable to use this module over regular expressions.
>>> import os
>>> print os.path.isfile(r'/Users')
False
>>> print os.path.isdir(r'/Users')
True
This might help someone, I had the exact same need and I used the following regular expression to test whether an input string is a directory, file or neither:
for generic file:
^(\/+\w{0,}){0,}\.\w{1,}$
for generic directory:
^(\/+\w{0,}){0,}$
So the generated python function looks like :
import os, re
def check_input(path):
check_file = re.compile("^(\/+\w{0,}){0,}\.\w{1,}$")
check_directory = re.compile("^(\/+\w{0,}){0,}$")
if check_file.match(path):
print("It is a file.")
elif check_directory.match(path):
print("It is a directory")
else:
print("It is neither")
Example:
check_input("/foo/bar/file.xyz") prints -> Is a file
check_input("/foo/bar/directory") prints -> Is a directory
check_input("Random gibberish") prints -> It is neither
This layer of security of input may be reinforced later by the os.path.isfile() and os.path.isdir() built-in functions as Mr.Squig kindly showed but I'd bet this preliminary test may save you a few microseconds and boost your script performance.
PS: While using this piece of code, I noticed I missed a huge use case when the path actually contains special chars like the dash "-" which is widely used. To solve this I changed the \w{0,} which specifies the requirement of alphabetic only words with .{0,} which is just a random character. This is more of a workaround than a solution. But that's all I have for now.
In a character class, if present and meant as a hyphen, the - needs to either be the first/last character, or escaped \- so change "^[\w-,\s]+\.[A-Za-z]{3}$" to "^[-\w,\s]+\.[A-Za-z]{3}$" for instance.
Otherwise, I think using regex's to determine if something looks like a filename/directory is pointless...
/dev/fd0 isn't a file or directory for instance
~/comm.pipe could look like a file but is a named pipe
~/images/test is a symbolic link to a file called '~/images/holiday/photo1.jpg'
Have a look at the os.path module which have functions that ask the OS what something is...: