Opening filenames with colon (":") in Windows 7 - python

I am writing a Python app that should run in both Windows and Linux, but am having an issue with one of the file naming conventions. I need to load a JSON file that has a colon in its name. However, with Windows 7 it doesn't seem to be possible, at least not directly.
These files are stored on a NFS drive thus we are able to see it in Windows 7, but cannot open them.
Does anyone have a workaround as to how it may be possible to read the JSON file containing a colon in Windows 7 using Python? One possible workaround we have (that we'd like to avoid) is to SSH into a Linux box, echo the contents and send it back.
Obviously if anyone else has another method that would be great. Windows XP is able to open them and read them just fine - this is just an issue with Win 7.
As an update, we found out that we can access our NFS/AFS servers through the web. So we ended up using urllib2 urlopen for all the JSON files that contain invalid characters. Seems to be working well so far.

To quote from http://support.microsoft.com/kb/289627:
Windows and UNIX operating systems have restrictions on valid characters that can be used in a file name. The list of illegal characters for each operating system, however, is different. For example, a UNIX file name can use a colon (:), but a Windows file name cannot use a colon (:). ...
To enable file name character mapping, create a character translation file and add a registry entry.
For example, the following maps the UNIX colon (:) to a Windows dash (-):
0x3a : 0x2d ; replace client : with - on server
When you have created the file name character translation file, you must specify its name location in the system registry. To register the path and name of the file:
Use Registry Editor to locate the following registry key:
HKEY_LOCAL_MACHINE\Software\Microsoft\Server For NFS\CurrentVersion\Mapping
Edit the CharacterTranslation (REG_SZ) value.
Enter the fully qualified path name of the file name character translation file. For example, C:\Sfu\CTrans.txt.

Related

FileNotFoundError when writing txt file [duplicate]

This is a screenshot of the execution:
As you see, the error says that the directory "JSONFiles/Apartment/Rent/dubizzleabudhabiproperty" is not there.
But look at my files, please:
The folder is definitely there.
Update 2
The code
self.file = open("JSONFiles/"+ item["category"]+"/" + item["action"]+"/"+ item['source']+"/"+fileName + '.json', 'wb') # Create a new JSON file with the name = fileName parameter
line = json.dumps(dict(item)) # Change the item to a JSON format in one line
self.file.write(line) # Write the item to the file
UPDATE
When I change the file name to a smaller one, it works, so the problem is because of the length of the path. what is the solution please?
Regular DOS paths are limited to MAX_PATH (260) characters, including the string's terminating NUL character. You can exceed this limit by using an extended-length path that starts with the \\?\ prefix. This path must be a Unicode string, fully qualified, and only use backslash as the path separator. Per Microsoft's file system functionality comparison, the maximum extended path length is 32760 characters. A individual file or directory name can be up to 255 characters (127 for the UDF filesystem). Extended UNC paths are also supported as \\?\UNC\server\share.
For example:
import os
def winapi_path(dos_path, encoding=None):
if (not isinstance(dos_path, unicode) and
encoding is not None):
dos_path = dos_path.decode(encoding)
path = os.path.abspath(dos_path)
if path.startswith(u"\\\\"):
return u"\\\\?\\UNC\\" + path[2:]
return u"\\\\?\\" + path
path = winapi_path(os.path.join(u"JSONFiles",
item["category"],
item["action"],
item["source"],
fileName + ".json"))
>>> path = winapi_path("C:\\Temp\\test.txt")
>>> print path
\\?\C:\Temp\test.txt
See the following pages on MSDN:
Naming Files, Paths, and Namespaces
Defining an MS-DOS Device Name
Kernel object namespaces
Background
Windows calls the NT runtime library function RtlDosPathNameToRelativeNtPathName_U_WithStatus to convert a DOS path to a native NT path. If we open (i.e. CreateFile) the above path with a breakpoint set on the latter function, we can see how it handles a path that starts with the \\?\ prefix.
Breakpoint 0 hit
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus:
00007ff9`d1fb5880 4883ec58 sub rsp,58h
0:000> du #rcx
000000b4`52fc0f60 "\\?\C:\Temp\test.txt"
0:000> r rdx
rdx=000000b450f9ec18
0:000> pt
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus+0x66:
00007ff9`d1fb58e6 c3 ret
The result replaces \\?\ with the NT DOS devices prefix \??\, and copies the string into a native UNICODE_STRING:
0:000> dS b450f9ec18
000000b4`536b7de0 "\??\C:\Temp\test.txt"
If you use //?/ instead of \\?\, then the path is still limited to MAX_PATH characters. If it's too long, then RtlDosPathNameToRelativeNtPathName returns the status code STATUS_NAME_TOO_LONG (0xC0000106).
If you use \\?\ for the prefix but use slash in the rest of the path, Windows will not translate the slash to backslash for you:
Breakpoint 0 hit
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus:
00007ff9`d1fb5880 4883ec58 sub rsp,58h
0:000> du #rcx
0000005b`c2ffbf30 "\\?\C:/Temp/test.txt"
0:000> r rdx
rdx=0000005bc0b3f068
0:000> pt
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus+0x66:
00007ff9`d1fb58e6 c3 ret
0:000> dS 5bc0b3f068
0000005b`c3066d30 "\??\C:/Temp/test.txt"
Forward slash is a valid object name character in the NT namespace. It's reserved by Microsoft filesystems, but you can use a forward slash in other named kernel objects, which get stored in \BaseNamedObjects or \Sessions\[session number]\BaseNamedObjects. Also, I don't think the I/O manager enforces the policy on reserved characters in device and filenames. It's up to the device. Maybe someone out there has a Windows device that implements a namespace that allows forward slash in names. At the very least you can create DOS device names that contain a forward slash. For example:
>>> kernel32 = ctypes.WinDLL('kernel32')
>>> kernel32.DefineDosDeviceW(0, u'My/Device', u'C:\\Temp')
>>> os.path.exists(u'\\\\?\\My/Device\\test.txt')
True
You may be wondering what \?? signifies. This used to be an actual directory for DOS device links in the object namespace, but starting with NT 5 (or NT 4 w/ Terminal Services) this became a virtual prefix. The object manager handles this prefix by first checking the logon session's DOS device links in the directory \Sessions\0\DosDevices\[LOGON_SESSION_ID] and then checking the system-wide DOS device links in the \Global?? directory.
Note that the former is a logon session, not a Windows session. The logon session directories are all under the DosDevices directory of Windows session 0 (i.e. the services session in Vista+). Thus if you have a mapped drive for a non-elevated logon, you'll discover that it's not available in an elevated command prompt, because your elevated token is actually for a different logon session.
An example of a DOS device link is \Global??\C: => \Device\HarddiskVolume2. In this case the DOS C: drive is actually a symbolic link to the HarddiskVolume2 device.
Here's a brief overview of how the system handles parsing a path to open a file. Given we're calling WinAPI CreateFile, it stores the translated NT UNICODE_STRING in an OBJECT_ATTRIBUTES structure and calls the system function NtCreateFile.
0:000> g
Breakpoint 1 hit
ntdll!NtCreateFile:
00007ff9`d2023d70 4c8bd1 mov r10,rcx
0:000> !obja #r8
Obja +000000b450f9ec58 at 000000b450f9ec58:
Name is \??\C:\Temp\test.txt
OBJ_CASE_INSENSITIVE
NtCreateFile calls the I/O manager function IoCreateFile, which in turn calls the undocumented object manager API ObOpenObjectByName. This does the work of parsing the path. The object manager starts with \??\C:\Temp\test.txt. Then it replaces that with \Global??\C:Temp\test.txt. Next it parses up to the C: symbolic link and has to start over (reparse) the final path \Device\HarddiskVolume2\Temp\test.txt.
Once the object manager gets to the HarddiskVolume2 device object, parsing is handed off to the I/O manager, which implements the Device object type. The ParseProcedure of an I/O Device creates the File object and an I/O Request Packet (IRP) with the major function code IRP_MJ_CREATE (an open/create operation) to be processed by the device stack. This is sent to the device driver via IoCallDriver. If the device implements reparse points (e.g. junction mountpoints, symbolic links, etc) and the path contains a reparse point, then the resolved path has to be resubmitted to the object manager to be parsed from the start.
The device driver will use the SeChangeNotifyPrivilege (almost always present and enabled) of the process token (or thread if impersonating) to bypass access checks while traversing directories. However, ultimately access to the device and target file has to be allowed by a security descriptor, which is verified via SeAccessCheck. Except simple filesystems such as FAT32 don't support file security.
below is Python 3 version regarding #Eryk Sun's solution.
def winapi_path(dos_path, encoding=None):
if (not isinstance(dos_path, str) and encoding is not None):
dos_path = dos_path.decode(encoding)
path = os.path.abspath(dos_path)
if path.startswith(u"\\\\"):
return u"\\\\?\\UNC\\" + path[2:]
return u"\\\\?\\" + path
#Python 3 renamed the unicode type to str, the old str type has been replaced by bytes. NameError: global name 'unicode' is not defined - in Python 3
Adding the solution that helped me fix a similar issue:
python version = 3.9, windows version = 10 pro.
I had an issue with the filename itself as it was too long for python's open built-in method. The error I got is that the path simply doesn't exist, although I use the 'w+' mode for open (which is supposed to open a new file regardless whether it exists or not).
I found this guide which solved the problem with a quick change to window's Registry Editor (specifically the Group Policy). Scroll down to the 'Make Windows 10 Accept Long File Paths' headline.
Don't forget to update your OS group policy to take effect immediately, a guide can be found here.
Hope this helps future searches as this post is quite old.
There can be multiple reasons for you getting this error. Please make sure of the following:
The parent directory of the folder (JSONFiles) is the same as the directory of the Python script.
Even though the folder exists it does not mean the individual file does. Verify the same and make sure the exact file name matches the one that your Python code is trying to access.
If you still face an issue, share the result of "dir" command on the innermost folder you are trying to access.
it works for me
import os
str1=r"C:\Users\sandeepmkwana\Desktop\folder_structure\models\manual\demodfadsfljdskfjslkdsjfklaj\inner-2djfklsdfjsdklfj\inner3fadsfksdfjdklsfjksdgjl\inner4dfhasdjfhsdjfskfklsjdkjfleioreirueewdsfksdmv\anotherInnerfolder4aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\5qbbbbbbbbbbbccccccccccccccccccccccccsssssssssssssssss\tmp.txt"
print(len(str1)) #346
path = os.path.abspath(str1)
if path.startswith(u"\\\\"):
path=u"\\\\?\\UNC\\"+path[2:]
else:
path=u"\\\\?\\"+path
with open(path,"r+") as f:
print(f.readline())
If you get a long path (more than 258 characters) issue in Windows, then try this.

Opening files in python3 for windows10

Hi I cannot open files in python 3 actually I have a problem with the path. I don't know how to write the path for it.:/ For example I have a file(bazi.py) in folder(w8) in driver(F). How should i write it's path. Please help me im an amateur:/
In Windows, there are a couple additional ways of referencing a file. That is because natively, Windows file path employs the backslash "" instead of the slash. Python allows using both in a Windows system, but there are a couple of pitfalls to watch out for. To sum them up:
Python lets you use OS-X/Linux style slashes "/" even in Windows. Therefore, you can refer to the file as 'C:/Users/narae/Desktop/alice.txt'. RECOMMENDED.
If using backslash, because it is a special character in Python, you must remember to escape every instance: 'C:\Users\narae\Desktop\alice.txt'
Alternatively, you can prefix the entire file name string with the rawstring marker "r": r'C:\Users\narae\Desktop\alice.txt'. That way, everything in the string is interpreted as a literal character, and you don't have to escape every backslash.
File Name Shortcuts and CWD (Current Working Directory)
So, using the full directory path and file name always works; you should be using this method. However, you might have seen files called by their name only, e.g., 'alice.txt' in Python. How is it done?
The concept of Current Working Directory (CWD) is crucial here. You can think of it as the folder your Python is operating inside at the moment. So far we have been using the absolute path, which begins from the topmost directory. But if your file reference does not start from the top (e.g., 'alice.txt', 'ling1330/alice.txt'), Python assumes that it starts in the CWD (a "relative path").
using the os.path.abspath function will translate the path to a version appropriate for the operating system.
os.path.abspath(r'F:\w8\bazi.py')

How do I write to a file with spaces and special characters in file name?

I'm trying to write to a file called
data/STO: INVE-B-TIME_SERIES_DAILY_ADJUSTED-1.csv
using the following python code on windows
overwrite = "w" # or write new
f = open(OutputPath(copyNr), overwrite)
f.write(response.content.decode('utf-8'))
f.close()
print(f"Wrote to file: {OutputPath(copyNr)}")
The output in the console is correct, but the file written to results in only data/STO so the path seems to be clipped. I've tried to escape the characters using the method from this SO answer but that gave me an invalid_argument exception for the following filename:
data/STO\\:\\ INVE-B-TIME_SERIES_DAILY_ADJUSTED-1.csv
I get the same error when I remove the space and it still seems to clip at :. How do I include such characters in a filename?
You can't. Colons are only allowed on the volumeseperator - anywhere else there are illegal.
Allowed:
d:/somedir/somefile.txt
Illegal:
./some:dir/somefile.txt
and as seperators you can either use '\\' or '/' - both should be understood.
This is not a python limitation but a operating system limitation.
See f.e. What characters are forbidden in Windows and Linux directory names?
Windows API will not allow colon in filename, it is reserved for drive letter, use a different sign.

Can I use Python scripting to fill out fields on a program locally in my computer?

I'm trying to use AgentRansack, a search program, to find all files in a directory containing certain text. I could write a script to open each file in the directory and search the text. However, just entering the directory, the term to search and hitting the start button in agent ransack should do the job.
Is it possible to use scripts in order to fill out forms in programs locally (like AgentRansack)?
![AgentRansack program template]https://img.utdstc.com/screen/windows/thumb/agent-ransack.jpg!
I have had luck before with WinGUIAuto. It uses Sendkeys method to literally send keys to other Windows application. However, a simple focus grab between windows applications could mess with the whole process, so not a production grade stuff.
You can use python to call windows' findstr :
findstr /spin /c:"string" [files]
The parameters have the following meanings:
s = recursive
p = skip non-printable characters
i = case insensitive
n = print line numbers
And the string to search for is the bit you put in quotes after /c:

Pathname too long to open?

This is a screenshot of the execution:
As you see, the error says that the directory "JSONFiles/Apartment/Rent/dubizzleabudhabiproperty" is not there.
But look at my files, please:
The folder is definitely there.
Update 2
The code
self.file = open("JSONFiles/"+ item["category"]+"/" + item["action"]+"/"+ item['source']+"/"+fileName + '.json', 'wb') # Create a new JSON file with the name = fileName parameter
line = json.dumps(dict(item)) # Change the item to a JSON format in one line
self.file.write(line) # Write the item to the file
UPDATE
When I change the file name to a smaller one, it works, so the problem is because of the length of the path. what is the solution please?
Regular DOS paths are limited to MAX_PATH (260) characters, including the string's terminating NUL character. You can exceed this limit by using an extended-length path that starts with the \\?\ prefix. This path must be a Unicode string, fully qualified, and only use backslash as the path separator. Per Microsoft's file system functionality comparison, the maximum extended path length is 32760 characters. A individual file or directory name can be up to 255 characters (127 for the UDF filesystem). Extended UNC paths are also supported as \\?\UNC\server\share.
For example:
import os
def winapi_path(dos_path, encoding=None):
if (not isinstance(dos_path, unicode) and
encoding is not None):
dos_path = dos_path.decode(encoding)
path = os.path.abspath(dos_path)
if path.startswith(u"\\\\"):
return u"\\\\?\\UNC\\" + path[2:]
return u"\\\\?\\" + path
path = winapi_path(os.path.join(u"JSONFiles",
item["category"],
item["action"],
item["source"],
fileName + ".json"))
>>> path = winapi_path("C:\\Temp\\test.txt")
>>> print path
\\?\C:\Temp\test.txt
See the following pages on MSDN:
Naming Files, Paths, and Namespaces
Defining an MS-DOS Device Name
Kernel object namespaces
Background
Windows calls the NT runtime library function RtlDosPathNameToRelativeNtPathName_U_WithStatus to convert a DOS path to a native NT path. If we open (i.e. CreateFile) the above path with a breakpoint set on the latter function, we can see how it handles a path that starts with the \\?\ prefix.
Breakpoint 0 hit
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus:
00007ff9`d1fb5880 4883ec58 sub rsp,58h
0:000> du #rcx
000000b4`52fc0f60 "\\?\C:\Temp\test.txt"
0:000> r rdx
rdx=000000b450f9ec18
0:000> pt
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus+0x66:
00007ff9`d1fb58e6 c3 ret
The result replaces \\?\ with the NT DOS devices prefix \??\, and copies the string into a native UNICODE_STRING:
0:000> dS b450f9ec18
000000b4`536b7de0 "\??\C:\Temp\test.txt"
If you use //?/ instead of \\?\, then the path is still limited to MAX_PATH characters. If it's too long, then RtlDosPathNameToRelativeNtPathName returns the status code STATUS_NAME_TOO_LONG (0xC0000106).
If you use \\?\ for the prefix but use slash in the rest of the path, Windows will not translate the slash to backslash for you:
Breakpoint 0 hit
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus:
00007ff9`d1fb5880 4883ec58 sub rsp,58h
0:000> du #rcx
0000005b`c2ffbf30 "\\?\C:/Temp/test.txt"
0:000> r rdx
rdx=0000005bc0b3f068
0:000> pt
ntdll!RtlDosPathNameToRelativeNtPathName_U_WithStatus+0x66:
00007ff9`d1fb58e6 c3 ret
0:000> dS 5bc0b3f068
0000005b`c3066d30 "\??\C:/Temp/test.txt"
Forward slash is a valid object name character in the NT namespace. It's reserved by Microsoft filesystems, but you can use a forward slash in other named kernel objects, which get stored in \BaseNamedObjects or \Sessions\[session number]\BaseNamedObjects. Also, I don't think the I/O manager enforces the policy on reserved characters in device and filenames. It's up to the device. Maybe someone out there has a Windows device that implements a namespace that allows forward slash in names. At the very least you can create DOS device names that contain a forward slash. For example:
>>> kernel32 = ctypes.WinDLL('kernel32')
>>> kernel32.DefineDosDeviceW(0, u'My/Device', u'C:\\Temp')
>>> os.path.exists(u'\\\\?\\My/Device\\test.txt')
True
You may be wondering what \?? signifies. This used to be an actual directory for DOS device links in the object namespace, but starting with NT 5 (or NT 4 w/ Terminal Services) this became a virtual prefix. The object manager handles this prefix by first checking the logon session's DOS device links in the directory \Sessions\0\DosDevices\[LOGON_SESSION_ID] and then checking the system-wide DOS device links in the \Global?? directory.
Note that the former is a logon session, not a Windows session. The logon session directories are all under the DosDevices directory of Windows session 0 (i.e. the services session in Vista+). Thus if you have a mapped drive for a non-elevated logon, you'll discover that it's not available in an elevated command prompt, because your elevated token is actually for a different logon session.
An example of a DOS device link is \Global??\C: => \Device\HarddiskVolume2. In this case the DOS C: drive is actually a symbolic link to the HarddiskVolume2 device.
Here's a brief overview of how the system handles parsing a path to open a file. Given we're calling WinAPI CreateFile, it stores the translated NT UNICODE_STRING in an OBJECT_ATTRIBUTES structure and calls the system function NtCreateFile.
0:000> g
Breakpoint 1 hit
ntdll!NtCreateFile:
00007ff9`d2023d70 4c8bd1 mov r10,rcx
0:000> !obja #r8
Obja +000000b450f9ec58 at 000000b450f9ec58:
Name is \??\C:\Temp\test.txt
OBJ_CASE_INSENSITIVE
NtCreateFile calls the I/O manager function IoCreateFile, which in turn calls the undocumented object manager API ObOpenObjectByName. This does the work of parsing the path. The object manager starts with \??\C:\Temp\test.txt. Then it replaces that with \Global??\C:Temp\test.txt. Next it parses up to the C: symbolic link and has to start over (reparse) the final path \Device\HarddiskVolume2\Temp\test.txt.
Once the object manager gets to the HarddiskVolume2 device object, parsing is handed off to the I/O manager, which implements the Device object type. The ParseProcedure of an I/O Device creates the File object and an I/O Request Packet (IRP) with the major function code IRP_MJ_CREATE (an open/create operation) to be processed by the device stack. This is sent to the device driver via IoCallDriver. If the device implements reparse points (e.g. junction mountpoints, symbolic links, etc) and the path contains a reparse point, then the resolved path has to be resubmitted to the object manager to be parsed from the start.
The device driver will use the SeChangeNotifyPrivilege (almost always present and enabled) of the process token (or thread if impersonating) to bypass access checks while traversing directories. However, ultimately access to the device and target file has to be allowed by a security descriptor, which is verified via SeAccessCheck. Except simple filesystems such as FAT32 don't support file security.
below is Python 3 version regarding #Eryk Sun's solution.
def winapi_path(dos_path, encoding=None):
if (not isinstance(dos_path, str) and encoding is not None):
dos_path = dos_path.decode(encoding)
path = os.path.abspath(dos_path)
if path.startswith(u"\\\\"):
return u"\\\\?\\UNC\\" + path[2:]
return u"\\\\?\\" + path
#Python 3 renamed the unicode type to str, the old str type has been replaced by bytes. NameError: global name 'unicode' is not defined - in Python 3
Adding the solution that helped me fix a similar issue:
python version = 3.9, windows version = 10 pro.
I had an issue with the filename itself as it was too long for python's open built-in method. The error I got is that the path simply doesn't exist, although I use the 'w+' mode for open (which is supposed to open a new file regardless whether it exists or not).
I found this guide which solved the problem with a quick change to window's Registry Editor (specifically the Group Policy). Scroll down to the 'Make Windows 10 Accept Long File Paths' headline.
Don't forget to update your OS group policy to take effect immediately, a guide can be found here.
Hope this helps future searches as this post is quite old.
There can be multiple reasons for you getting this error. Please make sure of the following:
The parent directory of the folder (JSONFiles) is the same as the directory of the Python script.
Even though the folder exists it does not mean the individual file does. Verify the same and make sure the exact file name matches the one that your Python code is trying to access.
If you still face an issue, share the result of "dir" command on the innermost folder you are trying to access.
it works for me
import os
str1=r"C:\Users\sandeepmkwana\Desktop\folder_structure\models\manual\demodfadsfljdskfjslkdsjfklaj\inner-2djfklsdfjsdklfj\inner3fadsfksdfjdklsfjksdgjl\inner4dfhasdjfhsdjfskfklsjdkjfleioreirueewdsfksdmv\anotherInnerfolder4aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa\5qbbbbbbbbbbbccccccccccccccccccccccccsssssssssssssssss\tmp.txt"
print(len(str1)) #346
path = os.path.abspath(str1)
if path.startswith(u"\\\\"):
path=u"\\\\?\\UNC\\"+path[2:]
else:
path=u"\\\\?\\"+path
with open(path,"r+") as f:
print(f.readline())
If you get a long path (more than 258 characters) issue in Windows, then try this.

Categories

Resources