I have two strings:
C:\Data
and another folder
Foo1
I need, the windows output to be
C:\Data\Foo1
and the Linux output to be
/data/foo1
assuming /data is in linux. Is there any constant separator that can be used in Python, that makes it easy to use irrespective of underlying OS?
Yes, python provides os.sep, which is that character, but for your purpose, the function os.path.join() is what you are looking for.
>>> os.path.join("data", "foo1")
"data/foo1"
os.path.normpath() will normalize a path correctly for Linux and Windows. FYI, Windows OS calls can use either slash, but should be displayed to the user normalized.
The os.path.join() is always better. As Mark Tolonen wrote (my +1 to him), you can use a normal slash also for Windows, and you should prefer this way if you have to write the path explicitly. You should avoid using the backslash for paths in Python at all. Or you would have to double them in strings or you would have to use r'raw strings' to suppress the backslash interpretation. Otherwise, 'c:\for\a_path\like\this' actually contains \f, \a, and \t escape sequences that you may not notice in the time of writing... and they may be source of headaches in future.
Related
using re.escape() on this directory:
C:\Users\admin\code
Should theoratically return this, right?
C:\\Users\\admin\\code
However, what I actually get is this:
C\:\\Users\\admin\\code
Notice the backslash immediately after C. This makes the string unusable, and trying to use directory.replace('\', '') just bugs out Python because it can't deal with a single backslash string, and treats everything after it as string.
Any ideas?
Update
This was a dumb question :p
No it should not. It's help says "Escape all the characters in pattern except ASCII letters, numbers and '_'"
What you are reporting you are getting is after calling the print function on the resulting string. In console, if you type directory and press enter, it would give something like: C\\:\\\\Users\\\\admin\\\\code. When using directory.replace('\\','') it would replace all backslashes. For example: directory.replace('\\','x') gives Cx:xxUsersxxadminxxcode. What might work in this case is replacing both the backslash and colon with ':' i.e. directory.replace('\\:',':'). This will work.
However, I will suggest doing something else. A neat way to work with Windows directories in Python is to use forward slash. Python and the OS will work out a way to understand your paths with forward slashes. Further, if you aren't using absolute paths, as far as the paths are concerned, your code will be portable to Unix-style OSes.
It also seems to me that you are calling re.escape unnecessarily. If the printing the directory is giving you C:\Users\admin\code then it's a perfectly fine directory to use already. And you don't need to escape it. It's already done. If it wasn't escaped print('C:\Users\admin\code') would give something like C:\Usersdmin\code since \a has special meaning (beep).
I am using replace string method in Python and I am finding something that I cannot understand.
Changing the way that a folder is written in python to windows notation, I find that replace method will change this double / for a double \ instead of just one \ as intended.
folder_im_wdows = folder_im_wdows.replace("//","\\")
But the most impressive, is that when I try a workaround doing the next
folder_im_wdows = folder_im_wdows.replace("//",chr(92))
Python does the same...
The original variable is: //xxxxx//xxxx//xxxx//xxxx//xxx//xxxxx
And I want to get -> \xxx\x\x\x
What's happening with replace method?
This is because python's CLI escapes backslashes.
Example from python's CLI:
>>> str = "abc//def//fgh"
>>> str.replace("//", "\\")
'abc\\def\\fgh'
>>> print(str.replace("//", "\\"))
abc\def\fgh
>>>
Also, you should need to use \\ and not only \, because you need to escape the backslash character, well, I do.
Use os.path for working with path names:
import os
os.path.normpath('C:/Users/Bob/My Documents')
os.path.abspath would do the job too (it uses os.path.normpath)
Note: requires host to be windows, if that's not the case you can use ntpath.normpath directly
https://docs.python.org/library/os.path.html#os.path.normpath
Avoid regexes, replaces and all that. You're going to get it wrong in some subtle way.
I am using python v3.6 on Windows 10. When specifying a string to represent a directory location, what is the difference between the 2 approaches below?
folder_location = 'C:\\Users\\username\\Dropbox\\Inv'
folder_location = 'C:/Users/username/Dropbox/Inv'
This is a follow-up question to another question I just posted. My problem was solved when I used \\ instead of /.
What is wrong with this selenium firefox profile to download file into customized folder?
On Unix systems, the folder separator is /, while on Windows systems, the separator is \. Unfortunately this \ is also an escape character in most programming languages and text based formats (including C, Python and many others). Strangely enough a / character is not allowed in windows paths.
So Python on windows is designed to accept both / and \ as folder separator when dealing with the filesystem, for convenience. But the \ must be escaped by another \ (unless of course you use raw strings like r'backslashes are now normal characters \\\ !')
Selenium, on the other hand, will write values into Firefox preferences, which, unlike Python, expects the appropriate kind of separator. That's why using forward slashes does not work in your example.
Windows uses by default backslashes as file/folder seperator the \\ is an escaped \. The POSIX compliant file/folder seperator / is also supported by the windows api. But the library you use (which is not recognizable in your example) need also support it.
The standard Windows path separator is backslash \. But it is used in string formatting so for example \n is end of line.
For the above reason you rather don't want to use backslash in you path as if the name of the folder will start with a letter corresponding to special characters you will run into troubles.
To use native backslash separator in windows you have two ways. Yo can use raw string and then all special characters are read literary. path = r"C:\user\myFolder" or escape backslach with escape character with turns out to be the backslash too path = "C:\\user\\myFolder".
But coming back to DOS it accepted forward slash in path string too
Python is able to accept both separators. It is advised to use native way of formatting on your system
If you want you script working on both systems try:
import os
if os.name == 'posix':
path = '/net/myFolder/'
else:
path = r'C:\Users\myFolder'
Windows inherited backslashes as a path separator from Microsoft DOS. DOS initially didn't support subdirectories and opted to use the (on US keyboards) easily typed slash / character for command line switches.
When they did introduce subdirectories in DOS 2, either slash / or backslash \ worked as a path separator, but to use slashes on the command line you had to reconfigure the switch character, a feature they later removed entirely.
Thus the command line for certain commands that look for switches without space in front (like dir/w) is the one place you can't use forward slashes (this has to do with the command line being passed as a single string, unlike POSIX which passes distinct arguments in a list). That, and poorly written code that tries things like splitting on backslash, not knowing that slash is also a path separator.
It's also sometimes complicated by either character having other meanings, such as \ being the escape character in string literals; that's why you use \\ unless you use a raw string r'foo\bar'.
The other path separator I know of is classic Mac OS, which uses colon :. Python handles these differences by including reasonable routines in os.path or pathlib.
Windows and Linux/macOS use different path separators - UNIX uses forward slashes (/) while Windows use back slashes (\).
You should never type your own separators, always use os.path.join or os.sep, which handle this for you based on the platform you're running on. Example:
import os
folder_location = os.path.join('C:\\', 'Users', 'username', 'Dropbox', 'Inv')
# or
folder_location = os.sep.join(['C:\\', 'Users', 'username', 'Dropbox', 'Inv']);
Also, you will need to manually escape the drive letter's trailing slash manually, as specified on the Python docs:
Note that on Windows, since there is a current directory for each drive, os.path.join("c:", "foo") represents a path relative to the current directory on drive C: (c:foo), not c:\foo.
Hard-coding a full path like this is usually useless, as C: will only work on Windows anyway. You will most likely want to use this later on using relative paths or paths that were fetched elsewhere and need to have segments added to them.
When I'm using Python 3 to launch a program via subprocess.call(), why do I need 4 backslashes in paths?
This is my code:
cmd = 'C:\\\\Windows\\\\System32\\\\cmd.exe'
cmd = shlex.split(cmd)
subprocess.call(cmd)
When I examine the command line of the launched cmd.exe instance with Task Manager, it shows the path correctly with only one backslash separating each path.
Because of this, I need this on Windows to make the paths work:
if platform.platform().startswith('Windows'):
cmd = cmd.replace(os.sep, os.sep + os.sep)
is there a more elegant solution?
Part of the problem is that you're using shlex, which implements escaping rules used by Unix-ish shells. But you're running on Windows, whose command shells use different rules. That accounts for one level of needing to double backslashes (i.e., to worm around something shlex does that you didn't need to begin with).
That you're using a regular string instead of a raw string (r"...") accounts for the other level of needing to double backslashes, and 2*2 = 4. QED ;-)
This works fine on Windows:
cmd = subprocess.call(r"C:\Windows\System32\cmd.exe")
By the way, read the docs for subprocess.Popen() carefully: the Windows CreateProcess() API call requires a string for an argument. When you pass a sequence instead, Python tries to turn that sequence into a string, via rules explained in the docs. When feasible, it's better - on Windows - to pass the string you want directly.
When you are creating the string, you need to double each backslash for escaping, and then when the string is passed to your shell, you need to double each backslash again. You can cute the backslashes in half by using a raw string:
cmd = r'C:\\Windows\\System32\\cmd.exe'
\ has special meaning - you're using it as part of an escape sequence. Double up the backslashes, and you have a literal backslash \.
The caveat is that, with only one pair of escaped backslashes, you still have only one literal backslash. You need to escape that backslash, too.
Alternatively, why not just use os.sep instead? You'll be able to ensure your code is more portable (since it'll use the system-specific separator), and you won't have to deal [directly] with escaping backslashes.
As John points out 4 slashes isn't necessary when accessing files locally.
One place where 4 slashes is necessary is when connecting to (generally windows) servers over SMB or CIFS.
Normally you would just use \servername\share\
But each one of those slashes needs to be escaped. So thus the 4 slashes before servernames.
you could also use subprocess.call()
import subprocess as sp
sp.call(['c:\\program files\\<path>'])
I have a program that includes an embedded Python 2.6 interpreter. When I invoke the interpreter, I call PySys_SetPath() to set the interpreter's import-path to the subdirectories installed next to my executable that contain my Python script files... like this:
PySys_SetPath("/path/to/my/program/scripts/type1:/path/to/my/program/scripts/type2");
(except that the path strings are dynamically generated based on the current location of my program's executable, not hard-coded as in the example above)
This works fine... except when the clever user decides to install my program underneath a folder that has a colon in its name. In that case, my PySys_SetPath() command ends up looking like this (note the presence of a folder named "path:to"):
PySys_SetPath("/path:to/my/program/scripts/type1:/path:to/my/program/scripts/type2");
... and this breaks all my Python scripts, because now Python looks for script files in "/path", and "to/my/program/scripts/type1" instead of in "/path:to/myprogram/scripts/type1", and so none of the import statements work.
My question is, is there any fix for this issue, other than telling the user to avoid colons in his folder names?
I looked at the makepathobject() function in Python/sysmodule.c, and it doesn't appear to support any kind of quoting or escaping to handle literal colons.... but maybe I am missing some nuance.
The problem you're running into is the PySys_SetPath function parses the string you pass using a colon as the delimiter. That parser sees each : character as delimiting a path, and there isn't a way around this (can't be escaped).
However, you can bypass this by creating a list of the individual paths (each of which may contain colons) and use PySys_SetObject to set the sys.path:
PyListObject *path;
path = (PyListObject *)PyList_New(0);
PyList_Append((PyObject *) path, PyString_FromString("foo:bar"));
PySys_SetObject("path", (PyObject *)path);
Now the interpreter will see "foo:bar" as a distinct component of the sys.path.
Supporting colons in a file path opens up a huge can of worms on multiple operating systems; it is not a valid path character on Windows or Mac OS X, for example, and it doesn't seem like a particularly reasonable thing to support in the context of a scripting environment either for exactly this reason. I'm actually a bit surprised that Linux allows colon filenames too, especially since : is a very common path separator character.
You might try escaping the colon out, i.e. converting /path:to/ to /path\:to/ and see if that works. Other than that, just tell the user to avoid using colons in their file names. They will run into all sorts of problems in quite a few different environments and it's a just plain bad idea.