How to remove the last word and print the remaining path? - python

I want the value of the actual variable to be printed like this.
Variable value
rootdir = '/home/runner/TestP1'
Required value to be printed
/home/runner
I used the code like this
hello = rootdir.split("/")[1]
print(hello)
Given value
TestP1
but I want to remove just the last word from that string and print the remaining path.

You should use the dirname function from os.path
Return the directory name of pathname path. This is the first element of the pair returned by passing path to the function split().
>>> from os.path import dirname
>>> rootdir = '/home/runner/TestP1'
>>> dirname(rootdir)
'/home/runner'

split() gives a list of elements after separating the string by /.
So, you need to take all elements till the last and re-join:
hello = rootdir.split("/")[:-1]
hello = '/'.join(hello)
print(hello)
Use rsplit() instead and split only till one / from right:
hello = rootdir.rsplit("/", 1)[0]
print(hello)
Further, if you are trying to extract the directory name, instead use #sayse's answer.

You can use rsplit (for spliting from the right side) and put number of max splits equal to 1. like this:
rootdir = '/home/runner/TestP1'
hello = rootdir.rsplit("/", 1)[0]
print(hello)
Output:

Related

Remove file name and extension from path and just keep path

Hi I have a string like this which will be dynamic and can be in following combinations.
'new/file.csv'
'new/mainfolder/file.csv'
'new/mainfolder/subfolder/file.csv'
'new/mainfolder/subfolder/secondsubfolder/file.csv'
Something like these. In any case I just want this string in 2 parts like path and filename. Path will not consist of file name for example.
End result expected
'new'
'new/mainfolder'
'new/mainfolder/subfolder'
'new/mainfolder/subfolder/secondsubfolder'
Till now tried many things included
path = 'new/mainfolder/file.csv'
final_path = path.split('/', 1)[-1]
And rstrip() but nothing worked till now.
You can use pathlib for this.
For example,
>>>import pathlib
>>>path = pathlib.Path('new/mainfolder/file.csv')
>>>path.name
'file.csv'
>>>str(path.parent)
'new/mainfolder'
input = ['new/file.csv',
'new/mainfolder/file.csv',
'new/mainfolder/subfolder/file.csv',
'new/mainfolder/subfolder/secondsubfolder/file.csv']
output = []
for i in input:
i = i.split("/")[:-1]
i = "/".join(i)
output.append(i)
print(output)
Output:
['new', 'new/mainfolder', 'new/mainfolder/subfolder', 'new/mainfolder/subfolder/secondsubfolder']
An option to pathlib is os
import os
fullPath = 'new/mainfolder/file.csv'
parent = os.path.dirname(fullPath) # get path only
file = os.path.basename(fullPath) # get file name
print (parent)
Output:
new/mainfolder
path.dirname:
Return the directory name of pathname path. This is the first element of the pair returned by passing path to the function split(). Source: https://docs.python.org/3.3/library/os.path.html#os.path.dirname
path.basename:
Return the base name of pathname path. This is the second element of the pair returned by passing path to the function split(). Note that the result of this function is different from the Unix basename program; where basename for '/foo/bar/' returns 'bar', the basename() function returns an empty string ('').
Source: https://docs.python.org/3.3/library/os.path.html#os.path.basename
you almost got it, just use rsplit, like this:
path = 'new/mainfolder/file.csv'
file_path, file_name = path.rsplit('/', 1)
print(f'{file_path=}\n{file_name=}')
'''
file_path='new/mainfolder'
file_name='file.csv'

Python: split hard coded path

I need to split a path up in python and then remove the last two levels.
Here is an example, the path I want to parse. I want to parse it to level 6.
C:\Users\Me\level1\level2\level3\level4\level5\level6\level7\level8
Below is what I want the output to be. Currently, I can only go one level up.
C:\Users\Me\level1\level2\level3\level4\level5\level6\
a ="C:\Users\Me\level1\level2\level3\level4\level5\level6\level7\level8"
split_path=os.path.split(a)
print split_path
Output:
('C:\Users\Me\level1\level2\level3\level4\level5\level6\level7','level8')
Split the path into all its parts, then join all the parts, except the last two.
import os
seperator = os.path.sep
parts = string.split(seperator)
output = os.path.join(*parts[0:-2])
You can either use the split function twice:
os.path.split(os.path.split(a)[0])[0]
This works since os.path.split() returns a tuple with two items, head and tail, and by taking [0] of that we'll get the head. Then just split again and take the first item again with [0].
Or join your path with the parent directory twice:
os.path.abspath(os.path.join(a, '..', '..'))
You can easily create a function that will step back as many steps as you want:
def path_split(path, steps):
for i in range(steps + 1):
path = os.path.split(path)[0]
return path
So
>>> path_split("C:\Users\Me\level1\level2\level3\level4\level5\level6\level7\level8", 2)
"C:\Users\Me\level1\level2\level3\level4\level5\level6\"
os.path.split(path) gives the whole path except the lastone, and the last one in a tuple. So if you want to remove the last two,
os.path.split(os.path.split(your_path)[0])[0]

Exclude the last delimited item from my list

I want to remove the last string in the list i.e. the library name (delimited by '\'). The text string that I have contains path of libraries used at the compilation time. These libraries are delimited by spaces. I want to retain each path but not till the library name, just one root before it.
Example:
text = " /opt/gcc/4.4.4/snos/lib/gcc/x86_64-suse-linux/4.4.4/crtbeginT.o /opt/gcc/4.4.4/snos/lib/gcc/x86_64-suse-linux/4.4.4/crtfastmath.o /opt/cray/cce/8.2.5/craylibs/x86-64/no_mmap.o /opt/cray/cce/8.2.5/craylibs/x86-64/libcraymath.a /opt/cray/cce/8.2.5/craylibs/x86-64/libcraymp.a /opt/cray/atp/1.7.1/lib/libAtpSigHandler.a /opt/cray/atp/1.7.1/lib/libAtpSigHCommData.a "
I want my output to be like -
Output_list =
[/opt/gcc/4.4.4/snos/lib/gcc/x86_64-suse-linux/4.4.4,
/opt/gcc/4.4.4/snos/lib/gcc/x86_64-suse-linux/4.4.4,
/opt/cray/cce/8.2.5/craylibs/x86-64,
/opt/cray/cce/8.2.5/craylibs/x86-64,
/opt/cray/cce/8.2.5/craylibs/x86-64,
/opt/cray/atp/1.7.1/lib,
/opt/cray/atp/1.7.1/lib]
and finally I want to remove the duplicates in the output_list so that the list looks like.
New_output_list =
[/opt/gcc/4.4.4/snos/lib/gcc/x86_64-suse-linux/4.4.4,
/opt/cray/cce/8.2.5/craylibs/x86-64,
/opt/cray/atp/1.7.1/lib]
I am getting the results using split() function but I am struggling to discard the library names from the path.
any help would be appreciated.
You seem to want (don't try and do string operations with paths, it's bound to end badly):
import os
New_output_List = list(set(os.path.dirname(pt) for pt in text.split()))
os.path.dirname splits a path into it's gets the directory name from a path. We do this for every item in the text, split into a list based on white-space. This is done for every item in the series.
To remove the duplicates, we just convert it to a set and then finally to a list.
try with this
text = " /opt/gcc/4.4.4/snos/lib/gcc/x86_64-suse-linux/4.4.4/crtbeginT.o /opt/gcc/4.4.4/snos/lib/gcc/x86_64-suse-linux/4.4.4/crtfastmath.o /opt/cray/cce/8.2.5/craylibs/x86-64/no_mmap.o /opt/cray/cce/8.2.5/craylibs/x86-64/libcraymath.a /opt/cray/cce/8.2.5/craylibs/x86-64/libcraymp.a /opt/cray/atp/1.7.1/lib/libAtpSigHandler.a /opt/cray/atp/1.7.1/lib/libAtpSigHCommData.a "
New_output_List = []
for x in list(set(text.split(' '))):
New_output_list.append("".join("/" + y if y else '' for y in x.split("/")[:-1]))

How to extract a string from another string wihout changing a characters case

There are two variables.
A variable drive was assigned a drive path (string).
A variable filepath was assigned a complete path to a file (string).
drive = '/VOLUMES/TranSFER'
filepath = '/Volumes/transfer/Some Documents/The Doc.txt'
First I need to find if a string stored in a drive variable is in a string stored in filepath variable.
If it is then I need to extract a string stored in drive variable from a string stored in a filepath variable without changing both variables characters case (no changing to lower or uppercase. The character case should stay the same).
So a final result should be:
result = '/Some Documents/The Doc.txt'
I could get it done with:
if drive.lower() in filepath.lower(): result = filepath.lower().split( drive.lower() )
But approach like this messes up the letter case (everything is now lowercase)
Please advise, thanks in advance!
EDITED LATER:
I coould be using my own approach. It appear IF portion of the statement
if drive.lower() in filepath.lower():
is case-sensitive. And drive in filepath will return False if case doesn't match.
So it would make sense to lower()-case everything while comparing. But a .split() method splits beautiful regardless of the letter-cases:
if drive.lower() in filepath.lower(): result = filepath.split( drive )
if filepath.lower().startswith(drive.lower() + '/'):
result = filepath[len(drive)+1:]
Using str.find:
>>> drive = '/VOLUMES/TranSFER'
>>> filepath = '/Volumes/transfer/Some Documents/The Doc.txt'
>>> i = filepath.lower().find(drive.lower())
>>> if i >= 0:
... result = filepath[:i] + filepath[i+len(drive):]
...
>>> result
'/Some Documents/The Doc.txt'

Reading files in a particular order in python

Lets say I have three files in a folder: file9.txt, file10.txt and file11.txt and i want to read them in this particular order. Can anyone help me with this?
Right now I am using the code
import glob, os
for infile in glob.glob(os.path.join( '*.txt')):
print "Current File Being Processed is: " + infile
and it reads first file10.txt then file11.txt and then file9.txt.
Can someone help me how to get the right order?
Files on the filesystem are not sorted. You can sort the resulting filenames yourself using the sorted() function:
for infile in sorted(glob.glob('*.txt')):
print "Current File Being Processed is: " + infile
Note that the os.path.join call in your code is a no-op; with only one argument it doesn't do anything but return that argument unaltered.
Note that your files will sort in alphabetical ordering, which puts 10 before 9. You can use a custom key function to improve the sorting:
import re
numbers = re.compile(r'(\d+)')
def numericalSort(value):
parts = numbers.split(value)
parts[1::2] = map(int, parts[1::2])
return parts
for infile in sorted(glob.glob('*.txt'), key=numericalSort):
print "Current File Being Processed is: " + infile
The numericalSort function splits out any digits in a filename, turns it into an actual number, and returns the result for sorting:
>>> files = ['file9.txt', 'file10.txt', 'file11.txt', '32foo9.txt', '32foo10.txt']
>>> sorted(files)
['32foo10.txt', '32foo9.txt', 'file10.txt', 'file11.txt', 'file9.txt']
>>> sorted(files, key=numericalSort)
['32foo9.txt', '32foo10.txt', 'file9.txt', 'file10.txt', 'file11.txt']
You can wrap your glob.glob( ... ) expression inside a sorted( ... ) statement and sort the resulting list of files. Example:
for infile in sorted(glob.glob('*.txt')):
You can give sorted a comparison function or, better, use the key= ... argument to give it a custom key that is used for sorting.
Example:
There are the following files:
x/blub01.txt
x/blub02.txt
x/blub10.txt
x/blub03.txt
y/blub05.txt
The following code will produce the following output:
for filename in sorted(glob.glob('[xy]/*.txt')):
print filename
# x/blub01.txt
# x/blub02.txt
# x/blub03.txt
# x/blub10.txt
# y/blub05.txt
Now with key function:
def key_func(x):
return os.path.split(x)[-1]
for filename in sorted(glob.glob('[xy]/*.txt'), key=key_func):
print filename
# x/blub01.txt
# x/blub02.txt
# x/blub03.txt
# y/blub05.txt
# x/blub10.txt
EDIT:
Possibly this key function can sort your files:
pat=re.compile("(\d+)\D*$")
...
def key_func(x):
mat=pat.search(os.path.split(x)[-1]) # match last group of digits
if mat is None:
return x
return "{:>10}".format(mat.group(1)) # right align to 10 digits.
It sure can be improved, but I think you get the point. Paths without numbers will be left alone, paths with numbers will be converted to a string that is 10 digits wide and contains the number.
You need to change the sort from 'ASCIIBetical' to numeric by isolating the number in the filename. You can do that like so:
import re
def keyFunc(afilename):
nondigits = re.compile("\D")
return int(nondigits.sub("", afilename))
filenames = ["file10.txt", "file11.txt", "file9.txt"]
for x in sorted(filenames, key=keyFunc):
print xcode here
Where you can set filenames with the result of glob.glob("*.txt");
Additinally the keyFunc function assumes the filename will have a number in it, and that the number is only in the filename. You can change that function to be as complex as you need to isolate the number you need to sort on.
glob.glob(os.path.join( '*.txt'))
returns a list of strings, so you can easily sort the list using pythons sorted() function.
sorted(glob.glob(os.path.join( '*.txt')))
for fname in ['file9.txt','file10.txt','file11.txt']:
with open(fname) as f: # default open mode is for reading
for line in f:
# do something with line

Categories

Resources