to get the text from image name using split using python - python

I have an image and i have sliced the image into 16 slices using image_slicer, and the images are named as 0804220001-5_01_01.png,0804220001-5_01_02.png,0804220001-5_01_03.png and so on. Using split I need the text 01_01,01_02,..
I tried using split and rsplit but I am not getting the result. below is my code.
#img1 is the path of the image which is
#Features/input/0804220001-5_02_04.png
name1 = img1.split('/')[-1]
patch = name1.rsplit('_')[1]
print(patch)
i am getting the output as 01 but I need the output as 01_01,01_02

You can use re.search with _(.*)\. pattern :
import re
str = " 0804220001-5_02_04.png "
print(re.search('_(.*)\.', str).group(1))
which extracts 02_04 as output.

We can try using re.findall here for a regex approach:
images = ["0804220001-5_01_01.png", "0804220001-5_01_02.png", "0804220001-5_01_03.png"]
# form single | delimeted string of all images
all_images = "|".join(images)
terms = re.findall(r'(\d+_\d+)\.png', all_images)
print(terms)
This prints:
['01_01', '01_02', '01_03']

import os
foo = 'some/path/0804220001-5_01_01.png'
print('_'.join(os.path.splitext(foo)[-2].split('_')[-2:]))
output
01_01
Of course you can make this a function

In your case, if you want to stick to your method and without using os path or anything, you can just fix the rsplit() to split(). You can use split() to split. The below code meant, you split with '_' at maximum one time and starting from the left. Split() optional argument is how many times you want to split and it defaults to -1(infinite).
img1 = "Features/input/0804220001-5_02_04.png"
name1 = img1.split('/')[-1]
patch = name1.split('_', 1)[1] # this if you don't want the file name --> [:-4]
print(patch)

Related

How can I replace a specific text in string python?

I'm confused when trying to replace specific text in python
my code is:
Image = "/home/user/Picture/Image-1.jpg"
Image2 = Image.replace("-1", "_s", 1)
print(Image)
print(Image2)
Output:
/home/user/Picture/Image-1.jpg
/home/user/Picture/Image_s.jpg
The output what I want from Image2 is:
/home/user/Picture/Image-1_s.jpg
You are replacing the -1 with _s
If you want to keep the -1 as well, you can just add it in the replacement
Image = "/home/user/Picture/Image-1.jpg"
Image2 = Image.replace("-1", "-1_s", 1)
print(Image)
print(Image2)
Output
/home/user/Picture/Image-1.jpg
/home/user/Picture/Image-1_s.jpg
If the digits can be variable, you can also use a pattern with for example 2 capture groups, and then use those capture groups in the replacement with _s in between
import re
pattern = r"(/home/user/Picture/Image-\d+)(\.jpg)\b"
s = "/home/user/Picture/Image-1.jpg\n/home/user/Picture/Image-2.jpg"
print(re.sub(pattern, r"\1_s\2", s))
Output
/home/user/Picture/Image-1_s.jpg
/home/user/Picture/Image-2_s.jpg
Or for example only taking the /Image- into account and then use the full match in the replacement instead of using capture groups:
import re
pattern = r"/Image-\d+(?=\.jpg)\b"
s = "/home/user/Picture/Image-1.jpg\n/home/user/Picture/Image-2.jpg"
print(re.sub(pattern, r"\g<0>_s", s))
Output
/home/user/Picture/Image-1_s.jpg
/home/user/Picture/Image-2_s.jpg
The behaviour of the code you wrote is working as I would have expected from reading it. Now, as to how to correct it to do what you expected/wanted it to do is a little different. You don't necessarily need to replace here, instead, you can consider appending what you need, as it seems from the behaviour you are looking for is in fact appending something to the end of the path before the extension.
We can try to help the code a bit by making it a little more "generic" by allowing us to simply "append" anything to the end of a string. The steps we can do to achieve this is (for other readers, yes there are more foolproof ways to do this, for now sticking to a simple example) :
split the string at . so that you end up with a list containing:
["/home/user/Picture/Image-1", "jpg"]
Append to the first element what you need to the end of the string so you end up with:
"/home/user/Picture/Image-1_s"
Use join to re-craft your string, but use .:
".".join(["/home/user/Picture/Image-1_s", "jpg"])
You will finally get:
/home/user/Picture/Image-1_s.jpg
Coding the above, we can have it work as follows:
>>> Image1 = "/home/user/Picture/Image-1.jpg"
>>> img_split = Image1.split(".")
>>> img_split
['/home/user/Picture/Image-1', 'jpg']
>>> img_split[0] = img_split[0] + "_s"
>>> img_split
['/home/user/Picture/Image-1_s', 'jpg']
>>> final_path = ".".join(img_split)
>>> final_path
'/home/user/Picture/Image-1_s.jpg'
More idiomatic approach using Python's pathlib module is an interesting solution too.
from pathlib import Path
Image1 = "/home/user/Picture/Image-1.jpg"
p = Path(Image1)
# you have access to all the parts you need. Like the path to the file:
p.parent # outputs PosixPath('/home/user/Picture/')
# The name of the file without extension
p.stem # outputs 'Image-1'
# The extension of the file
p.suffix # outputs '.jpg'
# Finally, we get to now rename it using the rename method!
p.rename(p.parent / f"{p.stem}_s{p.suffix}")
# This will now result in the following object with renamed file!
# PosixPath('/home/user/Picture/Image-1_s.jpg')
The replace function replaces "-1" with "_s".
If you want the output to be: /home/user/Picture/Image-1_s.jpg
You should replace "-1" with "-1_s".
Try:
Image = "/home/user/Picture/Image-1.jpg"
Image2 = Image.replace("-1", "-1_s")
print(Image)
print(Image2)
Try this
i think you should append the string in a certain position not replace
Image = "/home/user/Picture/Image-1.jpg"
Image2 = Image[:26]+ '_s' + Image[26:]
print(Image2)
The output

How to overlay multiple images onto certain original images using python

I am trying to create a program in which I should be able to overlay one image over another base image using python and opencv and store the output image in an other folder . I am using opencv to achieve this however the code I have written is not giving the desired result.
import cv2
from os import listdir
from os.path import isfile, join
import numpy as np
path_wf = 'wf_flare'
path_fr = 'captured_flare'
files_wf = [ f for f in listdir(path_wf) if isfile(join(path_wf,f))]
files_fr = [ fl for fl in listdir(path_fr) if isfile(join(path_fr,fl))]
img_wf = np.empty(len(files_wf), dtype = object)
img_fr = np.empty(len(files_fr), dtype = object)
img = np.empty(len(files_wf), dtype = object)
k = 0
for n in range(0, len(files_wf)):
img_wf[n] = cv2.imread(join(path_wf, files_wf[n]))
img_fr[k] = cv2.imread(join(path_fr, files_fr[k]))
print("Done Reading"+str(n))
img_wf[n] = cv2.resize(img_wf[n], (1024,1024),interpolation = cv2.INTER_AREA)
img[n] = 0.4*img_fr[k] + img_wf[n]
fn = listdir(path_wf)
name = 'C:\Flare\flare_img'+str(fn[n])
print('Creating...'+ name + str(n+10991))
cv2.imwrite(name,img[n])
k += 1
if(k%255 == 0):
k = 0
else:
continue
the folder organization is pasted below:
I want the output images to come here:
There are two issues in the following line:
name = 'C:\Flare\flare_img'+str(fn[n])
In Python, special characters in strings are escaped with backslashes. Some examples are \n (newline), \t (tab), \f (form feed), etc. In your case, the \f is a special character that leads to a malformed path. One way to fix this is to use raw strings by adding an r before the first quote:
'C:\Flare\flare_img'
Out[12]: 'C:\\Flare\x0clare_img'
r'C:\Flare\flare_img'
Out[13]: 'C:\\Flare\\flare_img'
Do not just concatenate strings when you create filesystem paths. Sooner or later you end up misplacing a path separator. In this case, it is missing because fn[n] does not start with one. Let's say that fn[n] = "spam.png". Then assuming you do
name = r'C:\Flare\flare_img'+str(fn[n])
your value for name will be
C:\\Flare\\flare_imgspam.png
which is not what you intended.
Use os.path.join or the modern pathlib.Path as previously suggested. It is also redundant to wrap fn[n] in the str function because os.listdir already returns a list of strings.
The changes you need to make are as follows:
# add to imports section
from pathlib import Path
# add before for-loop
out_path = Path(r'C:\Flare\flare_img')
# change inside for-loop
name = out_path / fn[n]
Documentation: Python Strings

Python modify text file by the name of arguments

I have a text file ("input.param"), which serves as an input file for a package. I need to modify the value of one argument. The lines need to be changed are the following:
param1 0.01
model_name run_param1
I need to search the argument param1 and modify the value of 0.01 for a range of different values, meanwhile the model_name will also be changed accordingly for different value of param1. For example, if the para1 is changed to be 0.03, then the model_name is changed to be 'run_param1_p03'. Below is some of my attempting code:
import numpy as np
import os
param1_range = np.arange(0.01,0.5,0.01)
with open('input.param', 'r') as file :
filedata = file.read()
for p_value in param1_range:
filedata.replace('param1 0.01', 'param1 ' + str(p_value))
filedata.replace('model_name run_param1', 'model_name run_param1' + '_p0' + str(int(round(p_value*100))))
with open('input.param', 'w') as file:
file.write(filedata)
os.system('./bin/run_app param/input.param')
However, this is not working. I guess the main problem is that the replace command can not recognize the space. But I do not know how to search the argument param1 or model_name and change their values.
I'm editing this answer to more accurately answer the original question, which it did not adequately do.
The problem is "The replace command can not recognize the space". In order to do this, the re, or regex module, can be of great help. Your document is composed of an entry and its value, separated by spaces:
param1 0.01
model_name run_param1
In regex, a general capture would look like so:
import re
someline = 'param1 0.01'
pattern = re.match(r'^(\S+)\s+(\S+)$', someline)
pattern.groups()
# ('param1', '0.01')
The regex functions as follows:
^ captures a start-of-line
\S is any non-space char, or, anything not in ('\t', ' ', '\r', '\n')
+ indicates one or more as a greedy search (will go forward until the pattern stops matching)
\s+ is any whitespace char (opposite of \S, note the case here)
() indicate groups, or how you want to group your search
The groups make it fairly easy for you to unpack your arguments into variables if you so choose. To apply this to the code you have already:
import numpy as np
import re
param1_range = np.arange(0.01,0.5,0.01)
filedata = []
with open('input.param', 'r') as file:
# This will put the lines in a list
# so you can use ^ and $ in the regex
for line in file:
filedata.append(line.strip()) # get rid of trailing newlines
# filedata now looks like:
# ['param1 0.01', 'model_name run_param1']
# It might be easier to use a dictionary to keep all of your param vals
# since you aren't changing the names, just the values
groups = [re.match('^(\S+)\s+(\S+)$', x).groups() for x in filedata]
# Now you have a list of tuples which can be fed to dict()
my_params = dict(groups)
# {'param1': '0.01', 'model_name': 'run_param1'}
# Now just use that dict for setting your params
for p_value in param1_range:
my_params['param1'] = str(p_value)
my_params['model_name'] = 'run_param1_p0' + str(int(round(p_value*100)))
# And for the formatting back into the file, you can do some quick padding to get the format you want
with open('somefile.param', 'w') as fh:
content = '\n'.join([k.ljust(20) + v.rjust(20) for k,v in my_params.items()])
fh.write(content)
The padding is done using str.ljust and str.rjust methods so you get a format that looks like so:
for k, v in dict(groups).items():
intstr = k.ljust(20) + v.rjust(20)
print(intstr)
param1 0.01
model_name run_param1
Though you could arguably leave out the rjust if you felt so inclined.

how to cut the end of a string by some condition in python?

I have searched possible ways but I am unable to mix those up yet. I have a string that is a path to the image.
myString= "D:/Train/16_partitions_annotated/partition1/images/AAAAA/073-1_00191.jpeg"
What I want to do is replace images with IMAGES and cut off the 073-1_00191.jpeg part at the end. Thus, the new string string should be
newString = "D:/Train/16_partitions_annotated/partition1/IMAGES/AAAAA/"
And the chopped part (073-1_00191.jpeg) will be used separately as the name of processed image. The function .replace() doesn't work here as I need to provide path and filename as separate parameters.
The reason why I want to do is that I am accessing images through their paths and doing some stuff on them and when saving them I need to create another directory (in this case IMAGES) and the next directories after that (in this case AAAAA) should remain the same ( together with the name of corresponding image).
Note that images may have different names and extensions
If something is not clear by my side please ask, I will try to clear up
As alluded to in the comments, os.path is useful for manipulating paths represented as strings.
>>> import os
>>> myString= "D:/Train/16_partitions_annotated/partition1/images/AAAAA/073-1_00191.jpeg"
>>> dirname, basename = os.path.split(myString)
>>> dirname
'D:/Train/16_partitions_annotated/partition1/images/AAAAA'
>>> basename
'073-1_00191.jpeg'
At this point, how you want to handle capitalizing "images" is a function of your broader goal. If you want to simply capitalize that specific word, dirname.replace('images', 'IMAGES') should suffice. But you seem to be asking for a more generalized way to capitalize the second to last directory in the absolute path:
>>> def cap_penultimate(dirname):
... h, t = os.path.split(dirname)
... hh, ht = os.path.split(h)
... return os.path.join(hh, ht.upper(), t)
...
>>> cap_penultimate(dirname)
'D:/Train/16_partitions_annotated/partition1/IMAGES/AAAAA'
It's game of slicing , Here you can try this :
myString= "D:/Train/16_partitions_annotated/partition1/images/AAAAA/073-1_00191.jpeg"
myString1=myString.split('/')
pre_data=myString1[:myString1.index('images')]
after_data=myString1[myString1.index('images'):]
after_data=['IMAGE'] + after_data[1:2]
print("/".join(pre_data+after_data))
output:
D:/Train/16_partitions_annotated/partition1/IMAGE/AAAAA
The simple way :
myString= "D:/Train/16_partitions_annotated/partition1/images/AAAAA/073-1_00191.jpeg"
a = myString.rfind('/')
filename = myString[a+1:]
restofstring = myString[0:a]
alteredstring = restofstring.replace('images', 'IMAGES')
print(alteredstring)
output:
D:/Train/16_partitions_annotated/partition1/IMAGE/AAAAA

split based on multiple numbers in python

Can you help me figure out how to split based on multiple/group of number as delimiter?
I have content in a file in below format:
data_file_10572_2018-02-15-12-57-29.file
header_file_13238_2018-02-15-12-57-48.file
sig_file1_17678_2018-02-15-12-57-14.file
Expected output:
data_file
header_file
sig_file1
I'm new to python and I'm not sure how to cut based on group of number. Thanks for the reply!!
I hope this will help you. Method finds the element that can be casted to integer and return a string up to this value.
data = ['data_file_10572_2018-02-15-12-57-29.file', 'header_file_13238_2018-02-15-12-57-48.file', 'sig_file1_17678_2018-02-15-12-57-14.file']
def split_before_int(elem):
filename = elem.split('_')
for part in filename:
if not isinstance(part, (int)):
return '_'.join(filename[:filename.index(part)-2])
for elem in data:
print(split_before_int(elem))
Output:
data_file
header_file
sig_file1
First index to get the second location of the _ symbol, then python list partial indexing (i.e. list[0:5]) to get a substring up to the location of the second _.
files = ['data_file_10572_2018-02-15-12-57-29.file', 'header_file_13238_2018-02-15-12-57-48.file','sig_file1_17678_2018-02-15-12-57-14.file']
cleaned_files = list(map(lambda file: '_'.join(file.split('_')[0:2]), files))
This results in:
['data_file', 'header_file', 'sig_file1']
You can use the split by "_" with regex and then join the elements excluding the last
Ex:
import re
a = "data_file_10572_2018-02-15-12-57-29.file"
print "_".join(re.match("(.*?)_\d",a).group().split("_")[:-1])
output:
data_file
This code will work if all you filenames follow the pattern you described.
filename = 'data_file_10572_2018-02-15-12-57-29.file'
parts = filename.split('_')
new_filename = '_'.join(parts[:2])
If alphabetical part fo file name has variable number of underscores it's better to use Regex.
import re
pattern = re.compile('_[0-9_-]{3,}.file$')
re.sub(pattern, '', filename)
Output:
data_file
Essentially, first, it creates a pattern that starts with _, followed by 3 or more numbers, _ or - and ends with .file.
Then you replace the largest substring of you string that follows this pattern with an empty string.

Categories

Resources