how remove sub path from csv file [closed] - python

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
in 'id_path' in CSV file i want remove subpath from it such as
dataframe of csv file
i want remove all path before the image file name
./input/skin-cancer-malignant-vs-benign/data/test/benign/454.jpg
./input/skin-cancer-malignant-vs-benign/data/test/benign/90.jpg
./input/skin-cancer-malignant-vs-benign/data/test/benign/147.jpg
./input/skin-cancer-malignant-vs-benign/data/test/malignant/771.jpg
./input/skin-cancer-malignant-vs-benign/data/test/malignant/208.jpg
./input/skin-cancer-malignant-vs-benign/data/test/malignant/1383.jpg
./input/skin-cancer-malignant-vs-benign/data/test/malignant/1354.jpg
the output should be
454.jpg
90.jpg
147.jpg
771.jpg
208.jpg
1383.jpg
1354.jpg

rsplit() splits the data from the right side of the string and 1 is way of saying python to stop after first split.
txt = "./input/skin-cancer-malignant-vs-benign/data/test/benign/454.jpg"
x = txt.rsplit("/",1)
#your answer
print(x[1])
on your dataframe you could do something like:
train_df['id_path'] = train_df['id_path'].apply(lambda x: x.rsplit('/',1)[1])

Using str.replace:
df["filename"] = df["path"].str.replace(r'^.*/', '')
We could also use str.extract here:
df["filename"] = df["path"].str.extract(r'([^/]+\.\S+$)')

Related

Compare 2 lists and output the similar matches [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 months ago.
Improve this question
I have 2 Python lists :
prefixList = ["12","9"]
files = ["12-a.csv","12-b.csv","9-t.txt","8-a.txt"]
and want to create a new list with a list of files that start with the prefix list, so the output will be:
fileOutput = ["12-a.csv","12-b.csv","9-t.txt"]
You can use regex and find number and search number in prefixList.
prefixList = ["12","9"]
files = ["12-a.csv","12-b.csv","9-t.txt","8-a.txt"]
new_files = [file
for file in files
if(re.search(r'\d+', file).group(0) in prefixList)]
print(new_files)
Output:
['12-a.csv', '12-b.csv', '9-t.txt']

pandas read_csv does not read column names from header=o if separator is ; (i.e. non-standard separator) [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I am trying to import a CSV into pandas and the lines are separated by a ';'. and when I add headher=0 (or header = 'infer') the result is like this:
Whereas when I tested another file that had the lines separated via ',' then all the column headers are imported correctly.
WHAT IS THE ISSUE?
Thanks for your help!
As I Don't know that what's inside in your csv file but now after using pandas.read_csv() method you can do:-
df.columns=df.loc[0]
df.drop(index=0,inplace=True)
df=df.reset_index().drop(columns=['index'])

Escape characters when joining strings [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
Trying to read a .csv file content from folder. Example code:
Files_in_folder = os.listdir(r"\\folder1\folder2")
Filename_list = []
For filename in files_in_folder:
If "sometext" in filename:
Filename_list.append(filename)
Read_this_file = "\\folder1\folder2"+max(filename_list)
Data = pandas.read_csv(Read_this_file,sep=',')
Fetching the max filename works, but the Data variable fails:
FileNotFoundError: no such file or directory.
I am able to access the folder as we see in my first line of code, but when I combine two strings, putting the r in front doesn't work, any ideas?
You need to add \ to your path when concatenating:
read_this_file = '\\folder1\\folder2\\' + max(filename_list)
But a better way to avoid that problem is to use
os.path.join("\\folder1\\folder2", max(filename_list))
for a working code, use this:
files_in_folder = os.listdir("folder1/folder2/")
filename_list = []
for filename in files_in_folder:
if "sometext" in filename:
filename_list.append(filename)
read_this_file = "folder1/folder2/"+max(filename_list)
data = pd.read_csv(read_this_file,sep=',')
Explanation:
When you put r before a string, the character following a backslash is included in the string without change, and all backslashes are left in the string.
In your example, if you try to print "\folder1\folder2" Python will read the '\f' part as a special character (just as it would for \n for example).

How to save the following python output result to .csv format [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
How can I save the result from this code to .csv format?
import re
import CSV
text = open('example.txt').read()
pattern = r'([0-9]+)[:]([0-9]+)[:](.*)'
regex = re.compile(pattern)
for match in regex.finditer(text):
result = ("{},{}".format(match.group(2),match.group(3)))
If I understood your question correctly, you can generate the CSV as follows:
import re
text = open('example.txt').read()
pattern = r'([0-9]+)[:]([0-9]+)[:](.*)'
regex = re.compile(pattern)
with open('csv_file.csv', 'w') as csv_file:
# Add header row with two columns
csv_file.write('{},{}\n'.format('Id', 'Tile'))
for match in regex.finditer(text):
result = ("{},{}".format(match.group(2),match.group(3)))
csv_file.write('{}\n'.format(result))

How to detect/filter empty string in a column [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
How do you filter empty cells in a column in a form of
blank = ''
df = df[(df['Remarks']== blank)]
please give me suggestion in the given form because i want to add multiple conditions using & or |. I tried this and the output was an error.
It could be that your columns aren't actually blank, but are nan/null.
Try the following;
df = df[df['Remarks'].isnull()]
I've removed the parenthesis from your example because you don't need to enclose a single conditional, only when you are passing multiple conditions, as such;
df = df[(df['Remarks']isnull()) & (df['Remarks'] == 'Some Value')]
I dont know what type of data you have, as you havent posted anything here.
But may be you need to convert your columns to string using str and then do comparision
try this
blank = ''
df = df[(df['Remarks'].str == blank)]

Categories

Resources