Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
Trying to read a .csv file content from folder. Example code:
Files_in_folder = os.listdir(r"\\folder1\folder2")
Filename_list = []
For filename in files_in_folder:
If "sometext" in filename:
Filename_list.append(filename)
Read_this_file = "\\folder1\folder2"+max(filename_list)
Data = pandas.read_csv(Read_this_file,sep=',')
Fetching the max filename works, but the Data variable fails:
FileNotFoundError: no such file or directory.
I am able to access the folder as we see in my first line of code, but when I combine two strings, putting the r in front doesn't work, any ideas?
You need to add \ to your path when concatenating:
read_this_file = '\\folder1\\folder2\\' + max(filename_list)
But a better way to avoid that problem is to use
os.path.join("\\folder1\\folder2", max(filename_list))
for a working code, use this:
files_in_folder = os.listdir("folder1/folder2/")
filename_list = []
for filename in files_in_folder:
if "sometext" in filename:
filename_list.append(filename)
read_this_file = "folder1/folder2/"+max(filename_list)
data = pd.read_csv(read_this_file,sep=',')
Explanation:
When you put r before a string, the character following a backslash is included in the string without change, and all backslashes are left in the string.
In your example, if you try to print "\folder1\folder2" Python will read the '\f' part as a special character (just as it would for \n for example).
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
in 'id_path' in CSV file i want remove subpath from it such as
dataframe of csv file
i want remove all path before the image file name
./input/skin-cancer-malignant-vs-benign/data/test/benign/454.jpg
./input/skin-cancer-malignant-vs-benign/data/test/benign/90.jpg
./input/skin-cancer-malignant-vs-benign/data/test/benign/147.jpg
./input/skin-cancer-malignant-vs-benign/data/test/malignant/771.jpg
./input/skin-cancer-malignant-vs-benign/data/test/malignant/208.jpg
./input/skin-cancer-malignant-vs-benign/data/test/malignant/1383.jpg
./input/skin-cancer-malignant-vs-benign/data/test/malignant/1354.jpg
the output should be
454.jpg
90.jpg
147.jpg
771.jpg
208.jpg
1383.jpg
1354.jpg
rsplit() splits the data from the right side of the string and 1 is way of saying python to stop after first split.
txt = "./input/skin-cancer-malignant-vs-benign/data/test/benign/454.jpg"
x = txt.rsplit("/",1)
#your answer
print(x[1])
on your dataframe you could do something like:
train_df['id_path'] = train_df['id_path'].apply(lambda x: x.rsplit('/',1)[1])
Using str.replace:
df["filename"] = df["path"].str.replace(r'^.*/', '')
We could also use str.extract here:
df["filename"] = df["path"].str.extract(r'([^/]+\.\S+$)')
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 years ago.
Improve this question
I am trying to make a function that takes typically copy-pasted text that very often includes \n characters. An example of such is as follows:
func('''This
is
some
text
that I entered''')
The problem with this function is the text can sometimes be rather large, so taking it line by line to avoid ', " or ''' isn't plausible. A piece of text that can cause issues is as follows:
func('''This
is'''
some"
text'
that I entered''')
I wanted to know if there is any way I can take the text as seen in the second example and use it as a string regardless of what it is comprised of.
Thanks!
To my knowledge, you won't be able to paste the text directly into your file. However, you could paste it into a text file.
Use regex to find triple quotes ''' and other invalid characters.
Example python:
def read_paste(file):
import re
with open(file,'r') as f:
data = f.readlines()
for i,line in enumerate(data):
data[i] = re.sub('("|\')',r'\\\1',line)
output = str()
for line in data:
output += line
return output
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
Basically, I have a txt file which is full of numbers 1-5000, in no order. I am trying to import them into a python script to manipulate them and find info on averages, and whatnot.
I've tried many different methods of importing the list, but it always errors with "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte"
list = []
with open('numbers.txt', 'r') as f:
content = f.readlines()
for x in content:
row = x.split()
list.append(int(row[0]))
print(list)
The expected result is a list of numbers, in int format
However, I either get that error or in certain executions, I get a list filled with \x00 between every character.
You can try UTF-16 to encode and then split as per your code.
my code is below.
with open(path_to_file,'rb') as f:
contents = f.read()
contents = contents.rstrip("\n").decode("utf-16")
Hope it helps.
MV
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
In a Linux directory, I have several numbered files, such as "day1" and "day2". My goal is to write a code that retrieves the number from the files and add 1 to the file that has the biggest number and create a new file. So, for example, if there are files, 'day1', 'day2' and 'day3', the code should read the list of files and add 'day4'. To do so, at least I need to know how to retrieve the numbers on the file name.
I'd use os.listdir to get all the file names, remove the "day" prefix, convert the remaining characters to integers, and take the maximum.
From there, it's just a matter of incrementing the number and appending it to the same prefix:
import os
max_file = max([int(f[3:]) for f in os.listdir('some_directory')])
new_file = 'day' + str(max_file + 1)
Get all files with the os module/package (don't have the exact command handy) and then use regex(package) to get the numbers. If you don't want to look into regex you could remove the letters from your string with replace() and convert that string with int().
Glob would be good for this. It is kind of regex, but specially for file search and simpler. Basically you just use * as a wildcard, and you can select numbers too. Just google what it exactly is. It can be pretty powerful and is native to the bash shell for example.
for glob import glob
from pathlib import Path
pattern = "day"
last_file_number = max(map(lambda f: int(f[len(pattern):]), glob(pattern + "[0-9]*")))
Path("%s%d" % (pattern, last_file_number + 1)).touch()
You can also see that I use pathlib here. This is a library to deal with the file system in an OOP manner. Some people like, some don't.
So, a little disclaimer: Glob is not as powerful as regex. Here daydream for example won't be matched, but day0dream would still be matched. You can also try day*[0-9], but then daydream0 would still be matched. Off course you can also use day[0-9] if you know you stay below double digits. So, if your use case requires this, you can use glob and filter down with regex.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
i'm new in python and I'm wanting to do what I said above, but I don't have any ideas, so how can I?
From the code in your comment (you should put this in your question), it is with reading the lines from a file that you're struggling.
The idiomatic way of doing this is like so:
with open("hello.txt") as f:
for line in f:
print line,
[See File Objects in the official Python documentation].
Plugging this into your code (and removing the newline and any spaces from each line with str.strip()):
#!/usr/bin/env python
import mechanize
br = mechanize.Browser()
br.set_handle_redirect(False)
with open('urls.txt') as urls:
for url in urls:
stripped = url.strip()
print '[{}]: '.format(stripped),
try:
br.open_novisit(stripped)
print 'Funfando!'
except Exception, e:
print e
Note that URLs start with a scheme name (commonly called a protocol, such as http), followed by a colon, and two slashes hence:
[stackoverflow.com]: can't fetch relative reference: not viewing any document
But
[http://stackoverflow.com/]: Funfando!
Open the file. Iterate through the lines. Fetch the files and check for errors.