I have a list as follows: a = ['abc', 'def'] and I have separate files with these file names which consists of some text inside.
Now I have to extract and save this text from those 2 files inside 2 separate variables in python using a for loop as follows: c[abc] = text1 & c[def] = text2
Here is my code:
for b in a:
x = open('/Users/xyz/'+b+'.txt', 'r')
c[b] = x.read()
But I am getting a name error that c is not defined. Can anyone please help me out with this?
1.You have a list a which contains text files.
2.You want to create separate variables to store values of each file.
But this approach which you expect would take lot of unnecessary memory.
3.As per my suggestions you store the values of your text files in to a list and then iterate the list to print the data
a=["reverse","abc","def"]
li=[]
for b in a:
x = open(r"C:\Users\akash\Desktop\dailyChallaneges\\"+b+".txt", "r")
c = x.read()
li.append(c)
for ele in li:
print(ele)
OUTPUT:
afzdvxdvdxxrfc
saddfgf
cmskjfnzsk.zdnc
Related
Hi sorry if this is an obvious one, looked around online and I can't seem to find what I'm doing wrong.
I am trying to compare the contents of two lists, in two separate csv files (file A and file B). Both csv files are of x rows, but only 1 column each. File A consists of rows with sentences in each, file B consists of single words. My goal is to search the rows in file B, and if any of these rows appear in file A, append the relevant rows from file A to a separate, empty list to be exported. The code I am running is as follows:
import pandas as pd
#Importing csv files
##File A is 2 rows of sentences in 1 column, e.g. "This list should be picked up for the word DISHWASHER" and "This sentence should not appear in list_AB"
file_A = pd.read_csv(r"C:\Users\User\Desktop\File\file_A.csv")
##File B is 2 rows of singular words that should appear in file A e.g. "DISHWASHER", "QWERTYXYZ123"
file_B = pd.read_csv(r"C:\Users\User\Desktop\File\file_B.csv", converters={i: str for i in range(10)})
#Convert csv files to lists
file_A2 = file_A.values.tolist()
file_B2 = file_B.values.tolist()
#Empty list
list_AB = []
#for loop supposed to filter file_A based on file_B
for x in file_A2:
words = x[0].split(" ")
#print(words)
for y in file_B2:
#print(y)
if y in words:
list_AB.append(x)
print(list_AB)
The problem is that print(list_AB) only returns an empty list ([]), not a filtered version of file_A. The reason I want to do it this way is because the actual csv files I want to read consist of 21600 (file A) and 50400 (file B) rows. Apologies in advance if this is a really basic question.
Edit: Added images of csv file examples, couldn't see how to upload files.
The problem is in the if-statement y in words.
Here y is a list. You're searching for a list inside a list of strings (not a list of lists).
Using y[0] in words solve your problem.
This is my code:
files = open('clean.txt').readlines()
print files
finallist = []
for items in files:
new = items.split()
new.append(finallist)
And since the file of text is too huge, here is an example of "print files":
files = ['chemistry leads outstanding another story \n', 'rhapsodic moments blow narrative prevent bohemian rhapsody']
I really need each line separated by '\n' to be splitted in words & placed in a list of list just like the format below:
outcome = [['chemistry','leads','outstanding', 'another', 'story'],['rhapsodic','moments','blow', 'narrative', 'prevent', 'bohemian', 'rhapsody']]
I've tried methods just like the first code given and it returns an empty list. Please help! Thanks in advance.
The last line of your code is backwards, it seems. Instead of
new.append(finallist)
it should be
finallist.append(new)
I changed the last line to the version above, and the result was a list (finallist) containing 2 sub-lists. Here is the code that seems to work:
files = open('clean.txt').readlines()
print files
finallist = []
for items in files:
new = items.split()
finallist.append(new)
Use list comprehension to reduce line
finallist = [i.split() for i in files]
My purpose is to extract one certain column from the multiple data files.
So, I tried to use glob module to read files and tried to extract one column from each file with for statements like below:
filin = diri + '*_7.txt'
FileList=sorted(glob.glob(filin))
for INPUT in FileList:
a = []
b = []
c = []
T = []
f = open(INPUT,'r')
f.seek(0,0)
for columns in ( raw.strip().split() for raw in f):
b.append(columns[11])
t = np.array(b, float)
print t
t = list(t)
T = T + [t]
f.close()
print T
The number of data files which I used is 32. So, I expected the second 'for' statement ran only 32 times while generating only 32 arrays of t. However, the result doesn't look like what I expected.
I assume that it may be due to the influence from the first 'for' statement, but I am not sure.
Any idea or help would be really appreciated.
Thank you,
Isaac
You clear T = [] for every file. Move T = [] line before first loop.
I am really new to python and now I am struggeling with some problems while working on a student project. Basically I try to read data from a text file which is formatted in columns. I store the data in a list of list and sort and manipulate the data and write them into a file again. My problem is to align the written data in proper columns. I found some approaches like
"%i, %f, %e" % (1000, 1000, 1000)
but I don't know how many columns there will be. So I wonder if there is a way to set all columns to a fixed width.
This is how the input data looks like:
2 232.248E-09 74.6825 2.5 5.00008 499.482
5 10. 74.6825 2.5 -16.4304 -12.3
This is how I store the data in a list of list:
filename = getInput('MyPath', workdir)
lines = []
f = open(filename, 'r')
while 1:
line = f.readline()
if line == '':
break
splitted = line.split()
lines.append(splitted)
f.close()
To write the data I first put all the row elements of the list of list into one string with a free fixed space between the elements. But instead i need a fixed total space including the element. But also I don't know the number of columns in the file.
for k in xrange(len(lines)):
stringlist=""
for i in lines[k]:
stringlist = stringlist+str(i)+' '
lines[k] = stringlist+'\n'
f = open(workdir2, 'w')
for i in range(len(lines)):
f.write(lines[i])
f.close()
This code works basically, but sadly the output isn't formatted properly.
Thank you very much in advance for any help on this issue!
You are absolutely right about begin able to format widths as you have above using string formatting. But as you correctly point out, the tricky bit is doing this for a variable sized output list. Instead, you could use the join() function:
output = ['a', 'b', 'c', 'd', 'e',]
# format each column (len(a)) with a width of 10 spaces
width = [10]*len(a)
# write it out, using the join() function
with open('output_example', 'w') as f:
f.write(''.join('%*s' % i for i in zip(width, output)))
will write out:
' a b c d e'
As you can see, the length of the format array width is determined by the length of the output, len(a). This is flexible enough that you can generate it on the fly.
Hope this helps!
String formatting might be the way to go:
>>> print("%10s%9s" % ("test1", "test2"))
test1 test2
Though you might want to first create strings from those numbers and then format them as I showed above.
I cannot fully comprehend your writing code, but try working on it somehow like that:
from itertools import enumerate
with open(workdir2, 'w') as datei:
for key, item in enumerate(zeilen):
line = "%4i %6.6" % key, item
datei.write(item)
Going to re-word the question.
Basically I'm wondering what is the easiest way to manipulate a string formatted like this:
Safety/Report/Image/489
or
Safety/Report/Image/490
And sectioning off each word seperated by a slash(/), and storing each section(token) into a store so I can call it later. (Reading in about 1200 cells from a CSV file).
The answer for your question:
>>> mystring = "Safety/Report/Image/489"
>>> mystore = mystring.split('/')
>>> mystore
['Safety', 'Report', 'Image', '489']
>>> mystore[2]
'Image'
>>>
If you want to store data from more than one string, then you have several options depending on how do you want to organize it. For example:
liststring = ["Safety/Report/Image/489",
"Safety/Report/Image/490",
"Safety/Report/Image/491"]
dictstore = {}
for line, string in enumerate(liststring):
dictstore[line] = string.split('/')
print dictstore[1][3]
print dictstore[2][3]
prints:
490
491
In this case you can use in the same way a dictionary or a list (a list of lists) for storage. In case each string has a especial identifier (one better than the line number), then the dictionary is the option to choose.
I don't quite understand your code and don't have too much time to study it, but I thought that the following might be helpful, at least if order isn't important ...
in_strings = ['Safety/Report/Image/489',
'Safety/Report/Image/490',
'Other/Misc/Text/500'
]
out_dict = {}
for in_str in in_strings:
level1, level2, level3, level4 = in_str.split('/')
out_dict.setdefault(level1, {}).setdefault(
level2, {}).setdefault(
level3, []).append(level4)
print out_dict
{'Other': {'Misc': {'Text': ['500']}}, 'Safety': {'Report': {'Image': ['489', '490']}}}
If your csv is line seperated:
#do something to load the csv
split_lines = [x.strip() for x in csv_data.split('\n')]
for line_data in split_lines:
split_parts = [x.strip() for x in line_data.split('/')]
# do something with individual part data
# such as some_variable = split_parts[1] etc
# if using indexes, I'd be sure to catch for index errors in case you
# try to go to index 3 of something with only 2 parts
check out the python csv module for some importing help (I'm not too familiar).