I have to change the res variable value in the next code for every loop it does.
txt = open(os.path.expanduser('~FOLDER\\numbers.txt'), 'r')
res = txt.read().splitlines()
u = [something]
for item in u:
var['Number : ' + res[0]]
txt variable contains a text file. In this text file there some lines of numbers in this format:
123
1234
125342
562546
I have to take a variable for each loop the script does and assign to res. At the moment, with res[0] it only iterate the same number (ex: 123) on every loop. How can I solve the problem ?
It should be 0 at first, 1 at second ad so on...
I think this should do the job :
with open(os.path.expanduser('~FOLDER\\numbers.txt'), 'r') as res:
for line in res:
var['Number ': line]
More info here
txt = open(os.path.expanduser('~FOLDER\\numbers.txt'), 'r')
res = txt.read().splitlines()
u = [something]
for index, item in enumerate(u):
var['Number : ' + res[index]]
More info about enumerate here: https://docs.python.org/2/library/functions.html#enumerate
I assume you want to iterate through u and v simultaneously. In this case, you either want to just use a plain index using for loop over a range, or you could use enumerate as follows:
txt = open(os.path.expanduser('~FOLDER\\numbers.txt'), 'r')
res = txt.read().splitlines()
u = [something]
for index, item in enumerate(u):
var['Number : ' + res[index]]
Related
I have a text file with twenty car prices and its serial number there are 50 lines in this file. I would like to find the max car price and its serial for every 10 lines.
priceandserial.txt
102030 4000.30
102040 5000.40
102080 5500.40
102130 4000.30
102140 5000.50
102180 6000.50
102230 2000.60
102240 4000.30
102280 6000.30
102330 9000.70
102340 1000.30
102380 3000.30
102430 4000.80
102440 5000.30
102480 7000.30
When I tried Python's builtin max function I get 102480 as the max value.
x = np.loadtxt('carserial.txt', unpack=True)
print('Max:', np.max(x))
Desired result:
102330 9000.70
102480 7000.30
There are 50 lines in file, therefore I should have a 5 line result with serial and max prices of each 10 lines.
Respectfully, I think the first solution is over-engineered. You don't need numpy or math for this task, just a dictionary. As you loop through, you update the dictionary if the latest value is greater than the current value, and do nothing if it isn't. Everything 10th item, you append the values from the dictionary to an output list and reset the buffer.
with open('filename.txt', 'r') as opened_file:
data = opened_file.read()
rowsplitdata = data.split('\n')
colsplitdata = [u.split(' ') for u in rowsplitdata]
x = [[int(j[0]), float(j[1])] for j in colsplitdata]
output = []
buffer = {"max":0, "index":0}
count = 0
#this assumes x is a list of lists, not a numpy array
for u in x:
count += 1
if u[1] > buffer["max"]:
buffer["max"] = u[1]
buffer["index"] = u[0]
if count == 10:
output.append([buffer["index"], buffer["max"]])
buffer = {"max":0, "index":0}
count = 0
#append the remainder of the buffer in case you didn't get to ten in the final pass
output.append([buffer["index"], buffer["max"]])
output
[[102330, 9000.7], [102480, 7000.3]]
You should iterate over it and for each 10 lines extract the maximum:
import math
# New empty list for colecting the results
max_list=[]
#iterate thorught x supposing
for i in range(math.ceil(len(x)/10)):
### append only 10 elments if i+10 is not superior to the lenght of the array
if i+11<len(x):
max_list=max_list.append(np.max(x[i:i+11]))
### if it is superior, then append all the remaining elements
else:
max_list=max_list.append(np.max(x[i:]))
This should do your job.
number_list = [[],[]]
with open('filename.txt', 'r') as opened_file:
for line in opened_file:
if len(line.split()) == 0:
continue
else:
a , b = line.split(" ")
number_list[0].append(a)
number_list[1].append(b)
col1_max, col2_max = max(number_list[0]), max(number_list[1])
col1_max, col2_max
Just change the filename. col1_max, col2_max have the respective column's max value. You can edit the code to accommodate more columns.
You can transpose your input first, then use np.split and for each submatrix you calculate its max.
x = np.genfromtxt('carserial.txt', unpack=True).T
print(x)
for submatrix in np.split(x,len(x)//10):
print(max(submatrix,key=lambda l:l[1]))
working example
I'm a little beginner,
i have a list and I need to change the first ";" with "\n[" , and the third ";" with "]"
i have this:
print(lista)
>A0A024;167;188;22;DiPPE
>A0AV;1;25;25;DiWC
>A0AV6;38;58;21;Diwc
>A0AV7;408;432;25;Diwc
i try:
lista1=str(lista).replace(";","\n[",1)
but only replace the first in the list:
>A0A024
[167;188;22;DiPPE
>A0AV;1;25;25;DiWC
>A0AV6;38;58;21;DiwC
>A0AV7;408;432;25;DiwC
need to be:
>A0A024
[167;188]22;DiPPE
>A0AV
[1;25]25;DiWC
>A0AV6
[38;58]21;DiwC
>A0AV7
[408;432]25;DiwC
Create the data
parts = """A0A024;167;188;22;DiPPE
A0AV;1;25;25;DiWC
A0AV6;38;58;21;Diwc
A0AV7;408;432;25;Diwc""".split("\n")
Go over the data of lines, split at ; and recombine as wanted:
for idx,line in enumerate(parts):
# make it a list without any ;
pp = line.split(";")
# make it a string and reassign into parts
parts[idx]= pp[0] + "\n[" + pp[1] + ";" + pp[2] + "]" + ";".join(pp[3:])
print(parts)
for p in parts:
print(p)
Output:
# data as list
['A0A024\n[167;188]22;DiPPE', 'A0AV\n[1;25]25;DiWC',
'A0AV6\n[38;58]21;Diwc', 'A0AV7\n[408;432]25;Diwc']
# data linewise
A0A024
[167;188]22;DiPPE
A0AV
[1;25]25;DiWC
A0AV6
[38;58]21;Diwc
A0AV7
[408;432]25;Diwc
You can use str.replace and a list comprehension.
The first replace replaces all ';' to ']'.
The second replaces the 3 first ']' to ';'.
And the last one replaces the first ';' to '\n['.
data = [">A0A024;167;188;22;DiPPE",
">A0AV;1;25;25;DiWC",
">A0AV6;38;58;21;Diwc",
">A0AV7;408;432;25;Diwc"]
res = [s.replace(';', ']').replace(']', ';', 3). replace(';', '\n[', 1) for s in data]
for s in res:
print(s)
You can split each line on ; and create a new line by formatting the parts:
def format_line(line):
return '{0}\n[{1};{2}]{3};{4}'.format(*line.split(';'))
Using this function, you can do:
data = """A0A024;167;188;22;DiPPE
A0AV;1;25;25;DiWC
A0AV6;38;58;21;Diwc
A0AV7;408;432;25;Diwc"""
lines = data.split('\n')
out = '\n'.join([format_line(line) for line in lines])
Output:
print(out)
A0A024
[167;188]22;DiPPE
A0AV
[1;25]25;DiWC
A0AV6
[38;58]21;Diwc
A0AV7
[408;432]25;Diwc
here a solution with re
import re
# my guess on how lista looks like, it is more useful to show the actual variable, than the output btw.
lista = ">A0A024;167;188;22;DiPPE\n>A0AV;1;25;25;DiWC\n>A0AV6;38;58;21;Diwc\n>A0AV7;408;432;25;Diwc"
lista = lista.split("\n")
lista1 = []
for elem in lista:
lista1.append(re.sub(r'(.+?);(.+?;.+?);', r'\1\n[\2]', elem))
print(*lista1, sep='\n')
for the first elem in the list r'(.+?);(.+?;.+?);' will match >A0A024;167;188;, and it will substitute each of the 2 groups (>A0A024, and 167;188) in the match with this pattern r'\1\n[\2]'.
I'm new to programming and python and I'm looking for a way to distinguish between two input formats in the same input file text file. For example, let's say I have an input file like so where values are comma-separated:
5
Washington,A,10
New York,B,20
Seattle,C,30
Boston,B,20
Atlanta,D,50
2
New York,5
Boston,10
Where the format is N followed by N lines of Data1, and M followed by M lines of Data2. I tried opening the file, reading it line by line and storing it into one single list, but I'm not sure how to go about to produce 2 lists for Data1 and Data2, such that I would get:
Data1 = ["Washington,A,10", "New York,B,20", "Seattle,C,30", "Boston,B,20", "Atlanta,D,50"]
Data2 = ["New York,5", "Boston,10"]
My initial idea was to iterate through the list until I found an integer i, remove the integer from the list and continue for the next i iterations all while storing the subsequent values in a separate list, until I found the next integer and then repeat. However, this would destroy my initial list. Is there a better way to separate the two data formats in different lists?
You could use itertools.islice and a list comprehension:
from itertools import islice
string = """
5
Washington,A,10
New York,B,20
Seattle,C,30
Boston,B,20
Atlanta,D,50
2
New York,5
Boston,10
"""
result = [[x for x in islice(parts, idx + 1, idx + 1 + int(line))]
for parts in [string.split("\n")]
for idx, line in enumerate(parts)
if line.isdigit()]
print(result)
This yields
[['Washington,A,10', 'New York,B,20', 'Seattle,C,30', 'Boston,B,20', 'Atlanta,D,50'], ['New York,5', 'Boston,10']]
For a file, you need to change it to:
with open("testfile.txt", "r") as f:
result = [[x for x in islice(parts, idx + 1, idx + 1 + int(line))]
for parts in [f.read().split("\n")]
for idx, line in enumerate(parts)
if line.isdigit()]
print(result)
You're definitely on the right track.
If you want to preserve the original list here, you don't actually have to remove integer i; you can just go on to the next item.
Code:
originalData = []
formattedData = []
with open("data.txt", "r") as f :
f = list(f)
originalData = f
i = 0
while i < len(f): # Iterate through every line
try:
n = int(f[i]) # See if line can be cast to an integer
originalData[i] = n # Change string to int in original
formattedData.append([])
for j in range(n):
i += 1
item = f[i].replace('\n', '')
originalData[i] = item # Remove newline char in original
formattedData[-1].append(item)
except ValueError:
print("File has incorrect format")
i += 1
print(originalData)
print(formattedData)
The following code will produce a list results which is equal to [Data1, Data2].
The code assumes that the number of entries specified is exactly the amount that there is. That means that for a file like this, it will not work.
2
New York,5
Boston,10
Seattle,30
The code:
# get the data from the text file
with open('filename.txt', 'r') as file:
lines = file.read().splitlines()
results = []
index = 0
while index < len(lines):
# Find the start and end values.
start = index + 1
end = start + int(lines[index])
# Everything from the start up to and excluding the end index gets added
results.append(lines[start:end])
# Update the index
index = end
I am new to python and trying to write my dictionary values to a file using Python 2.7. The values in my Dictionary D is a list with at least 2 items.
Dictionary has key as TERM_ID and
value has format [[DOC42, POS10, POS22], [DOC32, POS45]].
It means the TERM_ID (key) lies in DOC42 at POS10, POS22 positions and it also lies in DOC32 at POS45
So I have to write to a new file in the format: a new line for each TERM_ID
TERM_ID (tab) DOC42:POS10 (tab) 0:POS22 (tab) DOC32:POS45
Following code will help you understand what exactly am trying to do.
for key,valuelist in D.items():
#first value in each list is an ID
docID = valuelist[0][0]
for lst in valuelist:
file.write('\t' + lst[0] + ':' + lst[1])
lst.pop(0)
lst.pop(0)
for n in range(len(lst)):
file,write('\t0:' + lst[0])
lst.pop(0)
The output I get is :
TERM_ID (tab) DOC42:POS10 (tab) 0:POS22
DOC32:POS45
I tried using the new line tag as well as commas to continue file writing on the same line at no of places, but it did not work. I fail to understand how the file write really works.
Any kind of inputs will be helpful. Thanks!
#Falko I could not find a way to attach the text file hence here is my sample data-
879\t3\t1
162\t3\t1
405\t4\t1455
409\t5\t1
13\t6\t15
417\t6\t13
422\t57\t1
436\t4\t1
141\t8\t1
142\t4\t145
170\t8\t1
11\t4\t1
184\t4\t1
186\t8\t14
My sample running code is -
with open('sampledata.txt','r') as sample,open('result.txt','w') as file:
d = {}
#term= ''
#docIndexLines = docIndex.readlines()
#form a d with format [[doc a, pos 1, pos 2], [doc b, poa 3, pos 8]]
for l in sample:
tID = -1
someLst = l.split('\\t')
#if len(someLst) >= 2:
tID = someLst[1]
someLst.pop(1)
#if term not in d:
if not d.has_key(tID):
d[tID] = [someLst]
else:
d[tID].append(someLst)
#read the dionary to generate result file
docID = 0
for key,valuelist in d.items():
file.write(str(key))
for lst in valuelist:
file.write('\t' + lst[0] + ':' + lst[1])
lst.pop(0)
lst.pop(0)
for n in range(len(lst)):
file.write('\t0:' + lst[0])
lst.pop(0)
My Output:
57 422:1
3 879:1
162:1
5 409:1
4 405:1455
436:1
142:145
11:1
184:1
6 13:15
417:13
8 141:1
170:1
186:14
Expected output:
57 422:1
3 879:1 162:1
5 409:1
4 405:1455 436:1 142:145 11:1 184:1
6 13:15 417:13
8 141:1 170:1 186:14
You probably don't get the result you're expecting because you didn't strip the newline characters \n while reading the input data. Try replacing
someLst = l.split('\\t')
with
someLst = l.strip().split('\\t')
To enforce the mentioned line breaks in your output file, add a
file.write('\n')
at the very end of your second outer for loop:
for key,valuelist in d.items():
// ...
file.write('\n')
Bottom line: write never adds a line break. If you do see one in your output file, it's in your data.
I have a large list of lists like:
X = [['a','b','c','d','e','f'],['c','f','r'],['r','h','l','m'],['v'],['g','j']]
each inner list is a sentence and the members of these lists are actually the word of this sentences.I want to write this list in a file such that each sentence(inner list) is in a separate line in the file, and each line has a number corresponding to the placement of this inner list(sentence) in the large this. In the case above. I want the output to look like this:
1. a b c d e f
2. c f r
3. r h l m
4.v
5.g j
I need them to be written in this format in a "text" file. Can anyone suggest me a code for it in python?
Thanks
with open('somefile.txt', 'w') as fp:
for i, s in enumerate(X):
print >>fp, '%d. %s' % (i + 1, ' '.join(s))
with open('file.txt', 'w') as f:
i=1
for row in X:
f.write('%d. %s'%(i, ' '.join(row)))
i+=1