Python create lists conditionally from txt file

Python create lists conditionally from txt file - python

I have a txt file with this structure of data:
3
100 name1
200 name2
50 name3
2
1000 name1
2000 name2
0
The input contains several sets. Each set starts with a row containing one natural number N, the number of bids, 1 ≤ N ≤ 100. Next, there are N rows containing the player's price and his name separated by a space. The player's prize is an integer and ranges from 1 to 2*109.
Expected out is:
Name2
Name2
How can I find the highest price and name for each set of data?
I had to try this:(find the highest price)
offer = []
name = []
with open("futbal_zoznam_hracov.txt", "r") as f:
for line in f:
maximum = []
while not line.isdigit():
price = line.strip().split()[0]
offer.append(int(price))
break
maximum.append(max(offer[1:]))
print(offer)
print(maximum)
This creates a list of all sets but not one by one. Thank you for your advice.

You'll want to manually loop over each set using the numbers, rather than a for loop over the whole file
For example
with open("futbal_zoznam_hracov.txt") as f:
while True:
try: # until end of file
bids = int(next(f).strip())
if bids == 0:
continue # or break if this is guaranteed to be end of the file
max_price = float("-inf")
max_player = None
for _ in range(bids):
player = next(f).strip().split()
price = int(player[0])
if price > max_price:
max_price = price
max_player = player[1]
print(max_player)
except:
break

EDITED:
The lines in the input file containing a single token are irrelevant so this can be greatly simplified
with open('futbal_zoznam_hracov.txt') as f:
_set = []
for line in f:
p, *n = line.split()
if n:
_set.append((float(p), n[0]))
else:
if _set:
print(max(_set)[1])
_set = []

Related

Writing a file to a dictionary

I'm taking a CSC 110 project. I am trying use dictionaries for our assignment even though we haven't learned them yet.
I have a file of countries and how many medals they won separated by new line characters. EX:
Afghanistan
0
0
0
Albania
0
0
0
Algeria
0
2
0
Each line after the country is the medals they earned starting with gold and working its way down to bronze.
I want to take these and store them in a dictionary with the structure looking something like this.
dict={Afghanistan: [0,0,0], Albania: [0,0,0]}
What I have :
olympic_stats = {}
fileIn = open('test.txt', 'r')
line = fileIn.readline()#Initialize Loop
counter = 0
while line != '':
if counter == 4:
counter = 0
if counter%4 == 0:#First Pass, COUNTRY
country_name = line.rstrip()
elif counter%4 == 1:#Second Pass, GOLD
gold_medals = int(line)
elif counter%4 == 2:#Third Pass, SILVER
silver_medals = int(line)
else: #Fourth Pass, BRONZE
bronze_medals = int(line)
#update Counter
counter += 1
if counter == 4:
olympic_stats[country_name] = [gold_medals, silver_medals, bronze_medals]
line = fileIn.readline()#Update Loop
While this works it is nasty and over complicated. I'm trying to come up with a new way to do this.

While your answer isn't super concise its not 'bad' per say. I might do something like this:
olympic_stats = {}
while line:
line_str = line.rstrip()
if line_str[0].isalpha():
country = line_str
olympic_stats[country] = []
else:
olympic_stats[country].append(line_str)

Your loop here is pretty clumsy - you can do better. You could, for example,
read the entire file at once into a list (using file.readlines())
count through the list four items at a time
which I have done here:
olympic_stats = {}
fileIn = open('test.txt', 'r')
fileLines = fileIn.readlines()
counter = 0
while counter < len(fileLines):
country_name = fileLines[counter]
gold_metals = fileLines[counter + 1]
silver_metals = fileLines[counter + 2]
bronze_metals = fileLines[counter + 3]
olympic_stats[country_name] = [gold_medals, silver_medals, bronze_medals]
counter += 4
There are more concise but much more complicated methods of doing this, by involving list comprehension and numpy or itertools, but those are more advanced topics and this should suffice for the time being.
While implementing this you might come up against errors when the number of lines in the file isn't easily divisible by four - I'll leave you to figure out how to fix that issue on your own, as it's a valuable learning experience and not too hard.

Assigning list items with for loop causing infinite loop

I have a file with all zip codes in the US with their latitude and longitude. The file is in the format ZIP,LAT, LONG\n.
I plan on saving these to a database, so I have looped through the file word by word and set a counter variable. If the counter == 1 it should assign the value to zip_codes[] if counter == 2 assign the value to latitude[] and if counter == 3 assign the value to longitude[], but when I run the following code to test if it properly added the zip code values it becomes an infinite loop and I have to force quit IDLE
(file can be viewed at here)
zip_code_file = open('zip_codes.txt')
zip_codes=[]
latitude=[]
longitude = []
counter = 1
for s in zip_code_file.read().split(','):
s = s.strip()
if counter ==1:
zip_codes.append(s)
counter = counter +1
elif counter == 2:
latitude.append(s)
counter = counter+1
elif counter == 3:
longitude.append(s)
counter = 1
print(zip_codes)
anyone know whats going on here?

You need a little less loop:
zip_code_file = open('zip_codes.txt')
zip_codes = []
latitude = []
longitude = []
for line in zip_code_file:
zipcode, lat, lng = line.strip().split(',')
zip_codes.append(zipcode)
latitude.append(lat)
longitude.append(lng)
print(zip_codes)

Files of comma separated values should be processed with the csv module
import csv
with open('zip_codes.txt') as f:
r = csv.reader(f)
next(r) # skip the header
zip_codes, latitudes, longitudes = zip(*r)

Print data between positions within a loop

I have one files.
File1 which has 3 columns. Data are tab separated
File1:
2 4 Apple
6 7 Samsung
Let's say if I run a loop of 10 iteration. If the iteration has value between column 1 and column 2 of File1, then print the corresponding 3rd column from File1, else print "0".
The columns may or may not be sorted, but 2nd column is always greater than 1st. Range of values in the two columns do not overlap between lines.
The output Result should look like this.
Result:
0
Apple
Apple
Apple
0
Samsung
Samsung
0
0
0
My program in python is here:
chr5_1 = [[]]
for line in file:
line = line.rstrip()
line = line.split("\t")
chr5_1.append([line[0],line[1],line[2]])
# Here I store all position information in chr5_1 list in list
chr5_1.pop(0)
for i in range (1,10):
for listo in chr5_1:
L1 = " ".join(str(x) for x in listo[:1])
L2 = " ".join(str(x) for x in listo[1:2])
L3 = " ".join(str(x) for x in listo[2:3])
if int(L1) <= i and int(L2) >= i:
print(L3)
break
else:
print ("0")
break
I am confused with loop iteration and it break point.

Try this:
chr5_1 = dict()
for line in file:
line = line.rstrip()
_from, _to, value = line.split("\t")
for i in range(int(_from), int(_to) + 1):
chr5_1[i] = value
for i in range (1, 10):
print chr5_1.get(i, "0")

I think this is a job for else:
position_information = []
with open('file1', 'rb') as f:
for line in f:
position_information.append(line.strip().split('\t'))
for i in range(1, 11):
for start, through, value in position_information:
if i >= int(start) and i <= int(through):
print value
# No need to continue searching for something to print on this line
break
else:
# We never found anything to print on this line, so print 0 instead
print 0
This gives the result you're looking for:
0
Apple
Apple
Apple
0
Samsung
Samsung
0
0
0

Setup:
import io
s = '''2 4 Apple
6 7 Samsung'''
# Python 2.x
f = io.BytesIO(s)
# Python 3.x
#f = io.StringIO(s)
If the lines of the file are not sorted by the first column:
import csv, operator
reader = csv.reader(f, delimiter = ' ', skipinitialspace = True)
f = list(reader)
f.sort(key = operator.itemgetter(0))
Read each line; do some math to figure out what to print and how many of them to print; print stuff; iterate
def print_stuff(thing, n):
while n > 0:
print(thing)
n -= 1
limit = 10
prev_end = 1
for line in f:
# if iterating over a file, separate the columns
begin, end, text = line.strip().split()
# if iterating over the sorted list of lines
#begin, end, text = line
begin, end = map(int, (begin, end))
# don't exceed the limit
begin = begin if begin < limit else limit
# how many zeros?
gap = begin - prev_end
print_stuff('0', gap)
if begin == limit:
break
# don't exceed the limit
end = end if end < limit else limit
# how many words?
span = (end - begin) + 1
print_stuff(text, span)
if end == limit:
break
prev_end = end
# any more zeros?
gap = limit - prev_end
print_stuff('0', gap)

Python Greedy Algorithm

I am writing a greedy algorithm (Python 3.x.x) for a 'jewel heist'. Given a series of jewels and values, the program grabs the most valuable jewel that it can fit in it's bag without going over the bag weight limit. I've got three test cases here, and it works perfectly for two of them.
Each test case is written in the same way: first line is the bag weight limit, all lines following are tuples(weight, value).
Sample Case 1 (works):
10
3 4
2 3
1 1
Sample Case 2 (doesn't work):
575
125 3000
50 100
500 6000
25 30
Code:
def take_input(infile):
f_open = open(infile, 'r')
lines = []
for line in f_open:
lines.append(line.strip())
f_open.close()
return lines
def set_weight(weight):
bag_weight = weight
return bag_weight
def jewel_list(lines):
jewels = []
for item in lines:
jewels.append(item.split())
jewels = sorted(jewels, reverse= True)
jewel_dict = {}
for item in jewels:
jewel_dict[item[1]] = item[0]
return jewel_dict
def greedy_grab(weight_max, jewels):
#first, we get a list of values
values = []
weights = []
for keys in jewels:
weights.append(jewels[keys])
for item in jewels.keys():
values.append(item)
values = sorted(values, reverse= True)
#then, we start working
max = int(weight_max)
running = 0
i = 0
grabbed_list = []
string = ''
total_haul = 0
# pick the most valuable item first. Pick as many of them as you can.
# Then, the next, all the way through.
while running < max:
next_add = int(jewels[values[i]])
if (running + next_add) > max:
i += 1
else:
running += next_add
grabbed_list.append(values[i])
for item in grabbed_list:
total_haul += int(item)
string = "The greedy approach would steal $" + str(total_haul) + " of
jewels."
return string
infile = "JT_test2.txt"
lines = take_input(infile)
#set the bag weight with the first line from the input
bag_max = set_weight(lines[0])
#once we set bag weight, we don't need it anymore
lines.pop(0)
#generate a list of jewels in a dictionary by weight, value
value_list = jewel_list(lines)
#run the greedy approach
print(greedy_grab(bag_max, value_list))
Does anyone have any clues why it wouldn't work for case 2? Your help is greatly appreciated.
EDIT: The expected outcome for case 2 is $6130. I seem to get $6090.

Your dictionary keys are strings, not integers so they are sorted like string when you try to sort them. So you would get:
['6000', '3000', '30', '100']
instead wanted:
['6000', '3000', '100', '30']
Change this function to be like this and to have integer keys:
def jewel_list(lines):
jewels = []
for item in lines:
jewels.append(item.split())
jewels = sorted(jewels, reverse= True)
jewel_dict = {}
for item in jewels:
jewel_dict[int(item[1])] = item[0] # changed line
return jewel_dict
When you change this it will give you:
The greedy approach would steal $6130 of jewels.

In [237]: %paste
def greedy(infilepath):
with open(infilepath) as infile:
capacity = int(infile.readline().strip())
items = [map(int, line.strip().split()) for line in infile]
bag = []
items.sort(key=operator.itemgetter(0))
while capacity and items:
if items[-1][0] <= capacity:
bag.append(items[-1])
capacity -= items[-1][0]
items.pop()
return bag
## -- End pasted text --
In [238]: sum(map(operator.itemgetter(1), greedy("JT_test1.txt")))
Out[238]: 8
In [239]: sum(map(operator.itemgetter(1), greedy("JT_test2.txt")))
Out[239]: 6130

I think in this piece of code i has to be incremented on the else side too
while running < max:
next_add = int(jewels[values[i]])
if (running + next_add) > max:
i += 1
else:
running += next_add
grabbed_list.append(values[i])
i += 1 #here
this and #iblazevic's answer explains why it behaves this way

Python: keep top Nth results for csv.reader

I am doing some filtering on csv file where for every title there are many duplicate IDs with different prediction values, so the column 2 (pythoniac) is different. I would like to keep only 30 lowest values but with unique ID. I came to this code, but I don't know how to keep lowest 30 entries.
Can you please help with suggestions how to obtain 30 unique by ID entries?
# title1 id1 100 7.78E-25 # example of the line
with open("test.txt") as fi:
cmp = {}
for R in csv.reader(fi, delimiter='\t'):
for L in ligands:
newR = R[0], R[1]
if R[0] == L:
if (int(R[2]) <= int(1000) and int(R[2]) != int(0) and float(R[3]) < float("1.0e-10")):
if newR in cmp:
if float(cmp[newR][3]) > float(R[3]):
cmp[newR] = R[:-2]
else:
cmp[newR] = R[:-2]

Maybe try something along this line...
from bisect import insort
nth_lowest = [very_high_value] * 30
for x in my_loop:
do_stuff()
...
if x < nth_lowest[-1]:
insort(nth_lowest, x)
nth_lowest.pop() # remove the highest element

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python create lists conditionally from txt file - python

EDITED: The lines in the input file containing a single token are irrelevant so this can be greatly simplified with open('futbal_zoznam_hracov.txt') as f: _set = [] for line in f: p, *n = line.split() if n: _set.append((float(p), n[0])) else: if _set: print(max(_set)[1]) _set = []

Related

Writing a file to a dictionary

Assigning list items with for loop causing infinite loop

Print data between positions within a loop

Python Greedy Algorithm

Python: keep top Nth results for csv.reader

Categories

Resources