how can i extract elements from lists in python - python

I am trying to extract elements from list.
I've looked up a lot of data, but I do not know..
this is my test.txt (text file)
[ left in the table = time, right in the table = value ]
0 81
1 78
2 76
3 74
4 81
5 79
6 80
7 81
8 83
9 83
10 83
11 82
.
.
22 81
23 80
If the current time is equal to the time in the table, i want to extract the value of that time.
this is my demo.py (python file)
import datetime
now = datetime.datetime.now())
current_hour = now.hour
with open('test.txt') as f:
lines = f.readlines()
time = [int(line.split()[0]) for line in lines]
value = [int(line.split()[1]) for line in lines]
>>>time = [0,1,2,3,4,5,....,23]
>>>value = [81,78,76,......,80]

You could make a loop where you iterate over the list, looking for the current hour at every position on the list.
Starting at position 0, it will compare it with the current hour. If it's the same value, it will assign the value at the position it was found in "time" to the variable extractedValue, then it will break the loop.
If it isn't the same value, it will increase by 1 the pos variable, which we use to look into the list. So it will keep searching until the first if is True or the list ends.
pos=0
for i in time:
if(current_hour==time[pos]):
extractedValue=value[pos]
break
else:
pos+=1
pass
Feel free to ask if you don't understand something :)

Assuming unique values for the time column:
import datetime
with open('text.txt') as f:
lines = f.readlines()
#this will create a dictionary with time value from test.txt as the key
time_data_dict = { l.split(' ')[0] : l.split(' ')[1] for l in lines }
current_hour = datetime.now().hour
print(time_data_dict[current_hour])

import datetime
import csv
data = {}
with open('hour.csv', newline='') as f:
reader = csv.reader(f)
for row in reader:
k, v = row
data[k] = v
hour = str(datetime.datetime.now().hour)
print(data[str(hour)])

Related

HTML tables not showing up after running Python code

My Code is not running somehow. I'm trying to create a table that has the min, max, and average score from 5 tests in my csv file. (my columns the tests and min, max, and average the rows.)
I'm also trying to make a table for the names of students that can be found in the same csv file I'm using. The formatting I want to use is : Last name, first name, average, min and max.
Then lastly I want to determine which student has the highest average and print their name along with their score, outside of the tables.
Here's a sample for my csv file:
Jailen McQueen
89
30
70
71
26
Cay Phillip
90
10
86
3
50
Gerry Green
87
70
40
90
55
Here is my code so far:
import csv
import webbrowser
csvFile = open('testdata (1).csv') #opens csv file
csvReader = csv.reader(csvFile, delimiter=";")
test1 = []
test2 = []
test3 = []
test4 = []
test5 = []
n = {}
for i in csvReader:
test1.append(int(i[1]))
test2.append(int(i[2]))
test3.append(int(i[3]))
test4.append(int(i[4]))
test5.append(int(i[5]))
n[1[0]] = 1[1:]
csvFile.close()
testNames = {}
for v in n.keys():
tests = n[x]
tests = [int(a)for b in tests]
minimum = min(tests)
maximum = max(tests)
mean = sum(tests)/float(len(tests))
f = x.split()[0]
l = x.split()[1]
testNames[f+", "+l] = [mean, minimum, maximum]
html = open('lab3.html','w')
html.write('<html>\n<body>\n<h1>Test data</h1>\n<ul>\n') #starts creating the table
html.write('<table border=‘1’><tr><td><td>Test 1</td><td>Test 2</td><td>Test 3</td><td>Test 4</td><td>Test 5</td></tr>\n')
csvFile.close()
html.close
webbrowser.open_new_tab('lab3.html')
Thank you so much for the help!

enumerate append error while creating list from csv

I'm stuck in a process of creating list of columns. I tried to avoid using defaultdict.
Thanks for any help!
Here is my code:
# Read CSV file
with open('input.csv', 'r') as csvfile:
reader = csv.reader(csvfile)
#-----------#
row_list = []
column_list = []
year = []
suburb = []
for each in reader:
row_list = row_list + [each]
year = year + [each[0]]#create list of years
suburb = suburb + [each[2]]#create list of suburb
for (i,v) in enumerate(each[3:-1]):
column_list[i].append(v)
#print i,v
#print column_list[0]
My error message:
19 suburb = suburb + [each[2]]#create list of suburb
20 for i,v in enumerate(each[3:-1]):
---> 21 column_list[i].append(v)
22 #print i,v
23 #print column_list[0]
IndexError: list index out of range
printed result of (i,v):
0 10027
1 14513
2 3896
3 23362
4 77966
5 5817
6 24699
7 9805
8 62692
9 33466
10 38792
0 0
1 122
2 0
3
4 137
5 0
6 0
7
8
9 77
10
Basically, I want to have lists to look like this.
column[0]=['10027','0']
column[1]=['14513','122']
A sample of my csv file:
enter image description here
Yes Like Alex mentioned the problem is indeed due to trying to access the index before creating/initializing it as an alternative solution you can also consider this.
for (i,v) in enumerate(each[3:-1]):
if len(column_list) < i+1:
column_list.append([])
column_list[i].append(v)
hope It may Help !
The error happens because column_list is empty and so you can't access column_list[i] because it doesn't exist. It doesn't matter that you want to append to it because you can't append to something nonexistent, and appending doesn't create it from scratch.
column_list = defaultdict(list) would indeed solve this but since you don't want to do that, the simplest is to make sure that column_list starts out with plenty of empty lists to append to. Like this:
column_list = [[] for _ in range(size)]
where size is the number of columns, the length of each[3:-1], which is apparently 11 according to your output.

Python: Removing duplicates from col 1,2 and printing col 3 values on 1 line

I have a file with AA sequences in column 1, and in column two, the number of times they appear, which I created using Counter(). In column three I have numerical values, which are all different. The items in col 1 and col 2 can be identical.
Ex. Input file:
ADVAEDY 28 0.17805
ADVAEDY 28 0.17365
ADVAEDY 28 0.16951
...
ARYLGYNSNWYPFDY 23 4.16148
ARYLGYNSNWYPFDY 23 3.17716
ARYLGYNSNWYPFDY 23 1.74919
...
ARHLGYNSAWYPFDY 21 10.6038
ARHLGYNSAWYPFDY 21 2.3498
ARHLGYNSAWYPFDY 21 1.68818
...
AGIAFDY 20 0.457553
AGIAFDY 20 0.416321
AGIAFDY 20 0.286349
...
ATIEDH 4 2.45283
ATIEDH 4 0.553351
ATIEDH 4 0.441266
So there is 197 lines in this file. There are only 48 unique AA sequences from col 1. The code that generated this file:
input_fh = sys.argv[1] # File containing all CDR(x)
cdr_spec = sys.argv[2] # File containing CDR(x) in one column and specificities in the second
with open(input_fh, "r") as f1:
cdr = [line.strip() for line in f1]
with open(cdr_spec, "r") as f2:
cdr_spec_list = [line.strip().split() for line in f2]
cdr_spec_out = open("CDR" + c + "_counts_spec.txt", "w")
counter_cdr = Counter(cdr)
countermc_cdr = counter_cdr.most_common()
print len(countermc_cdr)
#This one might work:
for k,v in countermc_cdr:
for x,y in cdr_spec_list:
if k == x:
print >> cdr_spec_out, k, '\t', v, '\t', y
cdr_spec_out.close()
The output I want to generate is,using the example above by removing duplicates in col 1 and 2 but keeping all mtaching values in col 3 on one line:
ADVAEDY 28 0.17805, 0.17365, 0.16951
...
ARYLGYNSNWYPFDY 23 4.16148, 3.17716, 1.74919
...
ARHLGYNSAWYPFDY 21 10.6038, 2.3498, 1.68818
...
AGIAFDY 20 0.457553, 0.416321, 0.286349
...
ATIEDH 4 2.45283, 0.553351, 0.441266
Also, for each comma separated value for the "new" col 3 I would need them to be in order of largest to smallest. I would prefer to stay away from modules, as I'm still learning python and the "pythonic" way of doing things.
Any help is appreciated.
What causes the same AA to be printed additional times is the second for loop:
for x,y in cdr_spec_list:
try to load the cdr_spec_list from the start as a dictionary:
with open(cdr_spec, "r") as f2:
cdr_spec_dic = defaultdict(list) #a dictionary with the default value of list
for ln in f2:
k,v = ln.strip().split()
cdr_spec_dic[k].append(v)
Now you have a dictionary from each AA sequence to the numerical values you're presenting.
So now, we don't need the second for loop, and we can also sort while we're there.
for k,v in countermc_cdr:
print >> cdr_spec_out, k, '\t', v, '\t', ' '.join(sorted(cdr_spec_dic[k]))

Sorting a hash table and printing key and value at the same time

I have written a program in python, where I have used a hash table to read data from a file and then add data in the last column of the file corresponding to the values in the 2nd column of the file. for example, for all entries in column 2 with same values, the corresponding last column values will be added.
Now I have implemented the above successfully. Now I want to sort the table in descending order according to last column values and print these values and the corresponding 2nd column (key) values. i am not able to figure out on how to do this. Can anyone please help ?
pmt txt file is of the form
0.418705 2 3 1985 20 0
0.420657 4 5 119 3849 5
0.430000 2 3 1985 20 500
and so on...
So, for example, for number 2 in column 2, i have added all data of last column corresponding to all numbers '2' in the 2nd column. So, this process will continue for the next set of numbers lie 4, 5 ,etc in column 2.
I'm using python 3
import math
source_ip = {}
f = open("pmt.txt","r",1)
lines = f.readlines()
for line in lines:
s_ip = line.split()[1]
bit_rate = int(line.split()[-1]) + 40
if s_ip in source_ip.keys():
source_ip[s_ip] = source_ip[s_ip] + bit_rate
print (source_ip[s_ip])
else:
source_ip[s_ip] = bit_rate
f.close()
for k in source_ip.keys():
print(str(k)+": "+str(source_ip[k]))
print ("-----------")
It sounds like you want to use the sorted function with a key parameter that gets the value from the key/value tuple:
sorted_items = sorted(source_ip.items(), key=lambda x: x[1])
You could also use itemgetter from the operator module, rather than a lambda function:
import operator
sorted_items = sorted(source_ip.items(), key=operator.itemgetter(1))
How about something like this?
#!/usr/local/cpython-3.4/bin/python
import collections
source_ip = collections.defaultdict(int)
with open("pmt.txt","r",1) as file_:
for line in file_:
fields = line.split()
s_ip = fields[1]
bit_rate = int(fields[-1]) + 40
source_ip[s_ip] += bit_rate
print (source_ip[s_ip])
for key, value in sorted(source_ip.items()):
print('{}: {}'.format(key, value))
print ("-----------")

how do I match a specific number into number set efficiently?

I have a number set which contains 2375013 unique numbers in txt file. The data structure looks like this:
11009
900221
2
3
4930568
293
102
I want to match a number in a line from another data to the number set for extracting data what I need. So, I coded like this:
6 def get_US_users_IDs(filepath, mode):
7 IDs = []
8 with open(filepath, mode) as f:
9 for line in f:
10 sp = line.strip()
11 for id in sp:
12 IDs.append(id.lower())
13 return IDs
75 IDs = "|".join(get_US_users_IDs('/nas/USAuserlist.txt', 'r'))
76 matcher = re.compile(IDs)
77 if matcher.match(user_id):
78 number_of_US_user += 1
79 text = tweet.split('\t')[3]
But it takes a lot of time for running. Is there any idea to reduce run time?
What I understood is that you have a huge number of ids in a file and you want to know if a specific user_id is in this file.
You can use a python set.
fd = open(filepath, mode);
IDs = set(int(id) for id in fd)
...
if user_id in IDs:
number_of_US_user += 1
...

Categories

Resources