for synset in wn.synsets(wordstr):
len_lemma_names = len (synset.lemma_names)
#print len_lemma_names, synset.lemma_names
count_lemma = count_lemma + len_lemma_names
for synset_scores in swn_senti_synset:
count_synset = count_synset + 1
#print count_synset, synset_scores
I am trying to print len_lemma_names in front of count_synset but it did not work. Is there any way possible for printing them together? Thank you...
I think that you are wanting to iterate over the two, together. If this is the case, you want to use zip, or to avoid turning it all into one big list at once, itertools.izip.
from itertools import izip
for synset, synset_scores in izip(wn.synsets(wordstr), swn_senti_synset):
# Now you can deal with both at once in this loop.
len_lemma_names = len(synset.lemma_names)
count_lemma += len_lemma_names
count_synset += 1
# Mix to taste.
print len_lemma_names, count_synset
Note that the count_synset part may be better done with enumerate (I don't know its initial value or whether you're wanting to use it outside this code).
Related
So I have a good one. I'm trying to build two lists (ku_coins and bin_coins) of crypto tickers from two different exchanges, but I don't want to double up, so if it appears on both exchanges I want to remove it from ku_coins.
A slight complication occurs as Kucoin symbols come in as AION-BTC, while Binance symbols come in as AIONBTC, but it's no problem.
So firstly, I create the two lists of symbols, which runs fine, no problem. What I then try and do is loop through the Kucoin symbols and convert them to the Binance style symbol, so AIONBTC instead of AION-BTC. Then if it appears in the Binance list I want to remove it from the Kucoin list. However, it appears to randomly refuse to remove a handful of symbols that match the requirement. For example AION.
It removes the majority of doubled up symbols but in AIONs case for example it just won't delete it.
If I just do print(i) after this loop:
for i in ku_coins:
if str(i[:-4] + 'BTC') in bin_coins:
It will happily print AION-BTC as one of the symbols, as it fits the requirement perfectly. However, when I stick the ku_coins.remove(i) command in before printing, it suddenly decideds not to print AION suggesting it doesn't match the requirements. And it's doing my head in. Obviously the remove command is causing the problem, but I can't for the life of me figure out why. Any help really appreciated.
import requests
import json
ku_dict = json.loads(requests.get('https://api.kucoin.com/api/v1/market/allTickers').text)
ku_syms = ku_dict['data']['ticker']
ku_coins = []
for x in range(0, len(ku_syms)):
if ku_syms[x]['symbol'][-3:] == 'BTC':
ku_coins.append(ku_syms[x]['symbol'])
bin_syms = json.loads(requests.get('https://www.binance.com/api/v3/ticker/bookTicker').text)
bin_coins = []
for i in bin_syms:
if i['symbol'][-3:] == 'BTC':
bin_coins.append(i['symbol'])
ku_coins.sort()
bin_coins.sort()
for i in ku_coins:
if str(i[:-4] + 'BTC') in bin_coins:
ku_coins.remove(i)
#top bantz, #Fourier has already mentioned that you shouldn't modify a list you're iterating over. What you can do in this case is to create a copy of ku_coins first then iterate over that, and then remove the element from the original ku_coins that matches your if condition. See below:
ku_coins.sort()
bin_coins.sort()
# Create a copy
ku_coins_ = ku_coins[:]
# Then iterate over that copy
for i in ku_coins_:
if str(i[:-4] + 'BTC') in bin_coins:
ku_coins.remove(i)
How about modifying the code to:
while ku_coins:
i = ku_coins.pop()
if str(i[:-4] + 'BTC') in bin_coins:
pass
else:
# do something
the pop() method removes i from the ku_coins list
pop()
I'm new to BioPython and I'm trying to import a fasta/fastq file and iterate through each sequence, while performing some operation on each sequence. I know this seems basic, but my code below for some reason is not printing correctly.
from Bio import SeqIO
newfile = open("new.txt", "w")
records = list(SeqIO.parse("rosalind_gc.txt", "fasta"))
i = 0
dna = records[i]
while i <= len(records):
print (dna.name)
i = i + 1
I'm trying to basically iterate through records and print the name, however my code ends up only printing "records[0]", where I want it to print "records[1-10]". Can someone explain why it ends up only print "records[0]"?
The reason for your problem is here:
i = 0
dna = records[i]
Your object 'dna' is fixed to the index 0 of records, i.e., records[0]. Since you are not calling it again, dna will always be fixed on that declaration. On your print statement within your while loop, use something like this:
while i <= len(records):
print (records[i].name)
i = i + 1
If you would like to have an object dna as a copy of records entries, you would need to reassign dna to every single index, making this within your while loop, like this:
while i <= len(records):
dna = records[i]
print (dna.name)
i = i + 1
However, that's not the most efficient way. Finally, for you to learn, a much nicer way than with your while loop with i = i + 1 is to use a for loop, like this:
for i in range(0,len(records)):
print (records[i].name)
For loops do the iteration automatically, one by one. range() will give a set of integers from 0 to the length of records. There are also other ways, but I'm keeping it simple.
EDIT: My question was answered on reddit. Here is the link if anyone is interested in the answer to this problem https://www.reddit.com/r/learnpython/comments/42ibhg/how_to_match_fields_from_two_lists_and_further/
I am attempting to get the pos and alt strings from file1 to match up with what is in
file2, fairly simple. However, file2 has values in the 17th split element/column to the
last element/column (340th) which contains string such as 1/1:1.2.2:51:12 which
I also want to filter for.
I want to extract the rows from file2 that contain/match the pos and alt from file1.
Thereafter, I want to further filter the matched results that only contain certain
values in the 17th split element/column onwards. But to do so the values would have to
be split by ":" so I can filter for split[0] = "1/1" and split[2] > 50. The problem is
I have no idea how to do this.
I imagine I will have to iterate over these and split but I am not sure how to do this
as the code is presently in a loop and the values I want to filter are in columns not rows.
Any advice would be greatly appreciated, I have sat with this problem since Friday and
have yet to find a solution.
import os,itertools,re
file1 = open("file1.txt","r")
file2 = open("file2.txt","r")
matched = []
for (x),(y) in itertools.product(file2,file1):
if not x.startswith("#"):
cells_y = y.split("\t")
pos_y = cells[0]
alt_y = cells[3]
cells_x = x.split("\t")
pos_x = cells_x[0]+":"+cells_x[1]
alt_x = cells_x[4]
if pos_y in pos_x and alt_y in alt_x:
matched.append(x)
for z in matched:
cells_z = z.split("\t")
if cells_z[16:len(cells_z)]:
Your requirement is not clear, but you might mean this:
for (x),(y) in itertools.product(file2,file1):
if x.startswith("#"):
continue
cells_y = y.split("\t")
pos_y = cells[0]
alt_y = cells[3]
cells_x = x.split("\t")
pos_x = cells_x[0]+":"+cells_x[1]
alt_x = cells_x[4]
if pos_y != pos_x: continue
if alt_y != alt_x: continue
extra_match = False
for f in range(17, 341):
y_extra = y[f].split(':')
if y_extra[0] != '1/1': continue
if y_extra[2] <= 50: continue
extra_match = True
break
if not extra_match: continue
xy = x + y
matched.append(xy)
I chose to concatenate x and y into the matched array, since I wasn't sure whether or not you would want all the data. If not, feel free to go back to just appending x or y.
You may want to look into the csv library, which can use tab as a delimiter. You can also use a generator and/or guards to make the code a bit more pythonic and efficient. I think your approach with indexes works pretty well, but it would be easy to break when trying to modify down the road, or to update if your file lines change shape. You may wish to create objects (I use NamedTuples in the last part) to represent your lines and make it much easier to read/refine down the road.
Lastly, remember that Python has a shortcut feature with the comparative 'if'
for example:
if x_evaluation and y_evaluation:
do some stuff
when x_evaluation returns False, Python will skip y_evaluation entirely. In your code, cells_x[0]+":"+cells_x[1] is evaluated every single time you iterate the loop. Instead of storing this value, I wait until the easier alt comparison evaluates to True before doing this (comparatively) heavier/uglier check.
import csv
def filter_matching_alt_and_pos(first_file, second_file):
for x in csv.reader(open(first_file, 'rb'), delimiter='\t'):
for y in csv.reader(open(second_file, 'rb'), delimiter='\t'):
# continue will skip the rest of this loop and go to the next value for y
# this way, we can abort as soon as one value isn't what we want
# .. todo:: we could make a filter function and even use the filter() built-in depending on needs!
if x[3] == y[4] and x[0] == ":".join(y[:1]):
yield x
def match_datestamp_and_alt_and_pos(first_file, second_file):
for z in filter_matching_alt_and_pos(first_file, second_file):
for element in z[16:]:
# I am not sure I fully understood your filter needs for the 2nd half. Here, I split all elements from the 17th onward and look for the two cases you mentioned. This seems like it might be very heavy, but at least we're using generators!
# same idea as before, we abort as early as possible to avoid needless indexing and checks
for chunk in element.split(":"):
# WARNING: if you aren't 100% sure the 2nd element is an int, this is very dangerous
# here, I use the continue keyword and the negative-check to help eliminate excess overhead. The execution is very similar as above, but might be easier to read/understand and can help speed things along in some cases
# once again, I do the lighter check before the heavier one
if not int(chunk[2])> 50:
# continue automatically skips to the next iteration on element
continue
if not chunk[:1] == "1/1":
continue
yield z
if __name__ == '__main__':
first_file = "first.txt"
second_file = "second.txt"
# match_datestamp_and_alt_and_pos returns a generator; for loop through it for the lines which matched all 4 cases
match_datestamp_and_alt_and_pos(first_file=first_file, second_file=second_file)
namedtuples for the first part
from collections import namedtuple
FirstFileElement = namedtuple("FirstFrameElement", "pos unused1 unused2 alt")
SecondFileElement = namedtuple("SecondFrameElement", "pos1 pos2 unused2 unused3 alt")
def filter_matching_alt_and_pos(first_file, second_file):
for x in csv.reader(open(first_file, 'rb'), delimiter='\t'):
for y in csv.reader(open(second_file, 'rb'), delimiter='\t'):
# continue will skip the rest of this loop and go to the next value for y
# this way, we can abort as soon as one value isn't what we want
# .. todo:: we could make a filter function and even use the filter() built-in depending on needs!
x_element = FirstFileElement(*x)
y_element = SecondFileElement(*y)
if x.alt == y.alt and x.pos == ":".join([y.pos1, y.pos2]):
yield x
I've been learning Python for a couple of months, and wanted to understand a cleaner and more efficient way of writing this function. It's just a basic thing I use to look up bus times near me, then display the contents of mtodisplay on an LCD, but I'm not sure about the mtodisplay=mtodisplay+... line. There must be a better, smarter, more Pythonic way of concatenating a string, without resorting to lists (I want to output this string direct to LCD. Saves me time. Maybe that's my problem ... I'm taking shortcuts).
Similarly, my method of using countit and thebuslen seems a bit ridiculous! I'd really welcome some advice or pointers in making this better. Just wanna learn!
Thanks
json_string = requests.get(busurl)
the_data = json_string.json()
mtodisplay='220 buses:\n'
countit=0
for entry in the_data['departures']:
for thebuses in the_data['departures'][entry]:
if thebuses['line'] == '220':
thebuslen=len(the_data['departures'][entry])
print 'buslen',thebuslen
countit += 1
mtodisplay=mtodisplay+thebuses['expected_departure_time']
if countit != thebuslen:
mtodisplay=mtodisplay+','
return mtodisplay
Concatenating strings like this
mtodisplay = mtodisplay + thebuses['expected_departure_time']
Used to be very inefficient, but for a long time now, Python does reuse the string being catentated to (as long as there are no other references to it), so it's linear performance instead of the older quadratic performance which should definitely be avoided.
In this case it looks like you already have a list of items that you want to put commas between, so
','.join(some_list)
is probably more appropriate (and automatically means you don't get an extra comma at the end).
So next problem is to construct the list(could also be a generator etc.). #bgporter shows how to make the list, so I'll show the generator version
def mtodisplay(busurl):
json_string = requests.get(busurl)
the_data = json_string.json()
for entry in the_data['departures']:
for thebuses in the_data['departures'][entry]:
if thebuses['line'] == '220':
thebuslen=len(the_data['departures'][entry])
print 'buslen',thebuslen
yield thebuses['expected_departure_time']
# This is where you would normally just call the function
result = '220 buses:\n' + ','.join(mtodisplay(busurl))
I'm not sure what you mean by 'resorting to lists', but something like this:
json_string = requests.get(busurl)
the_data = json_string.json()
mtodisplay= []
for entry in the_data['departures']:
for thebuses in the_data['departures'][entry]:
if thebuses['line'] == '220':
thebuslen=len(the_data['departures'][entry])
print 'buslen',thebuslen
mtodisplay.append(thebuses['expected_departure_time'])
return '220 buses:\n' + ", ".join(mtodisplay)
Hello I'm facing a problem and I don't how to fix it. All I know is that when I add an else statement to my if statement the python execution always goes to the else statement even there is there a true statement in if and can enter the if statement.
Here is the script, without the else statement:
import re
f = open('C:\Users\Ziad\Desktop\Combination\MikrofullCombMaj.txt', 'r')
d = open('C:\Users\Ziad\Desktop\Combination\WhatsappResult.txt', 'r')
w = open('C:\Users\Ziad\Desktop\Combination\combination.txt','w')
s=""
av =0
b=""
filtred=[]
Mlines=f.readlines()
Wlines=d.readlines()
for line in Wlines:
Wspl=line.split()
for line2 in Mlines:
Mspl=line2.replace('\n','').split("\t")
if ((Mspl[0]).lower()==(Wspl[0])):
Wspl.append(Mspl[1])
if(len(Mspl)>=3):
Wspl.append(Mspl[2])
s="\t".join(Wspl)+"\n"
if s not in filtred:
filtred.append(s)
break
for x in filtred:
w.write(x)
f.close()
d.close()
w.close()
with the else statement and I want else for the if ((Mspl[0]).lower()==(Wspl[0])):
import re
f = open('C:\Users\Ziad\Desktop\Combination\MikrofullCombMaj.txt', 'r')
d = open('C:\Users\Ziad\Desktop\Combination\WhatsappResult.txt', 'r')
w = open('C:\Users\Ziad\Desktop\Combination\combination.txt','w')
s=""
av =0
b=""
filtred=[]
Mlines=f.readlines()
Wlines=d.readlines()
for line in Wlines:
Wspl=line.split()
for line2 in Mlines:
Mspl=line2.replace('\n','').split("\t")
if ((Mspl[0]).lower()==(Wspl[0])):
Wspl.append(Mspl[1])
if(len(Mspl)>=3):
Wspl.append(Mspl[2])
s="\t".join(Wspl)+"\n"
if s not in filtred:
filtred.append(s)
break
else:
b="\t".join(Wspl)+"\n"
if b not in filtred:
filtred.append(b)
break
for x in filtred:
w.write(x)
f.close()
d.close()
w.close()
first of all, you're not using "re" at all in your code besides importing it (maybe in some later part?) so the title is a bit misleading.
secondly, you are doing a lot of work for what is basically a filtering operation on two files. Remember, simple is better than complex, so for starters, you want to clean your code a bit:
you should use a little more indicative names than 'd' or 'w'. This goes for 'Wsplt', 's' and 'av' as well. Those names don't mean anything and are hard to understand (why is the d.readlines named Wlines when ther's another file named 'w'? It's really confusing).
If you choose to use single letters, it should still make sense (if you iterate over a list named 'results' it makes sense to use 'r'. 'line1' and 'line2' however, are not recommanded for anything)
You don't need parenthesis for conditions
You want to use as little variables as you can as to not get confused. There's too much different variables in your code, it's easy to get lost. You don't even use some of them.
you want to use strip rather than replace, and you want the whole 'cleaning' process to come first and then just have a code the deals with the filtering logic on the two lists. If you split each line according to some logic, and you don't use the original line anywhere in the iteration, then you can do the whole thing in the beggining.
Now, I'm really confused what you're trying to achieve here, and while I don't understand why your doing it that way, I can say that looking at your logic you are repeating yourself a lot. The action of checking against the filtered list should only happend once, and since it happens regardless of whether the 'if' checks out or not, I see absolutely no reason to use an 'else' clause at all.
Cleaning up like I mentioned, and re-building the logic, the script looks something like this:
# PART I - read and analyze the lines
Wappresults = open('C:\Users\Ziad\Desktop\Combination\WhatsappResult.txt', 'r')
Mikrofull = open('C:\Users\Ziad\Desktop\Combination\MikrofullCombMaj.txt', 'r')
Wapp = map(lambda x: x.strip().split(), Wappresults.readlines())
Mikro = map(lambda x: x.strip().split('\t'), Mikrofull.readlines())
Wappresults.close()
Mikrofull.close()
# PART II - filter using some logic
filtred = []
for w in Wapp:
res = w[:] # So as to copy the list instead of point to it
for m in Mikro:
if m[0].lower() == w[0]:
res.append(m[1])
if len(m) >= 3 :
res.append(m[2])
string = '\t'.join(res)+'\n' # this happens regardles of whether the 'if' statement changed 'res' or not
if string not in filtred:
filtred.append(string)
# PART III - write the filtered results into a file
combination = open('C:\Users\Ziad\Desktop\Combination\combination.txt','w')
for comb in filtred:
combination.write(comb)
combination.close()
I can't promise it will work (because again, like I said, I don't know what you're trying to achive) but this should be a lot easier to work with.