I want to solve two problems regarding sorting my list in python.
1) In my list, there is an element starts with "noname" and a number comes after it like this, "noname3" or "noname4" (each list contains only one noname+number)
This noname aggregates all the nonames and the number after it shows however many nonames are there.
My question is that how can I send this noname+integer element to the end?
2) As you can see below, sorted function will sort English first then Korean. Is there any way that I can sort Korean first then English? Of course 'noname' at the end.
names = ['Z', 'C', 'A B', 'noname3', 'ㄴ', 'ㄱ', 'D A', 'A A' , 'ㄷ']
sorted(names)
# Output
['A A', 'A B', 'C', 'D A','noname3', 'Z', 'ㄱ', 'ㄴ', 'ㄷ']
# Desired Output
[ 'ㄱ', 'ㄴ', 'ㄷ', 'A A', 'A B', 'C', 'D A', 'Z', 'noname3']
Use a key function that sorts the noname items higher than the non-noname items.
sorted(names, key=lambda x: (x.startswith("noname"), x))
Without knowing how exactly Korean characters are alphabetized, here's my attempt (based on #kindall's start). Note, you can pass a custom function into the key parameter of the sorter
def sorter(char):
#Place english characters after Korean
if ord(char[0])>122:
return ord(char[0])-12000
else:
return ord(char[0])+12000
lst=['Z', 'C', 'A B', 'noname3', 'ㄴ', 'ㄱ', 'D A', 'A A' , 'ㄷ']
sorted(lst, key=lambda x: (x.startswith('noname'),sorter(x)))
['ㄱ', 'ㄴ', 'ㄷ', 'A B', 'A A', 'C', 'D A', 'Z', 'noname3']
Related
I want to extract lines in a list that contain carbons ('C').
The actual lines are:
propene_data = ['H -0.08677109049370 0.00000005322169 0.02324774260533\n', 'C -0.02236345244409 -0.00000001742911 1.09944502076327\n', 'C 1.14150994274008 0.00000000299501 1.72300489107368\n', 'H -0.95761218150040 -0.00000002374717 1.63257861279343\n', 'H 1.17043966864771 0.00000000845005 2.80466760537188\n', 'C 2.46626448549704 -0.00000000616665 1.02315746104893\n', 'H 3.28540550052797 0.00000001315434 1.73628424885091\n', 'H 2.55984407099540 -0.87855375749407 0.38655722260408\n', 'H 2.55984405602998 0.87855372701591 0.38655719488850\n']
I've tried to extract the carbons line using the following solution;
car1 = propene_data[1].split()
car2 = propene_data[2].split()
car3 = propene_data[5].split()
propene_carbons = car1 + car2 + car3
This solution gives;
propene_carbons = ['C', '-0.02236345244409', '-0.00000001742911', '1.09944502076327', 'C', '1.14150994274008', '0.00000000299501', '1.72300489107368', 'C', '2.46626448549704', '-0.00000000616665', '1.02315746104893']
It gives what I want, but I would like to know if I could indexing instead (in case the list is much longer). How do I use indexing in this case?
What you need here is startswith:
result = text.startswith('C')
in loop:
result = [i for i in propene_data if i.startswith('C')]
Output:
['C -0.02236345244409 -0.00000001742911 1.09944502076327\n',
'C 1.14150994274008 0.00000000299501 1.72300489107368\n',
'C 2.46626448549704 -0.00000000616665 1.02315746104893\n']
you can use this :
propene_array=np.array([i.split() for i in propene_data])
sub_array=np.where(propene_array[:,0]=='C')[0]
propene_carbon=[]
for i in sub_array :
propene_carbon+=list(propene_array[i])
output :
['C', '-0.02236345244409', '-0.00000001742911', '1.09944502076327', 'C',
'1.14150994274008', '0.00000000299501', '1.72300489107368', 'C',
'2.46626448549704', '-0.00000000616665', '1.02315746104893']
Say I have two Python lists containing strings that may or may not be of the same length.
list1 = ['a','b']
list2 = ['c','d','e']
I want to get the following result:
l = ['a c','a d','a e','b c','b d','b e']
The final list all possible combinations from the two lists with a space in between them.
One method I've tried is with itertools
import itertools
for p in itertools.permutations(, 2):
print(zip(*p))
But unfortunately this was not what I needed, as it did not return any combinations at all.
First make all possible combinations of the two lists, then use list comprehension to achieve the desired result:
list1 = ['a', 'b']
list2 = ['c', 'd', 'e']
com = [(x,y) for x in list1 for y in list2]
print([a + ' ' + b for (a, b) in com]) # ['a c', 'a d', 'a e', 'b c', 'b d', 'b e']
What you want is a cartesian product.
Code:
import itertools
list1 = ['a', 'b']
list2 = ['c', 'd', 'e']
l = ['%s %s' % (e[0], e[1]) for e in itertools.product(list1, list2)]
print(l)
result:
['a c', 'a d', 'a e', 'b c', 'b d', 'b e']
This is another possible method:
list1=['a','b']
list2=['c','d','e']
list3=[]
for i in list1:
for j in list2:
list3.append(i+" "+j)
print(list3)
One-Liner Solution, Use list comprehension and add the items of list
list1 = ['a','b']
list2 = ['c','d','e']
print([i+j for i in list1 for j in list2])
I have the following list of strings:
a = ['a','the quick fox', 'b', 'c', 'hi there']
How can I transform it into:
'a','the quick fox', 'b', 'c', 'hi there'
I tried to:
"','".join(a)
However, its returning me this:
"hey','the quick fox','b','c','hi there"
Instead of:
'hey','the quick fox','b','c','hi there'
Instead of adding the quotes as part of the join, you can instead get the repr representation of each string, which will include the quotes, then join those:
print(', '.join(map(repr, a)))
# 'a', 'the quick fox', 'b', 'c', 'hi there'
Really stuck with this question in my homework assignment.
Everything works, but when there is a space (' ') in the p. I need to stop the process of creating can.
For example, if I submit:
rankedVote("21 4", [('AB', '132'), ('C D', ''), ('EFG', ''), ('HJ K', '2 1')])
I would like to have:
['C D', 'AB']
returned, rather than just [] like it is now.
Code as below:
def rankedVote(p,cs):
candsplit = zip(*cs)
cand = candsplit[0]
vote = list(p)
ppl = vote
can = list(p)
for i in range(len(vote)):
if ' ' in vote[i-1]:
return []
else:
vote[i] = int(vote[i])
can[vote[i]-1] = cand[i]
for i in range(len(vote)):
for j in range(len(vote)):
if i != j:
if vote[i] == vote[j]:
return []
return can
EDIT:
In the example:
rankedVote("21 4", [('AB', '132'), ('C D', ''), ('EFG', ''), ('HJ K', '2 1')])
This means that the 1st, AB becomes 2nd,
and the 2nd one C D becomes 1st,
and it should stop because 3rd does not exist.
Let's say that instead of 21 4, it was 2143.
It would mean that the 3rd one EFG would be 4th,
and the 4th HJ K would be 3rd.
The code is doing as you instructed I would say. Look at the code block below:
if ' ' in vote[i-1]:
return []
I know this question is old, but I found it interesting.
Like the previous answer said you aren't returning the list up to that point, you are returning [].
What you should do is:
if ' ' in vote[i]:
return can[:i]
Also, since you seemed to know how to use zip, you could have also done it this way:
def rankedVote(p,cs):
cand = zip(*cs)[0]
# get elements before ' '
votes = p.split()[0] # '21'
# map votes index order with corresponding list order
# (number of `cands` is determined by length of `votes`)
cans = zip(votes, cand) # [('2', 'AB'), ('1', 'C D')]
# Sort the results and print only the cands
result = [can for vote, can in sorted(cans)] # ['C D', 'AB']
return result
Output:
>> rankedVote("21 4", [('AB', '132'), ('C D', ''), ('EFG', ''), ('HJ K', '2 1')])
['C D', 'AB']
>> rankedVote("2143", [('AB', '132'), ('C D', ''), ('EFG', ''), ('HJ K', '2 1')])
['C D', 'AB', 'HJ K', 'EFG']
If I have a list of strings (eg 'blah 1', 'blah 2' 'xyz fg','xyz penguin'), what would be the best way of finding the unique starts of strings ('xyz' and 'blah' in this case)? The starts of strings can be multiple words.
Your question is confusing, as it is not clear what you really want. So I'll give three answers and hope that one of them at least partially answers your question.
To get all unique prefixes of a given list of string, you can do:
>>> l = ['blah 1', 'blah 2', 'xyz fg', 'xyz penguin']
>>> set(s[:i] for s in l for i in range(len(s) + 1))
{'', 'xyz pe', 'xyz penguin', 'b', 'xyz fg', 'xyz peng', 'xyz pengui', 'bl', 'blah 2', 'blah 1', 'blah', 'xyz f', 'xy', 'xyz pengu', 'xyz p', 'x', 'blah ', 'xyz pen', 'bla', 'xyz', 'xyz '}
This code generates all initial slices of every string in the list and passes these to a set to remove duplicates.
To get all largest initial word sequences smaller than the full string, you could go with:
>>> l = ['a b', 'a c', 'a b c', 'b c']
>>> set(s.rsplit(' ', 1)[0] for s in l)
{'a', 'a b', 'b'}
This code creates a set by splitting all strings at their rightmost space, if available (otherwise the while string will be returned).
On the other hand, to get all unique initial word sequences without considering full strings, you could go for:
>>> l = ['a b', 'a c', 'a b c', 'b c']
>>> set(' '.join(w[:i]) for s in l for w in (s.split(),) for i in range(len(w)))
{'', 'a', 'b', 'a b'}
This code splits each word at any whitespace and concatenates all initial slices of the resulting list, except the largest one. This code has pitfall: it will e.g. convert tabs to spaces. This may or may not be an issue in your case.
If you mean unique first words of strings (words being separated by space), this would be:
arr=['blah 1', 'blah 2' 'xyz fg','xyz penguin']
unique=list(set([x.split(' ')[0] for x in arr]))