Related
I know I can do something like below to get number of occurrences of elements in the list:
from collections import Counter
words = ['a', 'b', 'c', 'a']
Counter(words).keys() # equals to list(set(words))
Counter(words).values() # counts the elements' frequency
Outputs:
['a', 'c', 'b']
[2, 1, 1]
But I want to get the count 2 for b and c as b and c occur exactly once in the list.
Is there any way to do this in concise / pythonic way without using Counter or even using above output from Counter?
You could just make an algorithm that does that, here is a one liner (thanks #d.b):
sum(x for x in Counter(words).values() if x == 1)
Or more than one line:
seen = []
count = 0
for word in words:
if word not in seen:
count += 1
seen.append(word)
I need someone to help me code this WITHOUT using external libraries panda imports exceptions counters
lineList = [['Cat', 'c', 1, x],['Cat', 'a', 2, x],['Cat', 't', 3, x],['Bat', 'b', 1, 3],['Bat', 'b', 1, 2],['Mat', 'm', 1, 1],['Fat', 'f', 1, 13]]
Words from 2D list that appear more than 2 times are displayed in a numerical list
Eg:
1. Cat
2. Bat
How can I allow the user to select a word by inputting the list position number? So for example if the user enters 1 it will return the second and third elements for Cat in the nested list:
c = 1, a = 2, t = 3
I am a beginner to Python so not sure how to approach this.
You can use str.join, str.format, enumerate, and a generator expression:
word_counts = [['Cat', 2], ['Bat', 3], ['Fat', 1], ['Mat', 1]]
filtered = [p for p in word_counts if p[1] > 1]
print('\n'.join('{0}. {1}'.format(i, p[0]) for i, p in enumerate(filtered, 1)))
Output:
1. Cat
2. Bat
For a string in a specific position:
n = int(input('position: ')) # 1-indexed
print('{0}. {1}'.format(n, filtered[n - 1][0])) # 0-indexed (hence, n - 1)
Use a Counter to count the words and then use enumerate to make the numbers for your list:
from collections import Counter
lineList = [['Cat', 'c', 1, 2],['Cat', 'c', 1, 3],['Bat', 'b', 1, 4],['Bat', 'b', 1, 3],['Bat', 'b', 1, 2],['Mat', 'm', 1, 1],['Fat', 'f', 1, 13]]
counts = Counter(word for word, *other_stuff in lineList)
filtered = [word for word, count in counts.items() if count >= 2]
for number, word in enumerate(filtered, start=1):
print("{: >2}.".format(number), word)
prints
1. Cat
2. Bat
If you can't import Counter you can write a basic replacement pretty easily:
def Counter(seq):
d = {}
for item in seq:
d[item] = d.get(item, 0) + 1
return d
(Counter has more features, but this is all we're using)
You can then select a word with:
def choose(filtered):
while True:
choice = input("Select a word: ")
try:
choice = int(choice)
return filtered[choice-1]
except ValueError, IndexError:
print("Please enter a number on the list")
You are right there, just check to see if the value of the second item in the nested list is greater than 1.
list1 = [['Cat', 2], ['Bat', 3], ['Fat', 1], ['Mat', 1]]
index = 1
for i in range(len(list1)):
if list1[i][1] > 1:
print (str(index)+ ". " + str(list1[i][0]))
index += 1
This prints:
1. Cat
2. Bat
You changed the description a bit so I rewrote the answer to be more fitting
lineList = [['Cat', 'c', 1, 'x'],['Cat', 'a', 2, 'x'],['Cat', 't', 3, 'x'],['Bat', 'b', 1, 3],['Bat', 'b', 1, 2],['Mat', 'm', 1, 1],['Fat', 'f', 1, 13]]
#First we create a dictionary with the repeating words in the list you gave
nameList = []
frequencyDict = {}
for i in range(len(lineList)):
if lineList[i][0] in frequencyDict.keys():
frequencyDict[lineList[i][0]] += 1
else:
frequencyDict[lineList[i][0]] = 1
#this will give you a list with the order
#it will be useful to get the indices of the repeating word later
nameList.append(lineList[i][0])
# Printing a list of values when if they are repeated
index = 1
repeats = []
for i in frequencyDict.keys():
if frequencyDict[i] > 1: #This if statement checks if it was repeated or not
print(str(index)+ ". " + i)
repeats.append(i) # we also crete yet another list so the user can call it with his input later
index += 1
x = (int(input("Which item on the list would you like to know more information about: \n")) -1) #note that we are subtracting one from the input so it matches the index of the list
# Here I am trying to get all the indices that start with the word that user replied
indicesList = []
for i in range(len(nameList)):
if nameList[i] == repeats[x]:
indicesList.append(i)
# Here I am printing the value that is in the index 1 and 2 of the nested list in Linelist
for i in range(len(indicesList)):
print(str(lineList[indicesList[i]][1]) +
" = " +
str(lineList[indicesList[i]][2]))
I'm doing a project for my school and for now I have the following code:
def conjunto_palavras_para_cadeia1(conjunto):
acc = []
conjunto = sorted(conjunto, key=lambda x: (len(x), x))
def by_size(words, size):
result = []
for word in words:
if len(word) == size:
result.append(word)
return result
for i in range(0, len(conjunto)):
if i > 0:
acc.append(("{} ->".format(i)))
acc.append(by_size(conjunto, i))
acc = ('[%s]' % ', '.join(map(str, acc)))
print( acc.replace(",", "") and acc.replace("'", "") )
conjunto_palavras_para_cadeia1(c)
I have this list: c = ['A', 'E', 'LA', 'ELA'] and what I want is to return a string where the words go from the smallest one to the biggest on in terms of length, and in between they are organized alphabetically. I'm not being able to do that...
OUTPUT: [;1 ->, [A, E], ;2 ->, [LA], ;3 ->, [ELA]]
WANTED OUTPUT: ’[1->[A, E];2->[LA];3->[ELA]]’
Taking a look at your program, the only issue appears to be when you are formatting your output for display. Note that you can use str.format to insert lists into strings, something like this:
'{}->{}'.format(i, sublist)
Here's my crack at your problem, using sorted + itertools.groupby.
from itertools import groupby
r = []
for i, g in groupby(sorted(c, key=len), key=len):
r.append('{}->{}'.format(i, sorted(g)).replace("'", ''))
print('[{}]'.format(';'.join(r)))
[1->[A, E];2->[LA];3->[ELA]]
A breakdown of the algorithm stepwise is as follows -
sort elements by length
group consecutive elements by length
for each group, sort sub-lists alphabetically, and then format them as strings
at the end, join each group string and surround with square brackets []
Shortest solution (with using of pure python):
c = ['A', 'E', 'LA', 'ELA']
result = {}
for item in c:
result[len(item)] = [item] if len(item) not in result else result[len(item)] + [item]
str_result = ', '.join(['{0} -> {1}'.format(res, sorted(result[res])) for res in result])
I will explain:
We are getting items one by one in loop. And we adding them to dictionary by generating lists with index of word length.
We have in result:
{1: ['A', 'E'], 2: ['LA'], 3: ['ELA']}
And in str_result:
1 -> ['A', 'E'], 2 -> ['LA'], 3 -> ['ELA']
Should you have questions - ask
lets say I have an array "array_1" with these items:
A b A c
I want to get a new array "array_2" which looks like this:
b A c A
I tried this:
array_1 = ['A','b','A','c' ]
array_2 = []
for item in array_1:
if array_1[array_1.index(item)] == array_1[array_1.index(item)].upper():
array_2.append(array_1[array_1.index(item)+1]+array_1[array_1.index(item)])
The problem: The result looks like this:
b A b A
Does anyone know how to fix this? This would be really great!
Thanks, Nico.
It's because you have 2 'A' in your array. In both case for the 'A',
array_1[array_1.index(item)+1
will equal 'b' because the index method return the first index of 'A'.
To correct this behavior; i suggest to use an integer you increment for each item. In that cas you'll retrieve the n-th item of the array and your program wont return twice the same 'A'.
Responding to your comment, let's take back your code and add the integer:
array_1 = ['A','b','A','c' ]
array_2 = []
i = 0
for item in array_1:
if array_1[i] == array_1[i].upper():
array_2.append(array_1[i+1]+array_1[i])
i = i + 1
In that case, it works but be careful, you need to add an if statement in the case the last item of your array is an 'A' for example => array_1[i+1] won't exist.
I think that simple flat list is the wrong data structure for the job if each lower case letter is paired with the consecutive upper case letter. If would turn it into a list of two-tuples i.e.:
['A', 'b', 'A', 'c'] becomes [('A', 'b'), ('A', 'c')]
Then if you are looping through the items in the list:
for item in list:
print(item[0]) # prints 'A'
print(item[1]) # prints 'b' (for first item)
To do this:
input_list = ['A', 'b', 'A', 'c']
output_list = []
i = 0;
while i < len(input_list):
output_list.append((input_list[i], input_list[i+1]))
i = i + 2;
Then you can swap the order of the upper case letters and the lower case letters really easily using a list comprehension:
swapped = [(item[1], item[0]) for item in list)]
Edit:
As you might have more than one lower case letter for each upper case letter you could use a list for each group, and then have a list of these groups.
def group_items(input_list):
output_list = []
current_group = []
while not empty(input_list):
current_item = input_list.pop(0)
if current_item == current_item.upper():
# Upper case letter, so start a new group
output_list.append(current_group)
current_group = []
current_group.append(current_item)
Then you can reverse each of the internal lists really easily:
[reversed(group) for group in group_items(input_list)]
According to your last comment, you can get what you want using this
array_1 = "SMITH Mike SMITH Judy".split()
surnames = array_1[1::2]
names = array_1[0::2]
print array_1
array_1[0::2] = surnames
array_1[1::2] = names
print array_1
You get:
['SMITH', 'Mike', 'SMITH', 'Judy']
['Mike', 'SMITH', 'Judy', 'SMITH']
If I understood your question correctly, then you can do this:
It will work for any length of array.
array_1 = ['A','b','A','c' ]
array_2 = []
for index,itm in enumerate(array_1):
if index % 2 == 0:
array_2.append(array_1[index+1])
array_2.append(array_1[index])
print array_2
Output:
['b', 'A', 'c', 'A']
This question already has answers here:
Removing duplicates in lists
(56 answers)
Closed 5 months ago.
So I'm trying to make this program that will ask the user for input and store the values in an array / list.
Then when a blank line is entered it will tell the user how many of those values are unique.
I'm building this for real life reasons and not as a problem set.
enter: happy
enter: rofl
enter: happy
enter: mpg8
enter: Cpp
enter: Cpp
enter:
There are 4 unique words!
My code is as follows:
# ask for input
ipta = raw_input("Word: ")
# create list
uniquewords = []
counter = 0
uniquewords.append(ipta)
a = 0 # loop thingy
# while loop to ask for input and append in list
while ipta:
ipta = raw_input("Word: ")
new_words.append(input1)
counter = counter + 1
for p in uniquewords:
..and that's about all I've gotten so far.
I'm not sure how to count the unique number of words in a list?
If someone can post the solution so I can learn from it, or at least show me how it would be great, thanks!
In addition, use collections.Counter to refactor your code:
from collections import Counter
words = ['a', 'b', 'c', 'a']
Counter(words).keys() # equals to list(set(words))
Counter(words).values() # counts the elements' frequency
Output:
['a', 'c', 'b']
[2, 1, 1]
You can use a set to remove duplicates, and then the len function to count the elements in the set:
len(set(new_words))
values, counts = np.unique(words, return_counts=True)
More Detail
import numpy as np
words = ['b', 'a', 'a', 'c', 'c', 'c']
values, counts = np.unique(words, return_counts=True)
The function numpy.unique returns sorted unique elements of the input list together with their counts:
['a', 'b', 'c']
[2, 1, 3]
Use a set:
words = ['a', 'b', 'c', 'a']
unique_words = set(words) # == set(['a', 'b', 'c'])
unique_word_count = len(unique_words) # == 3
Armed with this, your solution could be as simple as:
words = []
ipta = raw_input("Word: ")
while ipta:
words.append(ipta)
ipta = raw_input("Word: ")
unique_word_count = len(set(words))
print "There are %d unique words!" % unique_word_count
aa="XXYYYSBAA"
bb=dict(zip(list(aa),[list(aa).count(i) for i in list(aa)]))
print(bb)
# output:
# {'X': 2, 'Y': 3, 'S': 1, 'B': 1, 'A': 2}
For ndarray there is a numpy method called unique:
np.unique(array_name)
Examples:
>>> np.unique([1, 1, 2, 2, 3, 3])
array([1, 2, 3])
>>> a = np.array([[1, 1], [2, 3]])
>>> np.unique(a)
array([1, 2, 3])
For a Series there is a function call value_counts():
Series_name.value_counts()
If you would like to have a histogram of unique values here's oneliner
import numpy as np
unique_labels, unique_counts = np.unique(labels_list, return_counts=True)
labels_histogram = dict(zip(unique_labels, unique_counts))
How about:
import pandas as pd
#List with all words
words=[]
#Code for adding words
words.append('test')
#When Input equals blank:
pd.Series(words).nunique()
It returns how many unique values are in a list
You can use get method:
lst = ['a', 'b', 'c', 'c', 'c', 'd', 'd']
dictionary = {}
for item in lst:
dictionary[item] = dictionary.get(item, 0) + 1
print(dictionary)
Output:
{'a': 1, 'b': 1, 'c': 3, 'd': 2}
ipta = raw_input("Word: ") ## asks for input
words = [] ## creates list
unique_words = set(words)
Although a set is the easiest way, you could also use a dict and use some_dict.has(key) to populate a dictionary with only unique keys and values.
Assuming you have already populated words[] with input from the user, create a dict mapping the unique words in the list to a number:
word_map = {}
i = 1
for j in range(len(words)):
if not word_map.has_key(words[j]):
word_map[words[j]] = i
i += 1
num_unique_words = len(new_map) # or num_unique_words = i, however you prefer
Other method by using pandas
import pandas as pd
LIST = ["a","a","c","a","a","v","d"]
counts,values = pd.Series(LIST).value_counts().values, pd.Series(LIST).value_counts().index
df_results = pd.DataFrame(list(zip(values,counts)),columns=["value","count"])
You can then export results in any format you want
The following should work. The lambda function filter out the duplicated words.
inputs=[]
input = raw_input("Word: ").strip()
while input:
inputs.append(input)
input = raw_input("Word: ").strip()
uniques=reduce(lambda x,y: ((y in x) and x) or x+[y], inputs, [])
print 'There are', len(uniques), 'unique words'
I'd use a set myself, but here's yet another way:
uniquewords = []
while True:
ipta = raw_input("Word: ")
if ipta == "":
break
if not ipta in uniquewords:
uniquewords.append(ipta)
print "There are", len(uniquewords), "unique words!"
ipta = raw_input("Word: ") ## asks for input
words = [] ## creates list
while ipta: ## while loop to ask for input and append in list
words.append(ipta)
ipta = raw_input("Word: ")
words.append(ipta)
#Create a set, sets do not have repeats
unique_words = set(words)
print "There are " + str(len(unique_words)) + " unique words!"
This is my own version
def unique_elements():
elem_list = []
dict_unique_word = {}
for i in range(5):# say you want to check for unique words from five given words
word_input = input('enter element: ')
elem_list.append(word_input)
if word_input not in dict_unique_word:
dict_unique_word[word_input] = 1
else:
dict_unique_word[word_input] += 1
return elem_list, dict_unique_word
result_1, result_2 = unique_elements()
# result_1 holds the list of all inputted elements
# result_2 contains unique words with their count
print(result_2)