I am having a logical error of sorts and I can not seem to pick it out. Here is what I have:
Document = 'Sample1'
locationslist = []
thedictionary = []
userword = ['the', 'a']
filename = 'Sample1'
for inneritem in userword:
thedictionary.append((inneritem,locationslist))
for position, item in enumerate(file_contents):
if item == inneritem:
locationslist.append(position)
wordlist = (thedictionary, Document)
print wordlist
So basically I am trying to create a larger list (thedictionary) from a smaller list (locationslist) together with the particular userword. I almost have it except I have that the output is putting all the locations of all the words (in which there are only 2 - 'the' and 'a') in each of the lists. Seems like there is a simple logic problem - but I can't seem to spot it. The output is:
([('the', [5, 28, 41, 97, 107, 113, 120, 138, 141, 161, 2, 49, 57, 131, 167, 189, 194, 207, 215, 224]),
('a', [5, 28, 41, 97, 107, 113, 120, 138, 141, 161, 2, 49, 57, 131, 167, 189, 194, 207, 215, 224])],
'Sample1')
But should be:
([('the', [5, 28, 41, 97, 107, 113, 120, 138, 141, 161]),
('a', [2, 49, 57, 131, 167, 189, 194, 207, 215, 224])],
'Sample1')
See how both position lists are being appended to each of the problematic output concerning each of the userwords 'the' and 'a'? I could use advice on what I am doing wrong here..
You only create one locationslist, so you only have one. It is shared by both words. You need to create a new locationslist on each loop iteration:
for inneritem in userword:
locationslist = []
thedictionary.append((inneritem,locationslist))
# etc.
You have only created the one locationslist, so all of the locationslist.append() calls modify that list. You append the same locationslist to as many tuples in thedictionary as you have elements in userword. You should create one location list for each element of userword.
The algorithm you have could be written as a nested set of list comprehensions, which would lead to the correct lists being created:
user_word = ['the', 'a']
word_list = ([(uw,
[position for position, item in enumerate(file_contents)
if item == uw])
for uw in user_word],
'Sample1')
That would still call enumerate(file_contents) once for each item in user_word, which could be expensive if file_contents is large.
I suggest you rewrite this to pass over file_contents once, check the item at each position against the contents of user_word, and append the position to only the list for the particular user_word found at that position. I would suggest using a dictionary to keep the user_word lists separate and accessible:
document = 'Sample1'
temp_dict = dict((uw, []) for uw in user_word)
for position, item in enumerate(file_contents):
if item in temp_dict:
temp_dict[item].append(position)
wordlist = ([(uw, temp_dict[uw]) for uw in user_word], document)
Either solution will get you the positions of each user_word, in order of appearance, in the document being scanned. It will also return the list structure you're looking for.
Related
I have two copies of a list, one sorted, one isn't, in a dictionary which serve to find the index of any number in the list, starting with the largest. When I print the lists the output is as follows:
wealth_comp = {
'Wealth1': [131, 127, 125, 125, 123, 121, 121, 117, 115, 107, 105, 101],
'Wealth2': [127, 125, 121, 117, 105, 121, 107, 123, 131, 101, 115, 125]
}
but when I run
index = wealth_comp["Wealth2"].index([wealth_comp["Wealth1"][x]])
it gives me
ValueError: [131] is not in list
when it is in the list.
[131] is clearly not in the list. 131 is. So get rid of [] brackets.
index = wealth_comp["Wealth2"].index(wealth_comp["Wealth1"][x])
>>>Wealth_Comp = {
'Wealth1':131,127,125,125,123,121,121,117,115,107,105,101],
'Wealth2':[127,125,121,117,105,121,107,123,131,101,115,125]
}
>>> index = wealth_comp["Wealth2"].index(131)
>>> index`
8
Cheers
Let's suppose that I have 3 python two-dimensional lists (data_1, data_2, data_3) of 10x5 dimensions. I want to make one list (all_data) from them with 30x5 dimensions. For now, I am doing this by applying the following:
data_1 = [[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], ..., [46, 47, 48, 49, 50]]
data_2 = [[101, 102, 103, 104, 105], [106, 107, 108, 109, 110], ..., [146, 147, 148, 149, 150]]
data_3 = [[201, 202, 203, 204, 205], [206, 207, 208, 209, 210], ..., [246, 247, 248, 249, 250]]
all_data = []
for index, item in enumerate(data_1):
all_data.append(item)
for index, item in enumerate(data_2):
all_data.append(item)
for index, item in enumerate(data_3):
all_data.append(item)
If I simply do something like this:
all_data.append(data_1)
all_data.append(data_2)
all_data.append(data_3)
then I get a list of 3x10x5 which is not what I want.
So can I create a 30x5 list by appending three 10x5 lists without using any for loop?
You can just extend your list.
data = []
data.extend(d1)
data.extend(d2)
data.extend(d3)
Simply write:
all_data = data_1 + data_2 + data_3
If you want to merge all the corresponding sublists of these lists, you can write:
# function that merges all sublists in single list
def flatten(l):
return [el for subl in l for el in subl]
# zipped - is a list of tuples of corresponding items (our deep sublists) of all the lists
zipped = zip(data_1, data_3, data_3)
# merge tuples to get single list of rearranges sublists
all_data = flatten(zipped)
How about:
all_data = data_1 + data_2 + data_3
This should work:
data=[]
data[len(data):] = data1
data[len(data):] = data2
data[len(data):] = data3
In case you don't mind it being lazy, you could also do
import itertools
all_data = itertools.chain(data1, data2, data3)
Below is the given list:
x = [[50,55,57],[50,55,58],[50,55,60],[50,57,58],[50,57,60],[50,58,60],[55,57,58],[55,57,60],[55,58,60],[57,58,60]]
What I need here is the sum of numbers of each nested list in a different list.
For e.g [162,163,.....]
>>> x = [[50,55,57],[50,55,58],[50,55,60],[50,57,58],[50,57,60],[50,58,60],[55,57,58],[55,57,60],[55,58,60],[57,58,60]]
>>> y = [sum(i) for i in x]
>>> y
[162, 163, 165, 165, 167, 168, 170, 172, 173, 175]
You can simply do :
x = [[50,55,57],[50,55,58],[50,55,60],[50,57,58],[50,57,60],[50,58,60],[55,57,58],[55,57,60],[55,58,60],[57,58,60]]
print(list(map(sum,x)))
output:
[162, 163, 165, 165, 167, 168, 170, 172, 173, 175]
Just loop through the items.
First you loop through the internal lists:
for lst in x:
Then you sum the items in the list using the sum method:
total = sum(lst)
Together it looks like this:
new_list = list()
for lst in x:
lst = sum(lst)
new_list.append(total)
print(new_list)
Hope this helps.
Edit: I haven't used the sum method before. Which is why the two downvotes despite the program working fine.
I'm trying to loop over a list and save some elements to a new list: the ones that are at more than 50 greater than the previous value. My current code saves only the first value. I want [0, 76, 176, 262, 349].
list = [0, 76, 91, 99, 176, 192, 262, 290, 349]
new_list = []
for i in list:
if ((i+1)-i) > 50:
new_list.extend([i])
Solution
What I want is to save the values 0, 76, 176, 262, and 349.
So it sounds like you want to iterate over the list, and if the current element is greater than its preceding element by 50 then save it to the list and repeat. This solution assumes the original list is sorted. Let's turn this into some code.
lst = [0, 76, 91, 99, 176, 192, 262, 290, 349]
new_lst = []
for i, num in enumerate(lst):
if i == 0 or num - lst[i-1] > 50:
new_lst.append(num)
print(new_lst)
# [0, 76, 176, 262, 349]
The interesting part here is the conditional within the loop:
if i == 0 or num - lst[i-1] > 50:
The first part considers if we're at the first element, in which case I assume you want to add the element no matter what. Otherwise, get the difference between our current element and the previous element in the original list, and check if the difference is greater than 50. In both cases, we want to append the element to the list (hence the or).
Notes
You should avoid using list as a variable so you don't shadow the built-in Python type.
lst.extend([num]) is better written as lst.append(num).
enumerate(lst) allows you to easily get both the index and value of each element in a list (or any iterable).
The statement if ((i+1)-i) > 50 will always evaluate to if (1 > 50) which is false. You're looking for the next element in list, but i is simply the value of the current element. Try something like the tee() function in the itertools library to get multiple values, or something like
list = [0, 76, 91, 99, 176, 192, 262, 290, 349]
new_list = []
for i in range(len(list)):
print (i, list[i])
if i == 0 or (list[i] - list[i-1]) > 50:
new_list.append(list[i])
Keep in mind, I didn't add any checks for whether there is an i + 1 element, so you may need to do some error handling.
EDIT
This is my final edit. I would like to thank #Prune for enforcing standards :)
First, the code in PyCharm:
Second, the console output:
my solution
l = [0, 76, 91, 99, 176, 192, 262, 290, 349]
o = list(l[0]) + [a for a, b in zip(l[1:], l[:-1]) if a - b - 50 > 0]
output: [0, 76, 176, 262, 349]
for i in range(len(list)-1):
if (list[i+1]-list[i]) > 50:
new_list.append(list[i])
if list[-1] - new_list[-1] > 50:
new_list.append(list[-1]) #for the last element of list
Of course, the code can be made better. But this should work.
You can put a condition inside a list comprehension.
Also, do not give a variable the same name (list) as a built-in type.
my_list = [0, 76, 91, 99, 176, 192, 262, 290, 349]
# Start with the first element;
# then grab each element more than 50 greater than the previous one.
new_list = [my_list[0]]
new_list.extend([my_list[i] for i in range(1, len(my_list))
if my_list[i] - my_list[i-1] > 50])
print(new_list)
Output:
[0, 76, 176, 262, 349]
This is kind of hard to describe so I'll show it mainly in code. I'm taking a List of a List of numbers and appending it to a masterList.
The first list in master list would be the first element of each list. I would insert 0 in it's appropriate index in the master list. Then I would move on to the next list. I would choose the first element of the 2nd list and append it to the second list in the master list, since it's index would be 1, I would insert 0 to the first index of that list. This is WAY confusing, please comment back if you have any questions about it. I'll answer back fast. This is really bugging me.
ex:
L = [[], [346], [113, 240], [2974, 1520, 684], [169, 1867, 41, 5795]]
What i want is this:
[[0,346,113,2974,169],[346,0,240,1520,1867],[113,240,0,684,41],[2974,1520,684,0,5795],[169,1867,41,5795,0]]
IIUC, you want something like
>>> L = [[], [346], [113, 240], [2974, 1520, 684], [169, 1867, 41, 5795]]
>>> [x+[0]+[L[j][i] for j in range(i+1, len(L))] for i, x in enumerate(L)]
[[0, 346, 113, 2974, 169], [346, 0, 240, 1520, 1867],
[113, 240, 0, 684, 41], [2974, 1520, 684, 0, 5795],
[169, 1867, 41, 5795, 0]]
which might be easier to read in expanded form:
combined = []
for i, x in enumerate(L):
newlist = x + [0]
for j in range(i+1, len(L)):
newlist.append(L[j][i])
combined.append(newlist)