is there a way to avoid nested loop - python

I'm given two different data and I'm wondering if there is a way to get specific data without using nested loop
firstdata = [[["key"],["value"]],
[[2],["two"]],
[[3],["three"]]]
seconddata = [[[key],["artimatic"]],
[[2],["0+2"]],
[[2],["1+1"]],
[[3],["0+3"]],
[[3],["1+2"]],
[[3],["2+1"]]]
//nested loop solution would look like this
for x in firstdata:
for y in seconddata:
print(x[1])
if x[0]==y[0]:
print(y)
Is there an alternative solution that I can loop the seconddata without using nested loop?

**Ok I am assuming Data Structure of firstdata and seconddata will be same:
firstdata_dict = {x[0][0]: x[1][0] for x in firstdata}
seconddata_dict = {}
for data in seconddata:
if not seconddata_dict.has_key(data[0][0]):
seconddata_dict[data[0][0]] = []
seconddata_dict[data[0][0]].append(data[1][0])
for key, value in firstdata_dict.items():
if seconddata_dict.get(key):
# key match add your algo
print seconddata_dict[key]
Output:
['0+2', '1+1']
['0+3', '1+2', '2+1']
['artimatic']

Start by converting the list to a dictionary as so. Here the keys are the numbers 2 and 3, and values are list of strings associated to a specific key in the list
def convert_to_dct(lst):
dct = {}
for x in lst:
for i in range(len(x)):
key = x[0][0]
value = x[1][0]
if key in dct:
dct[key].append(value)
else:
dct[key] = []
return dct
This function converts the list as follows
firstdata = [[["key"],["value"]],
[[2],["two"]],
[[3],["three"]]]
seconddata = [[["key"],["artimatic"]],
[[2],["0+2"]],
[[2],["1+1"]],
[[3],["0+3"]],
[[3],["1+2"]],
[[3],["2+1"]]]
firstdict = convert_to_dct(firstdata)
seconddict = convert_to_dct(seconddata)
print(firstdict)
print(seconddict)
#{'key': ['value'], 2: ['two'], 3: ['three']}
#{'key': ['artimatic'], 2: ['0+2', '1+1', '1+1'], 3: ['0+3', '1+2', '1+2', '2+1', '2+1']}
Then to get your final result, do
for key in firstdict.keys():
if key in seconddict.keys():
print(seconddict[key])
#['artimatic']
#['0+2', '1+1', '1+1']
#['0+3', '1+2', '1+2', '2+1', '2+1']

I don't think the two answers understand the question correctly, so here you go:
Create your data
firstData = [[["key"],["value"]],
[[2],["two"]],
[[3],["three"]]]
secondData = [[['key'],["artimatic"]],
[[2],["0+2"]],
[[2],["1+1"]],
[[3],["0+3"]],
[[3],["1+2"]],
[[3],["2+1"]]]
Then
firstdata_dict = {x[0][0]: x[1][0] for x in firstData} #Creates the dictionary of first Data
Then do the computation
for element in secondData:
if (element[0][0] in firstdata_dict): #Checks in a hashMap and is thus done in O(1) generally
print(element)

Related

Remove file name duplicates in a list

I have a list l:
l = ['Abc.xlsx', 'Wqe.csv', 'Abc.csv', 'Xyz.xlsx']
In this list, I need to remove duplicates without considering the extension. The expected output is below.
l = ['Wqe.csv', 'Abc.csv', 'Xyz.xlsx']
I tried:
l = list(set(x.split('.')[0] for x in l))
But getting only unique filenames without extension
How could I achieve it?
You can use a dictionary comprehension that uses the name part as key and the full file name as the value, exploiting the fact that dict keys must be unique:
>>> list({x.split(".")[0]: x for x in l}.values())
['Abc.csv', 'Wqe.csv', 'Xyz.xlsx']
If the file names can be in more sophisticated formats (such as with directory names, or in the foo.bar.xls format) you should use os.path.splitext:
>>> import os
>>> list({os.path.splitext(x)[0]: x for x in l}.values())
['Abc.csv', 'Wqe.csv', 'Xyz.xlsx']
If the order of the end result doesn't matter, we could split each item on the period. We'll regard the first item in the list as the key and then keep the item if the key is unique.
oldList = l
setKeys = set()
l = []
for item in oldList:
itemKey = item.split(".")[0]
if itemKey in setKeys:
pass
else:
setKeys.add(itemKey)
l.append(item)
Try this
l = ['Abc.xlsx', 'Wqe.csv', 'Abc.csv', 'Xyz.xlsx']
for x in l:
name = x.split('.')[0]
find = 0
for index,d in enumerate(l, start=0):
txt = d.split('.')[0]
if name == txt:
find += 1
if find > 1:
l.pop(index)
print(l)
#Selcuk Definitely the best solution, unfortunately I don't have enough reputation to vote you answer.
But I would rather use el[:el.rfind('.')] as my dictionary key than os.path.splitext(x)[0] in order to handle the case where we have sophisticated formats in the name. that will give something like this:
list({x[:x.rfind('.')]: x for x in l}.values())

How to convert this code based on lists to code based on numpy arrays?

I am working with the keras and there is always a from with a lists, so I guess that always everything has to be converted to the numpy array which are very illogical for me. I guess that it is associated with performance? I don't see any other reason? However my problem looks like shown below. I have to convert this part of code:
output_sentence = []
final_output_sentence = []
for key in row['o'].lower():
temp_list = []
if key in dictionary.keys():
temp_list.append(dictionary[key])
output_sentence.append(temp_list)
else:
dictionary[key] = len(dictionary)
temp_list.append(dictionary[key])
output_sentence.append(temp_list)
final_output_sentence.append(output_sentence)
to the code based on numpy arrays. I try in this way:
output_sentence = np.array([], dtype=int)
final_output_sentence = np.array([], dtype=int)
for key in row['o'].lower():
temp_list = np.array([], dtype=int)
if key in dictionary.keys():
temp_list = np.append(temp_list, dictionary[key])
output_sentence = np.append(output_sentence, temp_list)
else:
dictionary[key] = len(dictionary)
temp_list = np.append(temp_list, dictionary[key])
output_sentence = np.append(output_sentence, temp_list)
final_output_sentence = np.append(final_output_sentence, output_sentence)
however instead of this [[[1], [2], [3], [2], [4]]] I get this [1 2 3 2 4]. Any ideas how to solve this?
UPDATE
What do you think about solution shown below? Any tips for performance optimization?
output_sentence = []
for key in row['o'].lower():
temp_list = []
if key in dictionary.keys():
temp_list.append(dictionary[key])
output_sentence.append(temp_list)
else:
dictionary[key] = len(dictionary)
temp_list.append(dictionary[key])
output_sentence.append(temp_list)
final_output_sentence = np.array(output_sentence)
final_output_sentence = final_output_sentence.reshape(1, final_output_sentence.shape[0], 1)
output_sentence = []
for key in row['o'].lower():
if key not in dictionary.keys():
dictionary[key] = len(dictionary)
output_sentence.append(dictionary[key])
final_output_sentence = np.array(output_sentence).reshape(1,-1,1)
If the key does not exist in dictionary then add it with the next size
Append the value corresponding corresponding to the key into output_sentence
Finally, output_sentence is a list but since you want a 3D array, convert it into numpy array and reshape it.
x.reshape(1,-1,1) => reshape x such that size of 0th axis is 1, size of 2nd axis is 1 and the size of 1st axis will be same a elements in x.

Dictionary initialization syntax

def __init__(self, devices, queue):
'''
'''
self.devices = devices
self.queue = queue
values = {k:0 for k in devices.keys()}
values[0xbeef] = len(values) # the number of devices
super(CallbackDataBlock, self).__init__(values)
Can someone help me explain the following two lines:
values = {k:0 for k in devices.keys()}
What does k:0 do?
values[0xbeef] = len(values) # the number of devices
Does this mean that new item {0xbeef: length} is appended in the dict?
The k is the field in the dictionary. The set of all fields is stored in the device.keys() which is most probably a list, we loop through the list, take names of fields and initialize them by zero.
Yes, you are right. The next statement is responsible for adding a new field and initializing it to the length of the array.
{k:0 for k in devices.keys()} creates a dictionary with all keys and 0 for all values. And your assessment is correct, it creates a new key with {value of 0xbeef : number of keys in the dictionary}
in python documentation you can see List Comprehensions
this pattern is important :
expression for item in list if conditional else
or for simple usage :
expression for item in list
in list data type we can use :
list = [0,1,2,3,4,5]
a = [x for x in list]
print (a)
printed :
[1,2,3,4,5]
and we have :
a = [x*2 for x in list]
print (a)
printed :
[2,4,6,8,10]
and for dictionary
in dictionary we have this syntax:
{key1:value1, key2:value2, . . .}
and example :
list = [0,1,2,3,4,5]
d = [k:0 for k in list]
print (d)
in example k:0 maens : put 0 for value of each k
printed :
{1: 0, 2: 0, 3: 0, 4: 0, 5: 0}
one more thing :
python dictionary have to Helpful method:dict.keys(),dict.values()
when we use dict.keys, python return a list of dict's keys
d = {"name":"sam", "job":"programer", "salary":"25000"}
print(d.keys())
print(d.values())
printed :
['name','job','salary']
['sam','programer','25000']
for add a new object in a dictionary we use :
d[newkey]= newValue
for example :
d[10] = 'mary'
print(d[10])
printed :
'mary'
now your answer :
in your code
1) k:0 maens : put 0 for value of each k
2) 0xbeef is a hex code == 48879 in decimal
values[48879] = len(values)
its fill by length of list.

Loop through entries in a list and create new list

I have a list of strings that looks like that
name=['Jack','Sam','Terry','Sam','Henry',.......]
I want to create a newlist with the logic shown below. I want to go to every entry in name and assign it a number if the entry is seen for the first time. If it is being repeated(as in the case with 'Sam') I want to assign it the corresponding number, include it in my newlist and continue.
newlist = []
name[1] = 'Jack'
Jack = 1
newlist = ['Jack']
name[2] = 'Sam'
Sam = 2
newlist = ['Jack','Sam']
name[3] = 'Terry'
Terry = 3
newlist = ['Jack','Sam','Terry']
name[4] = 'Sam'
Sam = 2
newlist = ['Jack','Sam','Terry','Sam']
name[5] = 'Henry'
Henry = 5
newlist = ['Jack','Sam','Terry','Sam','Henry']
I know this can be done with something like
u,index = np.unique(name,return_inverse=True)
but for me it is important to loop through the individual entries of the list name and keep the logic above. Can someone help me with this?
Try using a dict and checking if keys are already paired to a value:
name = ['Jack','Sam','Terry','Sam','Henry']
vals = {}
i = 0
for entry in name:
if entry not in vals:
vals[entry] = i + 1
i += 1
print vals
Result:
{'Henry': 5, 'Jack': 1, 'Sam': 2, 'Terry': 3}
Elements can be accessed by "index" (read: key) just like you would do for a list, except the "index" is whatever the key is; in this case, the keys are names.
>>> vals['Henry']
5
EDIT: If order is important, you can enter the items into the dict using the number as the key: in this way, you will know which owner is which based on their number:
name = ['Jack','Sam','Terry','Sam','Henry']
vals = {}
i = 0
for entry in name:
#Check if entry is a repeat
if entry not in name[0:i]:
vals[i + 1] = entry
i += 1
print (vals)
print (vals[5])
This code uses the order in which they appear as the key. To make sure we don't overwrite or create duplicates, it checks if the current name has appeared before in the list (anywhere from 0 up to i, the current index in the name list).
In this way, it is still in the "sorted order" which you want. Instead of accessing items by the name of the owner you simply index by their number. This will give you the order you desire from your example.
Result:
>>> vals
{1: 'Jack', 2: 'Sam', 3: 'Terry', 5: 'Henry'}
>>> vals[5]
'Henry'
If you really want to create variable.By using globals() I am creating global variable .If you want you can create local variable using locals()
Usage of globals()/locals() create a dictionary which is the look up table of the variable and their values by adding key and value you are creating a variable
lists1 = ['Jack','Sam','Terry','Sam','Henry']
var = globals()
for i,n in enumerate(nl,1):
if n not in var:
var[n] = [i]
print var
{'Jack':1,'Sam': 2,'Terry': 3, 'Henry':5}
print Jack
1
If order of the original list is key, may I suggest two data structures, a dictionary and a newlist
d = {}
newlist = []
for i,n in enumerate(nl):
if n not in d:
d[n] = [i+1]
newlist.append({n: d[n]})
newlist will return
[{'Jack': [1]}, {'Sam': [2]}, {'Terry': [3]}, {'Sam': [2]}, {'Henry': [5]}]
to walk it:
for names in newlist:
for k, v in names.iteritems():
print('{} is number {}'.format(k, v))
NOTE: This does not make it easy to lookup the number based on the name as other suggested above. That would require more data structure logic. This does however let you keep the original list order, but keep track of the time the name was first found essentially.
Edit: Since order is important to you. Use orderedDict() from the collections module.
Use a dictionary. Iterate over your list with a for loop and then check to see if you have the name in the dictionary with a if statement. enumerate gives you the index number of your name, but keep in mind that index number start from 0 so in accordance to your question we append 1 to the index number giving it the illusion that we begin indexing from 1
import collections
nl = ['Jack','Sam','Terry','Sam','Henry']
d = collections.OrderedDict()
for i,n in enumerate(nl):
if n not in d:
d[n] = [i+1]
print d
Output:
OrderedDict([('Jack', [1]), ('Sam', [2]), ('Terry', [3]), ('Henry', [5])])
EDIT:
The ordered dict is still a dictionary. So you can use .items() to get the key value pairs as tuples. the number is essectially a list so you can do this:
for i in d.items():
print '{} = {}'.format(i[0],i[1][0]) #where `i[1]` gives you the number at that position in the tuple, then the `[0]` gives you the first element of the list.
Output:
Jack = 1
Sam = 2
Terry = 3
Henry = 5

Delete empty value from (sorted) dictionary

I've got an arduino sending me serial data which is transposed into a dictionary.
However, not all entries have a value due data being sent at random.
Before sending the dictionary data to a CSV file I want to prune the empty values or values that are 0 from the dict.
Incoming data would look like this: (values only)
['','7','','49,'','173','158']
I want that to become
['7','49','173','158].
The script I currently use:
import serial
import time
def delete_Blanks(arrayName):
tempArray = array.copy()
for key, value in sorted(tempArray.items()):
if value == "":
del tempArray[key]
else:
print "Value is not nil"
return tempArray
array = {}
ser = serial.Serial('COM2', 9600, timeout=1)
key = 0
while 1:
length = len(array)
if len(array) in range(0,5):
array.update({key:ser.read(1000)})
key = key + 1
print "key is ", key
print array.values()
length = len(array)
else:
newArray = delete_Blanks(array)
print newArray.items()
break
from itertools import compress
l = ['','7','','49','','173','158']
ret = compress(l, map(lambda x: bool(x), l))
print(list(ret))
will output:
['7', '49', '173', '158']
if you have long arrays of data - it's better to work with iterators to avoid memory leaks. If you work with short lists - list comprehension is just fine
You can use a dictionary comprehension. This will remove all false values from a dictionary d:
d={key,d[key] for key in d if d[key]}
If it's just a plain list you can do something like this
Mylist = filter(None, Mylist)
Before creating the dictionary you can filter the two list, the list containing the keys and the list containing the values. Assuming both list are the same length you can then
mydict = dict(zip(l1, l2))
to create your new list
>>> li = ['','7','','49','','173','158']
>>> [e for e in li if e]
['7', '49', '173', '158']

Categories

Resources