Array update anomaly in Python [duplicate] - python

This question already has answers here:
List of lists changes reflected across sublists unexpectedly
(17 answers)
Closed 9 years ago.
I wrote the following code in python. Within checkd,
when I update d[ii][jj], it seems as if the compiler takes its own liberties and makes all the following column entries to 1.
Code:
def checkd(d, level, a, b):
i = len(b)
j = len(a)
print ['0'] + list(a)
for ii in range(i):
for jj in range(j):
if a[jj] == b[ii]:
#print a[jj] +" "+ b[ii] + " Matched."
d[ii][jj] = 1
print b[ii] + "\t" + a[jj] + "\t" + str(d[ii][jj])
print [b[ii]] + [str(m) for m in d[ii]]
return d
a = raw_input("First word:")
b = raw_input("Second word:")
w = input("Size of words to check:")
d = [[0]*len(a)]*len(b)
d = checkd(d, w, a, b)
print d
for x in d : print x
Output:
First word:ascend
Second word:nd
Size of words to check:2
['0', 'a', 's', 'c', 'e', 'n', 'd']
n a 0
n s 0
n c 0
n e 0
n n 1
n d 0
['n', '0', '0', '0', '0', '1', '0']
d a 0
d s 0
d c 0
d e 0
d n 1
d d 1
['d', '0', '0', '0', '0', '1', '1']
[[0, 0, 0, 0, 1, 1], [0, 0, 0, 0, 1, 1]]
[0, 0, 0, 0, 1, 1]
[0, 0, 0, 0, 1, 1]
As you would notice, not only is this leading to some random match(d,n,1?!) in the "d" row,
the returned 2d array is just a copy of the last row of the one in the function.
I have some experience with Python. I am not looking for a workaround (not that I'd mind) as much as an explanation for this behaviour, if possible?
Thanks!

This makes a list of len(b) references to the same list of len(a) zeros. A total of two interconnected lists are created.
d = [[0] * len(a)] * len(b)
What you want to do is:
d = [[0] * len(a) for _ in b]
ints are immutable, so they are safe to duplicate like that.

Related

Convert column to header in a pandas.DataFrame [duplicate]

This question already has answers here:
How can I pivot a dataframe?
(5 answers)
Closed 1 year ago.
There is a similar question with a solution not fully fitting my needs. And I do not understand all details of the solution their so I am not able to adapt it to my situation.
This is my initial dataframe where all unique values in the Y column should become a column.
Y P v
0 A X 0
1 A Y 1
2 B X 2
3 B Y 3
4 C X 4
5 C Y 5
The result should look like this where P is the first column or it could be the index also. So P could be understood as a row heading. And the values from 'Y' are the column headings. And the values from v are in each cell now.
P A B C
0 X 0 2 4
1 Y 1 3 5
Not working approach
This is based on https://stackoverflow.com/a/52082963/4865723
new_index = ['Y', df.groupby('Y').cumcount()]
final = df.set_index(new_index)
final = final['P'].unstack('Y')
print(final)
The problem here is that the index (or first column) does not contain the values from Y and the v column is totally gone.
Y A B C
0 X X X
1 Y Y Y
My own unfinished idea
>>> df.groupby('Y').agg(list)
P v
Y
A [X, Y] [0, 1]
B [X, Y] [2, 3]
C [X, Y] [4, 5]
I do not know if this help or how to go further from this point on.
The full MWE
#!/usr/bin/env python3
import pandas as pd
# initial data
df = pd.DataFrame({
'Y': ['A', 'A', 'B', 'B', 'C', 'C'],
'P': list('XYXYXY'),
'v': range(6)
})
print(df)
# final result I want
final = pd.DataFrame({
'P': list('XY'),
'A': [0, 1],
'B': [2, 3],
'C': [4, 5]
})
print(final)
# approach based on:
# https://stackoverflow.com/a/52082963/4865723
new_index = ['Y', df.groupby('Y').cumcount()]
final = df.set_index(new_index)
final = final['P'].unstack('Y')
print(final)
You don't need anything complex, this is a simple pivot:
df.pivot(index='P', columns='Y', values='v').reset_index()

Create an adjacency matrix using a dictionary with letter values converted to numbers in python

So I have a dictionary with letter values and keys and I want to generate an adjacency matrix using digits (0 or 1). But I don't know how to do that.
Here is my dictionary:
g = { "a" : ["c","e","b"],
"b" : ["f","a"]}
And I want an output like this :
import numpy as np
new_dic = {'a':[0,1,1,0,1,0],'b':(1,0,0,0,0,1)}
rows_names = ['a','b'] # I use a list because dictionaries don't memorize the positions
adj_matrix = np.array([new_dic[i] for i in rows_names])
print(adj_matrix)
Output :
[[0 1 1 0 1 0]
[1 0 0 0 0 1]]
So it's an adjacency matrix: column/row 1 represent A, column/row 2 represent B ...
Thank you !
I don't know if it helps but here is how I convert all letters to numbers using ascii :
for key, value in g.items():
nums = [str(ord(x) - 96) for x in value if x.lower() >= 'a' and x.lower() <= 'z']
g[key] = nums
print(g)
Output :
{'a': ['3', '5', '2'], 'b': ['6', '1']}
So a == 1 b == 2 ...
So my problem is: If a take the keys a with the first value "e", how should I do so that the e is found in the column 5 line 1 and not in the column 2 line 1 ? and replacing the e to 1
Using comprehensions:
g = {'a': ['c', 'e', 'b'], 'b': ['f', 'a']}
vals = 'a b c d e f'.split() # Column values
new_dic = {k: [1 if x in v else 0 for x in vals] for k, v in g.items()}

Implementing k nearest neighbours from distance matrix?

I am trying to do the following:
Given a dataFrame of distance, I want to identify the k-nearest neighbours for each element.
Example:
A B C D
A 0 1 3 2
B 5 0 2 2
C 3 2 0 1
D 2 3 4 0
If k=2, it should return:
A: B D
B: C D
C: D B
D: A B
Distances are not necessarily symmetric.
I am thinking there must be something somewhere that does this in an efficient way using Pandas DataFrames. But I cannot find anything?
Homemade code is also very welcome! :)
Thank you!
The way I see it, I simply find n + 1 smallest numbers/distances/neighbours for each row and remove the 0, which would then give you n numbers/distances/neighbours. Keep in mind that the code will not work if you have a distance of zeroes! Only the diagonals are allowed to be 0.
import pandas as pd
import numpy as np
X = pd.DataFrame([[0, 1, 3, 2],[5, 0, 2, 2],[3, 2, 0, 1],[2, 3, 4, 0]])
X.columns = ['A', 'B', 'C', 'D']
X.index = ['A', 'B', 'C', 'D']
X = X.T
for i in X.index:
Y = X.nsmallest(3, i)
Y = Y.T
Y = Y[Y.index.str.startswith(i)]
Y = Y.loc[:, Y.any()]
for j in Y.index:
print(i + ": ", list(Y.columns))
This prints out:
A: ['B', 'D']
B: ['C', 'D']
C: ['D', 'B']
D: ['A', 'B']

adding binary numbers without converting to decimal or using built in functions

I'm adding two binary numbers inputted as strings and outputting their sum as a string in binary using this method.
Any thoughts on getting my code to work?
def add(a,b):
a = list(a)
b = list(b)
equalize(a,b)
sum_check(a,b,ctr)
out = " ".join(str(x) for x in sum_check(a,b,ctr))
out.replace(" ","")
print(out)
def equalize(a,b):
if len(a) > len(b):
for i in range(0, len(a)-len(b)):
b.insert(0,'0')
elif len(a) < len(b):
for i in range(0, len(b)-len(a)):
a.insert(0,'0')
def sum_check(a,b):
out = []
ctr = 0
def sum(a, b):
if ctr > 0:
if a[-1] + b[-1] == 2:
out.append('1')
elif a[-1] + b[-1] == 0:
out.append('1')
ctr -= 1
else: # a[-1] + b[-1] = 1
out.append('0')
ctr -= 1
else: # ctr = 0
if a[-1] + b[-1] == 2:
out.append('1')
ctr += 1
elif a[-1] + b[-1] == 0:
out.append('0')
else: # a[-1] + b[-1] = 1
out.append('1')
for i in range(len(a)):
if i == 0:
sum(a,b)
else:
new_a = a[:-1]
new_b = b[:-1]
sum(new_a, new_b)
return out
Your algorithm is not really simple (I would rewrite it completely; I would also at least give functions and variables more explicit names in order to understand the algorithm again quicker later, like ctr should anyhow explicitely relate to carrying etc.).
Nevertheless, here it is, corrected and working. I have inserted some comments at places where I changed things (either errors in the algorithm, or python programming errors).
I have put def sum(...) outside sum_check, it's clearer this way.
Though you should know that sum is a python builtin, so you should find another name, to avoid losing the builtin in your namespace (but I guess your algorithm is only for training purpose, since you could replace all this by an operation on binary numbers directly:
$ python3
Python 3.4.3 (default, Oct 14 2015, 20:28:29)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = int('10110', 2)
>>> b = int('1011100', 2)
>>> bin(a + b)
'0b1110010'
>>>
).
Also, I have left some debugging print statements, to help to see what happens. Plus an example to check all cases were gone through.
Last note: it first crashed because you used ctr before you had defined it. This was the first thing to correct.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
def equalize(a, b):
if len(a) > len(b):
for i in range(0, len(a) - len(b)):
b.insert(0, 0)
elif len(a) < len(b):
for i in range(0, len(b) - len(a)):
a.insert(0, 0)
return a, b
def sum(a, b, ctr):
out = ''
print('\n'.join(['-------', 'Working on:', str(a), str(b)]))
if ctr > 0:
if a[-1] + b[-1] == 2:
out = '1'
print('Case 1')
elif a[-1] + b[-1] == 0:
out = '1'
ctr -= 1
print('Case 2')
else: # a[-1] + b[-1] = 1
out = '0'
print('Case 3')
# ctr -= 1 (wrong)
else: # ctr = 0
if a[-1] + b[-1] == 2:
out = '0' # '1' was wrong
ctr += 1
print('Case 4')
elif a[-1] + b[-1] == 0:
out = '0'
print('Case 5')
else: # a[-1] + b[-1] = 1
out = '1'
print('Case 6')
print('Sum will return: ' + str(out)
+ ' and carry: ' + str(ctr) + '\n-------')
return out, ctr
def sum_check(a, b):
ctr = 0
out = []
n = len(a)
for i in range(n):
if i != 0:
# You were always giving the same a and b to feed sum().
# In order to change them, as it's not advised to iterate over a
# changing list (maybe not even possible), I stored the desired
# length in a number n. Then, I assign new values to a and b.
a = a[:-1]
b = b[:-1]
new_out, ctr = sum(a, b, ctr)
out.append(new_out)
print('Current out: ' + str(out) + ' and carry: ' + str(ctr))
return out
def add(a, b):
a = [int(x) for x in a]
b = [int(x) for x in b]
# You need to return the new contents of a and b, otherwise you'll keep
# them as they were before the call to equalize()
a, b = equalize(a, b)
print('\n'.join(['Equalized: ', str(a), str(b)]))
# On next line, [::-1] reverses the result (your algorithm returns a
# result to be read from right to left)
print('Result: ' + ''.join(sum_check(a, b)[::-1]))
add('10110', '1011100')
Output:
Equalized:
[0, 0, 1, 0, 1, 1, 0]
[1, 0, 1, 1, 1, 0, 0]
-------
Working on:
[0, 0, 1, 0, 1, 1, 0]
[1, 0, 1, 1, 1, 0, 0]
Case 5
Sum will return: 0 and carry: 0
-------
Current out: ['0'] and carry: 0
-------
Working on:
[0, 0, 1, 0, 1, 1]
[1, 0, 1, 1, 1, 0]
Case 6
Sum will return: 1 and carry: 0
-------
Current out: ['0', '1'] and carry: 0
-------
Working on:
[0, 0, 1, 0, 1]
[1, 0, 1, 1, 1]
Case 4
Sum will return: 0 and carry: 1
-------
Current out: ['0', '1', '0'] and carry: 1
-------
Working on:
[0, 0, 1, 0]
[1, 0, 1, 1]
Case 3
Sum will return: 0 and carry: 1
-------
Current out: ['0', '1', '0', '0'] and carry: 1
-------
Working on:
[0, 0, 1]
[1, 0, 1]
Case 1
Sum will return: 1 and carry: 1
-------
Current out: ['0', '1', '0', '0', '1'] and carry: 1
-------
Working on:
[0, 0]
[1, 0]
Case 2
Sum will return: 1 and carry: 0
-------
Current out: ['0', '1', '0', '0', '1', '1'] and carry: 0
-------
Working on:
[0]
[1]
Case 6
Sum will return: 1 and carry: 0
-------
Current out: ['0', '1', '0', '0', '1', '1', '1'] and carry: 0
Result: 1110010

A simple list simulation in python

lists = ['A', 'B', 'C', 'D']
nos = [4, 4, 1, 1]
for idx, ln in enumerate(zip(lists,nos)):
l, n = ln[0], ln[1]
in_nos = range(1, n+1)
for indx, in_no in enumerate(in_nos):
out_no = ??? ### **I need an expression to get out_no here**
print out_no
Without modifying anything except the ??? after out_no, I need to print out the numbers from 1 to the sum of the numbers in nos i.e.:
1
2
3
4
5
6
7
8
9
10
I tried as:
out_no = idx*n + indx + 1
which resulted in:
1
2
3
4
5
6
7
8
1
1
Which out_no would give me the correct result?
Depends on what you're allowed to change, the simple way would of course be:
lists = ['A', 'B', 'C', 'D']
nos = [4, 4, 1, 1]
a = 0
for idx, ln in enumerate(zip(lists,nos)):
l, n = ln[0], ln[1]
in_nos = range(1, n+1)
for indx, in_no in enumerate(in_nos):
out_no = a+indx+1
print out_no ##The result should be HERE
a += n
Assuming you can only change out_no, you could do:
lists = ['A', 'B', 'C', 'D']
nos = [4, 4, 1, 1]
for idx, ln in enumerate(zip(lists,nos)):
l, n = ln[0], ln[1]
in_nos = range(1, n+1)
for indx, in_no in enumerate(in_nos):
out_no = sum(nos[0:+idx])+indx+1
print out_no ##The result should be HERE
Ok, as IanAuld pointed out, if you can just scrap everything but nos there are simpler solutions, for instance:
nos = [4, 4, 1, 1]
for i in range(sum(nos)): print i+1

Categories

Resources