How to merge an array with its array elements in Python? - python

I have an array like below;
constants = ['(1,2)', '(1,5,1)', '1']
I would like to transform the array into like below;
constants = [(1,2), 1, 2, 3, 4, 5, 1]
For doing this, i tried some operations;
from ast import literal_eval
import numpy as np
constants = literal_eval(str(constants).replace("'",""))
constants = [(np.arange(*i) if len(i)==3 else i) if isinstance(i, tuple) else i for i in constants]
And the output was;
constants = [(1, 2), array([1, 2, 3, 4]), 1]
So, this is not expected result and I'm stuck in this step. The question is, how can i merge the array with its parent array?

This is one approach.
Demo:
from ast import literal_eval
constants = ['(1,2)', '(1,5,1)', '1']
res = []
for i in constants:
val = literal_eval(i) #Convert to python object
if isinstance(val, tuple): #Check if element is tuple
if len(val) == 3: #Check if no of elements in tuple == 3
val = list(val)
val[1]+=1
res.extend(range(*val))
continue
res.append(val)
print(res)
Output:
[(1, 2), 1, 2, 3, 4, 5, 1]

I'm going to assume that this question is very literal, and that you always want to transform this:
constants = ['(a, b)', '(x, y, z)', 'i']
into this:
transformed = [(a,b), x, x+z, x+2*z, ..., y, i]
such that the second tuple is a range from x to y with step z. So your final transformed array is the first element, then the range defined by your second element, and then your last element. The easiest way to do this is simply step-by-step:
constants = ['(a, b)', '(x, y, z)', 'i']
literals = [eval(k) for k in constants] # get rid of the strings
part1 = [literals[0]] # individually make each of the three parts of your list
part2 = [k for k in range(literals[1][0], literals[1][1] + 1, literals[1][2])] # or if you don't need to include y then you could just do range(literals[1])
part3 = [literals[2]]
transformed = part1 + part2 + part3

I propose the following:
res = []
for cst in constants:
if isinstance(cst,tuple) and (len(cst) == 3):
#add the range to the list
res.extend(range(cst[0],cst[1], cst[2]))
else:
res.append(cst)
res has the result you want.
There may be a more elegant way to solve it.

Please use code below to resolve parsing described above:
from ast import literal_eval
constants = ['(1,2)', '(1,5,1)', '1']
processed = []
for index, c in enumerate(constants):
parsed = literal_eval(c)
if isinstance(parsed, (tuple, list)) and index != 0:
processed.extend(range(1, max(parsed) + 1))
else:
processed.append(parsed)
print processed # [(1, 2), 1, 2, 3, 4, 5, 1]

Related

Don't understand Python Expression

I have some basic knowledge on Python but I have no idea what's going for the below code. Can someone help me to explain or 'translate' it into a more normal/common expression?
steps = len(t)
sa = [i for i in range(steps)]
sa.sort(key = lambda i: t[i:i + steps])#I know that sa is a list
for i in range(len(sa)):
sf = t[sa[i] : sa[i] + steps]
't' is actually a string
Thank you.
What I don't understand is the code: sa.sort(key = lambda i: t[i:i + steps])`
sa.sort(key = lambda i: t[i:i + steps])
It sorts sa according to the natural ordering of substrings t[i:i+len(t)]. Actually i + steps will always be greater or equal than steps (which is len(t)) so it could be written t[i:] instead (which makes the code simpler to understand)
You will better understand using the decorate/sort/undecorate pattern:
>>> t = "azerty"
>>> sa = range(len(t))
>>> print sa
[0, 1, 2, 3, 4, 5]
>>> decorated = [(t[i:], i) for i in sa]
>>> print decorated
[('azerty', 0), ('zerty', 1), ('erty', 2), ('rty', 3), ('ty', 4), ('y', 5)]
>>> decorated.sort()
>>> print decorated
[('azerty', 0), ('erty', 2), ('rty', 3), ('ty', 4), ('y', 5), ('zerty', 1)]
>>> sa = [i for (_dummy, i) in decorated]
>>> print sa
[0, 2, 3, 4, 5, 1]
and sf = t[sa[i] : sa[i] + steps]
This could also be written more simply:
for i in range(len(sa)):
sf = t[sa[i] : sa[i] + steps]
=>
for x in sa:
sf = t[x:]
print sf
which yields:
azerty
erty
rty
ty
y
zerty
You'll notice that this is exactly the keys used (and then discarded)
in the decorate/sort/undecorate example above, so the whole thing could be rewritten as:
def foo(t):
decorated = sorted((t[i:], i) for i in range(len(t)))
for sf, index in decorated:
print sf
# do something with sf here
As to what all this is supposed to do, I'm totally at lost, but at least you now have a much more pythonic (readable...) version of this code ;)
The lambda in sort defines the criteria according to which the list is going to be sorted.
In other words, the list will not be sorted simply according to its values, but according to the function applied to the values.
Have a look here for more details.
It looks like what you are doing is sorting the list according to the alphabetical ordering of the substrings of the input string t.
Here is what is happening:
t = 'hello' # EXAMPLE
steps = len(t)
sa = [i for i in range(steps)]
sort_func = lambda i: t[i:i + steps]
for el in sa:
print sort_func(el)
#ello
#hello
#llo
#lo
#o
So these are the values that determines the sorting of the list.
transf_list = [sort_func(el) for el in sa]
sorted(transf_list)
# ['ello', 'hello', 'llo', 'lo', 'o']
Hence:
sa.sort(key = sort_func)#I know that sa is a list
# [1, 0, 2, 3, 4]

Python - Create set from string

I have strings in the format "1-3 6:10-11 7-9" and from them I want to create number sets as follows {1,2,3,6,10,11,7,8,9}.
For creating the set from the range of numbers, I have the following code:
def create_set(src):
lset = []
if len(src) > 0:
pos = src.find('-')
if pos != -1:
first = int(src[:pos])
last = int(src[pos+1:])
else:
return [int(src)] # Only one number
for j in range (first, last+1):
lset.append(j)
return set(lset)
But I cannot figure out how to correctly treat the ':' when it appears in the string. Can someone help me?
Thanks in advance!
EDIT: By the way, is there a more compact way of parsing such strings, perhaps using regular expressions?
Something like this might work for you:
s = '1-3 6:10-11 7-9'
s = s.replace(':', ' ')
lset = set()
fs = s.split()
for f in fs:
r = f.split('-')
if len(r)==1:
# add a single number
lset.add(int(r[0]))
else:
# add a range of numbers (inclusive of the endpoints)
lset |= set(range(int(r[0]), int(r[1])+1))
print(lset)
EDIT: By the way, is there a more compact way of parsing such strings,
perhaps using regular expressions?
Perhaps a cleaner (and slightly more efficient) way:
import re
import itertools
allGroups = re.findall(r"(\d+)(?:-(\d+)|:)", s)
expanded = [range(int(x), (int(x) if y == '' else int(y)) + 1) for x, y in allGroups]
print {x for x in itertools.chain.from_iterable(expanded)}
Explanations:
Match all strings like 'a-b' or 'a:' and return a list of (a, b) and (a, '') pairs respectively:
allGroups = re.findall(r"(\d+)(?:-(\d+)|:)", s)
This produces:
[('1', '3'), ('6', ''), ('10', '11'), ('7', '9')]
Using list comprehension expand all pairs of (x, y) into the full list of numbers in the range (x, y + 1), taking care to handle the (x, '') case as (x, x+1):
expanded = [range(int(x), (int(x) if y == '' else int(y)) + 1) for x, y in allGroups]
This produces:
[[1, 2, 3], [6], [10, 11], [7, 8, 9]]
Use itertools.chain.from_iterable() to transform the list of lists into a single iterable which is iterated by a set comprehension into the final set:
print {x for x in itertools.chain.from_iterable(expanded)}
This produces:
set([1, 2, 3, 6, 7, 8, 9, 10, 11])

Python random list

I'm new to Python, and have some problems with creating random lists.
I'm using random.sample(range(x, x), y).
I want to get 4 lists with unique numbers, from 1-4, so I have been using this
a = random.sample(range(1, 5), 4)
b = random.sample(range(1, 5), 4)
c = random.sample(range(1, 5), 4)
d = random.sample(range(1, 5), 4)
So I get for example
a = 1, 3, 2, 4
b = 1, 4, 3, 2
c = 2, 3, 1, 4
d = 4, 2, 3, 1
How can I make it that the column are also unique?
Absent a clear mathematical theory, I distrust anything other than a somewhat hit-and-miss approach. In particular, backtracking approaches can introduce a subtle bias:
from random import shuffle
def isLatin(square):
#assumes that square is an nxn list
#where each row is a permutation of 1..n
n = len(square[0])
return all(len(set(col)) == n for col in zip(*square))
def randSquare(n):
row = [i for i in range(1,1+n)]
square = []
for i in range(n):
shuffle(row)
square.append(row[:])
return square
def randLatin(n):
#uses a hit and miss approach
while True:
square = randSquare(n)
if isLatin(square): return square
Typical output:
>>> s = randLatin(4)
>>> for r in s: print(r)
[4, 1, 3, 2]
[2, 3, 4, 1]
[1, 4, 2, 3]
[3, 2, 1, 4]
Totally random then:
def gen_matrix():
first_row = random.sample(range(1, 5), 4)
tmp = first_row + first_row
rows = []
for i in range(4):
rows.append(tmp[i:i+4])
return random.sample(rows, 4)
Create a list of all the elements, and as will filling the line, remove the used element.
import random
def fill_line(length):
my_list = list(range(length))
to_return = []
for i in range(length):
x = random.choice(my_list)
to_return.append(x)
my_list.remove(x)
return to_return
x = [fill_line(4)
for i in range(4)]
print(x)
Probably the simplest way is to create a valid matrix, and then shuffle the rows, and then shuffle the columns:
import random
def random_square(U):
U = list(U)
rows = [U[i:] + U[:i] for i in range(len(U))]
random.shuffle(rows)
rows_t = [list(i) for i in zip(*rows)]
random.shuffle(rows_t)
return rows_t
Usage:
>>> random_square(range(1, 1+4))
[[2, 3, 4, 1], [4, 1, 2, 3], [3, 4, 1, 2], [1, 2, 3, 4]]
This should be able to create any valid matrix with equal probability. After doing some reading it seems that this still has bias, although I don't fully comprehend why yet.
I would build a random latin square by 1) start with a single random permutation, 2) populate the rows with rotations 3) shuffle the rows 4) transpose the square 5) shuffle the rows again:
from collections import deque
from random import shuffle
def random_latin_square(elements):
elements = list(elements)
shuffle(elements)
square = []
for i in range(len(elements)):
square.append(list(elements))
elements = elements[1:] + [elements[0]]
shuffle(square)
square[:] = zip(*square)
shuffle(square)
return square
if __name__ == '__main__':
from pprint import pprint
square = random_latin_square('ABCD')
pprint(square)

Duplicates counting with order order preserving in Python lists

suppose the list
[7,7,7,7,3,1,5,5,1,4]
I would like to remove duplicates and get them counted while preserving the order of the list. To preserve the order of the list removing duplicates i use the function
def unique(seq, idfun=None):
# order preserving
if idfun is None:
def idfun(x): return x
seen = {}
result = []
for item in seq:
marker = idfun(item)
if marker in seen: continue
seen[marker] = 1
result.append(item)
return result
that is giving to me the output
[7,3,1,5,1,4]
but the desired output i want would be (in the final list could exists) is:
[7,3,3,1,5,2,4]
7 is written because it's the first item in the list, then the following is checked if it's the different from the previous. If the answer is yes count the occurrences of the same item until a new one is found. Then repeat the procedure. Anyone more skilled than me that could give me a hint in order to get the desired output listed above? Thank you in advance
Perhaps something like this?
>>> from itertools import groupby
>>> seen = set()
>>> out = []
>>> for k, g in groupby(lst):
if k not in seen:
length = sum(1 for _ in g)
if length > 1:
out.extend([k, length])
else:
out.append(k)
seen.add(k)
...
>>> out
[7, 4, 3, 1, 5, 2, 4]
Update:
As per your comment I guess you wanted something like this:
>>> out = []
>>> for k, g in groupby(lst):
length = sum(1 for _ in g)
if length > 1:
out.extend([k, length])
else:
out.append(k)
...
>>> out
[7, 4, 3, 1, 5, 2, 1, 4]
Try this
import collections as c
lst = [7,7,7,7,3,1,5,5,1,4]
result = c.OrderedDict()
for el in lst:
if el not in result.keys():
result[el] = 1
else:
result[el] = result[el] + 1
print result
prints out: OrderedDict([(7, 4), (3, 1), (1, 2), (5, 2), (4, 1)])
It gives a dictionary though. For a list, use:
lstresult = []
for el in result:
# print k, v
lstresult.append(el)
if result[el] > 1:
lstresult.append(result[el] - 1)
It doesn't match your desired output but your desired output also seems like kind of a mangling of what is trying to be represented

Reverse indices of a sorted list

I want to return the 'reverse' indices of a sorted list. What I mean by that is: I have an unsorted list U and I sort it via S=sorted(U). Now, I can get the sort indices such that U(idx)=S - but I want S(Ridx) = U.
Here a little example:
U=[5,2,3,1,4]
S=sorted(U)
idx = [U.index(S[i]) for i in range(len(U))]
>>> idx
[3, 1, 2, 4, 0]
Ridx = [S.index(U[i]) for i in range(len(U))]
>>> Ridx
[4, 1, 2, 0, 3]
>>>[U[idx[i]] for i in range(len(U))] == S
True
>>>[S[Ridx[i]] for i in range(len(U))] == U
True
What I need is an efficient way to get Ridx.
Thanks!
Edit:
All right! I did a little speed test for both of the solutions (#Jon Clements and #Whatang) which answered the question.
The script:
import datetime as DT
import random
U=[int(1000*random.random()) for i in xrange(pow(10,8))]
S=sorted(U)
idx = sorted(xrange(len(U)), key=U.__getitem__)
T0 = DT.datetime.now()
ridx = sorted(xrange(len(U)), key=idx.__getitem__)
print [S[ridx[i]] for i in range(len(U))]==U
elapsed = DT.datetime.now()-T0
print str(elapsed)
print '==============='
T0 = DT.datetime.now()
ridx = [ y for (x,y) in sorted(zip(idx, range(len(idx)))) ]
print [S[ridx[i]] for i in range(len(U))]==U
elapsed = DT.datetime.now()-T0
print str(elapsed)
And the results:
True
0:02:45.278000
===============
True
0:06:48.889000
Thank you all for the quick and meaningful help!
The most efficient I can think of (short of possibly looking to numpy) that gets rid of the .index and can be used for both idx and ridx:
U=[5,2,3,1,4]
idx = sorted(xrange(len(U)), key=U.__getitem__)
ridx = sorted(xrange(len(U)), key=idx.__getitem__)
# [3, 1, 2, 4, 0] [4, 1, 2, 0, 3]
Not quite the data structure you asked for, but I think this gets the info you want:
>>> sorted(x[::-1] for x in enumerate(['z', 'a', 'c', 'x', 'm']))
[('a', 1), ('c', 2), ('m', 4), ('x', 3), ('z', 0)]
With numpy you can do
>>> import numpy as np
>>> U = [5, 2, 3, 1, 4]
>>> np.array(U).argsort().argsort()
array([4, 1, 2, 0, 3])
Assuming you already have the list idx, you can do
ridx = [ y for (x,y) in sorted(zip(idx, range(len(idx)))) ]
Then for all i from 0 to len(U)
S[ridx[i]] == U[i]
You can avoid the sort if you use a dictionary:
ridx_dict = dict(zip(idx, range(len(idx))))
which can then be converted to a list:
ridx = [ ridx_dict[k] for k in range(len(idx)) ]
Thinking about permutations is the key to this problem. One way of writing down a permutation is to write all the indexes in order on one line, then on the line below write the new index of the element with that index. e.g., for your example
0 1 2 3 4
3 1 2 4 0
This second line is your idx list. You read down the columns, so the element which starts at index 0 moves to index 3, the element which starts at index 1 stays at index 1, and so on.
The inverse permutation is the ridx you're looking for. To find this, sort the lower line of the your permutation keeping columns together, then write down the new top line. So the example becomes:
4 1 2 0 3
0 1 2 3 4
If I understand the question correctly (which I didn't) I think U.index(S[i]) is what you are looking for
EDIT: so I guess you could save a dictionary of the original indices and keep the retrieval syntax pretty simple
OIDX = {U[i]: i for i in range(0, len(U))}
S = sorted(U)
OIDX[S[i]]

Categories

Resources