Python array combining - python

I'm trying to port some Python code to Java. I'm not familiar with Python and have never seen this in any language before:
return [c,] + s
What exactly does this line mean? Specifically the [c,] part. Is it combining two arrays or something? s is an array of integers and c is an integer. The full function is below (from Wikipedia: http://en.wikipedia.org/wiki/Ring_signature )
def sign(self,m,z):
self.permut(m)
s,u = [None]*self.n,random.randint(0,self.q)
c = v = self.E(u)
for i in range(z+1,self.n)+range(z):
s[i] = random.randint(0,self.q)
v = self.E(v^self.g(s[i],self.k[i].e,self.k[i].n))
if (i+1)%self.n == 0: c = v
s[z] = self.g(v^u,self.k[z].d,self.k[z].n)
return [c,] + s
Thanks so much!

The comma is redundant. It's just creating a one-element list:
>>> [1,]
[1]
>>> [1] == [1,]
True
The practice comes from creating tuples in Python; a one-element tuple requires a comma:
>>> (1)
1
>>> (1,)
(1,)
>>> (1) == (1,)
False
The [c,] + s statement creates a new list with the value of c as the first element.

[c,] is exactly the same as [c], i.e. a single-item list.
(See this answer for why this syntax is needed)

For a list, the extra comma is redundant and can be ignored. The only time it makes a difference if it had been a tuple instead of a list so
[c,] and [c] are the same but,
(c,) and (c) are different. The former being a tuple and later just a parenthesis around an expression

to answer both your questions, the line concatenates two lists, the first of the two is a one-element list since the comma is just ignored by python

I believe you are correct, it is combining two "arrays" (lists in python). If I'm not mistaken, the trailing comma is unnecessary in this instance.
x = [1,2,3]
y = [1] + x
#is equivalent to
x = [1,2,3]
y = [1,] + x
The reason Python allows the use of trailing commas in lists has to do with another data type called a tuple and ease of use with multi-line list declaration in code.
Why does Python allow a trailing comma in list?

Related

Why Python gives TypeError when trying to call a function that read text?

So I have a code to give a position of a character in a text file, using my function code that looks like this
#this is defined function
def return_position(arr, search_for=[''], base=1):
dim = ()
arr = [str(arr) for arr in arr]
for search in search_for:
for sr in range(len(arr)):
s = arr[sr]
for cr in range(len(s)):
c = s[cr]
if c == search:
dim += [(cr+base, sr+base)]
return dim
In order to get the list of file, I used .readlines() because it's containing a list and will get the expected result, so I did
#open the file and read it as a list
textlist = open('testfile.text', 'r').readlines()
#textlist = ['Build a machine\n', 'For the next generation']
#print the return_position as a lines
print(''.join(return_position(textlist, search_for=['a', 'b'])))
In testfile.txt
Build a machine
For the next generation
Expected result
(1, 1)
(7, 1)
(10, 1)
(19, 2)
But why it's returning
TypeError can only concatenate tuple (not "list") to tuple
dim += [(cr+base, sr+base)]
This line won't work if dim is a tuple.
Turn this
dim = ()
into
dim = []
A tuple is an immutable ordered sequence. You can't append stuff to it. So use a list.
Ref: https://docs.python.org/3/library/stdtypes.html#typesseq
In your code dim is a tuple and tuples are immutable (i.e. cannot be modified after they're constructed).
You'll need to turn dim into a list to be able to append to it:
dim = []
As the other answers point out already a tuple is immutable and can not be changed.
That, however, does not answer your question because you are not trying to change a tuple, you are trying to concatenate a tuple and a list. Python does not know whether you would want a tuple or a list as a result therefore it raises the TypeError.
Concatenating two tuples works:
(1,2) + (3,4) # result: (1, 2, 3, 4)
Concatenating two lists works as well:
[1,2] + [3,4] # result: [1, 2, 3, 4]
Only the mixture of tuple and list causes the problem:
[1,2] + (3,4) # raises TypeError
Therefore (as the other answers point out already) changing the data type of either operand would solve the problem.
I would suggest using a list and append to it instead of concatenate for efficiency reasons:
def return_position(arr, search_for=[''], base=1):
dim = [] # changed this line
arr = [str(arr) for arr in arr]
for search in search_for:
for sr in range(len(arr)):
s = arr[sr]
for cr in range(len(s)):
c = s[cr]
if c == search:
dim.append((cr+base, sr+base)) # changed this line
return dim
textlist = ['Build a machine\n', 'For the next generation']
#print the return_position as a line
print(''.join(str(return_position(textlist, search_for=['a', 'b'])))) # added explicit conversion to str
Your print was raising a type error as well because all elements of the iterable passed to str.join must be strings. It does not perform an explicit conversion like the print function.
EDIT:
Please note that the algorithm, although it does not raise an error anymore, is missing one of your expected results because the algorithm is case sensitive.
If you want it to be case insensitive use str.casefold or str.lower, see this answer.
I would not use casefold to make sure that an "ß" is not replaced with a "ss", possibly turning the word into a different word with different pronounciation and different meaning like in the example in the linked answer (e.g. "busse" should not equal "buße").

Why does b+=(4,) work and b = b + (4,) doesn't work when b is a list?

If we take b = [1,2,3] and if we try doing: b+=(4,)
It returns b = [1,2,3,4], but if we try doing b = b + (4,) it doesn't work.
b = [1,2,3]
b+=(4,) # Prints out b = [1,2,3,4]
b = b + (4,) # Gives an error saying you can't add tuples and lists
I expected b+=(4,) to fail as you can't add a list and a tuple, but it worked. So I tried b = b + (4,) expecting to get the same result, but it didn't work.
The problem with "why" questions is that usually they can mean multiple different things. I will try to answer each one I think you might have in mind.
"Why is it possible for it to work differently?" which is answered by e.g. this. Basically, += tries to use different methods of the object: __iadd__ (which is only checked on the left-hand side), vs __add__ and __radd__ ("reverse add", checked on the right-hand side if the left-hand side doesn't have __add__) for +.
"What exactly does each version do?" In short, the list.__iadd__ method does the same thing as list.extend (but because of the language design, there is still an assignment back).
This also means for example that
>>> a = [1,2,3]
>>> b = a
>>> a += [4] # uses the .extend logic, so it is still the same object
>>> b # therefore a and b are still the same list, and b has the `4` added
[1, 2, 3, 4]
>>> b = b + [5] # makes a new list and assigns back to b
>>> a # so now a is a separate list and does not have the `5`
[1, 2, 3, 4]
+, of course, creates a new object, but explicitly requires another list instead of trying to pull elements out of a different sequence.
"Why is it useful for += to do this? It's more efficient; the extend method doesn't have to create a new object. Of course, this has some surprising effects sometimes (like above), and generally Python is not really about efficiency, but these decisions were made a long time ago.
"What is the reason not to allow adding lists and tuples with +?" See here (thanks, #splash58); one idea is that (tuple + list) should produce the same type as (list + tuple), and it's not clear which type the result should be. += doesn't have this problem, because a += b obviously should not change the type of a.
They are not equivalent:
b += (4,)
is shorthand for:
b.extend((4,))
while + concatenates lists, so by:
b = b + (4,)
you're trying to concatenate a tuple to a list
When you do this:
b += (4,)
is converted to this:
b.__iadd__((4,))
Under the hood it calls b.extend((4,)), extend accepts an iterator and this why this also work:
b = [1,2,3]
b += range(2) # prints [1, 2, 3, 0, 1]
but when you do this:
b = b + (4,)
is converted to this:
b = b.__add__((4,))
accept only list object.
From the official docs, for mutable sequence types both:
s += t
s.extend(t)
are defined as:
extends s with the contents of t
Which is different than being defined as:
s = s + t # not equivalent in Python!
This also means any sequence type will work for t, including a tuple like in your example.
But it also works for ranges and generators! For instance, you can also do:
s += range(3)
The "augmented" assignment operators like += were introduced in Python 2.0, which was released in October 2000. The design and rationale are described in PEP 203. One of the declared goals of these operators was the support of in-place operations. Writing
a = [1, 2, 3]
a += [4, 5, 6]
is supposed to update the list a in place. This matters if there are other references to the list a, e.g. when a was received as a function argument.
However, the operation can't always happen in place, since many Python types, including integers and strings, are immutable, so e.g. i += 1 for an integer i can't possibly operate in place.
In summary, augmented assignment operators were supposed to work in place when possible, and create a new object otherwise. To facilitate these design goals, the expression x += y was specified to behave as follows:
If x.__iadd__ is defined, x.__iadd__(y) is evaluated.
Otherwise, if x.__add__ is implemented x.__add__(y) is evaluated.
Otherwise, if y.__radd__ is implemented y.__radd__(x) is evaluated.
Otherwise raise an error.
The first result obtained by this process will be assigned back to x (unless that result is the NotImplemented singleton, in which case the lookup continues with the next step).
This process allows types that support in-place modification to implement __iadd__(). Types that don't support in-place modification don't need to add any new magic methods, since Python will automatically fall back to essentially x = x + y.
So let's finally come to your actual question – why you can add a tuple to a list with an augmented assignment operator. From memory, the history of this was roughly like this: The list.__iadd__() method was implemented to simply call the already existing list.extend() method in Python 2.0. When iterators were introduced in Python 2.1, the list.extend() method was updated to accept arbitrary iterators. The end result of these changes was that my_list += my_tuple worked starting from Python 2.1. The list.__add__() method, however, was never supposed to support arbitrary iterators as the right-hand argument – this was considered inappropriate for a strongly typed language.
I personally think the implementation of augmented operators ended up being a bit too complex in Python. It has many surprising side effects, e.g. this code:
t = ([42], [43])
t[0] += [44]
The second line raises TypeError: 'tuple' object does not support item assignment, but the operation is successfully performed anyway – t will be ([42, 44], [43]) after executing the line that raises the error.
Most people would expect X += Y to be equivalent to X = X + Y. Indeed, the Python Pocket Reference (4th ed) by Mark Lutz says on page 57 "The following two formats are roughly equivalent: X = X + Y , X += Y". However, the people who specified Python did not make them equivalent. Possibly that was a mistake which will result in hours of debugging time by frustrated programmers for as long as Python remains in use, but it's now just the way Python is. If X is a mutable sequence type, X += Y is equivalent to X.extend( Y ) and not to X = X + Y.
As it's explained here, if array doesn't implement __iadd__ method, the b+=(4,) would be just a shorthanded of b = b + (4,) but obviously it's not, so array does implement __iadd__ method. Apparently the implementation of __iadd__ method is something like this:
def __iadd__(self, x):
self.extend(x)
However we know that the above code is not the actual implementation of __iadd__ method but we can assume and accept that there's something like extend method, which accepts tupple inputs.

What's the difference between a += number and a += number, (with a trailing comma) when a is a list?

I am reading a snippet of Python code and there is one thing I can't understand. a is a list, num is an integer
a += num,
works but
a += num
won't work. Can anyone explain this to me?
First of all, it is important to note here that a += 1, works differently than a = a + 1, in this case. (a = a + 1, and a = a + (1,) are both throwing a TypeError because you can't concatenate a list and a tuple, but you you can extend a list with a tuple.)
+= calls the lists __iadd__ method, which calls list.extend and then returns the original list itself.
1, is a tuple of length one, so what you are doing is
>>> a = []
>>> a.extend((1,))
>>> a
[1]
which just looks weird because of the length one tuple. But it works just like extending a list with a tuple of any length:
>>> a.extend((2,3,4))
>>> a
[1, 2, 3, 4]
The trailing comma makes the right side of the assignment into a tuple, not an integer. A tuple is a container structure similar to a list (with some differences). For example, these two are equivalent:
a += num,
a += (num, )
Python allows you to add a tuple to a list and will append each element of the tuple to the list. It doesn't allow you to add a single integer to a list, you have to use append for that.
using
num,
declares a tuple of length one, and not an integer.
Thus, if a = [0,1] and num = 2
a+=num,
is equivalent to
a.extend((num,))
or
a.extend((2,))=[0,1,2]
while
a+=num
is equivalent to
a.extend(num)
or
a.extend(2)
which gives an error, because you can append a tuple to an array, but not an integer. Thus the first formulation works while the second gives you an error

Assigning empty list

I don't really know how I stumbled upon this, and I don't know what to think about it, but apparently [] = [] is a legal operation in python, so is [] = '', but '' = [] is not allowed. It doesn't seem to have any effect though, but I'm wondering: what the hell ?
This is related to Python's multiple assignment (sequence unpacking):
a, b, c = 1, 2, 3
works the same as:
[a, b, c] = 1, 2, 3
Since strings are sequences of characters, you can also do:
a, b, c = "abc" # assign each character to a variable
What you've discovered is the degenerative case: empty sequences on both sides. Syntactically valid because it's a list on the left rather than a tuple. Nice find; never thought to try that before!
Interestingly, if you try that with an empty tuple on the left, Python complains:
() = () # SyntaxError: can't assign to ()
Looks like the Python developers forgot to close a little loophole!
Do some search on packing/unpacking on python and you will find your answer.
This is basically for assigning multiple variables in a single go.
>>> [a,v] = [2,4]
>>> print a
2
>>> print v
4

Swapping every second character in a string in Python

I have the following problem: I would like to write a function in Python which, given a string, returns a string where every group of two characters is swapped.
For example given "ABCDEF" it returns "BADCFE".
The length of the string would be guaranteed to be an even number.
Can you help me how to do it in Python?
To add another option:
>>> s = 'abcdefghijkl'
>>> ''.join([c[1] + c[0] for c in zip(s[::2], s[1::2])])
'badcfehgjilk'
import re
print re.sub(r'(.)(.)', r'\2\1', "ABCDEF")
from itertools import chain, izip_longest
''.join(chain.from_iterable(izip_longest(s[1::2], s[::2], fillvalue = '')))
You can also use islices instead of regular slices if you have very large strings or just want to avoid the copying.
Works for odd length strings even though that's not a requirement of the question.
While the above solutions do work, there is a very simple solution shall we say in "layman's" terms. Someone still learning python and string's can use the other answers but they don't really understand how they work or what each part of the code is doing without a full explanation by the poster as opposed to "this works". The following executes the swapping of every second character in a string and is easy for beginners to understand how it works.
It is simply iterating through the string (any length) by two's (starting from 0 and finding every second character) and then creating a new string (swapped_pair) by adding the current index + 1 (second character) and then the actual index (first character), e.g., index 1 is put at index 0 and then index 0 is put at index 1 and this repeats through iteration of string.
Also added code to ensure string is of even length as it only works for even length.
string = "abcdefghijklmnopqrstuvwxyz123"
# use this prior to below iteration if string needs to be even but is possibly odd
if len(string) % 2 != 0:
string = string[:-1]
# iteration to swap every second character in string
swapped_pair = ""
for i in range(0, len(string), 2):
swapped_pair += (string[i + 1] + string[i])
# use this after above iteration for any even or odd length of strings
if len(swapped_pair) % 2 != 0:
swapped_adj += swapped_pair[-1]
print(swapped_pair)
badcfehgjilknmporqtsvuxwzy21 # output if the "needs to be even" code used
badcfehgjilknmporqtsvuxwzy213 # output if the "even or odd" code used
Here's a nifty solution:
def swapem (s):
if len(s) < 2: return s
return "%s%s%s"%(s[1], s[0], swapem (s[2:]))
for str in ("", "a", "ab", "abcdefgh", "abcdefghi"):
print "[%s] -> [%s]"%(str, swapem (str))
though possibly not suitable for large strings :-)
Output is:
[] -> []
[a] -> [a]
[ab] -> [ba]
[abcdefgh] -> [badcfehg]
[abcdefghi] -> [badcfehgi]
If you prefer one-liners:
''.join(reduce(lambda x,y: x+y,[[s[1+(x<<1)],s[x<<1]] for x in range(0,len(s)>>1)]))
Here's a another simple solution:
"".join([(s[i:i+2])[::-1]for i in range(0,len(s),2)])

Categories

Resources