'0424242' * -5
I understand how multiplying by strings work fundamentally, but I just stumbled on this strange fact that multiplying by negative numbers yields an empty string and thought it was interesting. I wanted to know the deeper why beneath the surface.
Anyone have a good explanation for this?
The docs on s * n say:
Values of n less than 0 are treated as 0 (which yields an empty
sequence of the same type as s).
What would you expect multiplying a string by a negative integer?
On the other hand
# Display results in nice table
print(keyword1, " "*(60-len(keyword1)), value1)
print(keyword2, " "*(60-len(keyword2)), value2)
without being worried than keyword? be longer than 60 is very handy.
This behavior is probably defined to be consistent with range(-5) being []. In fact, the latter may be exactly what underlies the behavior you observe.
That's literally part of the definition of the operation:
The * (multiplication) operator yields the product of its arguments. The arguments must either both be numbers, or one argument must be an integer and the other must be a sequence. In the former case, the numbers are converted to a common type and then multiplied together. In the latter case, sequence repetition is performed; a negative repetition factor yields an empty sequence.
Related
I'd expect bin(~0b111000) to return the value 0b000111 because to my understanding the NOT operation would return the opposite bit as output.
I keep reading that "~x: Returns the complement of x - the number you get by switching each 1 for a 0 and each 0 for a 1" so I don't exactly know where my logic breaks down.
Why does it show -(x + 1) instead of just literally flipping all bits?
It is flipping all the bits!
You seem to think of 0b111000 as a 6bit value. That is not the case. All integers in python3 have (at least conceptually) infinitely many bits. So imagine 0b111000 to be shorthand for 0b[...]00000111000.
Now, flipping all the bits results in 0b[...]11111000111. Notice how in this case the [...] stands for infinitely many ones, so mathematically this gets interesting. And we simply cannot print infinitely many ones, so there is no way to directly print this number.
However, since this is 2s complement, this simply means: The number which, if we add 0b111001 to it, becomes 0. And that is why you see -0b111001.
I need to represent a string as a number, however it is 8928313 characters long, note this string can contain more than just alphabet letters, and I have to be able to convert it back efficiently too. My current (too slow) code looks like this:
alpha = 'abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ,.?!#()+-=[]/*1234567890^*{}\'"$\\&#;|%<>:`~_'
alphaLeng = len(alpha)
def letterNumber(letters):
letters = str(letters)
cof = 1
nr = 0
for i in range(len(letters)):
nr += cof*alpha.find(letters[i])
cof *= alphaLeng
print(i,' ',len(letters))
return str(nr)
Ok, since other people are giving awful answers, I'm going to step in.
You shouldn't do this.
You shouldn't do this.
An integer and an array of characters are ultimately the same thing: bytes. You can access the values in the same way.
Most number representations cap out at 8 bytes (64-bits). You're looking at 8 MB, or 1 million times the largest integer representation. You shouldn't do this. Really.
You shouldn't do this. Your number will just be a custom, gigantic number type that would be identical under the hood.
If you really want to do this, despite all the reasons above, here's how...
Code
def lshift(a, b):
# bitwise left shift 8
return (a << (8 * b))
def string_to_int(data):
sum_ = 0
r = range(len(data)-1, -1, -1)
for a, b in zip(bytearray(data), r):
sum_ += lshift(a, b)
return sum_;
DONT DO THIS
Explanation
Characters are essentially bytes: they can be encoded in different ways, but ultimately you can treat them within a given encoding as a sequence of bytes. In order to convert them to a number, we can shift them left 8-bits for their position in the sequence, creating a unique number. r, the range value, is the position in reverse order: the 4th element needs to go left 24 bytes (3*8), etc.
After getting the range and converting our data to 8-bit integers, we can then transform the data and take the sum, giving us our unique identifier. It will be identical byte-wise (or in reverse byte-order) of the original number, but just "as a number". This is entirely futile. Don't do it.
Performance
Any performance is going to be outweighed by the fact that you're creating an identical object for no valid reason, but this solution is decently performant.
1,000 elements takes ~486 microseconds, 10,000 elements takes ~20.5 ms, while 100,000 elements takes about 1.5 seconds. It would work, but you shouldn't do it. This means it's scaled as O(n**2), which is likely due to memory overhead of reallocating the data each time the integer size gets larger. This might take ~4 hours to process all 8e6 elements (14365 seconds, calculated fitting the lower-order data to ax**2+bx+c). Remember, this is all to get the identical byte representation as the original data.
Futility
Remember, there are ~1e78 to 1e82 atoms in the entire universe, on current estimates. This is ~2^275. Your value will be able to represent 2^71426504, or about 260,000 times as many bits as you need to represent every atom in the universe. You don't need such a number. You never will.
If there are only ANSII characters. You can use ord() and chr().
built-in functions
There are several optimizations you can perform. For example, the find method requires searching through your string for the corresponding letter. A dictionary would be faster. Even faster might be (benchmark!) the chr function (if you're not too picky about the letter ordering) and the ord function to reverse the chr. But if you're not picky about ordering, it might be better if you just left-NULL-padded your string and treated it as a big binary number in memory if you don't need to display the value in any particular format.
You might get some speedup by iterating over characters instead of character indices. If you're using Python 2, a large range will be slow since a list needs to be generated (use xrange instead for Python 2); Python 3 uses a generator, so it's better.
Your print function is going to slow down output a fair bit, especially if you're outputting to a tty.
A big number library may also buy you speed-up: Handling big numbers in code
Your alpha.find() function needs to iterate through alpha on each loop.
You can probably speed things up by using a dict, as dictionary lookups are O(1):
alpha = 'abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ,.?!#()+-=[]/*1234567890^*{}\'"$\\&#;|%<>:`~_'
alpha_dict = { letter: index for index, letter in enumerate(alpha)}
print(alpha.find('$'))
# 83
print(alpha_dict['$'])
# 83
Store your strings in an array of distinct values; i.e. a string table. In your dataset, use a reference number. A reference number of n corresponds to the nth element of the string table array.
Why does python does print(type(-1**0.5)) return float instead of complex?
Getting the square root of negative integer of float always mathematically consider as complex numbers. How does python exponent operator support to get complex number?
print(type(-1**0.5))
<type 'float'>
In the mathematical order of operations, exponentation comes before multiplication and unary minus counts as multiplication (by -1). So your expression is the same as -(1**0.5), which doesn't involve any imaginary numbers.
If you do (-1)**0.5 you'll get an error in Python 2 because the answer isn't a real number. If you want a complex answer, you need to use a complex input by doing (-1+0j)**0.5. (In Python 3, (-1)**0.5 will return a complex result.)
Try (-1)**0.5 instead.
-1**0.5 is parsed as -(1**0.5), which is equal to -1.
>>> -1**0.5
-1
>>> (-1)**0.5
(6.123e-17+1j)
The exponentiation is being carried out first, and then its sign is inverted. To get the result you want, use parentheses to ensure that the - sign stays with the 1:
>>> -1**0.5
-1.0
>>> (-1)**0.5
(6.123233995736766e-17+1j)
Python is correct as -1**0.5 is different from (-1)**0.5.
The first one raises one to the power of 0.5 and negates the result.
The second one raises -1 to the same power and returns a complex number as expected.
There is something extremely strange happening if I do some ordinary calculations in Python. If I do a multiplication whithout brackets, it gives the right thing, but if set some things into brackets the total multiplication becomes equal to zero.
For those who don't believe (I know that it sounds strange):
>>> print( 1.1*1.15*0.8*171*15625*24*(60/368*0.75)/1000000 )
0.0
>>> print( 1.1*1.15*0.8*171*15625*24*60/368*0.75/1000000 )
7.93546875
as shown in this Jupyter screenshot.
The only difference between both multiplications is that in the first there are brackets around 60/368*0.75.
How is this possible and what can I do against it? I have no idea how this is even possible.
If you divide integers a,b in python the result is the floor of the division, thus if a < b we get:
With brackets you have the operation 60/368 which gives 0.
But without brackets the number 60 is first multiplied by everything before it, which results in some double value so dividing this value by 368 does not yield 0.
Parenthesis change the order of evaluation, and the expression inside them is evaluated first. Here, since 60 and 368 are both integer literals they are divided using integer division - meaning only the "whole" part is kept. Since 60 is smaller than 368 their integer division is 0. From there on, the result is obvious - you've got a series of multiplications and divisions where one of multipliers is 0, so the end result would also be 0.
To prevent this you could express the numbers as floating point literals - 60.0 and 368.0. (Well, technically, just using 60.0 would be sufficient here, but for consistency's sake I recommend representing all the numbers as floating point literals).
I feel like this is a simple question, but it keeps escaping me...
If I had a string, say, "1010101", how would I refer to the first digit in the string by its index?
You can get the first element of any sequence with [0]. Since a string is a sequence of characters, you're looking for s[0]:
>>> s = "1010101"
>>> s[0]
'1'
For a detailed explanation, refer to the Python tutorial on strings.
Negative indexes count from the right side.
digit = mystring[-1]
In Python, a sting is something called, subscriptable. That means that you can access the different parts using square brackets, just like you can with a list.
If you want to get the first character of the string, then you can simply use my_string[0].
If you need to get the last (character) in a string (the final 1 in the string you provided), then use my_string[-1].
If you originally have an int (or a long) and you are looking for the last digit, you are best off using % (modulous) (10101 % 10 => 1).
If you have a float, on the other hand, you are best of str(my_float)[-1]