Project Euler #13 Python. Incorrect carry over - python

This problem asks to sum up 100 numbers, each 50 digits long. http://code.jasonbhill.com/python/project-euler-problem-13/
We can replace \n with "\n+" in Notepad++ yielding
a=37107287533902102798797998220837590246510135740250
+46376937677490009712648124896970078050417018260538
...
+20849603980134001723930671666823555245252804609722
+53503534226472524250874054075591789781264330331690
print(a)
>>37107287533902102798797998220837590246510135740250 (incorrect)
We can as well replace \n with \na+= yielding
a=37107287533902102798797998220837590246510135740250
a+=46376937677490009712648124896970078050417018260538
...
a+=20849603980134001723930671666823555245252804609722
a+=53503534226472524250874054075591789781264330331690
print(a)
>>553... (correct)
This seems to be a feature of BigInteger arithmetic. Under which conditions a sum of all numbers (Method 1) yields different result from an iterative increment (Method 2)?

As you can see in the result, the first set of instruction is not computing the sum. It preserved the first assignment. Since +N is on its own a valid instruction, the next lines after the assignment do nothing. Thus
a=42
+1
print a
prints 42
To write an instruction over two lines, you need to escape the ending newline \n :
a=42\
+1
43

Python source code lines are terminated by newline characters. The subsequent lines in the first example are separate expression statements consisting of a single integer with a unary plus operator in front, but they don't do anything. They evaluate the expression (resulting in the integer constant itself), and then ignore the result. If you put all numbers on a single line, or use parentheses around the addition, the simple sum will work as well.

Related

Declaring and Looping over a variable in one line

Just for fun, I am trying to compress a programming problem into one line. I know this is typically a bad practice, but it is a fun challenge that I am asking for your help on.
I have a piece of code which declares the variables and in the second line which loops over a list created in the first line, until a number is not found anymore. Finally it returns that value.
The programming question is as follows. Given a sentence, convert each character to it's ascii representation. Then convert that ascii value to binary (filling the remaining spaces with 0 if the binary number is less than 8 digits), and combine the numbers into one string. Starting from the number 0, convert it to binary and check if it is in the string. If it is, add one to the number and check again. Return the last consecutive binary number that is in the string.
Ex)
string = "0000010"
0 in string: add 1
1 in string: add 1
10 in string: add 1
11 not in string: the last consecutive binary number was 102=210. Return 2
You can see my code below
def findLastBinary(s: str):
string, n = ''.join(['0'*(10-len(bin(ord(char))))+bin(ord(char))[2:] for char in s]), 0
while bin(n)[2:] in string: n+=1
return n-1
It would also be nice if I could combine the return statement and loop into one line as well.
EDIT
Fixed the code (it should work now). Also below, you will see a sample test case. Hope this helps with answering this question.
Sample test case
Input:
s="Roses and thorns"
Below you will see the steps my code follows to get the correct answer (obviously made more readable)
Organized into columns in the following order:
Character-Ascii-Binary Representation of ascii value:
R - 82 - 01010010
o - 111 - 01101111
s - 115 - 01110011
etc.
Keep in mind that if the binary number has less than 8 digits, zeros should be added to the beginning of the number until it is 8 digits.
Each binary integer is then concatenated into a single string (I added spaces for readability only):
01010010 01101111 01110011 01100101 01110011 00100000 01100001 01101110 01100100 00100000 01110100 01101000 01101111 01110010 01101110 01110011
Now we start from the binary number 0, and check if it is in the string. It is so we move on to 1. 1 is in the string, so we move on to 10. 10 is in the string. And so we continue until we find the binary string 11111 is not in our string. 111112=3110. Since 31 was the first number whose decimal representation was not in the string, we return the last number whose decimal number was in the string: namely, 31-1=30. 30 is what the function should return.
The problem statement has changed. See the bottom of this answer for the updated solution.
The function can be defined the function this way, thanks to #treuss' observation (this applies to the original problem to find the largest base 10 integer which when converted to binary is in the string):
def largest_binary_number(sentence: str):
return int(''.join([bin(ord(char))[2:].zfill(8) for char in sentence]), 2)
But suppose that the problem was to "find the smallest base 10 integer larger than 1000 whose binary representation is in the string." Then we have something like this:
def find(sentence: str):
return list(iter(lambda: globals().__setitem__('_c', globals().get('_c', 1000-1) + 1) or bin(globals().get('_c'))[2:] in ''.join([bin(ord(c))[2:].zfill(8) for c in sentence]), True)) is type or globals().get('_c')
Let's break this down into four parts:
globals().__setitem__('_c', globals().get('_c', 1000-1) + 1) - initialize and increment a counter
... or bin(globals().get('_c'))[2:] in ''.join([bin(ord(c))[2:].zfill(8) for c in sentence]) - check if the binary representation of the counter is in the binary representation of the sentence
list(iter(lambda: ..., True)) - inline while loop using black magic
... is type or globals().get('_c') - get the final value of the counter, which satisfies our condition
Part 1: globals().__setitem__('_c', globals().get('_c', 1000-1) + 1)
Since we are confined to do everything in one line, we don't have the luxury of defining variables. This is where globals comes in: we can store and use arbitrary variables as dictionary entries using the __setitem__ and get methods. Here we name our counter variable _c, calling get to initialize and fetch the value, then immediately increment it by one and save the value with __setitem__. Now we have a counter variable.
Part 2: ... or bin(globals().get('_c'))[2:] in ''.join([bin(ord(c))[2:].zfill(8) for c in sentence])
bin(globals().get('_c'))[2:] converts the counter to binary and removes the 0b prefix. ''.join([bin(ord(c))[2:].zfill(8) for c in sentence]), as before, converts the input sentence to binary. We use in to check if the binary counter is a substring of the binary sentence. Because the __setitem__ call from part 1 returns None, we use or here to ignore that and execute this part.
Part 3: list(iter(lambda: ..., True))
This is the bread and butter, allowing us to perform inline iteration. iter is usually passed an iterable to create and iterator, but it actually has a second form that takes two arguments: a callable and a sentinel. When iterating over an iterator created using this two-argument form, the callable is successively called until it returns the sentinel value (beware infinite loops!). So we define a lambda function that returns True when the condition is satisfied, and set the sentinel to True. Finally we use the list constructor to begin iterating.
Part 4: ... is type or globals().get('_c')
Once the list constructor finishes iterating, we need to fetch and return the final value of the counter. We follow list(...) with is type to make an expression that always evaluates to False, then chain it with or globals().get('_c') at the end of this one-liner to return the counter. Et voilĂ !
Part 5:
Of course, what we had before was a two-liner.
find = lambda sentence: list(iter(lambda: globals().__setitem__('_c', globals().get('_c', 1000-1) + 1) or bin(globals().get('_c'))[2:] in ''.join([bin(ord(c))[2:].zfill(8) for c in sentence]), True)) is type or globals().get('_c')
Now we have a one-liner.
Note: In hindsight, maybe the walrus := could be used to make the counter, instead of having to call globals() every time. However, replacing globals with locals doesn't work for some reason.
Note 2: Using these techniques, we can make one-liners that satisfy various conditions.
Update: Here's another version using the walrus
find = lambda sentence: (_c := {'v': 1000-1}) and list(iter(lambda: _c.__setitem__('v', _c['v'] + 1) or bin(_c['v'])[2:] in ''.join([bin(ord(c))[2:].zfill(8) for c in sentence]), True)) is type or _c['v']
We initialize the counter at the top level and simply use _c everywhere else. Note how it is a dict instead of an int because outer variables cannot be assigned within the inner lambda (but mutating outer variables is fine).
Update 2: OP has updated the problem statement, so here's the new solution:
find = lambda s: (_c := {'v': 0-1}) and list(iter(lambda: _c.__setitem__('v', _c['v'] + 1) or bin(_c['v'])[2:] in ''.join([bin(ord(c))[2:].zfill(8) for c in s]), False)) is type or _c['v'] - 1
The techniques are the same, but now we start the counter from -1 (the first iteration increments it to 0 before anything else), the sentinel becomes False (because we stop the loop when the binary counter is not in the binary string), and decrement the return value by 1 to get the last number satisfying the condition.

Why am I getting a different output when the multiplication of the digit is above 10?

#card number
card = input('Number: ')
j = int(card[::2]) # this will jump character by 1
# multiplying each other number by 2
j *= 2
print(j)
So whenever I run this code and input e.g. 1230404
The output would be correct which is 2688
But when I input for example 1230909 the output is 2798, I expected 261818
Let's look at what your code is doing.
You slice every second character from your input string, so '1230909' becomes '1399'.
You convert that to a single int, 1399.
You multiply that number by 2, producing 2798. I assure you that the computer did the math correctly.
It appears what you expected was for each digit to be doubled individually. To do that, you need to convert each digit, double it, and combine them back. Python has great facilities for this, I'd suggest a generator expression inside a join call.

Why compare two strings via calculating xor of their characters?

Some time ago I found this function (unfortunately, I don't remember from where it came from, most likely from some Python framework) that compares two strings and returns a bool value. It's quite simple to understand what's going on here.
Finding xor between char returns 1 (True) if they do not match.
def cmp_strings(str1, str2):
return len(str1) == len(str2) and sum(ord(x)^ord(y) for x, y in zip(str1, str2)) == 0
But why is this function used? Isn't it the same as str1==str2?
It takes a similar amount of time to compare any strings that have the same length. It's used for security when the strings are sensitive. Usually it's used to compare password hashes.
If == is used, Python stops comparing characters when the first one not matching is found. This is bad for hashes because it could reveal how close a hash was to matching. This would help an attacker to brute force a password.
This is how hmac.compare_digest works.
The security issue that is being addressed by XOR comparison is known as a Timing Attack. ...This is where you observe how much time it takes the Compare function to succeed|fail, and use that knowledge to gain an advantage over the system.
There are 95 printable ASCII characters. If you have an 8 character password, there are 95^8 (6,634,204,312,890,625) possible combinations ...If the correct password is the last one in your list, and you can try 1 billion passwords per second, it will take you about 77 days to Brute Force the password ...That's too long - so we need a shortcut!
There are an infinite number of ways to store a string - and probably a dozen in popular use {length-prefixed, nul-terminated, ...}{Unicode, UTF-8, ASCII, ,...}. For this working example, I will use the ubiquitous 'NUL-terminated array of bytes using ASCII encoding' ...IE. "ABC" will be stored as "ABC"NUL, or {65, 66, 67, 0} ...but whatever storage/encoding standard you use, the problem is essentially the same.
Syntactically, there are as many ways to compare two strings as there are languages, eg. if str1 == str2 or if (strcmp(str1, str2) == 0) etc. ...but when you look at how they work internally, they are all pretty-much the same. Here is some simple (but realisitic) pseudo-code to perform a classic (non-security) string compare:
index = 0
LOOP FOREVER {
IF ( (str1[index] == 0) AND (str2[index] == 0) ) THEN return 'same'
IF (str1[index] != str2[index]) THEN return 'different'
index = index + 1
}
Assuming the secret password is "BY3"NUL ...Let's try some passwords, and notice how many operations the Compare function has to do to establish success|fail.
1. "A"NUL ... returns 'different' when 1st char is checked (A) [zero chars are correct]
2. "B"NUL ... returns 'different' when 2nd char is checked (NUL) [first char must be correct]
3. "BX"NUL ... returns 'different' when 2nd char is checked (X) [first char must be correct]
4. "BY"NUL ... returns 'different' when 3rd char is checked (NUL) [first two chars must be correct]
5. "BY1"NUL ... returns 'different' when 3rd char is checked (1) [first two chars must be correct]
6. "BY2"NUL ... returns 'different' when 3rd char is checked (2) [first two chars must be correct]
7. "BY3"NUL ... returns 'same' when the 4th character is checked (NUL) [all three chars are correct]
You can see that guess 1 fails the 1st time around the loop, guesses 2 & 3 fail the 2nd time around the loop ...guesses 4, 5, 6 fail the 3rd time around the loop ...and guess 7 succeeds the 4th time around the loop.
By observing how much time it takes the Compare function to fail, we can tell which character is wrong! This means we can actually guess the password one character at a time.
Again, let's assume an 8 character password made up of the 95 printable characters, and our last guess will be correct ...Because we can now guess the password one character at a time, it will take 95*8 (760) guesses. At 1 billion guesses per second, it will take about 0.7 milliseconds to find the password [it takes about 100mS to blink] ...which is a significant advantage over 77 days ...For a laugh work out the advantage for a 20 character password (95^20 vs 95 * 20).
So how do we stop an attacker from using a Timing Attack? [Spoiler: XOR]
The first thing we need to do is to make both strings the same length; and secondly, we must ALWAYS check EVERY character before returning 'same' or 'different' ...This is surprisingly difficult to do without introducing a new Timing Attack. But rather than show you lots of ways to get it wrong, let's see a way to do it right.
Passwords should (where possible) be stored as Hashes ...{DES, MD5, SHA-1, ...} have now been shown to have cryptographic flaws, {SHA-256, SHA-3, Whirlpool, ...} are still in good favour [Oct 2021] ...You may know that ALL Hashes (generated by a given algorithm) are the same length ...So if we Hash the guess and compare the Guess-Hash against the Stored-Hash, we have solved the first problem - the 'strings' (array of bytes) we need to compare are now ALWAYS the same length.
Secondly. How to make sure our Compare function ALWAYS takes the same amount of time to reach its decision ...There are probably a lot of ways to do this, but the most common solution is to use XOR like this:
result = 0
index = 0
LOOP WHILE (index < hashLength) {
result = result OR ( secretHash[index] XOR guessHash[index] )
index = index + 1
}
IF result == 0 THEN return 'same' ELSE return 'different'
And this way ALL calls to the compare function take the same length of time to run ...No more Timing Attack!
Footnote:
For readers not familiar with Boolean Logic - go and read up; but the essence here is:
If A and B are the same, (A XOR B) gives a result of 0
If A and B are different, (A XOR B) gives a non-0 result
If A and B are both 0, (A OR B) gives a result of 0
If either A or B are non-0, (A OR B) gives a non-0 result
So (looking at the second code block) the first time the XOR returns non-0 (different), the result becomes non-0 (different) and can never return to 0 (same).
A search for "cve timing attack" will provide you with a list of real-life examples.
It appears to be doing a correlation (XOR sum) character-wise between the strings, given they are of the same length. It could be required in situations where you need to know 'similarity' and not equality. Maybe that was the plan. The author might have wanted to extend this function further.

Formatting Integers

I want to understand the logic behind the outputs of the below print statements.
x = 345
print ("%06d"%x)
print ("%-06d"%x)
The first statement would as expected prefix the number of zeroes required to make the total length as 6. The output is 000345 which I understand.
But the output of the second print statement is 345. How come? What is purpose of "-" operand?
minus means align to left.
You will see it when you add another element in print -
print("%06d"%x, 'a')
print("%-06d"%x, 'a')
Result
000345 a
345 a
See: PyFormat.info
Basically, the minus and the leading zero specify conflicting requirements. Python arbitrarily picks the minus as the winner.

How come a string multiplied by negative integer results in empty string?

'0424242' * -5
I understand how multiplying by strings work fundamentally, but I just stumbled on this strange fact that multiplying by negative numbers yields an empty string and thought it was interesting. I wanted to know the deeper why beneath the surface.
Anyone have a good explanation for this?
The docs on s * n say:
Values of n less than 0 are treated as 0 (which yields an empty
sequence of the same type as s).
What would you expect multiplying a string by a negative integer?
On the other hand
# Display results in nice table
print(keyword1, " "*(60-len(keyword1)), value1)
print(keyword2, " "*(60-len(keyword2)), value2)
without being worried than keyword? be longer than 60 is very handy.
This behavior is probably defined to be consistent with range(-5) being []. In fact, the latter may be exactly what underlies the behavior you observe.
That's literally part of the definition of the operation:
The * (multiplication) operator yields the product of its arguments. The arguments must either both be numbers, or one argument must be an integer and the other must be a sequence. In the former case, the numbers are converted to a common type and then multiplied together. In the latter case, sequence repetition is performed; a negative repetition factor yields an empty sequence.

Categories

Resources