How to change numbers in a number - python

I'm currently trying to learn python.
Suppose there was a a number n = 12345.
How would one go about changing every digit starting from the first spot and iterating it between (1-9) and every other spot after (0-9).
I'm sadly currently learning python so I apologize for all the syntax error that might follow.
Here's my last few attempts/idea for skeleton of the code.
define the function
turn n into string
start with a for loop that for i in n range(0,9) for i[1]
else range(10)
Basically how does one fix a number while changing the others?
Please don't give solution just hints I enjoy the thinking process.
For example if n =29 the program could check
19,39,49,59,69,79,89,99
and
21,22,23,24,25,26,27,28

Although you are new, the process seems far easy than you think.
You want to make that change to every digit of the number (let's say n=7382). But you cannot iterate over numbers (not even changing specific digits of it as you want to): only over iterables (like lists). A string is an iterable. If you get the way to do a int-str conversion, you could iterate over every number and then print the new number.
But how do you change only the digit you are iterating to? Again, the way is repeating the conversion (saving it into a var before the loop would make great DRY) and getting a substring that gets all numbers except the one you are. There are two ways of doing this:
You search for that specific value and get its index (bad).
You enumerate the loop (good).
Why 2 is good? Because you have the real position of the actual number being change (think that doing an index in 75487 with 7 as the actual one changing would not work well when you get to the last one). Search for a way to iterate over items in a loop to get its actual index.
The easiest way to get a substring in Python is slicing. You slice two times: one to get all numbers before the actual one, and other to get all after it. Then you just join those two str with the actual variable number and you did it.
I hope I didn't put it easy for you, but is hard for a simple task as that.

Related

Assign a unique value to a string s based on its lexicographic order

I want to come up with a function that assigns unique values to a string based on it's lexicographic order. For instance if my function is labelled as get_key(s), the function should take as input a string s and return a unique integer which will allow me to compare two strings based on those unique integers that I get , in O(1) time.
Some code for clarity:
get_key('aaa')
#Returns some integer
get_key('b')
#Returns another integer > output of get_key('aaa') since 'b' > 'aaa'
Any help would be highly appreciated.
Note: Cannot use python built in function id()
It's impossible.
Why? No matter what number you return for a string, I can always find a new string that's in between those two.
You would need an unlimited number of values, because there's an infinite amount of strings.
If I understand your problem clearly, one idea I come to is to convert the input to hex then from hex to int, this I believe would solve the problem, however, I guess it is impossible to solve it in O(1). The solution I provided (and every possible solution in my mind) needs O(n) since you don't have any specification on the input length and the function will operate depending on the length of the input.

Recursively search word in a matrix of characters

I'm trying to write a program for a homework using recursion to search for a word in a matrix (2x2 or more), it can be going from left to right or from up to down (no other directions), for example if I am searching for ab , in the matrix [['a','b'],['c','d']], the program should return in what direction the word is written (across), the starting index(0), ending index(2), and the index of the row or column(0).
My problem is that I have the idea of the recursion but, I can't implement it. I tried to break the problem down into more little proplems, like searching for the word in a given row, I started by thinking of the smallest case which is 2x2 matrix, at the first row and column, I need to search one to the right and one to the bottom of the first char, and check if they are equal to my given string, then give my recursion function a smaller problem with the index+1. However I can't think of what to make my function return at the base case of the recursion, been trying to solve it and think of ways to do it for two days, and I can't code what I think about or draw.
Note that I can't use any loops, I would really appreciate it if somone could push me in the right direction, any help would be pretty much appreciated, thanks in advance.
Edit: more examples: for input of matrix : [['a','b','c'],['d','e','f'],['g','h','i']] the outputs are:
with the string ab : across,0,0,2
with the string be : down,1,0,2
with the string ghi: across,2,0,3
I assume that the word we are looking for could be found starting from any place but we can move up to down or left to right only.
In that case, you should have a function that takes the start index and a direction and then the function keeps moving in the given direction starting from the given index and keeps moving until it doesn't find a mismatch, and it just returns true or false based on the match of the given string.
Now you need to call this function for each and every index of the matrix along with two directions up to down and left to right, and at any index, if you get the output of the function as true then you have found your answer.
This is a very basic idea to work, next it depends on you how you want to optimize the things in this method only.
Update:
To avoid using the loops.
The other way I can think of is that the function which we have defined now takes the row, column, and the string to find. So at each call, you will first check if the character at the given row and column matches the first character of the given string if so then it calls the two more functions, one in the right direction and the other in the down direction, along with the string with the first character removed.
Now to check all the columns of the matrix, you will anyway call the function in down and right direction with the exact same string.
The base case will be that if you reach the end of the string then you have found the answer and you will return True, otherwise False.
One more thing to notice here is that if any of the 4 function calls gives you a True response then the current row/column will also return True.
Cheers!

trying to generate math questions, comparissons not working

can anyone help me? im pretty new to python and im trying to generate 10 files, each with increasingly harder questions. this code is for difficult 2. I dont want the answers in dif. 2 to be negative so whenever i get a second number bigger than the first i swap the two. for some reason some of them still come out with the first number bigger than the second. i added the "its less than" print statments for testing and it will detect the fact that its less than but wont do something about it.
Your issue is that you're casting your random numbers to a string before comparing their mathematical values. You need to compare them as integers then cast them to strings.
I believe this is because you are checking for comparison between 2 strings not 2 integers. This will give bad results for this type of program
num1 = str(r.choice(numbers))
num2 = str(r.choice(numbers))
Here you are storing strings and not integers.
and then below this you are checking if num1 <= num2.
Convert them to integers before comparing them and your code should work.

Why can't a long integer be converted to an integer when inside a Python list?

I have seen many posts here, which gives ways of removing the trailing L from a list of python long integers.
The most proposed way is
print map(int,list)
However this seems not to work always.
Example---
A=[4198400644L, 3764083286L, 2895448686L, 1158179486, 2316359001L]
print map(int,A)
The above code gives the same result as the input.
I have noticed that the map method doesn't work whenever the number preceding L is large, and only when the numbers are in a list. e.g. Application of int() on 4198400644L does give the number without L, when out of the list.
Why is this occurring and more importantly, how to overcome this?
I think I really need to remove this L, because this is a small part of a program where I need to multiply some integer from this list A, with some integer from a list of non-long integers, and this L is disturbing.I could ofcourse convert the long integers into string,remove the L and convert them back to integer.But is there another way?
I am still using the now outdated Python 2.7.
Python has two different kinds of integers. The int type is used for those that fit into 32 bits, or -0x80000000 to 0x7fffffff. The long type is for anything outside that range, as all your examples are. The difference is marked with the L appended to the number, but only when you use repr(n) as is done automatically when the number is part of a list.
In Python 3 they realized that this difference was arbitrary and unnecessary. Any int can be as large as you want, and long is no longer a type. You won't see repr put the trailing L on any numbers no matter how large, and adding it yourself on a constant is a syntax error.

How can you parallelize a regex search of one long string? [duplicate]

This question already has answers here:
How can I tell if a string repeats itself in Python?
(13 answers)
Closed 7 years ago.
I'm testing the output of a simulation to see if it enters a loop at some point, so I need to know if the output repeats itself. For example, there may be 400 digits, followed by a 400000 digit cycle. The output consists only of digits from 0-9. I have the following regex function that I'm using to match repetitions in a single long string:
def repetitions(s):
r = re.compile(r"(.+?)\1+")
for match in r.finditer(s):
if len(match.group(1)) > 1 and len(match.group(0))/len(match.group(1)) > 4:
yield (match.group(1), len(match.group(0))/len(match.group(1)))
This function works fantastically, but it takes far too long. My most recent test was 4 million digits, and it took 4.5 hours to search. It found no repetitions, so I now need to increase the search space. The code only concerns itself with subsequences that repeat themselves more than 4 times because I'm considering 5 repetitions to give a set that can be checked manually: the simulation will generate subsequences that will repeat hundreds of times. I'm running on a four core machine, and the digits to be checked are generated in real time. How can I increase the speed of the search?
Based on information given by nhahtdh in one of the other answers, some things have come to light.
First, the problem you are posing is called finding "tandem repeats" or "squares".
Second, the algorithm given in http://csiflabs.cs.ucdavis.edu/~gusfield/lineartime.pdf finds z tandem repeats in O(n log n + z) time and is "optimal" in the sense that there can be that many answers. You may be able to use parallelize the tandem searches, but I'd first do timings with the simple-minded approach and divide by 4 to see if that is in the speed range you expect.
Also, in order to use this approach you are going to need O(n) space to store this suffix tree. So if you have on the order of 400,000 digits, you are going to need on the order of 400,000 time to build and 400,000 bytes to and store this suffix tree.
I am not totally what is meant by searching in "real time", I usually think of it as a hard limit on how long an operation can take. If that's the case, then that's not going to happen here. This algorithm needs to read in the entire input string and processes that before you start to get results. In that sense, it is what's called an "off-line" algorithm,.
http://web.cs.ucdavis.edu/~gusfield/strmat.html has C code that you can download. (In tar file strmat.tar.gz look for repeats_tandem.c and repeats_tandem.h).
In light of the above, if that algorithm isn't sufficiently fast or space efficient, I'd look for ways to change or narrow the problem. Maybe you only need a fixed number of answers (e.g. up to 5)? If the cycles are a result of executing statements in a program, given that programming languages (other than assembler) don't have arbitrary "goto" statements, it's possible that this can narrow the kinds of cycles that can occur and somehow by make use of that structure might offer a way to speed things up.
When one algorithm is too slow, switch algorithms.
If you are looking for repeating strings, you might consider using a suffix tree scheme: https://en.wikipedia.org/wiki/Suffix_tree
This will find common substrings in for you in linear time.
EDIT: #nhahtdh inb a comment below has referenced a paper that tells you how to pick out all z tandem repeats very quickly. If somebody upvotes
my answer, #nhahtdh should logically get some of the credit.
I haven't tried it, but I'd guess that you might be able to parallelize the construction of the suffix tree itself.
I'm sure there's room for optimization, but test this algorithm on shorter strings to see how it compares to your current solution:
def partial_repeat(string):
l = len(string)
for i in range(2, l//2+1):
s = string[0:i]
multi = l//i-1
factor = l//(i-1)
ls = len(s)
if s*(multi) == string[:ls*(multi)] and len(string)-len(string[:ls*factor]) <= ls and s*2 in string:
return s
>>> test_string
'abc1231231231231'
>>> results = {x for x in (partial_repeat(test_string[i:]) for i in range(len(test_string))) if x}
>>> sorted(sorted(results, key=test_string.index), key=test_string.count, reverse=True)[0]
'123'
In this test string, it's unclear whether the non-repeating initial characters are 'abc' or 'abc1', so the repeating string could be either '123' or '231'. The above sorts each found substring by its earliest appearance in the test string, sorts again (sorted() is a stable sort) by the highest frequency, and takes the top result.
With standard loops and min() instead of comprehensions and sorted():
>>> g = {partial_repeat(test_string[i:]) for i in range(len(test_string))}
>>> results = set()
>>> for x in g:
... if x and (not results or test_string.count(x) >= min(map(test_string.count, results))):
... results.add(x)
...
>>> min(results, key=test_string.index)
'123'
I tested these solutions with the test string 'abc123123a' multiplied by (n for n in range(100, 10101, 500) to get some timing data. I entered these data into Excel and used its FORECAST() function to estimate the processing time of a 4-million character string at 430 seconds, or about seven minutes.

Categories

Resources