Recovering a linked list - python

I have a linked list that has been stored out of order in an array and the information about the original order is preserved by storing with each element the index of the next element.
For example,
[c;3][b;0][a;1][d;4]
Here [c;3] means that c is followed by d (stored at 3); [b;0] means that b is followed by c (stored at 0), and so on. The out of bound index 4 in [d;4] means that d is the last element.
I am looking for an algorithm to extract the original linked list order, abcd in the example, from such an array.
The last element always comes last (already at the correct place) and the algorithm may use this fact.
To clarify the question, let me reformulate it in terms of Python data structures.
I have a list of 2-tuples where the second element in each tuple is an integer that defines the traversal order through the list. The value of the integer is the index of the next tuple to be traversed. For example, given a list
[('c', 3), ('b', 0), ('a', 1), ('d', 4)]
the traversal order is
[2, 1, 0, 3]
or ('a', 1) -> ('b', 0) -> ('c', 3) -> ('d', 4). How can I write a function that given a list of 2-tuples described above would find the traversal order.
Here is a possible Python solution:
def order(x):
nexts = [n for _, n in x]
prevs = [-1] * (len(a) + 1)
for i, n in enumerate(nexts):
prevs[n] = i
trav = []
i = prevs[-1]
for _ in x:
trav.append(i)
i = prevs[i]
return trav[::-1]
Given the example data
>>> a = [('c', 3), ('b', 0), ('a', 1), ('d', 4)]
this function produces the expected result
>>> order(a)
[2, 1, 0, 3]
>>> [a[i] for i in order(a)]
[('a', 1), ('b', 0), ('c', 3), ('d', 4)]
Is there a better solution?

The head is the element that has nothing pointing to it. You can walk through once and keep track of what has nothing pointing to it, then walk through again to print them out in linear time.

What about sorting the array itself?
C is followed by [3], so swap c with [2] (element before [3]), and continue the same for each. But you'll have to keep track of original index and swapped/latest index for each element.

Find the index of the head of the linked list in O(n) time and space by looking for whatever element isn't pointed at.
Alternatively, sort by index pointed at in ascending order such that NULL is greater than any index. This can definitely be done in O(n log n) time and O(1) space; you can also use a linear sorting algorithm since the indices are bounded fairly well. There may be in-place variants of linear sorts; something to check.

So you have this [c;3][b;0][a;1][d;NULL] and you want to rearrange to [a][b][c][d].
It's given that the last item in the array contains the tail of the list. So you could build the list backward using an O(n^2) algorithm that's similar to selection sort.
The general idea, in pseudo code, is:
new_list = new array[a.length];
int pass = a.length-1;
new_list[pass] = a[a.length-1];
while (pass > 0)
{
for (j = 0; j < a.length-1; j++)
{
if (a[j].index == pass)
{
new_list[pass] = a[j];
break;
}
}
--pass;
}
That should work, although it's not very efficient. But it'd be fine for small lists if you don't call it very often.
There's a faster way that uses a dictionary and is just slightly more difficult to code. The idea is that you store the item's position as the key, and its predecessor as the value. If you don't have the item's predecessor yet, you store -1. So for this example, you look at the first item, [c;3]. You don't know its predecessor, so you store {0,-1} in the dictionary. Then look in the dictionary to see if you already have an entry for '3'. You don't, so you skip forward to the item at index 3 in the array, which is [d,NULL]. Its predecessor is 0, so you add {3,0} to the dictionary.
At this point, you don't have a successor, so you go back to the last place you were in your sequential scan and go to the next item: [b;0]. You don't know its predecessor, so you store {1,-1} in the dictionary. This item's successor is 0, which you already have in the dictionary. So you update the entry to {0,1}, and proceed with your forward scan. You don't know the predecessor of [a;1], so you add {2,-1} to your dictionary. You already have an entry for 1 in your dictionary, so you update it to {1,2}. You move forward to the last entry, see that 4 is already in your dictionary, and you're done.
Your dictionary contains:
{0,1},{3,0}{1,2}{2,-1}
Since you know that 3 is the end of the list, you start there in the dictionary and build the ordered sequence backwards by following the predecessor links. You know you're done when you reach the entry that has a predecessor of -1.
Worst case, this makes 3*n passes over each item: twice in the scan of the array, and once when following links in the dictionary.

You have to first sort the letters and then find the numbers that go with them:
def order(tuples, index_of_value):
aim = []
# Loop through sorted letters
l = [i[index_of_value] for i in tuples] # Find the nth of each tuple
for i in sorted(l):
#Find the number that goes with it and append
correct = [j for j in tuples if j[0] == i]
first_num = correct[0][0]
second_num = correct[0][1]
aim.append(tuple((first_num, second_num))) # Turn back into tuple and append
return aim
a = [('c', 3), ('b', 0), ('a', 1), ('d', 4)]
print(order(a, 0))
>>> [('a', 1), ('b', 0), ('c', 3), ('d', 4)]

In order to have a linked list, you must have a pointer to its head (beginning). To find the head in this situation, iterate over the entire array once until you know which node is not pointed to by any other node. Then all you need to do is follow the pointer at each node until you reach NULL.

Related

Python, Make variable equal to the second column of an array

I realise that there's a fair chance this has been asked somewhere else, but to be honest I'm not sure exactly what terminology I should be using to search for it.
But basically I've got a list with a varying number of elements. Each element contains 3 values: A string, another list, and an integer eg:
First element = ('A', [], 0)
so
ListofElements[0] = [('A', [], 0)]
And what I am trying to do is make a new list that consists of all of the integers(3rd thing in the elements) that are given in ListofElements.
I can do this already by stepping through each element of ListofElements and then appending the integer onto the new list shown here:
NewList=[]
for element in ListofElements:
NewList.append(element[2])
But using a for loop seems like the most basic way of doing it, is there a way that uses less code? Maybe a list comprehension or something such as that. It seems like something that should be able to be done on a single line.
That is just a step in my ultimate goal, which is to find out the index of the element in ListofElements that has the minimum integer value. So my process so far is to make a new list, and then find the integer index of that new list using:
index=NewList.index(min(NewList))
Is there a way that I can just avoid making the new list entirely and generate the index straight away from the original ListofElements? I got stuck with what I would need to fill in to here, or how I would iterate through :
min(ListofElements[?][2])
You can use a list coprehension:
[x[2] for x in ListOfElements]
This is generally considered a "Pythonic" approach.
You can also find the minimum in a rather stylish manner using:
minimum = min(ListOfElements, key=lambda x: x[2])
index = ListOfElements.index(minimum)
Some general notes:
In python using underscores is the standard rather than CamelCase.
In python you almost never have to use an explicit for loop. Instead prefer
coprehensions or a functional pattern (map, apply etc.)
You can map your list with itemgetter:
>>> from operator import itemgetter
>>> l = [(1, 2, 3), (1, 2, 3), (1, 2, 3), (1, 2, 3), (1, 2, 3)]
>>> map(itemgetter(2), l)
[3, 3, 3, 3, 3]
Then you can go with your approach to find the position of minimum value.

What is the inverse function of itertools.izip in python?

I saw this, and this questions and I'd like to have the same effect, only efficiently done with itertool.izip.
From itertool.izip's documentation:
Like zip() except that it returns an iterator instead of a list
I need an iterator because I can't fit all values to memory so instead I'm using a generator and iterating over the values.
More specifically, I have a generator that generates a three values tuple, and instead of iterating it I'd like to feed three lists of values to three functions, each list represents a single position in the tuple.
Out of those three-tuple-values, only one is has big items (memory consumption wise) in it (lets call it data) while the other two contain only values that require only little amount of memory to hold, so iterating over the data value's "list of values" first should work for me by consuming the data values one by one, and caching the small ones.
I can't think of a smart way to generate one "list of values" at a time, because I might decide to remove instances of a three-value-tuple occasionally, depending on the big value of the tuple.
Using the widely suggested zip solution, similar to:
>>> zip(*[('a', 1), ('b', 2), ('c', 3), ('d', 4)])
[('a', 'b', 'c', 'd'), (1, 2, 3, 4)]
Results in the "unpacking argument list" part (*[...]) of this to trigger a full iteration over the entire iterator and (I assume) cache all results in memory, which is as I said, an issue for me.
I can build a mask list (True/False for small values to keep), but I'm looking for a cleaner more pythonic way. If all else fails, I'll do that.
What's wrong with a traditional loop?
>>> def gen():
... yield 'first', 0, 1
... yield 'second', 2, 3
... yield 'third', 4, 5
...
>>> numbers = []
>>> for data, num1, num2 in gen():
... print data
... numbers.append((num1, num2))
...
first
second
third
>>> numbers
[(0, 1), (2, 3), (4, 5)]

Python cartesian product of n lists with n unknown at coding time

Question
what is the best way to generate a cartesian product of some lists, not knowing in advance how many lists there are?
You can stop reading here if you like.
Background
I don't have money for school so I am trying to teach myself some programming
using the Internet whilst working night shifts at a highway tollbooth. I have
decided to try to solve some "programming challenge" problems as an exercise.
Programming assignment
Here's the problem I am trying to tackle, property of TopCoder:
http://community.topcoder.com/stat?c=problem_statement&pm=3496
I will not copy and paste the full description to respect their copyright notice
but I am assuming I can summarise it, provided I don't use pieces of it verbatim
(IANAL though).
Summary
If a "weighted sum" of historical stock prices is the sum of addenda obtained
by multiplying a subset of these prices by an equal number of "weighting"
factors, provided the latter add up to 1.0 and are chosen from the given set
of valid values [-1.0, -0.9, ..., 0.9, 1.0], use this formula on all
historical data supplied as an argument to your function, examining 5 prices
at a time, predicting the next price and returning the permutation of "weighting
factors" that yields the lowest average prediction error. There will be at least
6 stock prices in each run so at least one prediction is guaranteed, final
results should be accurate within 1E-9.
Test data
Format:
One row for input data, in list format
One row for the expected result
One empty row as a spacer
Download from:
http://paste.ubuntu.com/1229857/
My solution
import itertools
# For a permutation of factors to be used in a weighted sum, it should be chosen
# such than the sum of all factors is 1.
WEIGHTED_SUM_TOTAL = 1.0
FACTORS_CAN_BE_USED_IN_WEIGHTED_SUM = lambda x: sum(x) == WEIGHTED_SUM_TOTAL
# Historical stock price data should be examined using a sliding window of width
# 5 when making predictions about the next price.
N_RECENT_PRICES = 5
# Valid values for weighting factors are: [-1.0, -0.9, ..., 0.9, 1.0]
VALID_WEIGHTS = [x / 10. for x in range(-10, 11)]
# A pre-calculated list of valid weightings to consider. This is the cartesiant
# product of the set of valid weigths considering only the combinations which
# are valid as components of a weighted sum.
CARTESIAN_PRODUCT_FACTORS = [VALID_WEIGHTS] * N_RECENT_PRICES
ALL_PERMUTATIONS_OF_WEIGHTS = itertools.product(*CARTESIAN_PRODUCT_FACTORS)
WEIGHTED_SUM_WEIGHTS = filter(FACTORS_CAN_BE_USED_IN_WEIGHTED_SUM,
ALL_PERMUTATIONS_OF_WEIGHTS)
# Generator function to get sliding windows of a given width from a data set
def sliding_windows(data, window_width):
for i in range(len(data) - window_width):
yield data[i:i + window_width], data[i + window_width]
def avg_error(data):
# The supplied data will guarantee at least one iteration
n_iterations = len(data) - 5
best_average_error = None
# Consider each valid weighting (e.g. permutation of weights)
for weighting in WEIGHTED_SUM_WEIGHTS:
# Keep track of the prediction errors for this weighting
errors_for_this_weighting = []
for historical_data, next_to_predict in sliding_windows(data,
N_RECENT_PRICES):
prediction = sum([a * b for a, b in zip(weighting, historical_data)])
errors_for_this_weighting.append(abs(next_to_predict - prediction))
average_error = sum(errors_for_this_weighting) / n_iterations
if average_error == 0: return average_error
best_average_error = (average_error if not best_average_error else
min(average_error, best_average_error))
return best_average_error
def main():
with open('data.txt') as input_file:
while True:
data = eval(input_file.readline())
expected_result = eval(input_file.readline())
spacer = input_file.readline()
if not spacer:
break
result = avg_error(data)
print expected_result, result, (expected_result - result) < 1e-9
if __name__ == '__main__':
main()
My question
I am not asking for a code review of my solution because this would be the wrong StackExchange forum for that. I would post my solution to "Code Review" in that case.
My question instead is small, precise and unambiguous, fitting this site's format (hopefully).
In my code I generate a cartesian product of lists using itertools. In essence, I do not solve the crux of the problem myself but delegate the solution to a library that does that for me. I think this is the wrong approach to take if I want to learn from doing these exercises. I should be doing the hard part myself, otherwise why do the exercise at all? So I would like to ask you:
what is the best way to generate a cartesian product of some lists, not knowing in advance how many lists there are?
That's all I'd like to know, you can critique my code if you like. That's welcome, even if it passes all the tests (there is always a better way of doing things, especially if you are a beginner like me) but for this question to be "just right" for SO, I am focussing on just one aspect of the code, a concrete problem I have and something I am not happy with. Let me tell you more, I'll also share the canonical "what have you tried already"...
Clearly if I knew the number of lists I could just type in some nested for loops, like the top solvers of this exercise did in the competition. I tried writing a function that does this for an unknown number of lists but I was not sure which approach to take. The first approach was to write a recursive function. From list 1, take element 1 and combine it with element 1 of list 2, then element 1 of list 3, etc. I would push onto a stack the elements from each "layer" and pop them upon reaching the desired depth. I would imagine I would not fear a "stack overflow" because the depth reachable would be reasonable. I then struggled to choose a data structure to do this in the most efficient (memory/space) way possible, without passing too many parameters to the recursive calls. Should the data structure exist outside of the calls? Be passed around in the calls? Could I achieve any level of parallelism? How? With so many questions and so few answers I realised I needed to just know more to solve this problem and I could use a nudge in the right direction. You could provide a code snippet and I would study it. Or just explain to me what is the right "Computer Science" way of handling this type of problem. I am sure there is something I am not considering.
Finally, the thing that I did consider in my solution above is that thankfully filter filters a generator so the full cartesian product is never kept in memory (as it would if I did a list(ALL_PERMUTATIONS_OF_WEIGHTS) at any time in the code) so I am occupying space in memory only for those combinations which can actually be used as a weighted sum. A similar caution would be nice if applied to whatever system allows me to generate the cartesian product without using itertools.
Think of how numbers are written (in the decimal system, or in any other system). Include zero's even if you don't need them:
00000
00001
00002
...
00009
00010
00011
00012
...
99998
99999
You can see how this looks like a cartesian product of 5 lists list(range(10)) (in this particular case). You can generate this output very easily by incrementing the "lowest" digit, and when it reaches the last one in the list, setting it to the first element and incrementing the "next highest" digit. You would still need for loops, of course, but a very small number. Use a similar approach when you work with arbitrary number of arbitrary lists.
For example, if you have 3 lists: ['a', 'b', 'c'], ['x', 'y'], ['1', '2'], you'll get:
ax1
ax2
ay1
ay2
bx1
bx2
by1
by2
cy1
cy2
cx1
cx2
Good luck!
EDIT:
If you like, here's a sample code to do this. I don't recursion just to show how simple this is. Recursion, of course, is also a great approach.
def lex_gen(bounds):
elem = [0] * len(bounds)
while True:
yield elem
i = 0
while elem[i] == bounds[i] - 1:
elem[i] = 0
i += 1
if i == len(bounds):
raise StopIteration
elem[i] += 1
def cart_product(lists):
bounds = [len(lst) for lst in lists]
for elem in lex_gen(bounds):
yield [lists[i][elem[i]] for i in range(len(lists))]
for k in cart_product([['1', '2'], ['x', 'y'], ['a', 'b', 'c']]):
print(k)
First, consider an n-list cartesian product. Let's take the first list, which we'll call L. Then we'll take the rest of the lists, which we'll call R. Then, for each item in L, prepend that to the beginning of each tuple yielded by the cartesian product of R.
With that, you can solve the problem just by implementing the cartesian product of no lists.
Here's a Haskell implementation, in case it helps you understand what I'm saying:
cartesian :: [[a]] -> [[a]]
cartesian [] = [[]]
cartesian (xs:yss) = [x : ys | x <- xs, ys <- cartesian yss]
Classically, Cartesian coordinates are (x,y) in the plane or (x,y,z) in 3D-space (for x, y and z in the real numbers):
[ (x,y) for x in reals for y in reals ]
More generally they are tuples (as a Python list comprehension):
[ (x1, x2, x3, ...) for x1 in X1 for x2 in X2 for x3 in X3 ...]
For objects (iterables in our case) X1, X2, X3,..., and we would like is a function:
def cartesian_product(X1,X2,X3,...):
return # the above list
One way to do this is to use recursion, taking care to always return tuples:
def cartesian_product(*X):
if len(X) == 1: #special case, only X1
return [ (x0,) for x0 in X[0] ]
else:
return [ (x0,)+t1 for x0 in X[0] for t1 in cartesian_product(*X[1:]) ]
cartesian_product([1,2],[3,4],[5,6])
# [(1, 3, 5), (1, 3, 6), (1, 4, 5), (1, 4, 6), (2, 3, 5), (2, 3, 6), (2, 4, 5), (2, 4, 6)]
Here's a favorite (and pedagogically decent, I hope) way of implementing the cartesian product in terms of reduce, translated from a Perl version that I wrote some time ago:
def cartesian_product(*X):
return reduce(
lambda accum, lst:
[ tup + (item,) for tup in accum for item in lst ],
X,
[()]
)
It's similar to hayden's answer, except that it uses reduce instead of explicit recursion, which I think makes the base case a lot clearer.
What we're reducing here is a list of tuples (the accumulated output, accum) against a list of items (lst). For every item in the list of items, we concatenate it to the end of all of the accumulated tuples, and repeat this process for as many lists (X) as there are. The reduce initializer is [()], a list containing one empty tuple, which ensures that if X[0] is [1, 2, 3] the accumulator will become [(1), (2), (3)] after the first step (one tuple because we want each item in X[0] once, and a zero-tuple because we want it to be concatenated to nothing). This corresponds to the "nullary product" mentioned by senderle in a comment to icktoofay's answer.
Given this function definition, if you print cartesian_product([1,2], [3,4], [5,6]) it will print:
[(1, 3, 5), (1, 3, 6), (1, 4, 5), (1, 4, 6), (2, 3, 5), (2, 3, 6), (2, 4, 5), (2, 4, 6)]
which are the 8 tuples we expected.
Itertools to the rescue. The following will create combinations as they are used one-by-one:
import itertools
combs=itertools.product(*lists)
E. g. using command line Python, and assuming you have a list of lists of variable length:
>>> c=[['3', '5', '7'], ['100'], ['1', '2', '3']]
>>> z=itertools.product(*c)
>>> for ii in z:
... print ii
...
('3', '100', '1')
('3', '100', '2')
('3', '100', '3')
('5', '100', '1')
('5', '100', '2')
('5', '100', '3')
('7', '100', '1')
('7', '100', '2')
('7', '100', '3')

Flattening nested loops / decreasing complexity - complementary pairs counting algorithm

I was recently trying to solve some task in Python and I have found the solution that seems to have the complexity of O(n log n), but I believe it is very inefficient for some inputs (such as first parameter being 0 and pairs being very long list of zeros).
It has also three levels of for loops. I believe it can be optimized, but at the moment I cannot optimize it more, I am probably just missing something obvious ;)
So, basically, the problem is as follows:
Given list of integers (values), the function needs to return the number of indexes' pairs that meet the following criteria:
lets assume single index pair is a tuple like (index1, index2),
then values[index1] == complementary_diff - values[index2] is true,
Example:
If given a list like [1, 3, -4, 0, -3, 5] as values and 1 as complementary_diff, the function should return 4 (which is the length of the following list of indexes' pairs: [(0, 3), (2, 5), (3, 0), (5, 2)]).
This is what I have so far, it should work perfectly most of the time, but - as I said - in some cases it could run very slowly, despite the approximation of its complexity O(n log n) (it looks like pessimistic complexity is O(n^2)).
def complementary_pairs_number (complementary_diff, values):
value_key = {} # dictionary storing indexes indexed by values
for index, item in enumerate(values):
try:
value_key[item].append(index)
except (KeyError,): # the item has not been found in value_key's keys
value_key[item] = [index]
key_pairs = set() # key pairs are unique by nature
for pos_value in value_key: # iterate through keys of value_key dictionary
sym_value = complementary_diff - pos_value
if sym_value in value_key: # checks if the symmetric value has been found
for i1 in value_key[pos_value]: # iterate through pos_values' indexes
for i2 in value_key[sym_value]: # as above, through sym_values
# add indexes' pairs or ignore if already added to the set
key_pairs.add((i1, i2))
key_pairs.add((i2, i1))
return len(key_pairs)
For the given example it behaves like that:
>>> complementary_pairs_number(1, [1, 3, -4, 0, -3, 5])
4
If you see how the code could be "flattened" or "simplified", please let me know.
I am not sure if just checking for complementary_diff == 0 etc. is the best approach - if you think it is, please let me know.
EDIT: I have corrected the example (thanks, unutbu!).
I think this improves the complexity to O(n):
value_key.setdefault(item,[]).append(index) is faster than using
the try..except blocks. It is also faster than using a collections.defaultdict(list). (I tested this with ipython %timeit.)
The original code visits every solution twice. For each pos_value
in value_key, there is a unique sym_value associated with
pos_value. There are solutions when sym_value is also in
value_key. But when we iterate over the keys in value_key,
pos_value is eventually assigned to the value of sym_value, which
make the code repeat the calculation it has already done. So you can
cut the work in half if you can stop pos_value from equaling the
old sym_value. I implemented that with a seen = set() to keep
track of seen sym_values.
The code only cares about len(key_pairs), not the key_pairs themselves. So instead of keeping track of the pairs (with a
set), we can simply keep track of the count (with num_pairs). So we can replace the two inner for-loops with
num_pairs += 2*len(value_key[pos_value])*len(value_key[sym_value])
or half that in the "unique diagonal" case, pos_value == sym_value.
def complementary_pairs_number(complementary_diff, values):
value_key = {} # dictionary storing indexes indexed by values
for index, item in enumerate(values):
value_key.setdefault(item,[]).append(index)
# print(value_key)
num_pairs = 0
seen = set()
for pos_value in value_key:
if pos_value in seen: continue
sym_value = complementary_diff - pos_value
seen.add(sym_value)
if sym_value in value_key:
# print(pos_value, sym_value, value_key[pos_value],value_key[sym_value])
n = len(value_key[pos_value])*len(value_key[sym_value])
if pos_value == sym_value:
num_pairs += n
else:
num_pairs += 2*n
return num_pairs
You may want to look into functional programming idioms, such as reduce, etc.
Often times, nested array logic can be simplified by using functions like reduce, map, reject, etc.
For an example (in javascript) check out underscore js. I'm not terribly smart at Python, so I don't know which libraries they have available.
I think (some or all of) these would help, but I'm not sure how I would prove it yet.
1) Take values and reduce it to a distinct set of values, recording the count of each element (O(n))
2) Sort the resulting array.
(n log n)
3) If you can allocate lots of memory, I guess you might be able to populate a sparse array with the values - so if the range of values is -100 : +100, allocate an array of [201] and any value that exists in the reduced set pops a one at the value index in the large sparse array.
4) Any value that you want to check if it meets your condition now has to look at the index in the sparse array according to the x - y relationship and see if a value exists there.
5) as unutbu pointed out, it's trivially symmetric, so if {a,b} is a pair, so is {b,a}.
I think you can improve this by separating out the algebra part from the search and using smarter data structures.
Go through the list and subtract from the complementary diff for each item in the list.
resultlist[index] = complementary_diff - originallist[index]
You can use either a map or a simple loop. -> Takes of O(n) time.
See if the number in the resulting list exists in the original list.
Here, with a naive list, you would actually get O(n^2), because you can end up searching for the whole original list per item in the resulting list.
However, there are smarter ways to organize your data than this. If you have the original list sorted, your search time reduces to O(nlogn + nlogn) = O(nlogn), nlogn for the sort, and nlogn for the binary search per element.
If you wanted to be even smarter you can make your list in to a dictionary(or hash table) and then this step becomes O(n + n) = O(n), n to build the dictionary and 1 * n to search each element in the dictionary. (*EDIT: * Since you cannot assume uniqueness of each value in the original list. You might want to keep count of how many times each value appears in the original list.)
So with this now you get O(n) total runtime.
Using your example:
1, [1, 3, -4, 0, -3, 5],
Generate the result list:
>>> resultlist
[0, -2, 5, 1, 4, -4].
Now we search:
Flatten out the original list into a dictionary. I chose to use the original list's index as the value as that seems like a side data you're interested in.
>>> original_table
{(1,0), (3,1), (-4,2), (0,3), (-3,4), (5,5)}
For each element in the result list, search in the hash table and make the tuple:
(resultlist_index, original_table[resultlist[resultlist_index]])
This should look like the example solution you had.
Now you just find the length of the resulting list of tuples.
Now here's the code:
example_diff = 1
example_values = [1, 3, -4, 0, -3, 5]
example2_diff = 1
example2_values = [1, 0, 1]
def complementary_pairs_number(complementary_diff, values):
"""
Given an integer complement and a list of values count how many pairs
of complementary pairs there are in the list.
"""
print "Input:", complementary_diff, values
# Step 1. Result list
resultlist = [complementary_diff - value for value in values]
print "Result List:", resultlist
# Step 2. Flatten into dictionary
original_table = {}
for original_index in xrange(len(values)):
if values[original_index] in original_table:
original_table[values[original_index]].append(original_index)
else:
original_table[values[original_index]] = [original_index]
print "Flattened dictionary:", original_table
# Step 2.5 Search through dictionary and count up the resulting pairs.
pair_count = 0
for resultlist_index in xrange(len(resultlist)):
if resultlist[resultlist_index] in original_table:
pair_count += len(original_table[resultlist[resultlist_index]])
print "Complementary Pair Count:", pair_count
# (Optional) Step 2.5 Search through dictionary and create complementary pairs. Adds O(n^2) complexity.
pairs = []
for resultlist_index in xrange(len(resultlist)):
if resultlist[resultlist_index] in original_table:
pairs += [(resultlist_index, original_index) for original_index in
original_table[resultlist[resultlist_index]]]
print "Complementary Pair Indices:", pairs
# Step 3
return pair_count
if __name__ == "__main__":
complementary_pairs_number(example_diff, example_values)
complementary_pairs_number(example2_diff, example2_values)
Output:
$ python complementary.py
Input: 1 [1, 3, -4, 0, -3, 5]
Result List: [0, -2, 5, 1, 4, -4]
Flattened dictionary: {0: 3, 1: 0, 3: 1, 5: 5, -4: 2, -3: 4}
Complementary Pair Indices: [(0, 3), (2, 5), (3, 0), (5, 2)]
Input: 1 [1, 0, 1]
Result List: [0, 1, 0]
Flattened dictionary: {0: [1], 1: [0, 2]}
Complementary Pair Count: 4
Complementary Pair Indices: [(0, 1), (1, 0), (1, 2), (2, 1)]
Thanks!
Modified the solution provided by #unutbu:
The problem can be reduced to comparing these 2 dictionaries:
values
pre-computed dictionary for (complementary_diff - values[i])
def complementary_pairs_number(complementary_diff, values):
value_key = {} # dictionary storing indexes indexed by values
for index, item in enumerate(values):
value_key.setdefault(item,[]).append(index)
answer_key = {} # dictionary storing indexes indexed by (complementary_diff - values)
for index, item in enumerate(values):
answer_key.setdefault((complementary_diff-item),[]).append(index)
num_pairs = 0
print(value_key)
print(answer_key)
for pos_value in value_key:
if pos_value in answer_key:
num_pairs+=len(value_key[pos_value])*len(answer_key[pos_value])
return num_pairs

Populate a list in python

I have a series of Python tuples representing coordinates:
tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
I want to create the following list:
l = []
for t in tuples:
l[ t[0] ][ t[1] ] = something
I get an IndexError: list index out of range.
My background is in PHP and I expected that in Python you can create lists that start with index > 0, i.e. make gaps and then fill them up, but it seems you can't.
The idea is to have the lists sorted afterwards. I know I can do this with a dictionary, but as far as I know dictionaries cannot be sorted by keys.
Update: I now know they can - see the accepted solution.
Edit:
What I want to do is to create a 2D array that will represent the matrix described with the tuple coordinates, then iterate it in order.
If I use a dictionary, i have no guarantee that iterating over the keys will be in order -> (0,0) (0,1) (0,2) (1,0) (1,1) (1,2) (2,0) (2,1) (2,2)
Can anyone help?
No, you cannot create list with gaps. But you can create a dictionary with tuple keys:
tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
l = {}
for t in tuples:
l[t] = something
Update:
Try using NumPy, it provides wide range of operations over matrices and array. Cite from free pfd on NumPy available on the site (3.4.3 Flat Iterator indexing): "As mentioned previously, X.flat returns an iterator that will iterate over the entire array (in C-contiguous style with the last index varying the fastest". Looks like what you need.
You should look at dicts for something like that.
for t in tuples:
if not l.has_key(t[0]):
l[t[0]] = {}
l[t[0]][t[1]] = something
Iterating over the dict is a bit different than iterating over a list, though. You'll have the keys(), values() and items() functions to help with that.
EDIT: try something like this for ordering:
for x in sorted(l.keys()):
for y in sorted(l[x].keys()):
print l[x][y]
You create a one-dimensional list l and want to use it as a two-dimensional list.
Thats why you get an index error.
You have the following options:
create a map and use the tuple t as index:
l = {}
l[t] = something
and you will get entries in l as:
{(1, 1): something}
if you want a traditional array structure I'll advise you to look at numpy. With numpy you get n-dimensional arrays with "traditional" indexing.
As I mentioned use numpy,
with numpy you can create a 2-dimensional array, filled with zeros or ones or ...
Tha you can fill any desired value with indexing [x,y] as you desire.
Of course you can iterate over rows and columns or the whole array as a list.
If you know the size that you before hand,you can make a list of lists like this
>>> x = 3
>>> y = 3
>>> l = [[None] * x for i in range(y)]
>>> l
[[None, None, None], [None, None, None], [None, None, None]]
Which you can then iterate like you originally suggested.
What do you mean exactly by "but as far as I know dictionaries cannot be sorted by keys"?
While this is not strictly the same as a "sorted dictionary", you can easily turn a dictionary into a list, sorted by the key, which seems to be what you're after:
>>> tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
>>> l = {}
>>> for t in tuples:
... l[t] = "something"
>>> sorted(l) # equivalent to sorted(l.keys())
[(0, 0), (0, 1), (1, 0), (1, 1), (2, 1)]
>>> sorted(l.items()) # make a list of (key, value) tuples, and sort by key
[((0, 0), 'something'), ((0, 1), 'something'), ((1, 0), 'something'), ((1, 1), 'something'), ((2, 1), 'something')]
(I turned something into the string "something" just to make the code work)
To make use of this for your case however (if I understand it correctly, that is), you would still need to fill the dictionary with None values or something for every "empty" coordinate tuple)
Extending the Nathan's answer,
tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
x = max(tuples, key = lambda z : z[0])[0] + 1
y = max(tuples, key = lambda z : z[1])[1] + 1
l = [[None] * y for i in range(x)]
And then you can do whatever you want
As mentioned earlier, you can't make lists with gaps, and dictionaries may be the better choice here. The trick is to makes sure that l[t[0]] exists when you put something in position t[1]. For this, I'd use a defaultdict.
import collections
tuples = [(1,1), (0,1), (1,0), (0,0), (2,1)]
l = collections.defaultdict(dict)
for t in tuples:
l[t[0]][t[1]] = something
Since l is a defaultdict, if l[t[0]] doesn't exist, it will create an empty dict for you to put your something in at position t[1].
Note: this ends up being the same as #unwesen's answer, without the minor tedium of hand-checking for existence of the inner dict. Chalk it up to concurrent answering.
The dict solutions given are probably best for most purposes. For your issue of iterating over the keys in order, generally you would instead iterate over the coordinate space, not the dict keys, exactly the same way you would have for your list of lists. Use .get and you can specify the default value to use for the blank cells, or alternatively use "collections.defaultdict" to define a default at dict creation time. eg.
for y in range(10):
for x in range(10):
value = mydict.get((x,y), some_default_value)
# or just "value = mydict[x,y]" if used defaultdict
If you do need an actual list of lists, you can construct it directly as below:
max_x, max_y = map(max, zip(*tuples))
l=[[something if (x,y) in tuples else 0 for y in range(max_y+1)]
for x in xrange(max_x+1)]
If the list of tuples is likely to be long, the for performance reasons, you may want to use a set for the lookup,as "(x,y) in tuples" performs a scan of the list, rather than a fast lookup by hash. ie, change the second line to:
tuple_set = set(tuples)
l=[[something if (x,y) in tuple_set else 0 for y in range(max_y+1)]
for x in xrange(max_x+1)]
I think you have only declared a one dimensional list.
I think you declare it as
l = [][]
Edit: That's a syntax error
>>> l = [][]
File "<stdin>", line 1
l = [][]
^
SyntaxError: invalid syntax
>>>

Categories

Resources