I have a dictionary A, and a possible entry foo. I know that A[foo] should be equal to x, but I don't know if A[foo] has been already defined. In any case if A[foo] has been defined it means that it already has the correct value.
It is faster to execute:
if foo not in A.keys():
A[foo]=x
or simply update
A[foo]=x
because by the time the computer has found the foo entry, it can as well update it. While if not I would have to call the hash table two times?
Thanks.
Just add items to the dictionary without checking for their existence. I added 100,000 items to a dictionary using 3 different methods and timed it with the timeit module.
if k not in d: d[k] = v
d.setdefault(k, v)
d[k] = v
Option 3 was the fastest, but not by much.
[ Actually, I also tried if k not in d.keys(): d[k] = v, but that was slower by a factor of 300 (each iteration built a list of keys and performed a linear search). It made my tests so slow that I left it out here. ]
Here's my code:
import timeit
setup = """
import random
random.seed(0)
item_count = 100000
# divide key range by 5 to ensure lots of duplicates
items = [(random.randint(0, item_count/5), 0) for i in xrange(item_count)]
"""
in_dict = """
d = {}
for k, v in items:
if k not in d:
d[k] = v
"""
set_default = """
d = {}
for k, v in items:
d.setdefault(k, v)
"""
straight_add = """
d = {}
for k, v in items:
d[k] = v
"""
print 'in_dict ', timeit.Timer(in_dict, setup).timeit(1000)
print 'set_default ', timeit.Timer(set_default, setup).timeit(1000)
print 'straight_add ', timeit.Timer(straight_add, setup).timeit(1000)
And the results:
in_dict 13.090878085
set_default 21.1309413091
straight_add 11.4781760635
Note: This is all pretty pointless. We get many questions daily about what's the fastest way to do x or y in Python. In most cases, it is clear that the question was being asked before any performance issues were encountered. My advice? Focus on writing the clearest program you can write and if it's too slow, profile it and optimize where needed. In my experience, I almost never get to to profile and optimize step. From the description of the problem, it seems as if dictionary storage will not be the major bottle-neck in your program.
Using the built-in update() function is even faster. I tweaked Steven Rumbalski's example above a bit and it shows how update() is the fastest. There are at least two ways to use it (with a list of tuples or with another dictionary). The former (shown below as update_method1) is the fastest. Note that I also changed a couple of other things about Steven Rumbalski's example. My dictionaries will each have exactly 100,000 keys but the new values have a 10% chance of not needing to be updated. This chance of redundancy will depend on the nature of the data that you're updating your dictionary with. In all cases on my machine, my update_method1 was the fastest.
import timeit
setup = """
import random
random.seed(0)
item_count = 100000
existing_dict = dict([(str(i), random.randint(1, 10)) for i in xrange(item_count)])
items = [(str(i), random.randint(1, 10)) for i in xrange(item_count)]
items_dict = dict(items)
"""
in_dict = """
for k, v in items:
if k not in existing_dict:
existing_dict[k] = v
"""
set_default = """
for k, v in items:
existing_dict.setdefault(k, v)
"""
straight_add = """
for k, v in items:
existing_dict[k] = v
"""
update_method1 = """
existing_dict.update(items)
"""
update_method2 = """
existing_dict.update(items_dict)
"""
print 'in_dict ', timeit.Timer(in_dict, setup).timeit(1000)
print 'set_default ', timeit.Timer(set_default, setup).timeit(1000)
print 'straight_add ', timeit.Timer(straight_add, setup).timeit(1000)
print 'update_method1 ', timeit.Timer(update_method1, setup).timeit(1000)
print 'update_method2 ', timeit.Timer(update_method2, setup).timeit(1000)
This code resulted in the following results:
in_dict 10.6597309113
set_default 19.3389420509
straight_add 11.5891621113
update_method1 7.52693581581
update_method2 9.10132408142
if foo not in A.keys():
A[foo] = x
is very slow, because A.keys() creates a list, which has to be parsed in O(N).
if foo not in A:
A[foo] = x
is faster, because it takes O(1) to check, whether foo exists in A.
A[foo] = x
is even better, because you already have the object x and you just add (if it already does not exist) a pointer to it to A.
There are certainly faster ways than your first example. But I suspect the straight update will be faster than any test.
foo not in A.keys()
will, in Python 2, create a new list with the keys and then perform linear search on it. This is guaranteed to be slower (although I mainly object to it because there are alternatives that are faster and more elegant/idiomatic).
A[foo] = x
and
if foo not in A:
A[foo] = x
are different if A[foo] already exists but is not x. But since your "know" A[foo] will be x, it doesn't matter semantically. Anyway, both will be fine performance-wise (hard to tell without benchmarking, although intuitively I'd say the if takes much more time than copying a pointer).
So the answer is clear anyway: Choose the one that is much shorter code-wise and just as clear (the first one).
If you "know" that A[foo] "should be" equal to x, then I would just do:
assert(A[foo]==x)
which will tell you if your assumption is wrong!
A.setdefault(foo, x) but i'm not sure it is faster then if not A.has_key(foo): A[foo] = x. Should be tested.
Related
I have this piece of code:
import time
d = dict()
for i in range(200000):
d[i] = "DUMMY"
start_time = time.time()
for i in range(200000):
for key in d:
if len(d) > 1 or -1 not in d:
break
del d[i]
print("--- {} seconds ---".format(time.time() - start_time))
Why does this take ~15 seconds to run?
But, if I comment out del d[i] or the inner loop, it runs in ~0.1 seconds.
The issue you have is caused by iterating over even one element (e.g. next(iter(d))) of a dictionary that was once large, but has been shrunk a great deal. This can be nearly slow as iterating over all of the dictionary items if you get unlucky with your hash values. And this code is very "unlucky" (predictably so, due to Python hash design).
The reason for the issue is that Python does not rebuild a dictionary's hash table when you remove items. So the hash table for a dictionary that used to have 200000 items in it, but which now has only 1 left is still has more than 200000 spaces in it (and probably more, since it was probably not entirely full at its peak).
When you're iterating the dictionary when it has all its values in it, finding the first one is pretty simple. The first one will be in one of the first few table entries. But as you empty out the table, more and more blank spaces will be at the start of the table and the search for the first value that still exists will take longer and longer.
This might be even worse given that you're using integer keys, which (mostly) hash to themselves (only -1 hashes to something else). This means that the first key in the "full" dictionary will usually be 0, the next 1, and so on. As you delete the values in increasing order, you'll be very precisely removing the earliest keys in the table first, and so making the searches maximally worse.
It's because this
for key in d:
if len(d) > 1 or -1 not in d:
break
will break on the first iteration, so your inner loop is basically a no-op.
Adding del[i] makes it do some real work, which takes time.
Update: well the above is obviously way to simplistic :-)
The following version of your code shows the same characteristic:
import time
import gc
n = 140000
def main(d):
for i in range(n):
del d[i] # A
for key in d: # B
break # B
import dis
d = dict()
for i in range(n):
d[i] = "DUMMY"
print dis.dis(main)
start_time = time.time()
main(d)
print("--- {} seconds ---".format(time.time() - start_time))
Using iterkeys doesn't make a difference.
If we plot the run time on different sizes of n we get (n on the x-axis, seconds on the y-axis):
so clearly something exponential going on.
Deleting line (A) or lines (B) removes the exponential component, although I'm not sure why.
Update 2: Based on #Blckknght's answer, we can regain some of the speed by infrequently rehashing the items:
def main(d):
for i in range(n):
del d[i]
if i % 5000 == 0:
d = {k:v for k, v in d.items()}
for key in d:
break
or this:
def main(d):
for i in range(n):
del d[i]
if i % 6000 == 0:
d = {k:v for k, v in d.items()}
try:
iter(d).next()
except StopIteration:
pass
takes under half the time of the original on large n (the bump at 130000 is consistent over 4 runs..).
There seems to be some performance cost to accessing the keys as a whole after deleting an item. This cost is not incurred when you do direct accesses so, my guess is that the dictionary flags its key list as dirty when an item is removed and waits for a reference to the key list before updating/rebuilding it.
This explains why you don't get a performance hit when you remove the inner loop (you're not causing the key list to be rebuilt). It also explains why the loop is fast when you remove the del d[i] line (you're not flagging the key list for rebuilding).
Given a mutable set of objects,
A = set(1,2,3,4,5,6)
I can construct a new set containing only those objects that don't satisfy a predicate ...
B = set(x for x in A if not (x % 2 == 0))
... but how do I modify A in place to contain only those objects? If possible, do this in linear time, without constructing O(n)-sized scratch objects, and without removing anything from A, even temporarily, that doesn't satisfy the predicate.
(Integers are used here only to simplify the example. In the actual code they are Future objects and I'm trying to pull out those that have already completed, which is expected to be a small fraction of the total.)
Note that it is not, in general, safe in Python to mutate an object that you are iterating over. I'm not sure of the precise rules for sets (the documentation doesn't make any guarantee either way).
I only need an answer for 3.4+, but will take more general answers.
(Not actually O(1) due to implementation details, but I'm loathe to delete it as it's quite clean.)
Use symmetric_difference_update.
>>> A = {1,2,3,4,5,6}
>>> A.symmetric_difference_update(x for x in A if not (x % 2))
>>> A
{1, 3, 5}
With an horrible time complexity (quadratic), but in O(1) space:
>>> A = {1,2,3,4,5,6}
>>> while modified:
... modified = False
... for x in A:
... if not x%2:
... A.remove(x)
... modified = True
... break
...
>>> A
{1, 3, 5}
On the very specific use case you showed, there is a way to do this in O(1) space, but it doesn't generalize very well to sets containing anything other than int objects:
A = {1, 2, 3, 4, 5, 6}
for i in range(min(A), max(A) + 1):
if i % 2 != 0:
A.discard(i)
It also wastes time since it will check numbers that aren't even in the set. For anything other than int objects, I can't yet think of a way to do this without creating an intermediate set or container of some sort.
For a more general solution, it would be better to simply initially construct your set using the predicate (if you don't need to use the set for anything else first). Something like this:
def items():
# maybe this is a file or a stream or something,
# where ever your initial values are coming from.
for thing in source:
yield thing
def predicate(item):
return bool(item)
A = set(item for item in items() if predicate(item))
to maintain the use use of memory constant this is the only thing that come to my mind
def filter_Set(predicate,origen:set) -> set:
resul = set()
while origen:
elem = origen.pop()
if predicate( elem ):
resul.add( elem )
return resul
def filter_Set_inplace(predicate,origen:set):
resul = set()
while origen:
elem = origen.pop()
if predicate( elem ):
resul.add( elem )
while resul:
origen.add(resul.pop())
with this functions I move elems from one set to the other keeping only those that satisfied the predicate
I am curious why removing a line in my code results in a significant increase in performance. The function itself takes a dictionary and removes all keys which are substrings of other keys.
The line which slows my code down is:
if sub in reduced_dict and sub2 in reduced_dict:
Here's my function:
def reduced(dictionary):
reduced_dict = dictionary.copy()
len_dict = defaultdict(list)
for key in dictionary:
len_dict[len(key)].append(key)
start_time = time.time()
for key, subs in len_dict.items():
for key2, subs2 in len_dict.items():
if key2 > key:
for sub in subs:
for sub2 in subs2:
if sub in reduced_dict and sub2 in reduced_dict: # Removing this line gives a significant performance boost
if sub in sub2:
reduced_dict.pop(sub, 0)
print time.time() - start_time
return reduced_dict
The function checks if sub is in sub2 many times. I assumed that if I checked for this comparison having already been made, I would be saving myself time. This doesn't seem to be the case. Why is the constant time function for lookup in a dictionary slowing me down?
I am a beginner so, I'm interested in concepts.
When I tested if the line in question is ever returning False, it appears that it is. I've tested this with the following
def reduced(dictionary):
reduced_dict = dictionary.copy()
len_dict = defaultdict(list)
for key in dictionary:
len_dict[len(key)].append(key)
start_time = time.time()
for key, subs in len_dict.items():
for key2, subs2 in len_dict.items():
if key2 > key:
for sub in subs:
for sub2 in subs2:
if sub not in reduced_dict or sub2 not in reduced_dict:
print 'not present' # This line prints many thousands of times
if sub in sub2:
reduced_dict.pop(sub, 0)
print time.time() - start_time
return reduced_dict
For 14,805 keys in the function's input dictionary:
19.6360001564 sec. without the line
33.1449999809 sec. with the line
Here are 3 dictionary examples. Biggest sample dictionary with 14805 keys, medium sample dictionary and smaller sample dictionary
I have graphed time in seconds (Y) vs input size in # of keys (X) for the first 14,000 keys in the biggest example dictionary. It appears all these functions have exponential complexity.
John Zwinck answer for this question
Matt my algorithm for this question without the dictionary
comparision
Matt exponential is from my first attempt at this problem. This took 76s
Matt compare is the algorithm in this question with the dict comparison line
tdelaney solution for this question. Algorithm 1 & 2 in order
georg solution from a related question I asked
The accepted answer executes in apparently linear time.
I'm surprised to find magic ratio exists for input size where run time for a dict look-up == a string search.
For the sample corpus, or any corpus in which most keys are small, it's much faster to test all possible subkeys:
def reduced(dictionary):
keys = set(dictionary.iterkeys())
subkeys = set()
for key in keys:
for n in range(1, len(key)):
for i in range(len(key) + 1 - n):
subkey = key[i:i+n]
if subkey in keys:
subkeys.add(subkey)
return {k: v
for (k, v) in dictionary.iteritems()
if k not in subkeys}
This takes about 0.2s on my system (i7-3720QM 2.6GHz).
I would do it a bit differently. Here's a generator function which gives you the "good" keys only. This avoids creating a dict which may be largely destroyed key-by-key. I also have just two levels of "for" loops and some simple optimizations to try to find matches more quickly and avoid searching for impossible matches.
def reduced_keys(dictionary):
keys = dictionary.keys()
keys.sort(key=len, reverse=True) # longest first for max hit chance
for key1 in keys:
found_in_key2 = False
for key2 in keys:
if len(key2) <= len(key1): # no more keys are long enough to match
break
if key1 in key2:
found_in_key2 = True
break
if not found_in_key2:
yield key1
If you want to make an actual dict using this, you can:
{ key: d[key] for key in reduced_keys(d) }
You create len_dict, but even though it groups keys of equal size, you still have to traverse everything multiple times to compare. Your basic plan is right - sort by size and only compare what's the same size or bigger, but there are other ways to do that. Below, I just created a regular list sorted by key size and then iterated backwards so that I could trim the dict as I went. I'm curious how its execution time compares to yours. It did your little dict example in .049 seconds.
(I hope it actually worked!)
def myfilter(d):
items = d.items()
items.sort(key=lambda x: len(x[0]))
for i in range(len(items)-2,-1,-1):
k = items[i][0]
for k_fwd,v_fwd in items[i+1:]:
if k in k_fwd:
del items[i]
break
return dict(items)
EDIT
A significant speed increase by not unpacking k_fwd,v_fwd (after running both a few times, this wasn't really a speed-up. something else must have been eating time on my PC for awhile).
def myfilter(d):
items = d.items()
items.sort(key=lambda x: len(x[0]))
for i in range(len(items)-2,-1,-1):
k = items[i][0]
for kv_fwd in items[i+1:]:
if k in kv_fwd[0]:
del items[i]
break
return dict(items)
def heapSort(lst):
heap = arrayHeap.mkHeap(len(lst), arrayHeap.less)
alst = list(lst)
while alst != []:
v = alst.pop(0)
arrayHeap.add (heap, v)
while heap.size != 0:
w = arrayHeap.removeMin(heap)
alst.append(w)
return last
is this a valid heap sort function?
Assuming your arrayHeap provides the same guarantees as the stdlib's heapq or any other reasonable heap implementation, then this is a valid heap sort, but it's a very silly one.
By copying the original sequence into a list and then popping from the left side, you're doing O(N^2) preparation for your O(N log N) sort.
If you change this to pop from the right side, then you're only doing O(N) preparation, so the whole thing takes O(N log N), as a heapsort should.
That being said, I can't understand why you want to pop off the list instead of just iterating over it. Or, for that matter, why you want to copy the original sequence into a list instead of just iterating over it directly. If you do that, it will be faster, and use only half the memory, and be much simpler code. Like this:
def heapSort(lst):
heap = arrayHeap.mkHeap(len(lst), arrayHeap.less)
for v in lst:
arrayHeap.add(heap, v)
alst = []
while heap.size:
w = arrayHeap.removeMin(heap)
alst.append(w)
return last
With a slightly nicer API, like the one in the stdlib's heapq module (is there a reason you're not using it, by the way?), you can make this even simpler:
def heapSort(lst):
alst = []
for v in lst:
heapq.heappush(alst, v)
return [heapq.heappop(alst) for i in range(len(alst))]
… or, if you're sure lst is a list and you don't mind mutating it:
def heapSort(lst):
heapq.heapify(lst)
return [heapq.heappop(lst) for i in range(len(lst))]
… or, of course, you can copy lst and then mutate the copy:
def heapSort(lst):
alst = lst[:]
heapq.heapify(alst)
return [heapq.heappop(alst) for i in range(len(alst))]
You may notice that the first one is the first of the Basic Examples in the heapq docs.
Suppose the following:
>>> s = set([1, 2, 3])
How do I get a value (any value) out of s without doing s.pop()? I want to leave the item in the set until I am sure I can remove it - something I can only be sure of after an asynchronous call to another host.
Quick and dirty:
>>> elem = s.pop()
>>> s.add(elem)
But do you know of a better way? Ideally in constant time.
Two options that don't require copying the whole set:
for e in s:
break
# e is now an element from s
Or...
e = next(iter(s))
But in general, sets don't support indexing or slicing.
Least code would be:
>>> s = set([1, 2, 3])
>>> list(s)[0]
1
Obviously this would create a new list which contains each member of the set, so not great if your set is very large.
I wondered how the functions will perform for different sets, so I did a benchmark:
from random import sample
def ForLoop(s):
for e in s:
break
return e
def IterNext(s):
return next(iter(s))
def ListIndex(s):
return list(s)[0]
def PopAdd(s):
e = s.pop()
s.add(e)
return e
def RandomSample(s):
return sample(s, 1)
def SetUnpacking(s):
e, *_ = s
return e
from simple_benchmark import benchmark
b = benchmark([ForLoop, IterNext, ListIndex, PopAdd, RandomSample, SetUnpacking],
{2**i: set(range(2**i)) for i in range(1, 20)},
argument_name='set size',
function_aliases={first: 'First'})
b.plot()
This plot clearly shows that some approaches (RandomSample, SetUnpacking and ListIndex) depend on the size of the set and should be avoided in the general case (at least if performance might be important). As already shown by the other answers the fastest way is ForLoop.
However as long as one of the constant time approaches is used the performance difference will be negligible.
iteration_utilities (Disclaimer: I'm the author) contains a convenience function for this use-case: first:
>>> from iteration_utilities import first
>>> first({1,2,3,4})
1
I also included it in the benchmark above. It can compete with the other two "fast" solutions but the difference isn't much either way.
tl;dr
for first_item in muh_set: break remains the optimal approach in Python 3.x. Curse you, Guido.
y u do this
Welcome to yet another set of Python 3.x timings, extrapolated from wr.'s excellent Python 2.x-specific response. Unlike AChampion's equally helpful Python 3.x-specific response, the timings below also time outlier solutions suggested above – including:
list(s)[0], John's novel sequence-based solution.
random.sample(s, 1), dF.'s eclectic RNG-based solution.
Code Snippets for Great Joy
Turn on, tune in, time it:
from timeit import Timer
stats = [
"for i in range(1000): \n\tfor x in s: \n\t\tbreak",
"for i in range(1000): next(iter(s))",
"for i in range(1000): s.add(s.pop())",
"for i in range(1000): list(s)[0]",
"for i in range(1000): random.sample(s, 1)",
]
for stat in stats:
t = Timer(stat, setup="import random\ns=set(range(100))")
try:
print("Time for %s:\t %f"%(stat, t.timeit(number=1000)))
except:
t.print_exc()
Quickly Obsoleted Timeless Timings
Behold! Ordered by fastest to slowest snippets:
$ ./test_get.py
Time for for i in range(1000):
for x in s:
break: 0.249871
Time for for i in range(1000): next(iter(s)): 0.526266
Time for for i in range(1000): s.add(s.pop()): 0.658832
Time for for i in range(1000): list(s)[0]: 4.117106
Time for for i in range(1000): random.sample(s, 1): 21.851104
Faceplants for the Whole Family
Unsurprisingly, manual iteration remains at least twice as fast as the next fastest solution. Although the gap has decreased from the Bad Old Python 2.x days (in which manual iteration was at least four times as fast), it disappoints the PEP 20 zealot in me that the most verbose solution is the best. At least converting a set into a list just to extract the first element of the set is as horrible as expected. Thank Guido, may his light continue to guide us.
Surprisingly, the RNG-based solution is absolutely horrible. List conversion is bad, but random really takes the awful-sauce cake. So much for the Random Number God.
I just wish the amorphous They would PEP up a set.get_first() method for us already. If you're reading this, They: "Please. Do something."
To provide some timing figures behind the different approaches, consider the following code.
The get() is my custom addition to Python's setobject.c, being just a pop() without removing the element.
from timeit import *
stats = ["for i in xrange(1000): iter(s).next() ",
"for i in xrange(1000): \n\tfor x in s: \n\t\tbreak",
"for i in xrange(1000): s.add(s.pop()) ",
"for i in xrange(1000): s.get() "]
for stat in stats:
t = Timer(stat, setup="s=set(range(100))")
try:
print "Time for %s:\t %f"%(stat, t.timeit(number=1000))
except:
t.print_exc()
The output is:
$ ./test_get.py
Time for for i in xrange(1000): iter(s).next() : 0.433080
Time for for i in xrange(1000):
for x in s:
break: 0.148695
Time for for i in xrange(1000): s.add(s.pop()) : 0.317418
Time for for i in xrange(1000): s.get() : 0.146673
This means that the for/break solution is the fastest (sometimes faster than the custom get() solution).
Since you want a random element, this will also work:
>>> import random
>>> s = set([1,2,3])
>>> random.sample(s, 1)
[2]
The documentation doesn't seem to mention performance of random.sample. From a really quick empirical test with a huge list and a huge set, it seems to be constant time for a list but not for the set. Also, iteration over a set isn't random; the order is undefined but predictable:
>>> list(set(range(10))) == range(10)
True
If randomness is important and you need a bunch of elements in constant time (large sets), I'd use random.sample and convert to a list first:
>>> lst = list(s) # once, O(len(s))?
...
>>> e = random.sample(lst, 1)[0] # constant time
Yet another way in Python 3:
next(iter(s))
or
s.__iter__().__next__()
Seemingly the most compact (6 symbols) though very slow way to get a set element (made possible by PEP 3132):
e,*_=s
With Python 3.5+ you can also use this 7-symbol expression (thanks to PEP 448):
[*s][0]
Both options are roughly 1000 times slower on my machine than the for-loop method.
I use a utility function I wrote. Its name is somewhat misleading because it kind of implies it might be a random item or something like that.
def anyitem(iterable):
try:
return iter(iterable).next()
except StopIteration:
return None
Following #wr. post, I get similar results (for Python3.5)
from timeit import *
stats = ["for i in range(1000): next(iter(s))",
"for i in range(1000): \n\tfor x in s: \n\t\tbreak",
"for i in range(1000): s.add(s.pop())"]
for stat in stats:
t = Timer(stat, setup="s=set(range(100000))")
try:
print("Time for %s:\t %f"%(stat, t.timeit(number=1000)))
except:
t.print_exc()
Output:
Time for for i in range(1000): next(iter(s)): 0.205888
Time for for i in range(1000):
for x in s:
break: 0.083397
Time for for i in range(1000): s.add(s.pop()): 0.226570
However, when changing the underlying set (e.g. call to remove()) things go badly for the iterable examples (for, iter):
from timeit import *
stats = ["while s:\n\ta = next(iter(s))\n\ts.remove(a)",
"while s:\n\tfor x in s: break\n\ts.remove(x)",
"while s:\n\tx=s.pop()\n\ts.add(x)\n\ts.remove(x)"]
for stat in stats:
t = Timer(stat, setup="s=set(range(100000))")
try:
print("Time for %s:\t %f"%(stat, t.timeit(number=1000)))
except:
t.print_exc()
Results in:
Time for while s:
a = next(iter(s))
s.remove(a): 2.938494
Time for while s:
for x in s: break
s.remove(x): 2.728367
Time for while s:
x=s.pop()
s.add(x)
s.remove(x): 0.030272
What I usually do for small collections is to create kind of parser/converter method like this
def convertSetToList(setName):
return list(setName)
Then I can use the new list and access by index number
userFields = convertSetToList(user)
name = request.json[userFields[0]]
As a list you will have all the other methods that you may need to work with
You can unpack the values to access the elements:
s = set([1, 2, 3])
v1, v2, v3 = s
print(v1,v2,v3)
#1 2 3
I f you want just the first element try this:
b = (a-set()).pop()
How about s.copy().pop()? I haven't timed it, but it should work and it's simple. It works best for small sets however, as it copies the whole set.
Another option is to use a dictionary with values you don't care about. E.g.,
poor_man_set = {}
poor_man_set[1] = None
poor_man_set[2] = None
poor_man_set[3] = None
...
You can treat the keys as a set except that they're just an array:
keys = poor_man_set.keys()
print "Some key = %s" % keys[0]
A side effect of this choice is that your code will be backwards compatible with older, pre-set versions of Python. It's maybe not the best answer but it's another option.
Edit: You can even do something like this to hide the fact that you used a dict instead of an array or set:
poor_man_set = {}
poor_man_set[1] = None
poor_man_set[2] = None
poor_man_set[3] = None
poor_man_set = poor_man_set.keys()