I am trying to use a dictionary (created by reading the content of a first file) to modify the content of an array(created by reading the content of a second file) in order to return as many modified arrays as their are keys in the dictionary with the modification corresponding the position and change in the original array indicated in the values for each key.
From a minimal example, if my dictionary and my list are:
change={'change 1': [(1, 'B'), (3, 'B'), (5, 'B'), (7, 'B')], 'change 2': [(2, 'B'), (4, 'B'), (6, 'B')]}
s=['A', 'A', 'A', 'A', 'A', 'A', 'A']
Then, I would like my code to return two lists:
['B', 'A', 'B', 'A', 'B', 'A', 'B']
['A', 'B', 'A', 'B', 'A', 'B', 'A']
I tried to code this in python3:
change={'change 1': [(1, 'B'), (3, 'B'), (5, 'B'), (7, 'B')], 'change 2': [(2, 'B'), (4, 'B'), (6, 'B')]}
s=['A', 'A', 'A', 'A', 'A', 'A', 'A']
for key in change:
s1=s
for n in change[key]:
s1[n[0]-1] = n[1]
print(key, s1)
print(s)
However it seems that even if I am only modifying the list s1 which is a copy of s, I am nontheless modifying s as well, as it returns:
change 1 ['B', 'A', 'B', 'A', 'B', 'A', 'B']
['B', 'A', 'B', 'A', 'B', 'A', 'B']
change 2 ['B', 'B', 'B', 'B', 'B', 'B', 'B']
['B', 'B', 'B', 'B', 'B', 'B', 'B']
So although I get the first list right, the second isn't and I don't understand why.
Could you help me see what is wrong in my code, and how to make it work?
Many thanks!
In your code s1 isn't a copy of s, it's just another name for s.
I'd suggest doing this by just building a new list in a comprehension, e.g.:
>>> change={'change 1': [(1, 'B'), (3, 'B'), (5, 'B'), (7, 'B')], 'change 2': [(2, 'B'), (4, 'B'), (6, 'B')]}
>>> s=['A', 'A', 'A', 'A', 'A', 'A', 'A']
>>> for t in change.values():
... d = dict(t)
... print([d.get(i, x) for i, x in enumerate(s, 1)])
...
['B', 'A', 'B', 'A', 'B', 'A', 'B']
['A', 'B', 'A', 'B', 'A', 'B', 'A']
But if you change your original code to initialize s1 to s.copy() it seems to work:
>>> for key in change:
... s1 = s.copy()
... for n in change[key]:
... s1[n[0]-1] = n[1]
... print(s1)
...
['B', 'A', 'B', 'A', 'B', 'A', 'B']
['A', 'B', 'A', 'B', 'A', 'B', 'A']
If you want to make sure a copy of an instance of a class with all its dependencies is completely being copied, use deepcopy as follows:
import copy
change={'change 1': [(1, 'B'), (3, 'B'), (5, 'B'), (7, 'B')], 'change 2': [(2, 'B'), (4, 'B'), (6, 'B')]}
s=['A', 'A', 'A', 'A', 'A', 'A', 'A']
for key in change:
s1=copy.deepcopy(s)
for n in change[key]:
s1[n[0]-1] = n[1]
print(key, s1)
print(s)
Related
Can someone help me in sorting below tuple in Python?
({'b', 'c', 'a'}, {('b', 'c'), ('a', 'b')})
Expected output:
({'a', 'b', 'c'}, {('a', 'b'), ('b', 'c')})
Your title and example are conflicting. You should consult the python documentation on set, tuple, and list
Some Examples:
a_list = ['b', 'c', 'a']
a_tuple = ('b', 'c', 'a')
a_set = {'b', 'c', 'a'}
a_list_of_tuples = [('b', 'c', 'a'), ('b', 'c'), ('a', 'b')]
a_list_of_tuples_and_lists = [('b', 'c', 'a'), [('b', 'c'), ('a', 'b')]]
This example works for both a list of lists/tuples and a tuple of sets/tuples, however it returns an actual list of tuples not a tuple of sets as provided in your expected output example.
my_list = [('b', 'c', 'a'), [('b', 'c'), ('a', 'b')]]
print(tuple((sorted(item) for item in my_list)))
-> (['a', 'b', 'c'], [('a', 'b'), ('b', 'c')])
my_tuple = ({'b', 'c', 'a'}, {('b', 'c'), ('a', 'b')})
print(tuple((sorted(item) for item in my_tuple)))
-> (['a', 'b', 'c'], [('a', 'b'), ('b', 'c')])
I'm assuming you're talking about a list of lists and that you want to first sort each list and then the whole list of lists.
You can do it as follows:
arr = [['b','c','a'],['b','c'],['a','b']]
for i in arr:
i.sort()
arr.sort(key=lambda x:x[0])
print(arr)
[['a', 'b', 'c'], ['a', 'b'], ['b', 'c']]
Remember the solution will be totally different if you have a tuple of sets or vice versa.
If I understand itertools "combinatoric iterators" doc correctly, the idea is to provide a set of standard functions for every common combinatory iteration.
But I miss one today. I need to iterate over every ordered combinations of items with repetitions.
combination_with_replacement('abcd', 4) yields
('a', 'a', 'a', 'a')
('a', 'a', 'a', 'b')
('a', 'a', 'a', 'c')
('a', 'a', 'a', 'd')
('a', 'a', 'b', 'b')
('a', 'a', 'b', 'c')
('a', 'a', 'b', 'd')
('a', 'a', 'c', 'c')
('a', 'a', 'c', 'd')
... etc
but (even though the results are sorted tuples), these combinations are not ordered.
I expect more results from an ideal ordered_combination_with_replacement('abcd', 4) for I need to distinguish between
('a', 'a', 'a', 'a')
('a', 'a', 'a', 'b')
('a', 'a', 'b', 'a')
('a', 'b', 'a', 'a')
('b', 'a', 'a', 'a')
('a', 'a', 'a', 'c')
('a', 'a', 'c', 'a')
... etc
In other words: order matters today.
Does itertool provide such an iteration? Why not, or why have I missed it?
What's a standard way to iterate over those?
Do I need to write this generic iterator myself?
To wrap up some of the comments, there are (at least) two ways to do that:
itertools.combinations_with_replacement("abcd", 4)
and
itertools.product("abcd", repeat=4)
Both produce the required results of:
[('a', 'a', 'a', 'a'),
('a', 'a', 'a', 'b'),
('a', 'a', 'a', 'c'),
('a', 'a', 'a', 'd'),
('a', 'a', 'b', 'a'),
...
I have those python lists :
x = [('D', 'F'), ('A', 'D'), ('B', 'G'), ('B', 'C'), ('A', 'B')]
priority_list = ['A', 'B', 'C', 'D', 'F', 'G'] # Ordered from highest to lowest priority
How can I, for each tuple in my list, keep the value with the highest priority according to priority_list? The result would be :
['D', 'A', 'B', 'B', 'A']
Another examples:
x = [('B', 'D'), ('E', 'A'), ('B', 'A'), ('D', 'F'), ('E', 'C')]
priority_list = ['A', 'B', 'C', 'D', 'E', 'F']
# Result:
['B', 'A', 'A', 'D', 'C']
x = [('B', 'C'), ('F', 'E'), ('B', 'A'), ('D', 'F'), ('E', 'C')]
priority_list = ['F', 'E', 'D', 'C', 'B', 'A'] # Notice the change in priorities
# Result:
['C', 'F', 'B', 'F', 'E']
Thanks in advance, I might be over complicating this.
You can try
[sorted(i, key=priority_list.index)[0] for i in x]
though it will throw an exception if you find a value not in the priority list.
You can try:
def get_priority_val(data, priority_list):
for single_val in priority_list:
if single_val in data:
return single_val
x = [('D', 'F'), ('A', 'D'), ('B', 'G'), ('B', 'C'), ('A', 'B')]
priority_list = ['A', 'B', 'C', 'D', 'F', 'G']
final_data = []
for data in x:
final_data.append(get_priority_val(data, priority_list))
print(final_data)
Output:
['D', 'A', 'B', 'B', 'A']
you can try using list comprehension:
ans = [d[0] if priority_list.index(d[0]) < priority_list.index(d[1] ) else d[1] for d in x ]
output:
['D', 'A', 'B', 'B', 'A']
You can do it in one line using a list comprehension :
[y[0] if priority_list.index(y[0]) < priority_list.index(y[1]) else y[1] for y in x]
Output :
['D', 'A', 'B', 'B', 'A']
You can create a dict containing priority values and just use min with a custom key
>>> priority_dict = {k:i for i,k in enumerate(priority_list)}
>>> [min(t, key=priority_dict.get) for t in x]
['D', 'A', 'B', 'B', 'A']
I have list with urls for crawling.['http://domain1.com','http://domain1.com/page1','http://domain2.com']
Code:
prev_domain = ''
while urls:
url = urls.pop()
if base_url(url) == prev_domain: # base_url is custom function return domain of an url
urls.append(url) # is this is possible?
continue
else:
crawl(url)
Basically I dont want to crawl webpages of same domain continuously. Continuosly crawling a domain url, return http response status code with 429: Too Many Requests. The user has sent too many requests in a given amount of time ("rate limiting"). To by-pass this issue, I'm planning to go with below logic.
Loop through all items in the list and compare current element base url with previously processed element base url.
If base urls are different then process for next step, otherwise do not process current element, just append this element to the same list.
Note : If urls in list are of same domain, make delay in processing each element and then execute.
Please provide your thoughts.
Your algorithm is almost correct, but not the implementation:
>>> L = [1,2,3]
>>> L.pop()
3
>>> L.append(3)
>>> L
[1, 2, 3]
That's why your program loops forever: if the domain is the same as the previous domain, you just append then pop then append, then.... What you need is not a stack, it's a round robin:
>>> L.pop()
3
>>> L.insert(0, 3)
>>> L
[3, 1, 2]
Let's take a shuffled list of permutations of "abcd":
>>> L = [('b', 'c', 'd', 'a'), ('d', 'c', 'b', 'a'), ('a', 'c', 'd', 'b'), ('c', 'd', 'a', 'b'), ('b', 'd', 'a', 'c'), ('b', 'a', 'd', 'c'), ('b', 'c', 'a', 'd'), ('a', 'b', 'd', 'c'), ('d', 'a', 'b', 'c'), ('a', 'b', 'c', 'd'), ('d', 'c', 'a', 'b'), ('a', 'd', 'c', 'b'), ('d', 'a', 'c', 'b'), ('c', 'd', 'b', 'a'), ('d', 'b', 'c', 'a'), ('d', 'b', 'a', 'c'), ('a', 'd', 'b', 'c'), ('b', 'd', 'c', 'a'), ('c', 'b', 'd', 'a'), ('c', 'a', 'b', 'd'), ('b', 'a', 'c', 'd')]
The first letter is the domain. Here's a slightly modified version of your code:
>>> prev = None
>>> while L:
... e = L.pop()
... if L and e[0] == prev:
... L.insert(0, e)
... else:
... print(e)
... prev = e[0]
('b', 'a', 'c', 'd')
('c', 'a', 'b', 'd')
('b', 'd', 'c', 'a')
('a', 'd', 'b', 'c')
('d', 'b', 'a', 'c')
('c', 'd', 'b', 'a')
('d', 'a', 'c', 'b')
('a', 'd', 'c', 'b')
('d', 'c', 'a', 'b')
('a', 'b', 'c', 'd')
('d', 'a', 'b', 'c')
('a', 'b', 'd', 'c')
('b', 'c', 'a', 'd')
('c', 'd', 'a', 'b')
('a', 'c', 'd', 'b')
('d', 'c', 'b', 'a')
('b', 'c', 'd', 'a')
('c', 'b', 'd', 'a')
('d', 'b', 'c', 'a')
('b', 'a', 'd', 'c')
('b', 'd', 'a', 'c')
The modification is: if L and, because if the last element of the list domain is prev, then you'll loop forever with your one element list: pop, same as prev, insert, pop, ...(as with pop/append)
Here's another option: create a dict domain -> list of urls:
>>> d = {}
>>> for e in L:
... d.setdefault(e[0], []).append(e)
>>> d
{'b': [('b', 'c', 'd', 'a'), ('b', 'd', 'a', 'c'), ('b', 'a', 'd', 'c'), ('b', 'c', 'a', 'd'), ('b', 'd', 'c', 'a'), ('b', 'a', 'c', 'd')], 'd': [('d', 'c', 'b', 'a'), ('d', 'a', 'b', 'c'), ('d', 'c', 'a', 'b'), ('d', 'a', 'c', 'b'), ('d', 'b', 'c', 'a'), ('d', 'b', 'a', 'c')], 'a': [('a', 'c', 'd', 'b'), ('a', 'b', 'd', 'c'), ('a', 'b', 'c', 'd'), ('a', 'd', 'c', 'b'), ('a', 'd', 'b', 'c')], 'c': [('c', 'd', 'a', 'b'), ('c', 'd', 'b', 'a'), ('c', 'b', 'd', 'a'), ('c', 'a', 'b', 'd')]}
Now, take an element of every domain and clear the dict, then loop until the dict is empty:
>>> while d:
... for k, vs in d.items():
... e = vs.pop()
... print (e)
... d = {k: vs for k, vs in d.items() if vs} # clear the dict
...
('b', 'a', 'c', 'd')
('d', 'b', 'a', 'c')
('a', 'd', 'b', 'c')
('c', 'a', 'b', 'd')
('b', 'd', 'c', 'a')
('d', 'b', 'c', 'a')
('a', 'd', 'c', 'b')
('c', 'b', 'd', 'a')
('b', 'c', 'a', 'd')
('d', 'a', 'c', 'b')
('a', 'b', 'c', 'd')
('c', 'd', 'b', 'a')
('b', 'a', 'd', 'c')
('d', 'c', 'a', 'b')
('a', 'b', 'd', 'c')
('c', 'd', 'a', 'b')
('b', 'd', 'a', 'c')
('d', 'a', 'b', 'c')
('a', 'c', 'd', 'b')
('b', 'c', 'd', 'a')
('d', 'c', 'b', 'a')
The output is more uniform.
Check the following code snippet,
urls = ['http://domain1.com','http://domain1.com/page1','http://domain2.com']
crawl_for_urls = {}
for url in urls:
domain = base_url(url)
if domain not in crowl_for_urls:
crawl_for_urls.update({domain:url})
crawl(url)
crawl() will be called only for unique domain.
Or you can use:
urls = ['http://domain1.com','http://domain1.com/page1','http://domain2.com']
crawl_for_urls = {}
for url in urls:
domain = base_url(url)
if domain not in crowl_for_urls:
crawl_for_urls.update({domain:[url]})
crawl(url)
else:
crawl_for_urls.get(domain, []).append(url)
This way you can categories the URL's based on domain and also can use crawl() for unique domain.
Given a dict of vocabulary: {'A': 3, 'B': 4, 'C': 5, 'AB':6} and a sentence, which should be segmented: ABCAB.
I need to create all possible combinations of this sentence such as
[['A', 'B', 'C', 'A', 'B'], ['A', 'B', 'C', 'AB'], ['AB', 'C', 'AB'], ['AB', 'C', 'A', 'B']]
That's what I have:
def find_words(sentence):
for i in range(len(sentence)):
for word_length in range(1, max_word_length + 1):
word = sentence[i:i+word_length]
print(word)
if word not in test_dict:
continue
if i + word_length <= len(sentence):
if word.startswith(sentence[0]) and word not in words and word not in ''.join(words):
words.append(word)
else:
continue
next_position = i + word_length
if next_position >= len(sentence):
continue
else:
find_ngrams(sentence[next_position:])
return words
But it returns me only one list.
I was also looking for something useful in itertools but I couldn't find anything obviously useful. Might've missed it, though.
Try all possible prefixes and recursively do the same for the rest of the sentence.
VOC = {'A', 'B', 'C', 'AB'} # could be a dict
def parse(snt):
if snt == '':
yield []
for w in VOC:
if snt.startswith(w):
for rest in parse(snt[len(w):]):
yield [w] + rest
print(list(parse('ABCAB')))
# [['AB', 'C', 'AB'], ['AB', 'C', 'A', 'B'],
# ['A', 'B', 'C', 'AB'], ['A', 'B', 'C', 'A', 'B']]
Although not the most efficient solution, this should work:
from itertools import product
dic = {'A': 3, 'B': 4, 'C': 5, 'AB': 6}
choices = list(dic.keys())
prod = []
for a in range(1, len(choices)+2):
prod = prod + list(product(choices, repeat=a))
result = list(filter(lambda x: ''.join(x) == ''.join(choices), prod))
print(result)
# prints [('AB', 'C', 'AB'), ('A', 'B', 'C', 'AB'), ('AB', 'C', 'A', 'B'), ('A', 'B', 'C', 'A', 'B')]
Use itertools permutations to give all unique combinations.
d ={'A': 3, 'B': 4, 'C': 5, 'AB':6}
l = [k for k, v in d.items()]
print(list(itertools.permutations(l)))
[('A', 'B', 'C', 'AB'), ('A', 'B', 'AB', 'C'), ('A', 'C', 'B', 'AB'), ('A', 'C', 'AB', 'B'), ('A', 'AB', 'B', 'C'), ('A', 'AB', 'C', 'B'), ('B', 'A', 'C', 'AB'), ('B', 'A', 'AB', 'C'), ('B', 'C', 'A', 'AB'), ('B', 'C', 'AB', 'A'), ('B', 'AB', 'A', 'C'), ('B', 'AB', 'C', 'A'), ('C', 'A', 'B', 'AB'), ('C', 'A', 'AB', 'B'), ('C', 'B', 'A', 'AB'), ('C', 'B', 'AB', 'A'), ('C', 'AB', 'A', 'B'), ('C', 'AB', 'B', 'A'), ('AB', 'A', 'B', 'C'), ('AB', 'A', 'C', 'B'), ('AB', 'B', 'A', 'C'), ('AB', 'B', 'C', 'A'), ('AB', 'C', 'A', 'B'), ('AB', 'C', 'B', 'A')]