Deleting rows from a dataframe with nested for loops

Deleting rows from a dataframe with nested for loops - python

Some background: MB column only consists of 1 of 2 values (M or B) while the Indent column contains int. The numbers don't necessarily follow a pattern but if it does increment, it will increment by one. The numbers can decrement by any amount. The rows are sorted in a specific order.
The goal here is to drop rows with Indent values higher than the indent value of a row that contains a "B" value in the MB column. This should only stop once the indent value is equal to or less than the row that contains the "B" value. Below is a chart demonstrating what rows should be dropped.
Sample data:
import pandas as pd
d = {'INDENT': {'0': 0, '1': 1, '2': 1, '3': 2, '4': 3, '5': 3, '6': 4, '7': 2, '8': 3}, 'MB': {'0': 'M', '1': 'B', '2': 'M', '3': 'B', '4': 'B', '5': 'M', '6': 'M', '7': 'B', '8': 'M'}}
df = pd.DataFrame(d)
Code:
My current code has issues where I cant drop the rows of the inner for loop since it isn't using iterrows. I am aware of dropping based on a conditional expression but I am unsure how to nest this correctly.
for index, row in df.iterrows():
for row in range(index-1,0,-1):
if df.loc[row].at["INDENT"] <= df.loc[index].at["INDENT"]-1:
if df.loc[row].at["MB"]=="B":
df.drop(df.index[index], inplace=True)
break
else:
break
Edit 1:
This problem can be represented graphically. This is effectively scanning a hierarchy for an attribute and deleting anything below it. The example I provided is bad since all rows that need to be dropped are simply indent 3 or higher but this can happen at any indent level.
Edit 2: We are going to cheat on this problem a bit. I won't have to generate an edge graph from scratch since I have the prerequisite data to do this. I have an updated table and sample data.
Updated Sample Data
import pandas as pd
d = {
'INDENT': {'0': 0, '1': 1, '2': 1, '3': 2, '4': 3, '5': 3, '6': 4, '7': 2, '8': 3},
'MB': {'0': 'M', '1': 'B', '2': 'M', '3': 'B', '4': 'B', '5': 'M', '6': 'M', '7': 'B', '8': 'M'},
'a': {'0': -1, '1': 5000, '2': 5000, '3': 5322, '4': 5449, '5': 5449, '6': 5621, '7': 5322, '8': 4666},
'c': {'0': 5000, '1': 5222, '2': 5322, '3': 5449, '4': 5923, '5': 5621, '6': 5109, '7': 4666, '8': 5219}
}
df = pd.DataFrame(d)
Updated Code
import matplotlib.pyplot as plt
import networkx as nx
import pandas as pd
d = {
'INDENT': {'0': 0, '1': 1, '2': 1, '3': 2, '4': 3, '5': 3, '6': 4, '7': 2, '8': 3},
'MB': {'0': 'M', '1': 'B', '2': 'M', '3': 'B', '4': 'B', '5': 'M', '6': 'M', '7': 'B', '8': 'M'},
'a': {'0': -1, '1': 5000, '2': 5000, '3': 5322, '4': 5449, '5': 5449, '6': 5621, '7': 5322, '8': 4666},
'c': {'0': 5000, '1': 5222, '2': 5322, '3': 5449, '4': 5923, '5': 5621, '6': 5109, '7': 4666, '8': 5219}
}
df = pd.DataFrame(d)
G = nx.Graph()
G = nx.from_pandas_edgelist(df, 'a', 'c', create_using=nx.DiGraph())
T = nx.dfs_tree(G, source=-1).reverse()
print([x for x in T])
nx.draw(G, with_labels=True)
plt.show()
I am unsure how to use the edges from here to identify the rows that need to be dropped from the dataframe

Not a answer, but to long for a comment:
import pandas as pd
d = {'INDENT': {'0': 0, '1': 1, '2': 1, '3': 2, '4': 3, '5': 3, '6': 4, '7': 2, '8': 3}, 'MB': {'0': 'M', '1': 'B', '2': 'M', '3': 'B', '4': 'B', '5': 'M', '6': 'M', '7': 'B', '8': 'M'}}
df = pd.DataFrame(d)
df['i'] = df['INDENT']+1
df = df.reset_index()
df = df.merge(df[['INDENT', 'index', 'MB']].rename(columns={'INDENT':'target', 'index':'ix', 'MB': 'MBt'}), left_on=['i'], right_on=['target'], how='left')
import networkx as nx
G = nx.Graph()
G = nx.from_pandas_edgelist(df, 'index', 'ix', create_using=nx.DiGraph())
T = nx.dfs_tree(G, source='0').reverse()
print([x for x in T])
nx.draw(G, with_labels=True)
This demonstrates the problem. You actually want to apply graph theory, the library networkx can help you with that. A first step would be to first construct the connection between each node, like I did in the example above. From there you can try to apply a logic to filter edges you don't want.

Not sure I fully understand the question you're asking however this is my attempt. It selects only if index > indent and mb ==B. In this case I'm just selecting the subset we want instead of dropping the subset we don't
import numpy as np
import pandas as pd
x=np.transpose(np.array([[0,1,2,3,4,5,6,7,8],[0,1,1,2,3,3,4,2,3],['M','B','M','B','B','M','M','B','M']]))
df=pd.DataFrame(x,columns=['Index','indent','MB'])
df1=df[(df['Index']>=df['indent']) & (df['MB']=='B')]
print(df1)

Related

random.choice appears to sometimes return keys that are not there

The following code
import random
cards = {'A': 1, '2': 2, '3': 3, '4': 4, '5': 5, '6': 6, '7': 7, '8': 8, '9': 9, '10': 10, 'J': 10, 'Q': 10, 'K': 10}
def random_cards(n):
dealt_cards = []
for i in range(n):
dealt_cards += random.choice(list(cards.keys()))
print(dealt_cards)
produces this result
>>> random_cards(4)
['A', '4', '1', '0', '3']
>>>
The 1 isn't present in cards.keys(), and it doesn't come from the value of the A key. I tested this with setting the value to 100 and it still placed the 1.
Is random.choice broken?
If not, what is broken in my program?

Your error stems from the fact that, to add to a list, you don't use +=, but append. The correct code is:
def random_cards(n):
dealt_cards = []
for _ in range(n):
dealt_cards.append(random.choice(list(cards.keys())))
print(dealt_cards)
This is because addition with lists is defined only with iterables, which are converted to a list, then the two lists are concatenated. That is, ['c'] + 'ab' == ['c'] + list('ab') == ['c'] + ['a', 'b'] == ['c', 'a', 'b'].

python - how to iterate through values of dict

I have a dict I want to iterate through to find all values that contain the key. My output would be a separate dict that would contain the numbers from each dict value without the key in the value or each specific keys values in the final
dict_in =
{'6': ['2,9,8,10'], '1': ['3,5,8,9,10,12'], '4': ['2,5,7,8,9'], '2': ['3,4,7,6,13'], '12': ['1,7,5,9'], '3': ['9,11,10,1,2,13'], '10': ['1,3,6,11'], '5': ['4,1,7,11,12'], '13': ['2,3'], '8': ['1,6,4,11'], '7': ['5,2,4,9,12'], '11': ['3,5,10,8'], '9': ['12,1,3,6,4,7']}
so the output would be like this:
{'6':['3,4,7,13,1,3,11,1,4,11,12,1,3,4,7'] , '4':['3,4,6,13,1,11,12,1,6,11,12,12,1,3,6'],'13': ['4,7,6,9,11,10,1']}
I am a beginner and I do not even know where to start. Would it be easier to convert it to a list of lists?

There is one thing about your problem that is an extra challenge. That is for you or some other good samaritan to solve. This is just a nudge in your direction. Your values for the keys is actually a single string. Now if it was actual integers, the problem is not too complicated. Also to note, your expected output that you wrote based on your requirement is actually typed wrong, you missed a few values.
In case of having integers instead of a string, I can show you one approach that as a beginner you can hopefully understand:
dict_in = {'6': [2,9,8,10], '1': [3,5,8,9,10,12], '4': [2,5,7,8,9], '2': [3,4,7,6,13], '12': [1,7,5,9], '3': [9,11,10,1,2,13], '10': [1,3,6,11], '5': [4,1,7,11,12], '13': [2,3], '8' :[1,6,4,11], '7': [5,2,4,9,12], '11': [3,5,10,8], '9': [12,1,3,6,4,7]}
dict_out = {}
for key in dict_in:
if key == "6" or key == "4" or key == "13":
for k,v in dict_in.items():
for y in v:
if int(key) in v and y != int(key):
dict_out.setdefault(key, []).append(y)
Output:
{'6': [3, 4, 7, 13, 1, 3, 11, 1, 4, 11, 12, 1, 3, 4, 7], '4': [3, 7, 6, 13, 1, 7, 11, 12, 1, 6, 11, 5, 2, 9, 12, 12, 1, 3, 6, 7], '13': [3, 4, 7, 6, 9, 11, 10, 1, 2]}
Last note, I have no clue whatsoever for why on Earth, you wanted the only keys left to be 6,4 and 13.
In any case, do not consider this as a full answer.

Now sure what you mean, so just created something that will cover most part. Change it with what you want.
dict_in = {'6': ['2,9,8,10'], '1': ['3,5,8,9,10,12'], '4': ['2,5,7,8,9'], '2': ['3,4,7,6,13'], '12': ['1,7,5,9'], '3': ['9,11,10,1,2,13'], '10': ['1,3,6,11'], '5': ['4,1,7,11,12'], '13': ['2,3'], '8' :['1,6,4,11'], '7': ['5,2,4,9,12'], '11': ['3,5,10,8'], '9': ['12,1,3,6,4,7']}
dict_out = {}
for key, values in dict_in.items():
for item in values:
if True:#your condition
if key in dict_out.keys():
dict_out[key].append(item)
else:
dict_out[key] = [item]
print(dict_out)

How to compare XLS file to json file?

Hi I have a task to compare some XLS files to json files I have working code but the loop is comparing only first row from xls, does not go to the next one.
def verify_xlsx_file_content_against_json(json, xlsx_path):
found_configs_ct = 0
xlsx_reader = XLSXReader()
data_frame = xlsx_reader.read_xlsx(xlsx_path, "Configuration Set")
for config in json_definition:
items_compared = 0
for item in config["content"]["items"]:
for index, row in data_frame.iterrows():
if config["segAlias"] == row["Seg Alias"] and item["itemsOrder"] == row["Items Order"]:
found_configs_ct += 1
if index == 0:
def_platforms = config["platforms"]
platforms = ["gp", "ios", "win32", "fb", "amazon"]
for platform in platforms:
if platform in def_platforms:
assert_equal_values(row[convert_def_platform_to_excel_platform_name(platform)], "Y","{" + platform + "} should be True! and is not.")
else:
assert_equal_values(row[convert_def_platform_to_excel_platform_name(platform)], "N","{" + platform + "} should be False! and is not.")
name_from_definition = config["content"]["items"][0]["name"] # all items have the same name thus index 0 is sufficient
assert_equal_values(row["Name"], name_from_definition, "Content name is invalid")
Logger.LogInfo(str(item["identifier"]) + ' identifier')
Logger.LogInfo(str(row["Identifier"]) + ' Identifier')
Logger.LogInfo(str(item["itemsOrder"]) + ' itemsOrder')
Logger.LogInfo(str(row["Items Order"]) + ' Items Order')
Logger.LogInfo(str(item["dValue"]) + ' dValue')
Logger.LogInfo(str(row["D Value, $"]) + ' D Value, $')
Logger.LogInfo(str(row["Pack Image"]) + ' PackImage from row <<<<<<<<<<<<<<<')
Logger.LogInfo(str(item["packImage"]) + ' packImage second from item-<<<<<<<<')
assert_equal_values(str(item["identifier"]), str(row["Identifier"]), "Identifier is not match!")
assert_equal_values(str(item["itemsOrder"]), str(row["Items Order"]), "Items Order is not match!")
assert_equal_values(str(item["packImage"]), str(row["Pack Image"]), "packImage is not match!")
assert_equal_values(str(item["dValue"]), str(row["D Value, $"]),"dValue not match!")
# assert_equal_values(str(item["remarks"]), str(row["Remarks"]), "Remarks is not match!")
items_compared += 1
break
assert_equal_values(items_compared, len(
config["content"]["items"]), "number of items compared is less than actual items")
return False
logging.info("verification done. ---no problems detected--")
return True # if code got here, no errors on the way, return True
what I get here is
NFO:root:[2020-02-14 09:33:24.543957] 0 identifier
INFO:root:[2020-02-14 09:33:24.543957] -1 Identifier
INFO:root:[2020-02-14 09:33:24.543957] 1 itemsOrder
INFO:root:[2020-02-14 09:33:24.543957] 1 Items Order
INFO:root:[2020-02-14 09:33:24.543957] 1.99 dValue
INFO:root:[2020-02-14 09:33:24.543957] 1.99 D Value, $
INFO:root:[2020-02-14 09:33:24.544953] 1 identifier
INFO:root:[2020-02-14 09:33:24.544953] -1 Identifier
INFO:root:[2020-02-14 09:33:24.544953] 2 itemsOrder
INFO:root:[2020-02-14 09:33:24.544953] 1 Items Order
INFO:root:[2020-02-14 09:33:24.544953] 2.99 dValue
INFO:root:[2020-02-14 09:33:24.544953] 1.99 D Value,
INFO:root:[2020-02-14 09:33:24.544953] 2 identifier
INFO:root:[2020-02-14 09:33:24.545949] -1 Identifier
So -1 Identifier is always -1 and should change incrementally (0,1,2... ) like in json file.
How to change this loop to make this code work? and later to assert those files json vs xls
What I did
df = pd.read_excel("download1.xlsx") #xlsx to json
json_from_excel = df.to_json() #json string
print(type(json_from_excel))
print(json_from_excel)
final_dictionary = json.loads(json_from_excel) #dict
print(type(final_dictionary))
print(final_dictionary)
with open('basic.json', 'r') as jason_file:
data = json.loads(jason_file.read())
for item in data:
print(type(item))
and the result is
<class 'str'>
{"Name":{"0":"default_config","1":"default_config","2":"default_config","3":"default_config","4":"default_config","5":"gems1","6":"gems1","7":"gems1","8":"gems1","9":"gems1","10":"gems2","11":"gems2","12":"gems2","13":"gems2","14":"gems2","15":"gems3","16":"gems3","17":"gems3","18":"gems3","19":"gems3"},"Identifier":{"0":11294,"1":11294,"2":11294,"3":11294,"4":11294,"5":11295,"6":11295,"7":11295,"8":11295,"9":11295,"10":11296,"11":11296,"12":11296,"13":11296,"14":11296,"15":11297,"16":11297,"17":11297,"18":11297,"19":11297},"Segment Alias":{"0":"default","1":"default","2":"default","3":"default","4":"default","5":"rhinoQA_1","6":"rhinoQA_1","7":"rhinoQA_1","8":"rhinoQA_1","9":"rhinoQA_1","10":"rhinoQA_2","11":"rhinoQA_2","12":"rhinoQA_2","13":"rhinoQA_2","14":"rhinoQA_2","15":"rhinoQA_3","16":"rhinoQA_3","17":"rhinoQA_3","18":"rhinoQA_3","19":"rhinoQA_3"},"Segment ID":{"0":-1,"1":-1,"2":-1,"3":-1,"4":-1,"5":95365,"6":95365,"7":95365,"8":95365,"9":95365,"10":95366,"11":95366,"12":95366,"13":95366,"14":95366,"15":95367,"16":95367,"17":95367,"18":95367,"19":95367},"GP":{"0":"Y","1":"Y","2":"Y","3":"Y","4":"Y","5":"N","6":"N","7":"N","8":"N","9":"N","10":"N","11":"N","12":"N","13":"N","14":"N","15":"N","16":"N","17":"N","18":"N","19":"N"},"iOS":{"0":"Y","1":"Y","2":"Y","3":"Y","4":"Y","5":"Y","6":"Y","7":"Y","8":"Y","9":"Y","10":"N","11":"N","12":"N","13":"N","14":"N","15":"N","16":"N","17":"N","18":"N","19":"N"},"FB":{"0":"Y","1":"Y","2":"Y","3":"Y","4":"Y","5":"N","6":"N","7":"N","8":"N","9":"N","10":"Y","11":"Y","12":"Y","13":"Y","14":"Y","15":"N","16":"N","17":"N","18":"N","19":"N"},"Amazon":{"0":"Y","1":"Y","2":"Y","3":"Y","4":"Y","5":"N","6":"N","7":"N","8":"N","9":"N","10":"N","11":"N","12":"N","13":"N","14":"N","15":"Y","16":"Y","17":"Y","18":"Y","19":"Y"},"Win32":{"0":"Y","1":"Y","2":"Y","3":"Y","4":"Y","5":"N","6":"N","7":"N","8":"N","9":"N","10":"N","11":"N","12":"N","13":"N","14":"N","15":"N","16":"N","17":"N","18":"N","19":"N"},"Remarks":{"0":"default;A.request cfa for test user and any non existing platform to get default B. request user 520000000101987 (has segment in hbi redis,but not in config), any platform","1":"default;A.request cfa for test user and any non existing platform to get default B. request user 520000000101987 (has segment in hbi redis,but not in config), any platform","2":"default;A.request cfa for test user and any non existing platform to get default B. request user 520000000101987 (has segment in hbi redis,but not in config), any platform","3":"default;A.request cfa for test user and any non existing platform to get default B. request user 520000000101987 (has segment in hbi redis,but not in config), any platform","4":"default;A.request cfa for test user and any non existing platform to get default B. request user 520000000101987 (has segment in hbi redis,but not in config), any platform","5":"*inserted_by_automated_test*","6":"*inserted_by_automated_test*","7":"*inserted_by_automated_test*","8":"*inserted_by_automated_test*","9":"*inserted_by_automated_test*","10":"x1","11":"x1","12":"x1","13":"x1","14":"x1","15":"x3","16":"x3","17":"x3","18":"x3","19":"x3"},"Pack Name":{"0":"test_bingo","1":"test_bingo","2":"test_bingo","3":"test_bingo","4":"test_bingo","5":"test_bingo","6":"test_bingo","7":"test_bingo","8":"test_bingo","9":"test_bingo","10":"test_bingo","11":"test_bingo","12":"test_bingo","13":"test_bingo","14":"test_bingo","15":"test_bingo","16":"test_bingo","17":"test_bingo","18":"test_bingo","19":"test_bingo"},"SKU":{"0":"gems_99","1":"gems_99","2":"gems_99","3":"gems_99","4":"gems_99","5":"gems_99","6":"gems_99","7":"gems_99","8":"gems_99","9":"gems_99","10":"gems_99","11":"gems_99","12":"gems_99","13":"gems_99","14":"gems_99","15":"gems_99","16":"gems_99","17":"gems_99","18":"gems_99","19":"gems_99"},"Sold Item":{"0":"Gems","1":"Gems","2":"Gems","3":"Gems","4":"Gems","5":"Gems","6":"Gems","7":"Gems","8":"Gems","9":"Gems","10":"Gems","11":"Gems","12":"Gems","13":"Gems","14":"Gems","15":"Gems","16":"Gems","17":"Gems","18":"Gems","19":"Gems"},"Items Order":{"0":1,"1":2,"2":3,"3":4,"4":5,"5":1,"6":2,"7":3,"8":4,"9":5,"10":1,"11":2,"12":3,"13":4,"14":5,"15":1,"16":2,"17":3,"18":4,"19":5},"Dollar Value, $":{"0":1.99,"1":2.99,"2":3.99,"3":4.99,"4":5.99,"5":1.99,"6":2.99,"7":3.99,"8":4.99,"9":5.99,"10":1.99,"11":11.99,"12":13.99,"13":14.99,"14":15.99,"15":1.99,"16":2.99,"17":3.99,"18":4.99,"19":5.99},"Item Quantity":{"0":5,"1":15,"2":25,"3":35,"4":45,"5":5,"6":15,"7":25,"8":55,"9":65,"10":5,"11":15,"12":45,"13":55,"14":75,"15":5,"16":15,"17":55,"18":95,"19":85},"Is Best":{"0":"Y","1":"Y","2":"Y","3":"Y","4":"Y","5":"Y","6":"Y","7":"Y","8":"Y","9":"Y","10":"Y","11":"Y","12":"Y","13":"Y","14":"Y","15":"Y","16":"Y","17":"Y","18":"Y","19":"Y"},"Is Popular":{"0":"N","1":"N","2":"N","3":"N","4":"N","5":"N","6":"N","7":"N","8":"N","9":"N","10":"N","11":"N","12":"N","13":"N","14":"N","15":"N","16":"N","17":"N","18":"N","19":"N"},"Footer Type":{"0":1,"1":1,"2":1,"3":1,"4":1,"5":1,"6":1,"7":1,"8":1,"9":1,"10":1,"11":1,"12":1,"13":1,"14":1,"15":1,"16":1,"17":1,"18":1,"19":1},"Footer Text":{"0":"test_text","1":"test_text","2":"test_text","3":"test_text","4":"test_text","5":"test_text","6":"test_text","7":"test_text","8":"test_text","9":"test_text","10":"test_text","11":"test_text","12":"test_text","13":"test_text","14":"test_text","15":"test_text","16":"test_text","17":"test_text","18":"test_text","19":"test_text"},"Pack Image":{"0":"gems_1.png","1":"gems_2.png","2":"gems_3.png","3":"gems_4.png","4":"gems_5.png","5":"gems_1.png","6":"gems_2.png","7":"gems_3.png","8":"gems_4.png","9":"gems_5.png","10":"gems_1.png","11":"gems_2.png","12":"gems_3.png","13":"gems_4.png","14":"gems_5.png","15":"gems_1.png","16":"gems_2.png","17":"gems_3.png","18":"gems_4.png","19":"gems_5.png"}}
<class 'dict'>
{'Name': {'0': 'default_config', '1': 'default_config', '2': 'default_config', '3': 'default_config', '4': 'default_config', '5': 'gems1', '6': 'gems1', '7': 'gems1', '8': 'gems1', '9': 'gems1', '10': 'gems2', '11': 'gems2', '12': 'gems2', '13': 'gems2', '14': 'gems2', '15': 'gems3', '16': 'gems3', '17': 'gems3', '18': 'gems3', '19': 'gems3'}, 'Identifier': {'0': 11294, '1': 11294, '2': 11294, '3': 11294, '4': 11294, '5': 11295, '6': 11295, '7': 11295, '8': 11295, '9': 11295, '10': 11296, '11': 11296, '12': 11296, '13': 11296, '14': 11296, '15': 11297, '16': 11297, '17': 11297, '18': 11297, '19': 11297}, 'Segment Alias': {'0': 'default', '1': 'default', '2': 'default', '3': 'default', '4': 'default', '5': 'rhinoQA_1', '6': 'rhinoQA_1', '7': 'rhinoQA_1', '8': 'rhinoQA_1', '9': 'rhinoQA_1', '10': 'rhinoQA_2', '11': 'rhinoQA_2', '12': 'rhinoQA_2', '13': 'rhinoQA_2', '14': 'rhinoQA_2', '15': 'rhinoQA_3', '16': 'rhinoQA_3', '17': 'rhinoQA_3', '18': 'rhinoQA_3', '19': 'rhinoQA_3'}, 'Segment ID': {'0': -1, '1': -1, '2': -1, '3': -1, '4': -1, '5': 95365, '6': 95365, '7': 95365, '8': 95365, '9': 95365, '10': 95366, '11': 95366, '12': 95366, '13': 95366, '14': 95366, '15': 95367, '16': 95367, '17': 95367, '18': 95367, '19': 95367}, 'GP': {'0': 'Y', '1': 'Y', '2': 'Y', '3': 'Y', '4': 'Y', '5': 'N', '6': 'N', '7': 'N', '8': 'N', '9': 'N', '10': 'N', '11': 'N', '12': 'N', '13': 'N', '14': 'N', '15': 'N', '16': 'N', '17': 'N', '18': 'N', '19': 'N'}, 'iOS': {'0': 'Y', '1': 'Y', '2': 'Y', '3': 'Y', '4': 'Y', '5': 'Y', '6': 'Y', '7': 'Y', '8': 'Y', '9': 'Y', '10': 'N', '11': 'N', '12': 'N', '13': 'N', '14': 'N', '15': 'N', '16': 'N', '17': 'N', '18': 'N', '19': 'N'}, 'FB': {'0': 'Y', '1': 'Y', '2': 'Y', '3': 'Y', '4': 'Y', '5': 'N', '6': 'N', '7': 'N', '8': 'N', '9': 'N', '10': 'Y', '11': 'Y', '12': 'Y', '13': 'Y', '14': 'Y', '15': 'N', '16': 'N', '17': 'N', '18': 'N', '19': 'N'}, 'Amazon': {'0': 'Y', '1': 'Y', '2': 'Y', '3': 'Y', '4': 'Y', '5': 'N', '6': 'N', '7': 'N', '8': 'N', '9': 'N', '10': 'N', '11': 'N', '12': 'N', '13': 'N', '14': 'N', '15': 'Y', '16': 'Y', '17': 'Y', '18': 'Y', '19': 'Y'}, 'Win32': {'0': 'Y', '1': 'Y', '2': 'Y', '3': 'Y', '4': 'Y', '5': 'N', '6': 'N', '7': 'N', '8': 'N', '9': 'N', '10': 'N', '11': 'N', '12': 'N', '13': 'N', '14': 'N', '15': 'N', '16': 'N', '17': 'N', '18': 'N', '19': 'N'}, 'Remarks': {'0': 'default;A.request cfa for test user and any non existing platform to get default B. request user 520000000101987 (has segment in hbi redis,but not in config), any platform', '1': 'default;A.request cfa for test user and any non existing platform to get default B. request user 520000000101987 (has segment in hbi redis,but not in config), any platform', '2': 'default;A.request cfa for test user and any non existing platform to get default B. request user 520000000101987 (has segment in hbi redis,but not in config), any platform', '3': 'default;A.request cfa for test user and any non existing platform to get default B. request user 520000000101987 (has segment in hbi redis,but not in config), any platform', '4': 'default;A.request cfa for test user and any non existing platform to get default B. request user 520000000101987 (has segment in hbi redis,but not in config), any platform', '5': '*inserted_by_automated_test*', '6': '*inserted_by_automated_test*', '7': '*inserted_by_automated_test*', '8': '*inserted_by_automated_test*', '9': '*inserted_by_automated_test*', '10': 'x1', '11': 'x1', '12': 'x1', '13': 'x1', '14': 'x1', '15': 'x3', '16': 'x3', '17': 'x3', '18': 'x3', '19': 'x3'}, 'Pack Name': {'0': 'test_bingo', '1': 'test_bingo', '2': 'test_bingo', '3': 'test_bingo', '4': 'test_bingo', '5': 'test_bingo', '6': 'test_bingo', '7': 'test_bingo', '8': 'test_bingo', '9': 'test_bingo', '10': 'test_bingo', '11': 'test_bingo', '12': 'test_bingo', '13': 'test_bingo', '14': 'test_bingo', '15': 'test_bingo', '16': 'test_bingo', '17': 'test_bingo', '18': 'test_bingo', '19': 'test_bingo'}, 'SKU': {'0': 'gems_99', '1': 'gems_99', '2': 'gems_99', '3': 'gems_99', '4': 'gems_99', '5': 'gems_99', '6': 'gems_99', '7': 'gems_99', '8': 'gems_99', '9': 'gems_99', '10': 'gems_99', '11': 'gems_99', '12': 'gems_99', '13': 'gems_99', '14': 'gems_99', '15': 'gems_99', '16': 'gems_99', '17': 'gems_99', '18': 'gems_99', '19': 'gems_99'}, 'Sold Item': {'0': 'Gems', '1': 'Gems', '2': 'Gems', '3': 'Gems', '4': 'Gems', '5': 'Gems', '6': 'Gems', '7': 'Gems', '8': 'Gems', '9': 'Gems', '10': 'Gems', '11': 'Gems', '12': 'Gems', '13': 'Gems', '14': 'Gems', '15': 'Gems', '16': 'Gems', '17': 'Gems', '18': 'Gems', '19': 'Gems'}, 'Items Order': {'0': 1, '1': 2, '2': 3, '3': 4, '4': 5, '5': 1, '6': 2, '7': 3, '8': 4, '9': 5, '10': 1, '11': 2, '12': 3, '13': 4, '14': 5, '15': 1, '16': 2, '17': 3, '18': 4, '19': 5}, 'Dollar Value, $': {'0': 1.99, '1': 2.99, '2': 3.99, '3': 4.99, '4': 5.99, '5': 1.99, '6': 2.99, '7': 3.99, '8': 4.99, '9': 5.99, '10': 1.99, '11': 11.99, '12': 13.99, '13': 14.99, '14': 15.99, '15': 1.99, '16': 2.99, '17': 3.99, '18': 4.99, '19': 5.99}, 'Item Quantity': {'0': 5, '1': 15, '2': 25, '3': 35, '4': 45, '5': 5, '6': 15, '7': 25, '8': 55, '9': 65, '10': 5, '11': 15, '12': 45, '13': 55, '14': 75, '15': 5, '16': 15, '17': 55, '18': 95, '19': 85}, 'Is Best': {'0': 'Y', '1': 'Y', '2': 'Y', '3': 'Y', '4': 'Y', '5': 'Y', '6': 'Y', '7': 'Y', '8': 'Y', '9': 'Y', '10': 'Y', '11': 'Y', '12': 'Y', '13': 'Y', '14': 'Y', '15': 'Y', '16': 'Y', '17': 'Y', '18': 'Y', '19': 'Y'}, 'Is Popular': {'0': 'N', '1': 'N', '2': 'N', '3': 'N', '4': 'N', '5': 'N', '6': 'N', '7': 'N', '8': 'N', '9': 'N', '10': 'N', '11': 'N', '12': 'N', '13': 'N', '14': 'N', '15': 'N', '16': 'N', '17': 'N', '18': 'N', '19': 'N'}, 'Footer Type': {'0': 1, '1': 1, '2': 1, '3': 1, '4': 1, '5': 1, '6': 1, '7': 1, '8': 1, '9': 1, '10': 1, '11': 1, '12': 1, '13': 1, '14': 1, '15': 1, '16': 1, '17': 1, '18': 1, '19': 1}, 'Footer Text': {'0': 'test_text', '1': 'test_text', '2': 'test_text', '3': 'test_text', '4': 'test_text', '5': 'test_text', '6': 'test_text', '7': 'test_text', '8': 'test_text', '9': 'test_text', '10': 'test_text', '11': 'test_text', '12': 'test_text', '13': 'test_text', '14': 'test_text', '15': 'test_text', '16': 'test_text', '17': 'test_text', '18': 'test_text', '19': 'test_text'}, 'Pack Image': {'0': 'gems_1.png', '1': 'gems_2.png', '2': 'gems_3.png', '3': 'gems_4.png', '4': 'gems_5.png', '5': 'gems_1.png', '6': 'gems_2.png', '7': 'gems_3.png', '8': 'gems_4.png', '9': 'gems_5.png', '10': 'gems_1.png', '11': 'gems_2.png', '12': 'gems_3.png', '13': 'gems_4.png', '14': 'gems_5.png', '15': 'gems_1.png', '16': 'gems_2.png', '17': 'gems_3.png', '18': 'gems_4.png', '19': 'gems_5.png'}}
{'id': 0, 'segmentAlias': 'default', 'userIds': [-1], 'platforms': ['gp'], 'content': {'items': [{'name': 'default_config', 'identifier': 0, 'itemsOrder': 1, 'dollarValue': 1.99, 'remarks': 'default!', 'itemsQuantity': 5, 'is_best': 'Y', 'is_popular': 'N', 'packName': 'test_bingo', 'footerType': 1, 'footerText': 'test_text', 'soldItem': 'Gems', 'packImage': 'gems_1.png', 'SKU': 'gems_99'}, {'name': 'default_config', 'identifier': 1, 'itemsOrder': 2, 'dollarValue': 2.99, 'remarks': 'default!', 'itemsQuantity': 15, 'is_best': 'Y', 'is_popular': 'N', 'packName': 'test_bingo', 'footerType': 1, 'footerText': 'test_text', 'soldItem': 'Gems', 'packImage': 'gems_2.png', 'SKU': 'gems_99'}, {'name': 'default_config', 'identifier': 2, 'itemsOrder': 3, 'dollarValue': 3.99, 'remarks': 'default!', 'itemsQuantity': 25, 'is_best': 'Y', 'is_popular': 'N', 'packName': 'test_bingo', 'footerType': 1, 'footerText': 'test_text', 'soldItem': 'Gems', 'packImage': 'gems_3.png', 'SKU': 'gems_99'}, {'name': 'default_config', 'identifier': 3, 'itemsOrder': 4, 'dollarValue': 4.99, 'remarks': 'default!', 'itemsQuantity': 35, 'is_best': 'Y', 'is_popular': 'N', 'packName': 'test_bingo', 'footerType': 1, 'footerText': 'test_text', 'soldItem': 'Gems', 'packImage': 'gems_4.png', 'SKU': 'gems_99'}, {'name': 'default_config', 'identifier': 4, 'itemsOrder': 5, 'dollarValue': 5.99, 'remarks': 'default!', 'itemsQuantity': 45, 'is_best': 'Y', 'is_popular': 'N', 'packName': 'test_bingo', 'footerType': 1, 'footerText': 'test_text', 'soldItem': 'Gems', 'packImage': 'gems_5.png', 'SKU': 'gems_99'}]}}
<class 'dict'>
{'id': 1, 'segmentAlias': 's1', 'userIds': [-1], 'platforms': ['ios'], 'content': {'items': [{'name': 'gems1', 'identifier': 0, 'itemsOrder': 1, 'dollarValue': 1.99, 'remarks': 'x', 'itemsQuantity': 5,
I cannot compare those dicts

You will need to read the excel file and convert it to json with pandas. Also check in which format is your json file when converting from pandas. Maybe you need to change the orientation.
df = pd.read_excel(xlsx, sheet_name=sheet)
json_from_excel = df.to_json(orient='records')
Next you will need to order your json data here is an example function.
def ordered(obj):
if isinstance(obj, dict):
return sorted((k, ordered(v)) for k, v in obj.items())
if isinstance(obj, list):
return sorted(ordered(x) for x in obj)
else:
return obj
And finally you can make the compare.
if ordered(json_from_excel) == ordered(json_data):
#do something

How to return key for column with smallest values

I have this dictionary:
d= {'1': { '2': 1, '3': 0, '4': 0, '5': 1, '6': 29 }
,'2': {'1': 13, '3': 1, '4': 0, '5': 21, '6': 0 }
,'3': {'1': 0, '2': 0, '4': 1, '5': 0, '6': 1 }
,'4': {'1': 1, '2': 17, '3': 1, '5': 2, '6': 0 }
,'5': {'1': 39, '2': 1, '3': 0, '4': 0, '6': 14 }
,'6': {'1': 0, '2': 0, '3': 43, '4': 1, '5': 0 }
}
I want to write a function that returns the column where all the values are <2 (less than 2).
So far i have turned the dictionary into a list, and then tried a lot of things that didn't work... I know that the answer is column number 4.
This was my latest attemp:
def findFirstRead(overlaps):
e= [[d[str(i)].get(str(j), '-') for j in range(1, 7)] for i in range(1, 7)]
nested_list = e
for i in map(itemgetter(x),nested_list):
if i<2:
return x+1
else:
continue
...and it was very wrong

The following set and list comprehension lists columns where the column has a max value of 2:
columns = {c for r, row in d.iteritems() for c in row}
[c for c in columns if max(v.get(c, -1) for v in d.itervalues()) < 2]
This returns ['4'].

Making a dictionary of overlaps from a dictionary

This problem is teasing me:
I have 6 different sequences that each overlap, they are name 1-6.
I have made a function that represents the sequences in a dictionary, and a function that gives me the part of the sequences that overlap.
Now i should use those 2 functions to construct a dictionary that take the number of overlapping positions in both the right-to-left order and in the left-to-right oder.
The dictionary I have made look like:
{'1': 'GGCTCCCCACGGGGTACCCATAACTTGACAGTAGATCTCGTCCAGACCCCTAGC',
'2': 'CTTTACCCGGAAGAGCGGGACGCTGCCCTGCGCGATTCCAGGCTCCCCACGGG',
'3': 'GTCTTCAGTAGAAAATTGTTTTTTTCTTCCAAGAGGTCGGAGTCGTGAACACATCAGT',
'4': 'TGCGAGGGAAGTGAAGTATTTGACCCTTTACCCGGAAGAGCG',
'5': 'CGATTCCAGGCTCCCCACGGGGTACCCATAACTTGACAGTAGATCTC',
'6': 'TGACAGTAGATCTCGTCCAGACCCCTAGCTGGTACGTCTTCAGTAGAAAATTGTTTTTTTCTTCCAAGAGGTCGGAGT'}
I should end up with a result like:
{'1': {'3': 0, '2': 1, '5': 1, '4': 0, '6': 29},
'3': {'1': 0, '2': 0, '5': 0, '4': 1, '6': 1},
'2': {'1': 13, '3': 1, '5': 21, '4': 0, '6': 0},
'5': {'1': 39, '3': 0, '2': 1, '4': 0, '6': 14},
'4': {'1': 1, '3': 1, '2': 17, '5': 2, '6': 0},
'6': {'1': 0, '3': 43, '2': 0, '5': 0, '4': 1}}
I seems impossible.
I guess it's not, so if somebody could (not do it) but push me in the right direction, it would be great.

This is a bit of a complicated one-liner, but it should work. Using find_overlaps() as the function that finds overlaps and seq_dict as the original dictionary of sequences:
overlaps = {seq:{other_seq:find_overlaps(seq_dict[seq],seq_dict[other_seq])
for other_seq in seq_dict if other_seq != seq} for seq in seq_dict}
Here it is with a bit nicer spacing:
overlaps = \
{seq:
{other_seq:
find_overlaps(seq_dict[seq],seq_dict[other_seq])
for other_seq in seq_dict if other_seq != seq}
for seq in seq_dict}

The clean way:
dna = {
'1': 'GGCTCCCCACGGGGTACCCATAACTTGACAGTAGATCTCGTCCAGACCCCTAGC',
'2': 'CTTTACCCGGAAGAGCGGGACGCTGCCCTGCGCGATTCCAGGCTCCCCACGGG',
'3': 'GTCTTCAGTAGAAAATTGTTTTTTTCTTCCAAGAGGTCGGAGTCGTGAACACATCAGT',
'4': 'TGCGAGGGAAGTGAAGTATTTGACCCTTTACCCGGAAGAGCG',
'5': 'CGATTCCAGGCTCCCCACGGGGTACCCATAACTTGACAGTAGATCTC',
'6': 'TGACAGTAGATCTCGTCCAGACCCCTAGCTGGTACGTCTTCAGTAGAAAATTG' \
'TTTTTTTCTTCCAAGAGGTCGGAGT'
}
def overlap(a, b):
l = min(len(a), len(b))
while True:
if a[-l:] == b[:l] or l == 0:
return l
l -= 1
def all_overlaps(d):
result = {}
for k1, v1 in d.items():
overlaps = {}
for k2, v2 in d.items():
if k1 == k2:
continue
overlaps[k2] = overlap(v1, v2)
result[k1] = overlaps
return result
print all_overlaps(dna)
(By the way, you could've provided overlap yourself in the question to make it easier for everyone to answer.)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Deleting rows from a dataframe with nested for loops - python

Related

random.choice appears to sometimes return keys that are not there

python - how to iterate through values of dict

How to compare XLS file to json file?

How to return key for column with smallest values

Making a dictionary of overlaps from a dictionary

Categories

Resources