Okay, so I am having this issue with JSON, keys, and strings. I'm using a JSON dump in python to save my game dictionaries and it does work. The issue is when I load the dictionaries the game I'm making uses int values as keys in the world directory but JSON stores keys as strings. Here's a random generation I did.
worldmap = {
'Regions': {
1: 'Zelbridge', 2: 'Forest Path', 3: 'Baronbell', 4: 'Old Path', 5: 'Cariva', 6: 'Prairie Path'},
'Zelbridge': {1: 'Field', 2: 'Prairie Path', 3: 'School', 4: 'Mountain Path', 5: 'Graveyard',
6: 'Old Path', 7: 'Blacksmith', 8: 'Forest Path', 9: 'Doctor', 0: 'Zelbridge'},
'Forest Path': {1: 'Trees', 2: 'Bushes', 3: 'Path', 4: 'Cariva', 5: 'Path',
6: 'Baronbell', 7: 'Path', 8: 'Zelbridge', 9: 'Path', 0: 'Forest Path'},
'Baronbell': {1: 'House', 2: 'Mountain Path', 3: 'Graveyard', 4: 'Old Path', 5: 'Field',
6: 'Forest Path', 7: 'Church', 8: 'Prairie Path', 9: 'Shop', 0: 'Baronbell'},
'Old Path': {1: 'Path', 2: 'Trees', 3: 'Bushes', 4: 'Cariva', 5: 'Path',
6: 'Zelbridge', 7: 'Trees', 8: 'Baronbell', 9: 'Trees', 0: 'Old Path'},
'Cariva': {1: 'Cellar', 2: 'Old Path', 3: 'Graveyard', 4: 'Mountain Path', 5: 'Town Hall',
6: 'Prairie Path', 7: 'School', 8: 'Forest Path', 9: 'Blacksmith', 0: 'Cariva'},
'Prairie Path': {1: 'Bushes', 2: 'Path', 3: 'Path', 4: 'Zelbridge', 5: 'Trees',
6: 'Cariva', 7: 'Trees', 8: 'Baronbell', 9: 'Path', 0: 'Prairie Path'}
}
So when I use the load function I get key errors due to the int's being converted to strings. I attempted a for loop to iterate over the keys and change them back but I get this error about the dictionary changing. Here's an example of me trying to load a different (and also random) world. Its set to print after each loop showing that it works
What was your hero's name? #Input hero name
Loading...
{'2': 'Prairie Path', '3': 'Cariva', '4': 'Old Path', '5': 'Baronbell', '6': 'Mountain Path', 1: 'Zelbridge'} #Region number 1 no longer string
{'3': 'Cariva', '4': 'Old Path', '5': 'Baronbell', '6': 'Mountain Path', 1: 'Zelbridge', 2: 'Prairie Path'} #Region numbers 1 and 2 no longer string
{'4': 'Old Path', '5': 'Baronbell', '6': 'Mountain Path', 1: 'Zelbridge', 2: 'Prairie Path', 3: 'Cariva'} #ect
{'5': 'Baronbell', '6': 'Mountain Path', 1: 'Zelbridge', 2: 'Prairie Path', 3: 'Cariva', 4: 'Old Path'} #ect
{'6': 'Mountain Path', 1: 'Zelbridge', 2: 'Prairie Path', 3: 'Cariva', 4: 'Old Path', 5: 'Baronbell'} # Region numbers 1, 2, 3, 4, and 5 no longer string
{'6': 'Mountain Path', 1: 'Zelbridge', 2: 'Prairie Path', 3: 'Cariva', 5: 'Baronbell'}
Traceback (most recent call last):
File "C:/Users/crazy/PycharmProjects/rpg/main.py", line 581, in load
for key in worldmap["Regions"]:
RuntimeError: dictionary changed size during iteration
Process finished with exit code 1
I'm not exactly sure what's wrong with it but sadly I'll also have to do this for each location within a region. Any and all help is appreciated, as I've looked all over SO and google but to no avail.
with open(world_file) as infile:
worldmap = json.load(infile)
copy = worldmap.copy()
regions = worldmap["Regions"]
locations = copy.pop("Regions")
for key in worldmap["Regions"]:
value = worldmap["Regions"][key]
new_key = int(key)
worldmap["Regions"].update({new_key: value})
worldmap ["Regions"].pop(key)
print(str(worldmap["Regions"]) + "\n")```
Use For loop this way and update the key in loop itself:
mydict = {1: 'a', 2: 'b'}
for index, (key, value) in enumerate(mydict.items()):
print("index: {}, key: {}, value: {}".format(index, key, value))
mydict[index] = mydict.pop(key)
Or use can use List to force a copy of the keys to be made:
mydict = {1: 'a', 2: 'b'}
for index, key in enumerate(list(mydict)):
mydict[index] = mydict.pop(key)
# which will give output like:
# ---------------------------
# {0: 'a', 1: 'b'}
Related
Here's the data frame, original has a million rows so solution needs to be efficient:
Code:
import pandas as pd
df_temp = pd.DataFrame({'Download Button Clicked Time': {0: '2021-10-24 12:39:27.189629',
1: '2021-10-24 12:42:06.346536',
2: '2021-10-24 12:42:06.369056',
3: '2021-10-24 12:42:11.551610',
4: '2021-10-24 12:44:38.475047',
5: '2021-10-24 12:46:33.331920',
6: '2021-10-24 12:46:33.346536',
7: '2021-10-24 12:46:33.369056',
8: '2021-10-24 12:46:33.421520',
9: '2021-10-24 12:46:33.404641'},
'Install Verified Time': {0: '2021-10-24 12:41:04.669589',
1: '2021-10-24 12:43:14.032023',
2: '2021-10-24 12:43:14.033913',
3: '2021-10-24 12:44:08.667666',
4: '2021-10-24 12:46:11.161883',
5: '2021-10-24 12:46:34.976129',
6: '2021-10-24 12:46:35.032023',
7: '2021-10-24 12:46:35.033913',
8: '2021-10-24 12:46:35.065320',
9: '2021-10-24 12:46:35.125156'},
'App ID': {0: 'com.foxbytecode.captureintruder',
1: 'in.onecode.app',
2: 'com.payworld.phoneapp',
3: 'messenger.messenger.videocall.messenger',
4: 'imagito.image.search',
5: 'reward.earn.talktime.sixer',
6: 'com.hivoco.app',
7: 'messenger.social.productivity.notifire',
8: 'com.foxbytecode.exiftool',
9: 'com.fliplearn.app'},
'Email ID': {0: 'mandeepsharma38276atwehoo.com',
1: 'luckychauhan1199atwehoo.com',
2: 'mandeepsharma38276atwehoo.com',
3: 'chettanmon40atwehoo.com',
4: 'kaliapradhan1413atwehoo.com',
5: 'pinkydevi69784atwehoo.com',
6: 'pinkydevi69784atwehoo.com',
7: 'pinkydevi69784atwehoo.com',
8: 'pinkydevi69784atwehoo.com',
9: 'pinkydevi69784atwehoo.com'},
'install_time': {0: 97.47996,
1: 68.29827800000001,
2: 120.708813,
3: 117.116056,
4: 92.686836,
5: 1.644209,
6: 1.6854870000000002,
7: 1.664857,
8: 1.6438000000000001,
9: 1.720515},
'fraud': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6: 0, 7: 0, 8: 0, 9: 0}})
df_temp
Output should only have the last FIVE 'fraud' rows as one but current output is this:
The code I'm using to detect fraud and get this output is this:
df_temp['Download Button Clicked Time'] = df_temp['Download Button Clicked Time'].astype('datetime64[ns]')
df_temp['Install Verified Time'] = df_temp['Install Verified Time'].astype('datetime64[ns]')
df_temp['install_time'] = df_temp['Install Verified Time'] - df_temp['Download Button Clicked Time']
df_temp['install_time'] = df_temp['install_time'].dt.total_seconds()
df_temp['diff'] = df_temp.install_time.diff().abs()
def fraud_time(row):
fraud = 0
if row['install_time'] < 0.5:
fraud = 1
elif row['diff'] < 0.1:
fraud = 1
return fraud
df_temp['fraud'] = df_temp.apply(fraud_time, axis=1)
df_temp
I'm using Install Verified Time, seems more sensible than Download Button Clicked Time. As you can clearly see, the third row should not be marked as one as second and third row emails were different. Also, last five, not four, rows should be also marked 1.
TL;DR
Detect fraud (maybe by) using pandas.DataFrame.diff only if last two email addresses were different.
Edit: Frauds will have a very small value of time difference (say 0.02 seconds), for the SAME email, not different ones. Two different users installing two different apps in under 2 milliseconds makes sense, same user doing so doesn't makes sense.
The key is to group entries by email for successive installations:
# I assume you have done this already
df['Install Verified Time'] = pd.to_datetime(df['Install Verified Time'])
df['fraud'] = (df['install_time'] < 0.5) | (
df.groupby('Email ID', as_index=False)['Install Verified Time'].diff()['Install Verified Time'] < pd.Timedelta(seconds=0.1)
)
ive got data drame
json = {'contexts_ru_andata_master_cookies_1': {0: [{'_ym_uid': '1664978572350562652'}],
1: [{'_ym_uid': '1664978577951178500'}],
2: [{'_ym_uid': '1631015476823239589'}],
3: [{'_ym_uid': '1664945479855475653'}],
4: [{'_ym_uid': '1663327749550707020'}],
6: [{'_ym_uid': '1664978547593809275'}],
7: [{'_ym_uid': '16649783691007078342'}],
8: [{'_ym_uid': '1662551949642530804'}]}}
pd.DataFrame.from_dict(json)
i need to get numeric value from cell, any help will be appreciated.
like 1664978577951178500 and etc
It's not a corret json string, you can use "re" to match it.
import re
json = '''{'contexts_ru_andata_master_cookies_1': {0: [{'_ym_uid': '1664978572350562652'}],
1: [{'_ym_uid': '1664978577951178500'}],
2: [{'_ym_uid': '1631015476823239589'}],
3: [{'_ym_uid': '1664945479855475653'}],
4: [{'_ym_uid': '1663327749550707020'}],
6: [{'_ym_uid': '1664978547593809275'}],
7: [{'_ym_uid': '16649783691007078342'}],
8: [{'_ym_uid': '1662551949642530804'}]}}'''
res = re.finditer(r'\'[0-9]*\'', json)
cookies_l = []
for i in res:
cookies_l.append(i.group()[1:-1])
print(cookies_l)
This is output:
['1664978572350562652', '1664978577951178500', '1631015476823239589', '1664945479855475653', '1663327749550707020', '1664978547593809275', '16649783691007078342', '1662551949642530804']
df['_ym_uid'] = df['contexts_ru_andata_master_cookies_1'].str[0].str[0].apply(lambda x : x['_ym_uid'])
that was the answer
I have this dictionary:
{
0: array([-0.16638531, -0.11749843]),
1: array([-0.2318372 , 0.00917023]),
2: array([-0.42934129, -0.0675385 ]),
3: array([-0.63377579, -0.02102854]),
4: array([-0.26648222, -0.42038916]),
5: array([-0.17250316, -0.73490218]),
6: array([-0.42774336, -0.61259704]),
7: array([-0.55420825, -0.77304496]),
8: array([0.13900166, 0.07800885]),
9: array([0.42223986, 0.16563338]),
10: array([ 0.39895669, -0.09198566]),
12: array([0.24324618, 0.44829616]),
11: array([ 0.55394714, -0.17960723]),
13: array([0.192127 , 0.5988793]),
14: array([0.39554203, 0.7186038 ]),
15: array([0.53721604, 1. ])
}
I want to convert those numpy.ndarray values to tuples, and have something like this:
{
0: (-0.16638531, -0.11749843),
1: (-0.2318372 , 0.00917023),
...
}
From this answer here it looks like for each value in the dictionary you can:
tuple(arr)
So for the whole dictionary you can probably do something like:
new_dict = {key: tuple(arr) for key, arr in old_dict.items()}
Or easier to understand:
new_dict = {}
for key, arr in old_dict.items():
new_dict.update({key: tuple(arr)})
You can use a dictionary comprehension.
Python dictionaries have an .items() method that return a tuple of (key, value) for each key-value pair.
The comprehension recreates a new mapping with the original key and the array cast as a tuple.
from numpy import array
data = {
0: array([-0.16638531, -0.11749843]),
1: array([-0.2318372 , 0.00917023]),
2: array([-0.42934129, -0.0675385 ]),
3: array([-0.63377579, -0.02102854]),
4: array([-0.26648222, -0.42038916]),
5: array([-0.17250316, -0.73490218]),
6: array([-0.42774336, -0.61259704]),
7: array([-0.55420825, -0.77304496]),
8: array([0.13900166, 0.07800885]),
9: array([0.42223986, 0.16563338]),
10: array([ 0.39895669, -0.09198566]),
12: array([0.24324618, 0.44829616]),
11: array([ 0.55394714, -0.17960723]),
13: array([0.192127 , 0.5988793]),
14: array([0.39554203, 0.7186038 ]),
15: array([0.53721604, 1. ])
}
print({key: tuple(value) for key, value in data.items()})
OUTPUT:
{0: (-0.16638531, -0.11749843), 1: (-0.2318372, 0.00917023), 2: (-0.42934129, -0.0675385), 3: (-0.63377579, -0.02102854), 4: (-0.26648222, -0.42038916), 5: (-0.17250316, -0.73490218), 6: (-0.42774336, -0.61259704), 7: (-0.55420825, -0.77304496), 8: (0.13900166, 0.07800885), 9: (0.42223986, 0.16563338), 10: (0.39895669, -0.09198566), 12: (0.24324618, 0.44829616), 11: (0.55394714, -0.17960723), 13: (0.192127, 0.5988793), 14: (0.39554203, 0.7186038), 15: (0.53721604, 1.0)}
mapping = { key: (item[0], item[1]) for key, item in your_dict.items() }
From dictionary :
{0: (u'Donald', u'PERSON'), 1: (u'John', u'PERSON'), 2: (u'Trump', u'PERSON'), 14: (u'Barack', u'PERSON'), 15: (u'Obama', u'PERSON'), 17: (u'Michelle', u'PERSON'), 18: (u'Obama', u'PERSON'), 30: (u'Donald', u'PERSON'), 31: (u'Jonh', u'PERSON'), 32: (u'Trump', u'PERSON')}
I'd like to create another dictionary as follows:
{u'Donald John Trump': 2, u'Barack Obama':1, u'Michele Obama':1}
Here 0,1,2 and 30,31,32 keys are increasing by 1 and occurred twice. And 14,15 17,18 occurred once each. Is there any way to create such dict?
I think the main problem you need to solve is to identify persons by grouping keys denoting an increasing int sequence, as you described it.
Fortunately, Python has a recipe for this.
from itertools import groupby
from operator import itemgetter
from collections import defaultdict
dct = {
0: ('Donald', 'PERSON'),
1: ('John', 'PERSON'),
2: ('Trump', 'PERSON'),
14: ('Barack', 'PERSON'),
15: ('Obama', 'PERSON'),
17: ('Michelle', 'PERSON'),
18: ('Obama', 'PERSON'),
30: ('Donald', 'PERSON'),
31: ('John', 'PERSON'),
32: ('Trump', 'PERSON')
}
persons = defaultdict(int) # Used for conveniance
keys = sorted(dct.keys()) # So groupby() can recognize sequences
for k, g in groupby(enumerate(keys), lambda d: d[0] - d[1]):
ids = map(itemgetter(1), g) # [0, 1, 2], [14, 15], etc.
person = ' '.join(dct[i][0] for i in ids) # "Donald John Trump", "Barack Obama", etc
persons[person] += 1
print(persons)
# defaultdict(<class 'int'>,
# {'Barack Obama': 1,
# 'Donald John Trump': 2,
# 'Michelle Obama': 1})
def add_name(d, consecutive_keys, result):
result_key = ' '.join(d[k][0] for k in consecutive_keys)
if result_key in result:
result[result_key] += 1
else:
result[result_key] = 1
d = {0: (u'Donald', u'PERSON'), 1: (u'John', u'PERSON'), 2: (u'Trump', u'PERSON'),
14: (u'Barack', u'PERSON'), 15: (u'Obama', u'PERSON'),
17: (u'Michelle', u'PERSON'), 18: (u'Obama', u'PERSON'),
30: (u'Donald', u'PERSON'), 31: (u'John', u'PERSON'), 32: (u'Trump', u'PERSON')}
sorted_keys = sorted(d.keys())
last_key = sorted_keys[0]
consecutive_keys = [last_key]
result = {}
for i in sorted_keys[1:]:
if i == last_key + 1:
consecutive_keys.append(i)
else:
add_name(d, consecutive_keys, result)
consecutive_keys = [i]
last_key = i
add_name(d, consecutive_keys, result)
print(result)
Output
{'Donald John Trump': 2, 'Barack Obama': 1, 'Michelle Obama': 1}
The following code plots an interactive figure where I can toggle specific lines on/off. This works perfectly when I'm working in an Ipython Notebook
import pandas as pd
import numpy as np
from itertools import cycle
import matplotlib.pyplot as plt, mpld3
from matplotlib.widgets import CheckButtons
import matplotlib.patches
import seaborn as sns
%matplotlib nbagg
sns.set(style="whitegrid")
df = pd.DataFrame({'freq': {0: 0.01, 1: 0.02, 2: 0.029999999999999999, 3: 0.040000000000000001, 4: 0.050000000000000003, 5: 0.059999999999999998, 6: 0.070000000000000007, 7: 0.080000000000000002, 8: 0.089999999999999997, 9: 0.10000000000000001, 10: 0.01, 11: 0.02, 12: 0.029999999999999999, 13: 0.040000000000000001, 14: 0.050000000000000003, 15: 0.059999999999999998, 16: 0.070000000000000007, 17: 0.080000000000000002, 18: 0.089999999999999997, 19: 0.10000000000000001, 20: 0.01, 21: 0.02, 22: 0.029999999999999999, 23: 0.040000000000000001, 24: 0.050000000000000003, 25: 0.059999999999999998, 26: 0.070000000000000007, 27: 0.080000000000000002, 28: 0.089999999999999997, 29: 0.10000000000000001}, 'kit': {0: 'B', 1: 'B', 2: 'B', 3: 'B', 4: 'B', 5: 'B', 6: 'B', 7: 'B', 8: 'B', 9: 'B', 10: 'A', 11: 'A', 12: 'A', 13: 'A', 14: 'A', 15: 'A', 16: 'A', 17: 'A', 18: 'A', 19: 'A', 20: 'C', 21: 'C', 22: 'C', 23: 'C', 24: 'C', 25: 'C', 26: 'C', 27: 'C', 28: 'C', 29: 'C'}, 'SNS': {0: 91.198979591799997, 1: 90.263605442199989, 2: 88.818027210899999, 3: 85.671768707499993, 4: 76.23299319729999, 5: 61.0969387755, 6: 45.1530612245, 7: 36.267006802700003, 8: 33.0782312925, 9: 30.739795918400002, 10: 90.646258503400006, 11: 90.306122449, 12: 90.178571428600009, 13: 89.498299319699996, 14: 88.435374149599994, 15: 83.588435374200003, 16: 75.212585034, 17: 60.969387755100001, 18: 47.278911564600001, 19: 37.627551020399999, 20: 90.986394557800011, 21: 90.136054421799997, 22: 89.540816326499993, 23: 88.690476190499993, 24: 86.479591836799997, 25: 82.397959183699996, 26: 73.809523809499993, 27: 63.180272108800004, 28: 50.935374149700003, 29: 41.241496598699996}, 'FPR': {0: 1.0953616823100001, 1: 0.24489252678500001, 2: 0.15106142277199999, 3: 0.104478605177, 4: 0.089172822253300005, 5: 0.079856258734300009, 6: 0.065881413455800009, 7: 0.059892194050699996, 8: 0.059892194050699996, 9: 0.0578957875824, 10: 0.94097291541899997, 11: 0.208291741532, 12: 0.14773407865800001, 13: 0.107805949291, 14: 0.093165635189999998, 15: 0.082518134025399995, 16: 0.074532508152000007, 17: 0.065881413455800009, 18: 0.062554069341799995, 19: 0.061888600519100001, 20: 0.85313103081100006, 21: 0.18899314567100001, 22: 0.14107939043000001, 23: 0.110467824582, 24: 0.099820323417899995, 25: 0.085180009316599997, 26: 0.078525321088700001, 27: 0.073201570506399985, 28: 0.071870632860800004, 29: 0.0705396952153}})
tableau20 = ["#6C6C6C", "#92D050", "#FFC000"]
tableau20 = cycle(tableau20)
kits = ["A","B", "C"]
color = iter(["#6C6C6C", "#92D050", "#FFC000"])
fig = plt.figure(figsize=(12,8))
for kit in kits:
colour = next(color)
for i in df.groupby('kit'):
grouped_df = pd.DataFrame(np.array(i[1]), columns =
['freq', 'SNS', 'FPR', 'kit'])
if grouped_df.kit.tolist()[1] == kit:
x = [float(value) for i, value in enumerate(grouped_df.FPR)]
y = [float(value) for i, value in enumerate(grouped_df.SNS)]
x, y = (list(x) for x in zip(*sorted(zip(x, y))))
label = grouped_df['kit'].tolist()[1]
p = plt.plot(x, y, "-o",label = label, color = colour)
labels = [label.get_text() for label in plt.legend().texts]
plt.legend().set_visible(False)
for i, value in enumerate(labels):
exec('label%s="%s"'%(i, value))
for i in range(len(labels)):
exec('l%s=fig.axes[0].lines[i]'%(i))
rax = plt.axes([0.92, 0.7, 0.2, 0.2], frameon=False)
check = CheckButtons(rax, (labels), ('True ' * len(labels)))
for i, rec in enumerate(check.rectangles):
rec.set_facecolor(tableau20.next())
def func(label):
for i in range(len(labels)):
if label == eval('label%s'%(i)): eval('l%s.set_visible(not l%s.get_visible())'%(i,i))
plt.draw()
check.on_clicked(func)
plt.show()
Problem is, I need to export the notebook as a html to share with colleagues who know nothing about python. How can I export the notebook to html and get it to maintain the interactive (toggle) functionality (which it currently loses)? Thanks!
Maybe you don't need to export jupyter notebook to html, but share the notebook link to the other people and they can visit the url using their browser.
A jupyter notebook plugin would help you do this more efficiently: jupyter/dashboards, it's maintained by official jupyter team, and it helps you share your notebook like a report, and you can control which cell to display and the location of each cell displayed. Worth a try!