Separating "objects" within a series to be in separate columns in Python - python

I have some data that I have managed to put into a series in Python there are 369 elements in the series, within each element, there is a further two arrays containing starting x and y co-ordinates and ending x and y co-ordinates. I am looking to restructure this series in a simple data table with 369 entries and 4 columns.
First 10 Elements of the Series is
0 [{'y': 52, 'x': 50}, {'y': 44, 'x': 40}]
1 [{'y': 44, 'x': 40}, {'y': 75, 'x': 33}]
2 [{'y': 75, 'x': 33}, {'y': 76, 'x': 42}]
3 [{'y': 76, 'x': 42}, {'y': 36, 'x': 28}]
4 [{'y': 36, 'x': 28}, {'y': 12, 'x': 34}]
5 [{'y': 12, 'x': 34}, {'y': 30, 'x': 32}]
6 [{'y': 30, 'x': 32}, {'y': 70, 'x': 30}]
7 [{'y': 70, 'x': 30}, {'y': 35, 'x': 28}]
8 [{'y': 35, 'x': 28}, {'y': 23, 'x': 33}]
9 [{'y': 83, 'x': 46}, {'y': 87, 'x': 48}]
Name: list, dtype: object
By Using this, I can access the first element within that series, but ideally I want to be able to access each individual 'y' and 'x' value within these elements
passinglocations[1]
[{'y': 44, 'x': 40}, {'y': 75, 'x': 33}]
I cannot seem to find any further information in which I understand to get this in the usable form I want it
Any Insights?
Thanks

Assuming that your four columns are your y, x, y, x values, this should work:
passinglocations = [
[{'y': 44, 'x': 40}, {'y': 75, 'x': 33}],
[{'y': 23, 'x': 15}, {'y': 25, 'x': 37}]
]
def transform(passinglocations):
return [(loc[0]['y'], loc[0]['x'], loc[1]['y'], loc[1]['x']) for loc in passinglocations]
print(transform(passinglocations))
output:
[(44, 40, 75, 33), (23, 15, 25, 37)]

Related

Nested dictionary from a txt with the dictionary

I have a txt file with the dictionary like this:
{'origin': {'Ukraine': 50, 'Portugal': 20, 'others': 10}, 'native language': {'ucranian': 50; 'english': 45, 'russian': 30, 'others': 10}, 'second language': {'ucranian': 50; 'english': 45, 'russian': 30, 'others': 10, 'none': 0}, 'profession': {'medical doctor': 50, 'healthcare professional': 40, 'cooker': 30, 'others': 10, 'spy': 0}, 'first aid skills': {'yes': 50, 'no': 0}, 'driving skills': {'yes': 40, 'no': 0}, 'cooking skills': {'yes': 50, 'some': 30, 'no': 0}, 'IT skills': {'yes': 50, 'little': 35, 'no': 0}}
And I want to create a dictionary from this
I tried using ast.literal_eval but it gives me the following error:
SyntaxError: expression expected after dictionary key and ':'
This is my code :
def helpersSkills(helpersFile, skillsFile):
"""
"""
helpers = open(helpersFile, 'r')
skills = open(skillsFile, 'r')
skillsLines = skills.read()
dictionary = ast.literal_eval(skillsLines)
...
helpersSkills('helpersArrived2.txt', 'skills.txt')
as said by #ThierryLathuille it was just some writing errors in the txt file
so its working:
{'origin': {'Ukraine': 50, 'Portugal': 20, 'others': 10}, 'native language': {'ucranian': 50, 'english': 45, 'russian': 30, 'others': 10}, 'second language': {'ucranian': 50, 'english': 45, 'russian': 30, 'others': 10, 'none': 0}, 'profession': {'medical doctor': 50, 'healthcare professional': 40, 'cooker': 30, 'others': 10, 'spy': 0}, 'first aid skills': {'yes': 50, 'no': 0}, 'driving skills': {'yes': 40, 'no': 0}, 'cooking skills': {'yes': 50, 'some': 30, 'no': 0}, 'IT skills': {'yes': 50, 'little': 35, 'no': 0}}
This is the code :
def helpersSkills(helpersFile, skillsFile):
"""
"""
helpers = open(helpersFile, 'r')
skills = open(skillsFile, 'r')
skillsLines = skills.read()
dictionary = ast.literal_eval(skillsLines)
...
helpersSkills('helpersArrived2.txt', 'skills.txt')

Converting a list of dictionaries to a list by using certain keys

I have list of dictionaries which the dictionary size (number of keys) is not constant. Here is an example:
data = [{'t': 1633098324445950024,
'y': 1633098324445929497,
'q': 1636226,
'i': '57337',
'x': 12,
's': 15,
'c': [14, 37, 41],
'p': 139.55,
'z': 3},
{'t': 1633098324445958000,
'y': 1633098324445929497,
'q': 1636229,
'i': '57340',
'x': 12,
's': 100,
'c': [14, 41],
'p': 139.55,
'z': 3},
{'t': 1633098324445958498,
'y': 1633098324445594112,
'q': 1636230,
'i': '31895',
'x': 11,
's': 60,
'c': [14, 37, 41],
'p': 139.55,
'z': 3},
{'t': 1633098324446013523,
'y': 1633098324445649152,
'q': 1636231,
'i': '31896',
'x': 11,
's': 52,
'c': [14, 37, 41],
'p': 139.55,
'z': 3},
{'t': 1633098324472392943,
'y': 1633098324472133407,
'q': 1636256,
'i': '3417',
'x': 15,
's': 100,
'p': 139.555,
'z': 3},
{'t': 1633098324478972256,
'y': 1633098324478000000,
'f': 1633098324478949693,
'q': 1636260,
'i': '58051',
'x': 4,
'r': 12,
's': 100,
'p': 139.555,
'z': 3}]
As it is in the sample each dictionary has different length and they do not necessarily have the same keys. I need to extract certain elements based on the key. I am using [(d['t'],d['p'],d['s']) for d in data] and the results looks like this:
[(1633098324445950024, 139.55, 15),
(1633098324445958000, 139.55, 100),
(1633098324445958498, 139.55, 60),
(1633098324446013523, 139.55, 52),
(1633098324472392943, 139.555, 100),
(1633098324478972256, 139.555, 100)]
But I need to have values with 'c' key and when I run the following I got KeyError:
[(d['t'],d['p'],d['s'],d['c']) for d in data]
Traceback (most recent call last):
File "<ipython-input-108-8533763f150e>", line 1, in <module>
[(d['t'],d['p'],d['s'],d['c']) for d in data]
File "<ipython-input-108-8533763f150e>", line 1, in <listcomp>
[(d['t'],d['p'],d['s'],d['c']) for d in data]
KeyError: 'c'
One approach:
res = [(d['t'], d['p'], d['s']) + ((d['c'],) if 'c' in d else tuple()) for d in data]
pprint.pprint(res)
Output
[(1633098324445950024, 139.55, 15, [14, 37, 41]),
(1633098324445958000, 139.55, 100, [14, 41]),
(1633098324445958498, 139.55, 60, [14, 37, 41]),
(1633098324446013523, 139.55, 52, [14, 37, 41]),
(1633098324472392943, 139.555, 100),
(1633098324478972256, 139.555, 100)]
Why don't you try this:
value_list = [(d.get('t', ''), d.get('p', ''), d.get('s', ''), d.get('c', [])) for d in data]
print(value_list)
Output:
[(1633098324445950024, 139.55, 15, [14, 37, 41]),
(1633098324445958000, 139.55, 100, [14, 41]),
(1633098324445958498, 139.55, 60, [14, 37, 41]),
(1633098324446013523, 139.55, 52, [14, 37, 41]),
(1633098324472392943, 139.555, 100, []),
(1633098324478972256, 139.555, 100, [])]
def convert_to_list(list_of_dicts, keys):
"""
Convert a list of dictionaries to a list by using certain keys.
:param list_of_dicts: list of dictionaries
:param keys: list of keys
:return: list
"""
return [{key: dic[key] for key in keys} for dic in list_of_dicts]
if __name__ == '__main__':
list_of_dicts = [{'name': 'John', 'age': 21}, {'name': 'Mark', 'age': 25}]
keys = ['name', 'age']
print(convert_to_list(list_of_dicts, keys))

Writing a list of dictionaries in CSV

The next problem you have a list of dictionaries of the format
[{'a': 10, 'b': 11, 'c': 12, 'd': 13, 'e': 14},
{'a': 20, 'b': 21, 'c': 22, 'd': 23, 'e': 24},
{'a': 30, 'b': 31, 'c': 32, 'd': 33, 'e': 34},
{'a': 40, 'b': 41, 'c': 42, 'd': 43, 'e': 44}]
which you want to move to CSV-file, looking like
"a","b","c","d","e"
10,11,12,13,14
20,21,22,23,24
30,31,32,33,34
40,41,42,43,44
Problem is that when you start code:
def write_csv_from_list_dict(filename, table, fieldnames, separator, quote):
table = []
for dit in table:
a_row = []
for fieldname in fieldnames:
a_row.append(dit[fieldname])
table.append(a_row)
file_handle = open(filename, 'wt', newline='')
csv_write = csv.writer(file_handle,
delimiter=separator,
quotechar=quote,
quoting=csv.QUOTE_NONNUMERIC)
csv_write.writerow(fieldnames)
for row in table:
csv_write.writerow(row)
file_handler.close()
raising error
(Exception: AttributeError) "'list' object has no attribute 'keys'"
at line 148, in _dict_to_list wrong_fields = rowdict.keys() - self.fieldnames
Why to be so hard to say, explicitly to close a file, not a string.
The below code should work
data = [{'a': 10, 'b': 11, 'c': 12, 'd': 13, 'e': 14},
{'a': 20, 'b': 21, 'c': 22, 'd': 23, 'e': 24},
{'a': 30, 'b': 31, 'c': 32, 'd': 33, 'e': 34},
{'a': 40, 'b': 41, 'c': 42, 'd': 43, 'e': 44}]
keys = data[0].keys()
with open('data.csv', 'w') as f:
f.write(','.join(keys) + '\n')
for entry in data:
f.write(','.join([str(v) for v in entry.values()]) + '\n')
data.csv
a,b,c,d,e
10,11,12,13,14
20,21,22,23,24
30,31,32,33,34
40,41,42,43,44

Scatter plot line to out of order data

I am tracking the movements of an avian animal. I have detection points on an xy plot. I want to connect the previous detected point to the next detection, regardless of direction. This will assist with removing extraneous detections.
Data Sample:
Sample input
The goal is to have a line from the previous data point to the next point.
Sample output
Unsuccessful method 1:
plt.figure('Frame',figsize=(16,12))
plt.imshow(frame)
plt.plot(x, y, '-ro', 'd',markersize=2.5, color='orange')
Method 1 output
Unsuccessful method 2:
plt.plot(np.sort(x), y[np.argsort(x)], '-bo', ms = 2)
Method 2 output
I used your sample data and make a plot with method 1 (but with pandas) and the output was as you expected. I don't understand why you have an unsuccessful result.
data = [{'frame': 1, 'x': 5, 'y': 15},
{'frame': 4, 'x': 10, 'y': 15},
{'frame': 5, 'x': 15, 'y': 15},
{'frame': 6, 'x': 20, 'y': 15},
{'frame': 7, 'x': 23, 'y': 20},
{'frame': 8, 'x': 25, 'y': 25},
{'frame': 11, 'x': 20, 'y': 23},
{'frame': 15, 'x': 15, 'y': 20},
{'frame': 18, 'x': 8, 'y': 18},
{'frame': 19, 'x': 8, 'y': 10},
{'frame': 20, 'x': 12, 'y': 7}]
df = pd.DataFrame(data).sort_values('frame')
df.plot(x='x', y='y')

Remove dictionaries from list where there is more than one with the same hour and minute?

I have a list of dictionaries which have a date string within them. I would like to remove a single entry of two if there is a matching hour and minute for that record.
Here is some sample data, as you can see the first two dictionaries have 14:21 in them, I would only like one of those dictionaries and the other to be removed.
I'm not sure how to even start with this one, is it possible?
[{'x': '2018-06-19 14:21:22', 'y': 80},
{'x': '2018-06-19 14:21:26', 'y': 86},
{'x': '2018-06-19 14:24:02', 'y': 89},
{'x': '2018-06-19 14:24:07', 'y': 95},
{'x': '2018-06-19 14:25:10', 'y': 127}]
This is one approach using a simple iteration and a check list.
Demo:
checkVal = set()
data = [{'x': '2018-06-19 14:21:22', 'y': 80}, {'x': '2018-06-19 14:21:26', 'y': 86}, {'x': '2018-06-19 14:24:02', 'y': 89}, {'x': '2018-06-19 14:24:07', 'y': 95}, {'x': '2018-06-19 14:25:10', 'y': 127}, {'x': '2018-06-19 14:25:14', 'y': 138}, {'x': '2018-06-19 14:28:04', 'y': 91}, {'x': '2018-06-19 14:28:08', 'y': 83}, {'x': '2018-06-19 14:30:11', 'y': 92}, {'x': '2018-06-19 14:30:16', 'y': 99}, {'x': '2018-06-19 14:31:21', 'y': 80}, {'x': '2018-06-19 14:31:26', 'y': 90}, {'x': '2018-06-19 14:34:03', 'y': 131}, {'x': '2018-06-19 14:34:07', 'y': 137}, {'x': '2018-06-19 14:35:28', 'y': 98}, {'x': '2018-06-19 14:35:32', 'y': 91}, {'x': '2018-06-19 14:37:11', 'y': 86}, {'x': '2018-06-19 14:37:16', 'y': 92}, {'x': '2018-06-19 14:39:02', 'y': 111}, {'x': '2018-06-19 14:39:06', 'y': 118}, {'x': '2018-06-19 14:42:03', 'y': 95}, {'x': '2018-06-19 14:42:08', 'y': 104}, {'x': '2018-06-19 14:43:04', 'y': 165}, {'x': '2018-06-19 14:43:09', 'y': 168}, {'x': '2018-06-19 14:45:11', 'y': 89}, {'x': '2018-06-19 14:45:15', 'y': 94}, {'x': '2018-06-19 14:47:11', 'y': 133}, {'x': '2018-06-19 14:47:16', 'y': 146}, {'x': '2018-06-19 14:49:16', 'y': 134}, {'x': '2018-06-19 14:49:21', 'y': 146}, {'x': '2018-06-19 14:52:05', 'y': 157}, {'x': '2018-06-19 14:52:09', 'y': 169}, {'x': '2018-06-19 14:54:13', 'y': 66}, {'x': '2018-06-19 14:54:17', 'y': 63}, {'x': '2018-06-19 14:55:09', 'y': 95}, {'x': '2018-06-19 14:55:14', 'y': 90}, {'x': '2018-06-19 14:58:02', 'y': 112}, {'x': '2018-06-19 14:58:07', 'y': 119}, {'x': '2018-06-19 14:59:09', 'y': 98}, {'x': '2018-06-19 14:59:13', 'y': 91}]
res = []
for i in data:
if i["x"][:-3] not in checkVal:
res.append(i)
checkVal.add(i["x"][:-3])
print(res)
Output:
[{'y': 80, 'x': '2018-06-19 14:21:22'}, {'y': 89, 'x': '2018-06-19 14:24:02'}, {'y': 127, 'x': '2018-06-19 14:25:10'}, {'y': 91, 'x': '2018-06-19 14:28:04'}, {'y': 92, 'x': '2018-06-19 14:30:11'}, {'y': 80, 'x': '2018-06-19 14:31:21'}, {'y': 131, 'x': '2018-06-19 14:34:03'}, {'y': 98, 'x': '2018-06-19 14:35:28'}, {'y': 86, 'x': '2018-06-19 14:37:11'}, {'y': 111, 'x': '2018-06-19 14:39:02'}, {'y': 95, 'x': '2018-06-19 14:42:03'}, {'y': 165, 'x': '2018-06-19 14:43:04'}, {'y': 89, 'x': '2018-06-19 14:45:11'}, {'y': 133, 'x': '2018-06-19 14:47:11'}, {'y': 134, 'x': '2018-06-19 14:49:16'}, {'y': 157, 'x': '2018-06-19 14:52:05'}, {'y': 66, 'x': '2018-06-19 14:54:13'}, {'y': 95, 'x': '2018-06-19 14:55:09'}, {'y': 112, 'x': '2018-06-19 14:58:02'}, {'y': 98, 'x': '2018-06-19 14:59:09'}]
You already have an answer, but for a very efficient solution use the itertools unique_everseen recipe. It's also safer since it will throw a useful error if the input date isn't valid.
from datetime import datetime
from itertools import filterfalse
input_ = [{'x': '2018-06-19 14:21:22', 'y': 80},
{'x': '2018-06-19 14:21:26', 'y': 86},
{'x': '2018-06-19 14:24:02', 'y': 89},
{'x': '2018-06-19 14:24:07', 'y': 95},
{'x': '2018-06-19 14:25:10', 'y': 127}]
def unique_everseen(iterable, key=None):
"""List unique elements, preserving order. Remember all elements ever seen.
# unique_everseen('AAAABBBCCDAABBB') --> A B C D
# unique_everseen('ABBCcAD', str.lower) --> A B C D
"""
seen = set()
seen_add = seen.add
if key is None:
for element in filterfalse(seen.__contains__, iterable):
seen_add(element)
yield element
else:
for element in iterable:
k = key(element)
if k not in seen:
seen_add(k)
yield element
def hour_and_min(dct):
fmt = '%Y-%m-%d %H:%M:%S'
d = datetime.strptime(dct['x'], fmt)
return d.hour, d.minute # add `, d.year, d.month, d.day` if you care about these
output = list(unique_everseen(input_, key=hour_and_min))
And output is:
[{'x': '2018-06-19 14:21:22', 'y': 80},
{'x': '2018-06-19 14:24:02', 'y': 89},
{'x': '2018-06-19 14:25:10', 'y': 127}]

Categories

Resources