Printing a data frame in Pandas and Python [closed] - python

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I am new to pandas and trying to solve a problem of a basic code to form a data frame. I wrote two rows the data frame to try, but it is not working. I do not know the problem is about the continuation of the dictionaries and the list on the new line or something else. Do I need to use backslash when moving to the new line? Any help is appreciated.
Here is the code:
import pandas as pd
data = [{'#':1, 'Name': 'BS', 'Type 1': 'grass', 'type 2': 'poison', 'Total': 318, 'HP': 45, 'Attack': 49, 'Defense': 49, 'Sp. Atk': 65, 'Sp. Def': 65, 'Speed': 45, 'Generation': 1, 'Legendary':'false'}, {'#':2, 'Name': 'IS', 'Type 1': 'grass', 'type 2': 'poison', 'Total': 405, 'HP': 60, 'Attack': 62, Defense': 63, 'Sp. Atk': 80, 'Sp. Def': 80, 'Speed': 60, 'Generation': 1, 'Legendary':'false'}]
df = pd.DataFrame(data)

Your problem is syntax error with the ‘defense’ key element. There is a missing apostrophe.
data = [{'#':1, 'Name': 'BS', 'Type 1': 'grass', 'type 2': 'poison', 'Total': 318, 'HP': 45, 'Attack': 49,
'Defense': 49, 'Sp. Atk': 65, 'Sp. Def': 65, 'Speed': 45, 'Generation': 1, 'Legendary':'false'},
{'#':2, 'Name': 'IS', 'Type 1': 'grass', 'type 2': 'poison', 'Total': 405, 'HP': 60, 'Attack': 62,
'Defense': 63, 'Sp. Atk': 80, 'Sp. Def': 80, 'Speed': 60, 'Generation': 1, 'Legendary':'false'}]
>>> pd.DataFrame(data)
# Name Type 1 type 2 Total HP Attack Defense Sp. Atk Sp. Def Speed Generation Legendary
0 1 BS grass poison 318 45 49 49 65 65 45 1 false
1 2 IS grass poison 405 60 62 63 80 80 60 1 false

import pandas as pd
data = [{'#':1, 'Name': 'BS', 'Type 1': 'grass', 'type 2': 'poison', 'Total': 318,
'HP': 45, 'Attack': 49, 'Defense': 49, 'Sp. Atk': 65, 'Sp. Def': 65, 'Speed': 45,
'Generation': 1, 'Legendary':'false'}, {'#':2, 'Name': 'IS', 'Type 1': 'grass',
'type 2': 'poison', 'Total': 405, 'HP': 60,
'Attack': 62, 'Defense': 63,
'Sp. Atk': 80, 'Sp. Def': 80,
'Speed': 60, 'Generation': 1, 'Legendary':'false'}]
df = pd.DataFrame(data)
print(df)

Related

Convert Pandas DataFrame to dictionary where columns are keys and (column-wise) rows are values

I wish to convert a DataFrame into a dictionary where columns are the keys and (column-wise) rows are its values. I also need to use grouping when doing so.
team id name salary
0 Alpha 10 Jack 1000
1 Alpha 15 John 2000
2 Alpha 20 John 2000
3 Bravo 50 Thomas 5000
4 Bravo 55 Robert 6000
5 Bravo 60 Albert 7000
Expected output:
# expected output WITH duplicates
ex = {'Alpha': {'id': [10, 15, 20], 'name': ['Jack', 'John', 'John'], 'salary': [1000, 2000, 2000]},
'Bravo': {'id': [50, 55, 60], 'name': ['Thomas', 'Robert', 'Albert'], 'salary': [5000, 6000, 7000]}
}
# expected output WITHOUT duplicates ('name', 'salary')
ex = {'Alpha': {'id': [10, 15, 20], 'name': ['Jack', 'John'], 'salary': [1000, 2000]},
'Bravo': {'id': [50, 55, 60], 'name': ['Thomas', 'Robert', 'Albert'], 'salary': [5000, 6000, 7000]}
}
Can it be done somehow using df.to_dict() ?
Code for example:
import pandas as pd
d = {'team': ['Alpha', 'Alpha', 'Alpha', 'Bravo', 'Bravo', 'Bravo'],
'id': [10, 15, 20, 50, 55, 60],
'name': ['Jack', 'John', 'John', 'Thomas', 'Robert', 'Albert'],
'salary': [1000, 2000, 2000, 5000, 6000, 7000]}
df = pd.DataFrame(data=d)
A Groupby then to_dict should do the trick:
out = df.groupby('team').agg(list).to_dict('index')
print(out)
Output:
{'Alpha': {'id': [10, 15, 20],
'name': ['Jack', 'John', 'John'],
'salary': [1000, 2000, 2000]},
'Bravo': {'id': [50, 55, 60],
'name': ['Thomas', 'Robert', 'Albert'],
'salary': [5000, 6000, 7000]}}
For unique lists:
out = df.groupby('team').agg(lambda x: x.unique().tolist()).to_dict('index')
print(out)
# Output:
{'Alpha': {'id': [10, 15, 20],
'name': ['Jack', 'John'],
'salary': [1000, 2000]},
'Bravo': {'id': [50, 55, 60],
'name': ['Thomas', 'Robert', 'Albert'],
'salary': [5000, 6000, 7000]}}
expected output WITH duplicates
df.groupby('team').agg(list).T.to_dict()
output:
{'Alpha': {'id': [10, 15, 20],
'name': ['Jack', 'John', 'John'],
'salary': [1000, 2000, 2000]},
'Bravo': {'id': [50, 55, 60],
'name': ['Thomas', 'Robert', 'Albert'],
'salary': [5000, 6000, 7000]}}
expected output WITHOUT duplicates
df.groupby('team').agg(lambda x: list(set(x))).T.to_dict()
output:
{'Alpha': {'id': [10, 20, 15],
'name': ['Jack', 'John'],
'salary': [1000, 2000]},
'Bravo': {'id': [50, 60, 55],
'name': ['Thomas', 'Albert', 'Robert'],
'salary': [5000, 7000, 6000]}}

Nested dictionary from a txt with the dictionary

I have a txt file with the dictionary like this:
{'origin': {'Ukraine': 50, 'Portugal': 20, 'others': 10}, 'native language': {'ucranian': 50; 'english': 45, 'russian': 30, 'others': 10}, 'second language': {'ucranian': 50; 'english': 45, 'russian': 30, 'others': 10, 'none': 0}, 'profession': {'medical doctor': 50, 'healthcare professional': 40, 'cooker': 30, 'others': 10, 'spy': 0}, 'first aid skills': {'yes': 50, 'no': 0}, 'driving skills': {'yes': 40, 'no': 0}, 'cooking skills': {'yes': 50, 'some': 30, 'no': 0}, 'IT skills': {'yes': 50, 'little': 35, 'no': 0}}
And I want to create a dictionary from this
I tried using ast.literal_eval but it gives me the following error:
SyntaxError: expression expected after dictionary key and ':'
This is my code :
def helpersSkills(helpersFile, skillsFile):
"""
"""
helpers = open(helpersFile, 'r')
skills = open(skillsFile, 'r')
skillsLines = skills.read()
dictionary = ast.literal_eval(skillsLines)
...
helpersSkills('helpersArrived2.txt', 'skills.txt')
as said by #ThierryLathuille it was just some writing errors in the txt file
so its working:
{'origin': {'Ukraine': 50, 'Portugal': 20, 'others': 10}, 'native language': {'ucranian': 50, 'english': 45, 'russian': 30, 'others': 10}, 'second language': {'ucranian': 50, 'english': 45, 'russian': 30, 'others': 10, 'none': 0}, 'profession': {'medical doctor': 50, 'healthcare professional': 40, 'cooker': 30, 'others': 10, 'spy': 0}, 'first aid skills': {'yes': 50, 'no': 0}, 'driving skills': {'yes': 40, 'no': 0}, 'cooking skills': {'yes': 50, 'some': 30, 'no': 0}, 'IT skills': {'yes': 50, 'little': 35, 'no': 0}}
This is the code :
def helpersSkills(helpersFile, skillsFile):
"""
"""
helpers = open(helpersFile, 'r')
skills = open(skillsFile, 'r')
skillsLines = skills.read()
dictionary = ast.literal_eval(skillsLines)
...
helpersSkills('helpersArrived2.txt', 'skills.txt')

Shifting label numbers by new string

I have an example of annotation file
{'text': "BELGIE BELGIQUE BELGIEN\nIDENTITEITSKAART CARTE D'IDENTITE PERSONALAUSWEIS\nBELGIUM\nIDENTITY CARD\nNaam / Name\nDermrive\nVoornamen / Given names\nBrando Jerom L\nGeslacht / Nationaliteit /\nGeboortedatum /\nSex\nNationality\nDate of birth\nM/M\nBEL\n19 05 1982\nRijksregisternr. 7 National Register Nº\n85.08.23-562.77\nKaartnr. / Card Nº\n752-0465474-34\nVervalt op / Expires on\n23 07 2025\n", 'spans': [{'start': 24, 'end': 40, 'token_start': 16, 'token_end': 16, 'label': 'CardType'}, {'start': 41, 'end': 57, 'token_start': 16, 'token_end': 16, 'label': 'CardType'}, {'start': 58, 'end': 73, 'token_start': 15, 'token_end': 15, 'label': 'CardType'}, {'start': 108, 'end': 116, 'token_start': 8, 'token_end': 8, 'label': 'LastName'}, {'start': 141, 'end': 155, 'token_start': 14, 'token_end': 14, 'label': 'FirstName'}, {'start': 229, 'end': 232, 'token_start': 3, 'token_end': 3, 'label': 'Gender_nid'}, {'start': 233, 'end': 236, 'token_start': 3, 'token_end': 3, 'label': 'Nationality_nid'}, {'start': 237, 'end': 247, 'token_start': 10, 'token_end': 10, 'label': 'DateOfBirth_nid'}, {'start': 288, 'end': 303, 'token_start': 15, 'token_end': 15, 'label': 'Ssn'}, {'start': 323, 'end': 337, 'token_start': 14, 'token_end': 14, 'label': 'CardNumber'}, {'start': 362, 'end': 372, 'token_start': 10, 'token_end': 10, 'label': 'ValidUntil_nid'}]}
So when a i have a start and end position of "LastName"entity, in the example is "Dermrive", when i produce another, shorter or longer LastName for example "Brad", i need to change all the rest by difference of this words, so that other labels stays in the correct postition. Its works perfecly with one entity, but when i try to change all of them, the output is messy and labels are not correct anymore.
def replace_text_by_index_and_type(self, new_text, type):
label_position = self.search_label_position_in_spans(self.annotation['spans'], type.value)
label = self.annotation['spans'][label_position]
begin_new_string = self.annotation["text"][:label["start"]]
end_new_string = self.annotation["text"][label["end"]:]
new_string = begin_new_string + new_text + end_new_string
for to_change_ent in self.annotation['spans'][label_position+1:]:
diff = len(new_text) - (label["end"] - label["start"])
self.annotation['spans'][label_position]["end"] = self.annotation['spans'][label_position]["end"] + diff
#print(f"Diff between original {to_change_ent} and new_string: {diff}")
to_change_ent["start"] += diff
to_change_ent["end"] += diff
return new_string
I start to change all entities from the second one, to keep the start position of first one. And add diff to ending position of first entity, as a results the firstname and lastname are correct, but other entities are shifted to mess.

Difference between dict.values and dict[key].values

What is the difference between studentsDict.values() and studentsDict[key].values in the following code?
studentsDict = {'Ayush': {'maths': 24, 'english': 19, 'hindi': 97, 'bio': 20, 'science': 0}, 'Pankaj': {'maths': 52, 'english': 76, 'hindi': 68, 'bio': 97, 'science': 66}, 'Raj': {'maths': 85, 'english': 79, 'hindi': 51, 'bio': 36, 'science': 75}, 'iC5z4DK': {'maths': 24, 'english': 92, 'hindi': 31, 'bio': 29, 'science': 91}, 'Zf1WSV6': {'maths': 81, 'english': 58, 'hindi': 85, 'bio': 31, 'science': 7}}
for key in studentsDict.keys():
for marks in studentsDict[key].values():
if marks < 33:
print(key, "FAILED")
break
studentsDict.keys() gives you each of the keys in the outer dict: "Ayush", "Pankaj", "Raj", "iC5z4DK" and "Zf1WSV6".
studentsDict[key].values() gives you the values for the entry in studentsDict corresponding to key. For example, if key is "Ayush", you would get 24, 19, 97, 20, and 0.

How to iterate a key over a list of lists in a dictionary? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm a bit stumped with this right now. I want to take a list of keys and iterate it over a list of lists
tests = ['test 1', 'test 2', 'test 3']
scores = [[90, 70, 60], [40, 50, 100], [60, 65, 90], [30, 61, 67],
[80, 79, 83], [70, 97, 100]]
Expected outcome:
I want to return a dictionary that shows the following:
'test 1': 90,'test 2' : 70, 'test 3': 60, 'test 1': 40, 'test 2': 50,
'test 3': 100... 'test 1' : 70, 'test 2' : 97, 'test 3':100
test 1: score 1
test 2: score 2
test 3: score 3
Use dict with zip:
[dict(zip(tests, score)) for score in scores]
Output:
[{'test 1': 90, 'test 2': 70, 'test 3': 60},
{'test 1': 40, 'test 2': 50, 'test 3': 100},
{'test 1': 60, 'test 2': 65, 'test 3': 90},
{'test 1': 30, 'test 2': 61, 'test 3': 67},
{'test 1': 80, 'test 2': 79, 'test 3': 83},
{'test 1': 70, 'test 2': 97, 'test 3': 100}]
Dictionaries cannot contain duplicate keys, however, you can use a list of tuples:
tests = ['test 1', 'test 2', 'test 3']
scores = [[90, 70, 60], [40, 50, 100], [60, 65, 90], [30, 61, 67], [80, 79, 83], [70, 97, 100]]
result = [(a, b) for i in scores for a, b in zip(tests, i)]
Output:
[('test 1', 90), ('test 2', 70), ('test 3', 60), ('test 1', 40), ('test 2', 50), ('test 3', 100), ('test 1', 60), ('test 2', 65), ('test 3', 90), ('test 1', 30), ('test 2', 61), ('test 3', 67), ('test 1', 80), ('test 2', 79), ('test 3', 83), ('test 1', 70), ('test 2', 97), ('test 3', 100)]
An even better approach is to group the integers by their target key:
from collections import defaultdict
d = defaultdict(list)
for i in scores:
for a, b in zip(tests, i):
d[a].append(b)
print(dict(d))
Output:
{'test 1': [90, 40, 60, 30, 80, 70], 'test 2': [70, 50, 65, 61, 79, 97], 'test 3': [60, 100, 90, 67, 83, 100]}

Categories

Resources