It's showing \n when printing the list [duplicate] - python

This question already has answers here:
How can I remove a trailing newline?
(27 answers)
Handling extra newlines (carriage returns) in csv files parsed with Python?
(6 answers)
Closed last month.
The output:
{'name': 'Peter', 'surname': ' Abdilla', 'DOB': ' 22/02/1986', 'mobileNo': '79811526', 'locality': ' Zabbar\n'}
{'name': 'John', 'surname': ' Borg', 'DOB': ' 12/04/1982', 'mobileNo': '99887654', 'locality': ' Paola\n'}
The expected output is supposed to be:
{'name': 'Peter', 'surname': ' Abdilla', 'DOB': ' 22/02/1986', 'mobileNo': '79811526', 'locality': ' Zabbar'}
{'name': 'John', 'surname': ' Borg', 'DOB': ' 12/04/1982', 'mobileNo': '99887654', 'locality': ' Paola'}

Each line in CSV has an endLine character which is '\n', so when you try to read the contents of the csv file line by line, '\n' will also be in the string.
So, we just need to replace the '\n' with an empty string.
To fix this, use the python's string replace() function.
while True:
x = file.readline()
x = x.replace('\n', '')
if x == '':
break
else:
value = x.split(',')
contact = dict(tuple(zip(keys,value)))
filename.append(contact)

Related

Remove white space in value of python dictionary without converting to string? [duplicate]

This question already has answers here:
How do I remove leading whitespace in Python?
(7 answers)
Closed last year.
I have a dictionary and I need to remove one whitespace at the beginning of the result value.
dct = {
'data': '',
'order': 100,
'result': ' home page',
'step': 'click'
}
So it will end up looking like this:
dct = {
'data': '',
'order': 100,
'result': 'home page',
'step': 'click'
}
You can do that with the following
for key in dct:
if type(dct[key]) == str:
dct[key] = dct[key].strip()
Just .strip() the values if they are strings. Here is the dictionary comprehension approach:
dct = {
'data': '',
'order': 100,
'result': ' home page',
'step': 'click'
}
dct = {k: v.strip() if isinstance(v, str) else v for k, v in dct.items()}
print(dct)
output:
{'data': '', 'order': 100, 'result': 'home page', 'step': 'click'}
.strip() will remove whitespaces from both left and right side of the string. If you need it to remove the left whitespaces only, you can use .lstrip() instead.

Limiting the output

I made a dictionary using .groupdict() function, however, I am having a problem regarding elimination of certain output dictionaries.
For example my code looks like this (tweet is a string that contains 5 elements separated by || :
def somefuntion(pattern,tweet):
pattern = "^(?P<username>.*?)(?:\|{2}[^|]+){2}\|{2}(?P<botprob>.*?)(?:\|{2}|$)"
for paper in tweet:
for item in re.finditer(pattern,paper):
item.groupdict()
This produces an output in the form:
{'username': 'yashrgupta ', 'botprob': ' 0.30794588629999997 '}
{'username': 'sterector ', 'botprob': ' 0.39391528649999996 '}
{'username': 'MalcolmXon ', 'botprob': ' 0.05630123819 '}
{'username': 'ryechuuuuu ', 'botprob': ' 0.08492567222000001 '}
{'username': 'dpsisi ', 'botprob': ' 0.8300337045 '}
But I would like it to only return dictionaries whose botprob is above 0.7. How do I do this?
Specifically, as #WiktorStribizew notes, just skip iterations you don't want:
pattern = "^(?P<username>.*?)(?:\|{2}[^|]+){2}\|{2}(?P<botprob>.*?)(?:\|{2}|$)"
for paper in tweet:
for item in re.finditer(pattern,paper):
item = item.groupdict()
if item["botprob"] < 0.7:
continue
print(item)
This could be wrapped in a generator expression to save the explicit continue, but there's enough going on as it is without making it harder to read (in this case).
UPDATE since you are apparently in a function:
pattern = "^(?P<username>.*?)(?:\|{2}[^|]+){2}\|{2}(?P<botprob>.*?)(?:\|{2}|$)"
items = []
for paper in tweet:
for item in re.finditer(pattern,paper):
item = item.groupdict()
if float(item["botprob"]) > 0.7:
items.append(item)
return items
Or using comprehensions:
groupdicts = (item.groupdict() for paper in tweet for item in re.finditer(pattern, paper))
return [item for item in groupdicts if float(item["botprob"]) > 0.7]
I would like it to only return dictionaries whose botprob is above 0.7.
entries = [{'username': 'yashrgupta ', 'botprob': ' 0.30794588629999997 '},
{'username': 'sterector ', 'botprob': ' 0.39391528649999996 '},
{'username': 'MalcolmXon ', 'botprob': ' 0.05630123819 '},
{'username': 'ryechuuuuu ', 'botprob': ' 0.08492567222000001 '},
{'username': 'dpsisi ', 'botprob': ' 0.8300337045 '}]
filtered_entries = [e for e in entries if float(e['botprob'].strip()) > 0.7]
print(filtered_entries)
output
[{'username': 'dpsisi ', 'botprob': ' 0.8300337045 '}]

Append field to a list

how can i extract just the value?
I have this code :
data = []
with open('city.txt') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
for row in readCSV:
data.append(row[3])
That appends to the list the following (list is massive):
[....... " 'id': 'AX~only~Mariehamn'", " 'id': 'AX~only~Saltvik'", " 'id': 'AX~only~Sund'"]
How can i just append to the list the value of key 'id' ?
i just want to append this to the list: AX~only~Saltvik, and so on ?
city.txt is file containing the following(90k line file) :
{'name': 'Herat', 'displayName': 'Herat', 'meta': {'type': 'CITY'}, 'id': 'AF~HER~Herat', 'countryId': 'AF', 'countryName': 'AF', 'regionId': 'AF~HER', 'regionName': 'HER', 'latitude': 34.3482, 'longitude': 62.1997, 'links': {'self': {'path': '/api/netim/v1/cities/AF~HER~Herat'}}}
{'name': 'Kabul', 'displayName': 'Kabul', 'meta': {'type': 'CITY'}, 'id': 'AF~KAB~Kabul', 'countryId': 'AF', 'countryName': 'AF', 'regionId': 'AF~KAB', 'regionName': 'KAB', 'latitude': 34.5167, 'longitude': 69.1833, 'links': {'self': {'path': '/api/netim/v1/cities/AF~KAB~Kabul'}}}
so on ....
when i print(row) in the for loop statement i get the following(this is just las line of the output):
["{'name': 'Advancetown'", " 'displayName': 'Advancetown'", " 'meta': {'type': 'CITY'}", " 'id': 'AU~QLD~Advancetown'", " 'countryId': 'AU'", " 'countryName': 'AU'", " 'regionId': 'AU~QLD'", " 'regionName': 'QLD'^C: 152.7713", " 'links': {'self': {'path': '/api/netim/v1/cities/AU~QLD~Basin%20Pocket'}}}"]
This answer is assuming that your output is exact, and that each value appended to your list is along the lines of a string, " 'id': 'AX~only~Mariehamn'".
This means that in the base CSV file, the id and value are stored together as a string. You can get the second value through various string functions.
for row in readCSV:
data.append(row[3].split(": ")[1].strip("'"))
The above code splits the string into a list with two parts, one before the colon and one afterwards: [" 'id'", "'AX~only~Mariehamn'". Then, it takes the second value and strips the 's, resulting in a clean string.
It looks like row[3] is a string representing a key, value pair.
I would split it further and select only the value portion:
data = []
with open('city.txt') as csvfile:
readCSV = csv.reader(csvfile, delimiter=',')
for row in readCSV:
data.append(row[3].split(':')[1][2:-1]
[2:-1] is to remove the ' and a space.

How to change inquiry in lambda to a list? [duplicate]

This question already has answers here:
Filter pandas DataFrame by substring criteria
(17 answers)
Closed 2 years ago.
I'm trying to make a list based on a data frame, where if a string is found under the "question" column, it is added. I seem to have made it work with a singular string, but I am not sure how to apply this to a list.
#pd.set_option("display.max_rows", None, "display.max_columns", None)
pd.set_option('display.max_colwidth', -1)
jp = pd.read_csv('jeopardy.csv', delimiter = ",")
jp = jp.rename(columns = {'Show Number': 'show_number', ' Air Date': 'air_date', ' Round': 'round', ' Category': 'category' , " Value": 'value', ' Question': 'question', ' Answer': 'answer'})
#print(jp.head())
print(jp.info())
jp_df = jp[jp.apply(lambda row: 'King' in row['question'], axis = 1)].reset_index(drop=True)
print(jp_df.info())
I think this is what you want:
pd.set_option("display.max_rows", None, "display.max_columns", None)
pd.set_option('display.max_colwidth', -1)
jp = pd.read_csv('jeopardy.csv', delimiter = ",")
jp = jp.rename(columns = {'Show Number': 'show_number', ' Air Date': 'air_date', ' Round': 'round', ' Category': 'category' , " Value": 'value', ' Question': 'question', ' Answer': 'answer'})
values_wanted = ['King', ' Queen']
jp_list = jp[jp['question'].isin(values_wanted)]

check key is present in python dictionary [duplicate]

This question already has answers here:
Check if a given key already exists in a dictionary
(16 answers)
How to check if a key exists in an inner dictionary inside a dictionary in python?
(3 answers)
Closed 5 years ago.
Below is the "data" dict
{' node2': {'Status': ' online', 'TU': ' 900', 'Link': ' up', 'Port': ' a0a-180', 'MTU': ' 9000'}, ' node1': {'Status': ' online', 'TU': ' 900', 'Link': ' up', 'Port': ' a0a-180', 'MTU': ' 9000'}}
I am trying key node2 is present or not in data dict in below code but it is not working. Please help
if 'node2' in data:
print "node2 Present"
else:
print "node2Not present"
if 'node2' in data:
print "node2 Present"
else:
print "node2Not present"
This is a perfectly appropriate way of determining if a key is inside a dictionary, unfortunately 'node2' is not in your dictionary, ' node2' is (note the space):
if ' node2' in data:
print "node2 Present"
else:
print "node2Not present"
Check key is present in dictionary :
data = {'node2': {'Status': ' online', 'TU': ' 900', 'Link': ' up', 'Port': ' a0a-180', 'MTU': ' 9000'}, ' node1': {'Status': ' online', 'TU': ' 900', 'Link': ' up', 'Port': ' a0a-180', 'MTU': ' 9000'}}
In python 2.7 version:
has_key()-Test for the presence of key in the dictionary.
if data.has_key('node2'):
print("found key")
else:
print("invalid key")
In python 3.x version:
key in d - Return True if d has a key, else False.
if 'node2' in data:
print("found key")
else:
print("invalid key")

Categories

Resources