How to use map on List of lists in python? - python

I am trying to solve a problem which has to make a .csv file into list of lists (list1) and then I have to use map function to extract the desired output into another list (list2) from list1
the csv file contains data like
Last name, First name, Final, Grade
Alfalfa, Aloysius,49, D-
Alfred, University,48, D+
After making the .csv into list I have to check for the marks if the student will be selected or not by using map on the list1
So here I code it like this
import csv
from curses.ascii import isdigit
def selection(lis):
for x in lis:
if(x.isdigit() and int(x) > 50):
return lis
list1=[]
list2=[]
with open('D:\C++\Programs\Advanced Programming\grades.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)
next(csv_reader)
for line in csv_reader:
list1.append(line)
for i in list1:
r = map(selection, i)
R = list(r)
list2.append(R)
print(list1)
print(list2)
list1 prints correctly
[['Alfalfa', ' Aloysius', '49', ' D-'], ['Alfred', ' University', '48', ' D+']....]
But my list2 is printing
[[None, None, None, None], [None, None, None, None].....]
I am not getting how to use map on list of lists. Why it is printing none. Please help to solve it.

just update ur selection function as
def selection(lis):
return lis if (lis.isdigit() and int(lis) > 50) else None
map sends a single item to the function not the entire list
Output will be
[['boo', 'foo', '53', 'a']]
[[None, None, '53', None]]
you can return anything else other than None if you want in selection function's else statement

Issue #1
The first problem is that you are going a little too deep. The code
for i in list1:
r = map(selection, i)
is sending individual list items to selection when you are expecting the entire list to be sent to selection. You want to change that code to just be
r = map(selection, list1)
How this results in a bunch of Nones
If you add some print statements around the place to debug what is going on you would see that
for i in list1: makes i be a single list such as ['Alfalfa', ' Aloysius', '49', ' D-']
Then map sends each item from our list i to the selection function. So def selection(lis): doesn't actually recieve a list, it will receive an item from the list such as Alfalfa or 49.
Then we have for x in lis: where we check each character in Alfalfa to see if it is a number greater than 50. Is A greater than 5n, No. Is l greater than 50, no. And so on. The for loop finishes, and whenever a function finishes without returning anything, None is returned. Map then moves on to the next item in the list until it gets to a number such as 51 where it will check each character, a 5 and a 1 in this case. Since 5 is not greater than 50 and 1 is not greater than 50, we continue on. As you can see you will never end up with a number greater than 50 since we are in a little too deep checking each individual character instead of the whole item.
Issue #2
The second problem is you want to use filter instead of map to ignore anything that returns None. filter will loop over each item sending each item to selection and will add the item to our list r only if selection returns something true.
The following code does what you want.
import csv
def selection(lis):
for x in lis:
if(x.isdigit() and int(x) > 50):
return True
list1=[]
list2=[]
with open('D:\C++\Programs\Advanced Programming\grades.csv', 'r') as csv_file:
csv_reader = csv.reader(csv_file)
next(csv_reader)
for line in csv_reader:
list1.append(line)
r = filter(selection, i)
list2 = list(r)
print(list1)
print(list2)
How I'd solve this type of problem
Below is how I would go about solving this problem. I'm using list comprehension instead of map or filter.
import csv
csv_file = 'D:\C++\Programs\Advanced Programming\grades.csv'
with open(csv_file) as fh:
csv_reader = csv.reader(fh)
headers = next(csv_reader)
data = list(csv_reader)
passing_grades = [x for x in data if x.isdigit() and int(x) > 50]
print('passing grades: ', passing_grades)

Related

Python can't remove None value from list of lists

I'm trying to remove a None value from a csv file I have. I have converted blank values to None values in the first part of the below code but in the last part when I envoke filter It prints the column_list but the None values remain also. I need to remove them so I can work out max/min values of each which doesn't appear to work with them in the list?
with (inFile) as f:
_= next(f)
list_of_lists = [[float(i) if i.strip() != '' else None for i in line.split(',')[2:]]
for line in f]
inFile.close()
log_GDP_list = [item[0] for item in list_of_lists]
social_support_list = [item[1] for item in list_of_lists]
healthy_birth_list = [item[2] for item in list_of_lists]
freedom_choices_list = [item[3] for item in list_of_lists]
generosity_list = [item[4] for item in list_of_lists]
confidence_gov_list = [item[5] for item in list_of_lists]
column_list = []
column_list.append([log_GDP_list, social_support_list, healthy_birth_list, freedom_choices_list, generosity_list, confidence_gov_list])
res = list(filter(None, column_list))
print(res)
Also, when running the filter on just one of the row lists (such as log_GDP_list) it removes the None values but I still get an error saying I can't run max() or min() on floats (all values were converted from strings to floats in the first bit of the code).
You currently have something like this
l = [
float(i) if i.strip() != '' else None
for i in line.split(',')[2:]
]
what you want is this:
l = [
float(i)
for i in line.split(',')[2:]
if i.strip()
]
This way, when i.strip() evaluates to False, the item wont be added to the resulting list at all.

How to append items in list which starts with alphabet python

Find a sublist which starts with alphabet in python 3?
how to append items in list which starts with alphabet python
import re
code_result = [['1', 'abc_123', '0.40','7.55'], ['paragraph', '100', 'ML MY'],
['2','abc_456', '0.99'], ['letters and words','end','99']]
index_list = []
sub_list = []
for i in range(0,len(code_result)):
if code_result[i][0].isalpha():
index_list.append([i,i-1])
for item in range(0,len(index_list)):
temp = re.sub('[^0-9a-zA-Z]','',str(code_result[index_list[item][0]]))
sub_list.append([code_result[index_list[item][1]][1]+" "+temp])
print(sub_list)
My code works only for one alphabet in the sublist not more than that
Expected Output:
[['abc_123 paragraph 100 MLMY'],['abc_456 letters and words end 99']]
This will do what you need with minimal changes
import re
code_result = [['1', 'abc_123', '0.40','7.55'], ['paragraph', '100', 'ML MY'], ['2','abc_456', '0.99'], ['letters and words','end','99']]
index_list = []
sub_list = []
for i in range(0,len(code_result)):
if code_result[i][0][0].isalpha():
index_list.append([i,i-1])
for item in range(0,len(index_list)):
temp = re.sub('[^0-9a-zA-Z ]','',str(code_result[index_list[item][0]]))
sub_list.append([code_result[index_list[item][1]][1]+" "+temp])
print(sub_list)
BUT I am still unclear of what you are trying to do and think this whatever it is it could've been done better.
Since only letters have an uppercase and lowercase variation, you could use that as a condition. The whole thing could fit in a single list comprehension:
sub_list = [[s for s in a if s[0].lower()!=s[0].upper()] for a in code_result]
# [['abc_123'], ['paragraph', 'ML MY'], ['abc_456'], ['letters and words', 'end']]
Note that your problem statement and expected output are ambiguous. they could also mean:
sub lists that start with an item that only contains letters (based on question title):
[ a for a in code_result if a[0].lower()!=a[0].upper()]
# [['paragraph', '100', 'ML MY'], ['letters and words', 'end', '99']]
OR, based on the expected output, sub list elements that start with a letter, sometimes taken individually and other times using the whole sublist and arbitrarily concatenated into a single string within a sub list.
Here is another solution, that ends up with your desired output, using the built-in startswith method (see the documentation).
import re
code_result = [['1', 'abc_123', '0.40','7.55'], ['paragraph', '100', 'ML MY'], ['2','abc_456', '0.99'], ['letters and words','end','99']]
l1 = []
l2 = []
last = False
for x in code_result:
if last:
for y in range(len(x)):
l1.append(x[y])
if y == len(x)-1:
l2.append([' '.join(l1)])
l1 = []
last = False
else:
for y in x:
a = re.search('^[a-zA-Z]', y)
if a:
l1.append(y)
last = True
break
print(l2)
This code iterates over your list of lists, checks whether an item in a list starts with 'abc' and breaks the inner loop. If the last is True, it appends all items from the subsequent list.

pyspark: keep a function in the lambda expression

I have the following working code:
def replaceNone(row):
myList = []
row_len = len(row)
for i in range(0, row_len):
if row[i] is None:
myList.append("")
else:
myList.append(row[i])
return myList
rdd_out = rdd_in.map(lambda row : replaceNone(row))
Here row is from pyspark.sql import Row
However, it is kind of lengthy and ugly. Is it possible to avoid making the replaceNone function by writing everything in the lambda process directly? Or at least simplify replaceNone()? Thanks!
I'm not sure what your goal is. It seems like you're jsut trying to replace all the None values in each row in rdd_in with empty strings, in which case you can use a list comprehension:
rdd_out = rdd_in.map(lambda row: [r if r is not None else "" for r in row])
The first call to map will make a new list for every element in row and the list comprehension will replace all Nones with empty strings.
This worked on a trivial example (and defined map since it's not defined for a list):
def map(l, f):
return [f(r) for r in l]
l = [[1,None,2],[3,4,None],[None,5,6]]
l2 = map(l, lambda row: [i if i is not None else "" for i in row])
print(l2)
>>> [[1, '', 2], [3, 4, ''], ['', 5, 6]]

Change the display of a list took from text file

I have this code wrote in Python:
with open ('textfile.txt') as f:
list=[]
for line in f:
line = line.split()
if line:
line = [int(i) for i in line]
list.append(line)
print(list)
This actually read integers from a text file and put them in a list.But it actually result as :
[[10,20,34]]
However,I would like it to display like:
10 20 34
How to do this? Thanks for your help!
You probably just want to add the items to the list, rather than appending them:
with open('textfile.txt') as f:
list = []
for line in f:
line = line.split()
if line:
list += [int(i) for i in line]
print " ".join([str(i) for i in list])
If you append a list to a list, you create a sub list:
a = [1]
a.append([2,3])
print a # [1, [2, 3]]
If you add it you get:
a = [1]
a += [2,3]
print a # [1, 2, 3]!
with open('textfile.txt') as f:
lines = [x.strip() for x in f.readlines()]
print(' '.join(lines))
With an input file 'textfiles.txt' that contains:
10
20
30
prints:
10 20 30
It sounds like you are trying to print a list of lists. The easiest way to do that is to iterate over it and print each list.
for line in list:
print " ".join(str(i) for i in line)
Also, I think list is a keyword in Python, so try to avoid naming your stuff that.
If you know that the file is not extremely long, if you want the list of integers, you can do it at once (two lines where one is the with open(.... And if you want to print it your way, you can convert the element to strings and join the result via ' '.join(... -- like this:
#!python3
# Load the content of the text file as one list of integers.
with open('textfile.txt') as f:
lst = [int(element) for element in f.read().split()]
# Print the formatted result.
print(' '.join(str(element) for element in lst))
Do not use the list identifier for your variables as it masks the name of the list type.

Error handling numpy.float?

I am working on a csv file using python.
I wrote the following script to treat the file:
import pickle
import numpy as np
from csv import reader, writer
dic1 = {'a': 2, 'b': 2, 'c': 2}
dic2 = {'a': 2,'b': 2,'c': 0}
number = dict()
for k in dic1:
number[k] = dic1[k] + dic2[k]
ctVar = {'a': [0.093323751331788565, -1.0872670058072453, '', 8.3574590513050264], 'b': [0.053169909627947334, -1.0825742255395172, '', 8.0033788558001984], 'c': [-0.44681777279768059, 2.2380488442495348]}
Var = {}
for k in number:
Var[k] = number[k]
def findIndex(myList, number):
n = str(number)
m = len(n)
for elt in myList:
e = str(elt)
l = len(e)
mi = min(m,l)
if e[:mi-1] == n[:mi-1]:
return myList.index(elt)
def sortContent(myList):
if '' in myList:
result = ['']
myList.remove('')
else:
result = []
myList.sort()
result = myList + result
return result
An extract of the csv file follows: (INFO: The blanks are important. To increase the readability, I noted them BL but they should just be empty cases)
The columns contain few elements (including '') repeated many times.
a
0.0933237513
-1.0872670058
0.0933237513
BL
BL
0.0933237513
0.0933237513
0.0933237513
BL
Second column:
b
0.0531699096
-1.0825742255
0.0531699096
BL
BL
0.0531699096
0.0531699096
0.0531699096
BL
Third column:
c
-0.4468177728
2.2380488443
-0.4468177728
-0.4468177728
-0.4468177728
-0.4468177728
-0.4468177728
2.2380488443
2.2380488443
I just posted an extract of the code (where I am facing a problem) and we can't see its utility. Basically, it is part of a larger code that I use to modify this csv file and encode it differently.
In this extract, I am trying at some point (line 68) to sort elements of a list that contains numbers and ''.
When I remove the line that does this, the elements printed are those of each column (without any repetition).
The problem is that, when I try to sort them, the '' are no longer taken into account. Yet, when I tested my function sortContent with lists that have '', it worked perfectly.
I thought this problem was related to the use of numpy.float64 elements in my list. So I converted all these elements into floats, but the problem remains.
Any help would be greatly appreciated!
I assume you mean to use sortContent on something else (as obviously if you want the values in your predefined lists in ctVar in a certain order, you can just put them in order in your code rather than sorting them at runtime).
Let's go through your sortContent piece by piece.
if '' in myList:
result = ['']
myList.remove('')
If the list object passed in (let's call this List 1) has items '', create a new list object (let's call it List 2) with just '', and remove the first instance of '' from list 1.
mylist.Sort()
Now, sort the contents of list 1.
result = myList + result
Now create a new list object (call it list 3) with the contents of list 1 and list 2.
return result
Keep in mind that list 1 (the list object that was passed in) still has the '' removed.

Categories

Resources