creating a list in python - python

I have an input list using which upon applying some if else logic while trying to save the output in a list. I'm when trying to check the type of it, found it to be "class list". I need to use this list and convert it to data frame in the next step so that i can have some terra data query written on top of it.
Also please note that upon writing the above piece of code in a single Jupiter window and when i go to the next Jupiter window and try query using the list index, i am always getting the last value in the list.
Need help in having the output set to a List, instead of Class List. also how to convert the list/Class list to a DataFrame?.
data = ['login', 'signup', 'account']
for i in range(len(data)):
source = []
if data[i] == 'login':
table = "sales.login_table"
elif data[i] == 'signup':
table = "sales.signup_table"
elif data[i] == 'account':
table = 'sales.account'
elif data[i] == 'addcc':
table = "sales.addcc"
elif data[i] == 'consolidatedfunding':
table = 'sales.consolidatedfunding'
elif data[i] == 'deposit':
table = 'sales.deposit'
elif data[i] == 'holdsassessment':
table = 'sales.holdsassessment'
elif data[i] == 'onboardinggc':
table = 'sales.onboardinggc'
source.append(table)
print(source)
print(source)
output:
['sales.login_table']
['sales.signup_table']
['sales.account']
print(type(source))
output :
<class 'list'>

You need to declare source = [] outside of for-loop, otherwise, in every iteration it'll be declare as empty list.
source = []
for i in range(len(data)):
if data[i] == 'login':
table = "sales.login_table"
elif data[i] == 'signup':
table = "sales.signup_table"
elif data[i] == 'account':
table = 'sales.account'
elif data[i] == 'addcc':
table = "sales.addcc"
elif data[i] == 'consolidatedfunding':
table = 'sales.consolidatedfunding'
elif data[i] == 'deposit':
table = 'sales.deposit'
elif data[i] == 'holdsassessment':
table = 'sales.holdsassessment'
elif data[i] == 'onboardinggc':
table = 'sales.onboardinggc'
source.append(table)
print(source)

Related

More efficient way than iterrows

I have a DataFrame df with columns action and pointerID. In the code snippet below I'm iterating through every row which is pretty inefficient because the DataFrame is pretty large. Is there a more efficient way to do this?
annotated = []
curr_pointers = []
count = 1
for index, row in df.iterrows():
action = row["action"]
id = row["pointerID"]
if action == "ACTION_MOVE":
annotated.append(curr_pointers[id])
elif (action == "ACTION_POINTER_DOWN") or (action == "ACTION_DOWN"):
if row["actionIndex"] != id:
continue
if id >= len(curr_pointers):
curr_pointers.append(count)
else:
curr_pointers[id] = count
annotated.append(count)
count = count + 1
elif (action == "ACTION_POINTER_UP") or (action == "ACTION_UP") or (action == "ACTION_CANCEL"):
if row["actionIndex"] != id:
continue
annotated.append(curr_pointers[id])
else:
print("{} unknown".format(action))

Apply result to dataset after df.iterrows

df = pd.read_csv('./test22.csv')
df.head(5)
df = df.replace(np.nan, None)
for index,col in df.iterrows():
# Extract only if date1 happened earlier than date2
load = 'No'
if col['date1'] == None or col['date2'] == None:
load = 'yes'
elif int(str(col['date1'])[:4]) >= int(str(col['date2'])[:4]) and \
(len(str(col['date1'])) == 4 or len(str(col['date2'])) == 4):
load = 'yes'
elif int(str(col['date1'])[:6]) >= int(str(col['date2'])[:6]) and \
(len(str(col['date1'])) == 6 or len(str(col['date2'])) == 6):
load = 'yes'
elif int(str(col['date1'])[:8]) >= int(str(col['date2'])[:8]):
load = 'yes'
df.head(5)
After preprocessing using iterrows in dataset, If you look at the above code (attached code), it will not be reflected in the actual dataset. I want to reflect the result in actual dataset.
How can I apply it to the actual dataset?
Replace your for loop with a function that returns a boolean, then you can use df.apply to apply it to all rows, and then filter your dataframe by that value:
def should_load(x):
if x['date1'] == None or x['date2'] == None:
return True
elif int(str(x['date1'])[:4]) >= int(str(x['date2'])[:4]) and \
(len(str(x['date1'])) == 4 or len(str(x['date2'])) == 4):
return True
elif int(str(x['date1'])[:6]) >= int(str(x['date2'])[:6]) and \
(len(str(x['date1'])) == 6 or len(str(x['date2'])) == 6):
return True
elif int(str(x['date1'])[:8]) >= int(str(x['date2'])[:8]):
return True
return False
df[df.apply(should_load, axis=1)].head(5)

Confused by using too many for loops, if and else statements

I am developing a plugin for the GIS software, QGIS. I created a QTableWidget and wish to extract values from it:
Problem is, I use a lot of for loops and if else statements which, up until the last few lines, seems to work fine. I can't seem to follow the logic now as the line print constraint_name only prints off the last value "Example_2". I could take it out of its corresponding else statement and then it will print all values correctly but I need to set it inside a condition:
qTable = self.dockwidget.tableWidget # QTableWidget
example_group = root.findGroup('Main group') # Group containing sub groups
all_items = []
gis_map = QgsMapLayerRegistry.instance().mapLayersByName( "Map" )[0] # Layer map in QGIS
idx = gis_map.fieldNameIndex("Rank") # Get "Rank" attribute field from gis_map
for row in range(qTable.rowCount()):
for col in [0]: # For first column "Constraint name"
constraint_item = qTable.item(row, col)
constraint_name = str(constraint_item.text())
for col in [1]: # For second column "Rank"
item = qTable.item(row, col)
item_string = str(item.text())
all_items.append(item_string)
for group in example_group.children(): # Search for specific group
if group.name() == "Sub group":
if len(set(all_items)) == 1: # If all items are the same
# If "Rank" field exists in layer map
if idx == -1:
print 'success'
else:
print 'fail'
else:
if idx == -1:
print constraint_name
else:
print 'fail'
Is there a way to tidy this up and still get the correct results?
My sincere thanks to the commenters who directed me to a much more efficient solution, here is the working code which works (I'm sure it can be refined further):
qTable = self.dockwidget.tableWidget
example_group = root.findGroup('Main group')
all_items = []
gis_map = QgsMapLayerRegistry.instance().mapLayersByName( "Map" )[0]
idx = gis_map.fieldNameIndex("Rank")
for row in range(qTable.rowCount()):
constraint_item = qTable.item(row, 0)
constraint_name = str(constraint_item.text())
item = qTable.item(row, 1)
item_string = str(item.text())
all_items.append(item_string)
for group in example_group.children():
if group.name() == "Sub group":
if idx == -1:
if len(set(all_items)) == 1:
print 'success'
else:
print 'fail'
else:
print constraint_name

Properly pickling and unpickling a dictionary

I've just started working with the Pickle module in Python 3.4.0 and trying to apply it to a simple program which handles Categories and Words. So far it stores everything as planned, but when I try to load back what I had dumped into the file, the structure appears to be empty:
new_data = int(input("New Data File?\n 1 = Yes\n 0 = No\nChoice: "))
if (new_data == 1):
f = open('data.txt', 'wb+')
data_d = {}
pickle.dump(data_d, f)
f.close()
PrMenu()
option = int(input("Option: "))
f = open('data.txt', 'rb+')
d = pickle.load(f)
#Functions inside this menu loop receive the structure (Dictionary)
#and modify it accordingly (add/modify/clear category/word), no
#pickling/unpickling is involved
while (option != 0):
if (option == 1):
add_c(d)
elif (option == 2):
modify_c(d)
elif (option == 3):
clear_c(d)
elif (option == 4):
add_w(d)
elif (option == 5):
modify_w(d)
elif (option == 6):
clear_w(d)
elif (option == 7):
pr_cw(d)
elif (option == 8):
pr_random(d)
if (option != 0):
PrMenu()
option = int(input("Option: "))
#the output structure would be eg. {category_a:[word1, word2, word3, ...], category_b:[..., ...]}
pickle.dump(d, f)
f.close()
print("End of Program")
I'm not sure where the problem is, I hope I was clear enough.
Thanks.
You are appending data to your file. so the first dataset is the empty dictionary, which you read in, and the second dataset is the filled dictionary, which you never read again. You have to seek back to 0 before writing.

Django python escape \n characters

In the following function, i upload a file from a template and pass it to the following function.But the data gets crippled if there is a \n or \t(This is a tab separated file).
1.If there is a \n or some special characters it stores the data in the next row.How to avoid this .
2.data is not None or data != "" still stores a null value
def save_csv(csv_file,cnt):
ret = 1
arr = []
try:
flag = 0
f = open(csv_file)
for l in f:
if flag == 0:
flag += 1
continue
parts = l.split("\t")
counter = 1
if(len(parts) > 6):
ret = 2
else:
taggeddata = Taggeddata()
for data in parts:
data = str(data.strip())
if counter == 1 and (data is not None or data != ""):
taggeddata.field1 = data
elif counter == 2 and (data is not None or data != ""):
taggeddata.field2 = data
elif counter == 3 and (data is not None or data != ""):
taggeddata.field3 = data
elif counter == 4 and (data is not None or data != ""):
taggeddata.field4 = data
elif counter == 5 and (data is not None or data != ""):
taggeddata.field5 = data
elif counter == 6 and (data is not None or data != ""):
taggeddata.field6 = data
counter += 1
taggeddata.content_id = cnt.id
taggeddata.save()
arr.append(taggeddata)
return ret
except:
write_exception("Error while processing data and storing")
Use the stdlib's csv module to parse your text, it will be much better at it.
Your expression data is not None or data != "" is always true, you meant data is not None and data != "". Note that you can simplify this to just elif counter == 3 and data:

Categories

Resources