Simplify these assignment statements? [duplicate] - python

This question already has answers here:
How do I create variable variables?
(17 answers)
Closed 6 years ago.
Is there some way to simplify this code? Maybe using def or a for loop or lists or something? Thank you!
c=0
c2=0
c3=0
c4=0
c5=0
c6=0
c7=0
c8=0
c9=0
c10=0
c11=0
c12=0
c13=0
c14=0
c15=0
c16=0
c17=0
c18=0
c19=0
c20=0
c21=0
c22=0
c23=0
c24=0
c25=0
c26=0

I would much rather use a dictionary here:
>>> d = {"c{}".format(val): 0 for val in range(27)}
>>> d
{'c19': 0, 'c18': 0, 'c13': 0, 'c12': 0, 'c11': 0, 'c10': 0, 'c17': 0, 'c16': 0, 'c15': 0, 'c14': 0, 'c9': 0, 'c8': 0, 'c3': 0, 'c2': 0, 'c1': 0, 'c0': 0, 'c7': 0, 'c6': 0, 'c5': 0, 'c4': 0, 'c22': 0, 'c23': 0, 'c20': 0, 'c21': 0, 'c26': 0, 'c24': 0, 'c25': 0}
>>> d.get('c15')
0
>>> d.get('c10000')
None

There are a couple of solutions. What are you using all these variables for? The best solution is probably to put them in a list. So:
c = [0 for _ in range(number_of_variables)]
Then you access them like c[0] c[26] etc

Related

How to construct a minimum bound box tuple for each geometry in a GeoDataFrame

I have a geopandas GeoDataFrame of lakes. I am trying to create a new column named 'MBB' with the bounding box for each lake.
I am using the bounds function from GeoPandas. However, this function exports minx, miny, maxx, and maxy in four separate columns.
# Preview the Use of the .bounds method to ensure it is exporting properly
lakes_a['geometry'].bounds
minx
miny
maxx
maxy
-69.37
44.19
-69.36
44.20
-69.33
44.19
-69.33
44.19
My desired output would look like the below and be able to be reinserted into the GeoPandasDataFrame
MBB
(-69.37, 44.19, -69.36, 44.20)
(-69.33, 44.19, -69.33, 44.19)
My gut tells me that I need to use either shapely.Geometry.Polygon or shapely.Geometry.box
The Polygon data used to create these is as follows.
Note: This is my first time working with GeoPandas (and new to Python as well); please forgive me if I made any mistakes :)
POLYGON Z ((-69.37232840276027 44.202966598054786 0, -69.37216940276056 44.202966598054786 0, -69.37181966942774 44.20276073138842 0, -69.37156540276146 44.20154879805699 0, -69.37092960276249 44.20138873139058 0, -69.370580002763 44.20111433139101 0, -69.37051640276309 44.20049693139197 0, -69.37042106942994 44.20042833139206 0, -69.37038926942995 44.20015393139249 0, -69.37013506943038 44.19976513139312 0, -69.36969020276439 44.19939919806035 0, -69.36838700276638 44.19903333139422 0, -69.36800546943368 44.198827531394556 0, -69.36787826943385 44.19864459806149 0, -69.3678466694339 44.19784419806274 0, -69.36797380276704 44.1973183313969 0, -69.36876860276584 44.19663233139795 0, -69.36759246943433 44.19658639806471 0, -69.3667658694356 44.1971809980638 0, -69.36641646943616 44.19722673139705 0, -69.36597146943683 44.19695219806414 0, -69.36549480277091 44.196403398065 0, -69.36470006943881 44.19583173139921 0, -69.36425520277282 44.19562593139955 0, -69.3618714694432 44.19500819806717 0, -69.36158546944364 44.19471099806759 0, -69.36152220277705 44.193887798068886 0, -69.36066406944508 44.19363613140263 0, -69.3604098027788 44.19345319806956 0, -69.3604098027788 44.193270198069854 0, -69.36066420277837 44.192995798070285 0, -69.36069540277833 44.19279379807057 0, -69.36069600277835 44.19278999807062 0, -69.36082306944479 44.19276719807061 0, -69.36098206944456 44.19237839807124 0, -69.3623808694424 44.19091499807348 0, -69.36288200277494 44.19074539807377 0, -69.36292126944159 44.19073213140712 0, -69.36342966944079 44.19084653140692 0, -69.36371580277364 44.191029531406684 0, -69.3639380027733 44.19198999807185 0, -69.36419220277293 44.19217279807157 0, -69.36451000277242 44.192195731404865 0, -69.36520940277131 44.191784131405484 0, -69.36587680277029 44.19157833140582 0, -69.3665442694359 44.19157853140581 0, -69.36733886943472 44.191761398072174 0, -69.36772020276743 44.19199013140519 0, -69.36791080276714 44.192516131404375 0, -69.368006002767 44.19256193140427 0, -69.36803786943364 44.19281339807054 0, -69.36845100276634 44.192767598070645 0, -69.36861000276605 44.19210453140499 0, -69.3694046027648 44.19155559807251 0, -69.36997680276392 44.1913039980729 0, -69.37058060276303 44.19118973140644 0, -69.37340926942528 44.19130413140624 0, -69.37448980275695 44.191601331405764 0, -69.37506200275607 44.19155559807251 0, -69.37541146942215 44.191326931406195 0, -69.37579286942156 44.19137273140615 0, -69.3759200027547 44.19146413140601 0, -69.37588826942141 44.19208153140505 0, -69.37534800275563 44.19322493140328 0, -69.37525260275572 44.19397959806872 0, -69.37541166942219 44.19436839806815 0, -69.37582466942155 44.19489433140069 0, -69.37633326942074 44.19521439806681 0, -69.37671466942015 44.19532873139997 0, -69.37798606941817 44.19532859806668 0, -69.37817680275123 44.19542013139983 0, -69.37801800275145 44.19578599806596 0, -69.37757286941883 44.19601473139892 0, -69.3765240027538 44.19601473139892 0, -69.37601546942125 44.19628913139849 0, -69.37557046942192 44.196723598064466 0, -69.37531620275564 44.1972039313971 0, -69.37528446942235 44.198598798061596 0, -69.37544340275548 44.19921619806064 0, -69.37582486942154 44.199970931392784 0, -69.37588846942145 44.20049679805862 0, -69.37607920275445 44.2009541980579 0, -69.37607926942115 44.20184593138987 0, -69.37582486942154 44.20223473138924 0, -69.37493486942293 44.2030807980546 0, -69.3744898694236 44.20337813138747 0, -69.37394946942442 44.20351539805392 0, -69.37340920275864 44.20351539805392 0, -69.37293226942603 44.2031037980546 0, -69.37232840276027 44.202966598054786 0))
POLYGON Z ((-69.33154920282357 44.19536753139994 0, -69.33170806948999 44.195504798066395 0, -69.3318348694898 44.19584779806587 0, -69.33212086948936 44.196076598065474 0, -69.33224780282251 44.196396798064995 0, -69.3329150028215 44.19676293139776 0, -69.33291466948816 44.19706019806398 0, -69.33278746948832 44.19726599806364 0, -69.33211986948936 44.19733433139686 0, -69.33103926949104 44.19719673139707 0, -69.3307216028249 44.19701373139736 0, -69.33069020282494 44.19653339806479 0, -69.33046780282524 44.19630473139847 0, -69.33046800282528 44.1960073980656 0, -69.33094520282452 44.195458798066454 0, -69.33154920282357 44.19536753139994 0))
You could use pandas.DataFrame.to_records:
pd.Series(
lakes_a['geometry'].bounds.to_records(index=False),
index=lakes_a.index,
)

Covert complexed list to flat list

I have a long list complexed of numpy arrays and integers, below is an example:
[array([[2218.67288865]]), array([[1736.90215229]]), array([[1255.13141592]]), array([[773.36067956]]), array([[291.58994319]]), 0, 0, 0, 0, 0, 0, 0, 0, 0]
and i'd like to convert it to a regular list as so:
[2218.67288865, 1736.90215229, 1255.13141592, 773.36067956, 291.58994319, 0, 0, 0, 0, 0, 0, 0, 0, 0]
How can I do that efficiently?
You can use a generator for flattening the nested list:
def convert(obj):
try:
for item in obj:
yield from convert(item)
except TypeError:
yield obj
result = list(convert(data))
list(itertools.from_iterable(itertools.from_iterable(...))) should work for removing 2 levels of nesting: just add or remove copies of itertools.from_iterable(...) as needed.
Here the simplest seems to also be the fastest:
x = [array([[2218.67288865]]), array([[1736.90215229]]), array([[1255.13141592]]), array([[773.36067956]]), array([[291.58994319]]), 0, 0, 0, 0, 0, 0, 0, 0, 0]
[y if y.__class__==int else y.item(0) for y in x]
# [2218.67288865, 1736.90215229, 1255.13141592, 773.36067956, 291.58994319, 0, 0, 0, 0, 0, 0, 0, 0, 0]
timeit(lambda:[y if y.__class__==int else y.item(0) for y in x])
# 2.198630048893392
You can stick to numpy by using np.ravel:
np.hstack([np.ravel(i) for i in l]).tolist()
Output:
[2218.67288865,
1736.90215229,
1255.13141592,
773.36067956,
291.58994319,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0,
0.0]

how to append keys and values from nested dictionaries to lists

I build three nested dictionaries to analyze my big data. I try to anylyze values inside them to make a scatter plot, so I am creating a list to append my data to them and then make a scatterplot by matplotlib. My problem is that I get an error while I try to append! TypeError: unhashable type: 'list'. so i confused to change structure of my dictionaries or is there possibility to handle it by this from that i have created.
my dictionaries structure are respectively like:
data_geo1:
'ENSG00000268358': {'Sample_19-leish_023_v2': 0, 'Sample_4-leish_012_v3': 0, 'Sample_25-leish027_v2': 0, 'Sample_6-leish_015_v3': 0, 'Sample_23-leish026_v2': 1, 'Sample_20-leish_023_v3': 0, 'Sample_18-leish_022_v3': 0, 'Sample_10-leish_017_v3': 0, 'Sample_13-leish_019_v2': 0, 'Sample_1-Leish_011_v2': 0, 'Sample_11-leish_018_v2': 0, 'Sample_3-leish_012_v2': 0, 'Sample_2-leish_011_v3': 0, 'Sample_29-leish032_v2': 0, 'Sample_8-leish_016_v3': 0, 'Sample_28-leish028_v3': 0, 'Sample_27-leish028_v2': 1, 'Sample_26-leish027_v3': 0, 'Sample_12-leish_018_v3': 0, 'Sample_5-leish_015_v2': 0, 'Sample_16-leish_021_v3': 0, 'Sample_21-leish_024_v2': 0, 'Sample_9-leish_017_v2': 0, 'Sample_24-leish026_v3': 1, 'Sample_22-leish_024_v3': 0, 'Sample_14-leish_019_v3': 0, 'Sample_30-leish032_v3': 0, 'Sample_7-leish_016_v2': 0, 'Sample_15-leish_021_v2': 0, 'Sample_17-leish_022_v2': 1}
data_ali:
{'ENSG00000268358': {'Sample_19-leish_023_v2': 0, 'Sample_16-leish_021_v3': 2, 'Sample_20': 0, 'Sample_24-leish026_v3': 1, 'Sample_6-leish_015_v3': 0, 'Sample_12-leish_018_v3': 0, 'Sample_22-leish_024_v3': 0, 'Sample_23-leish026_v2': 2, 'Sample_25-leish027_v2': 0, 'Sample_18-leish_022_v3': 1, 'Sample_14': 0, 'Sample_2-leish_011_v3': 0, 'Sample_13-leish_019_v2': 0, 'Sample_1-Leish_011_v2': 0, 'Sample_11-leish_018_v2': 0, 'Sample_20-leish_023_v3': 0, 'Sample_3-leish_012_v2': 0, 'Sample_10-leish_017_v3': 1, 'Sample_7': 0, 'Sample_29-leish032_v2': 1, 'Sample_8-leish_016_v3': 0, 'Sample_6': 0, 'Sample_7-leish_016_v2': 0, 'Sample_9': 0, 'Sample_8': 0, 'Sample_27-leish028_v2': 0, 'Sample_26-leish027_v3': 0, 'Sample_5': 1, 'Sample_4': 0, 'Sample_3': 0, 'Sample_19': 0, 'Sample_1': 0, 'Sample_2': 0, 'Sample_9-leish_017_v2': 0, 'Sample_5-leish_015_v2': 0, 'Sample_4-leish_012_v3': 0, 'Sample_21-leish_024_v2': 0, 'Sample_18': 0, 'Sample_13': 0, 'Sample_12': 0, 'Sample_11': 0, 'Sample_10': 1, 'Sample_17': 0, 'Sample_16': 0, 'Sample_15': 1, 'Sample_14-leish_019_v3': 0, 'Sample_30-leish032_v3': 0, 'Sample_28-leish028_v3': 1, 'Sample_15-leish_021_v2': 0, 'Sample_17-leish_022_v2': 0}
here is all my code structure from beginning, as you see in the end lines i tried to create list and append my values inside a list but i couldn't successful.
import os
import numpy as np
import matplotlib.pyplot as plt
path = "/home/ali/Desktop/data/"
root = "/home/ali/Desktop/SAMPLES/"
data_geo1={}
with open(path+"GSE98212_H_DE_genes_count.txt","rt") as fin: #data for sample 1-30
h = fin.readline()
sample1 = h.split()
sample_names = [s.strip('"') for s in sample1[1:31]]
for l in fin.readlines():
l = l.strip().split()
if l:
gene1= l[0].strip('"')
data_geo1[gene1] = {}
for i, x in enumerate(l[1:31]):
data_geo1[gene1][sample_names[i]] = int(x)
#print(data_geo1)
data_geo2={}
with open (path+"GSE98212_L_DE_genes_count.txt","rt") as fin:
h= fin.readline()
sample2=h.split()
sample_names=sample2[1:21]
for l in fin.readlines():
l = l.strip().split()
if l:
gene2= l[0].strip()
data_geo2[gene2]={}
for i,x in enumerate (l[1:21]):
data_geo2[gene2][sample_names[i]]= int(x)
#print(data_geo2)
data_ali={}
for sample_name in os.listdir(root):
with open(os.path.join(root, sample_name, "counts.txt"), "r") as fin:
for line in fin.readlines():
gene, reads = line.split()
reads = int(reads)
if gene.startswith('ENSG'):
data_ali.setdefault(gene, {})[sample_name] = reads
gene = l[0].strip()
#print(data_ali)
list_samples= data_ali[gene].keys()
#print(list_samples)
for sample in list_samples:
reads_data_ali = []
for gene in data_ali.keys():
reads_data_ali.append(data_ali[gene][sample_name])
i expect the output like :
[[0, 0], [0, 2], [11, 12], [4, 4], [18, 17], [2, 2], [381, 383], [1019, 1020], [198, 194], [66, 65], [2223, 2230], [30, 30], [0, 0], [33, 34], [0, 0], [411, 409], [804, 803], [11829, 7286], [137, 139], [277, 278], [3475, 3482], [5, 5], [2, 1], [70, 70], [48, 48], [234, 232], [121, 120], [928, 925], [220, 159], [165, 165], [702, 700], [1645, 1643], [79, 78], [1064, 1067], [971, 972], [0, 0]]
You can try to avoid the keyerror by checking if the key exist in your dictionary before the .append(...). Try to look at the dictionary .get() method. It's good for prevent this type of error.
As to your description, I suppose your code of making the dictionaries of data_ali and data_geo1got the right outputs, so the problem may be in the last code of making a list.
I find two questions:
1 for gene in data_ali.keys():, in the following loop, reads_data_geo1.append(data_geo1[gene1][sample_names]),here it's [gene1]
2for sample in list_samples:,so maybe you should use reads_data_ali.append(data_ali[gene][sample])
you may revise the name of these variables and see if it works.

Passing array arguments to my own 2D function applied on Pandas groupby

I am given the following pandas dataframe
df
long lat weekday hour
dttm
2015-07-03 00:00:38 1.114318 0.709553 6 0
2015-08-04 00:19:18 0.797157 0.086720 3 0
2015-08-04 00:19:46 0.797157 0.086720 3 0
2015-08-04 13:24:02 0.786688 0.059632 3 13
2015-08-04 13:24:34 0.786688 0.059632 3 13
2015-08-04 18:46:36 0.859795 0.330385 3 18
2015-08-04 18:47:02 0.859795 0.330385 3 18
2015-08-04 19:46:41 0.755008 0.041488 3 19
2015-08-04 19:47:45 0.755008 0.041488 3 19
I also have a function that receives as input 2 arrays:
import pandas as pd
import numpy as np
def time_hist(weekday, hour):
hist_2d=np.histogram2d(weekday,hour, bins = [xrange(0,8), xrange(0,25)])
return hist_2d[0].astype(int)
I wish to apply my 2D function to each and every group of the following groupby:
df.groupby(['long', 'lat'])
I tried passing *args to .apply():
df.groupby(['long', 'lat']).apply(time_hist, [df.weekday, df.hour])
but I get an error: "The dimension of bins must be equal to the dimension of the sample x."
Of course the dimensions mismatch. The whole idea is that I don't know in advance which mini [weekday, hour] arrays to send to each and every group.
How do I do that?
Do:
import pandas as pd
import numpy as np
df = pd.read_csv('file.csv', index_col=0)
def time_hist(x):
hour = x.hour
weekday = x.weekday
hist_2d = np.histogram2d(weekday, hour, bins=[xrange(0, 8), xrange(0, 25)])
return hist_2d[0].astype(int)
print(df.groupby(['long', 'lat']).apply(time_hist))
Output:
long lat
0.755008 0.041488 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
0.786688 0.059632 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
0.797157 0.086720 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
0.859795 0.330385 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
1.114318 0.709553 [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
dtype: object

count objects created in django application in past X days, for each day

I have following unsorted dict (dates are keys):
{"23-09-2014": 0, "11-10-2014": 0, "30-09-2014": 0, "26-09-2014": 0,
"03-10-2014": 0, "19-10-2014": 0, "15-10-2014": 0, "22-09-2014": 0,
"17-10-2014": 0, "29-09-2014": 0, "13-10-2014": 0, "16-10-2014": 0,
"12-10-2014": 0, "25-09-2014": 0, "14-10-2014": 0, "08-10-2014": 0,
"02-10-2014": 0, "09-10-2014": 0, "18-10-2014": 0, "24-09-2014": 0,
"28-09-2014": 0, "10-10-2014": 0, "21-10-2014": 0, "20-10-2014": 0,
"06-10-2014": 0, "04-10-2014": 0, "27-09-2014": 0, "05-10-2014": 0,
"01-10-2014": 0, "07-10-2014": 0}
I am trying to sort it from oldest to newest.
I've tried code:
mydict = OrderedDict(sorted(mydict .items(), key=lambda t: t[0], reverse=True))
to sort it, and it almost worked. It produced sorted dict, but it has ignored months:
{"01-10-2014": 0, "02-10-2014": 0, "03-10-2014": 0, "04-10-2014": 0,
"05-10-2014": 0, "06-10-2014": 0, "07-10-2014": 0, "08-10-2014": 0,
"09-10-2014": 0, "10-10-2014": 0, "11-10-2014": 0, "12-10-2014": 0,
"13-10-2014": 0, "14-10-2014": 0, "15-10-2014": 0, "16-10-2014": 0,
"17-10-2014": 0, "18-10-2014": 0, "19-10-2014": 0, "20-10-2014": 0,
"21-10-2014": 0, "22-09-2014": 0, "23-09-2014": 0, "24-09-2014": 0,
"25-09-2014": 0, "26-09-2014": 0, "27-09-2014": 0, "28-09-2014": 0,
"29-09-2014": 0, "30-09-2014": 0}
How can I fix this?
EDIT:
I need this to count objects created in django application in past X days, for each day.
event_chart = {}
date_list = [datetime.datetime.today() - datetime.timedelta(days=x) for x in range(0, 30)]
for date in date_list:
event_chart[formats.date_format(date, "SHORT_DATE_FORMAT")] = Event.objects.filter(project=project_name, created=date).count()
event_chart = OrderedDict(sorted(event_chart.items(), key=lambda t: t[0]))
return HttpResponse(json.dumps(event_chart))
You can use the datetime module to parse the strings into actual dates:
>>> from datetime import datetime
>>> sorted(mydict .items(), key=lambda t:datetime.strptime(t[0], '%d-%m-%Y'), reverse=True)
If you want to create a json response in the format: {"22-09-2014": 0, 23-09-2014": 0, "localized date": count_for_that_date} so that oldest dates will appear earlier in the output then you could make event_chart an OrderedDict:
event_chart = OrderedDict()
today = DT.date.today() # use DT.datetime.combine(date, DT.time()) if needed
for day in range(29, -1, -1): # last 30 days
date = today - DT.timedelta(days=day)
localized_date = formats.date_format(date, "SHORT_DATE_FORMAT")
day_count = Event.objects.filter(project=name, created=date).count()
event_chart[localized_date] = day_count
return HttpResponse(json.dumps(event_chart))

Categories

Resources