regex for data preparation and processing afterwards in python

regex for data preparation and processing afterwards in python - python

I have a quiet big file of data, which is not in a really good state for further processing. So I want to regex the best out of it and process this data in pandas for further data analysis.
The Data-Information segment repeats itself within the file and contains the necessary information.
My approach so far for the regex was to get some header information out of it. What I'm missing right now, is all three sections of data points. I only need the header from Points to the last data point. How could I grep these sections into multiple or one group?
^(?:Data-Information.*)
(?:\nName:\t+)(?P<Name>.+)
(?:\nSample:\t+)(?P<Sample>.+)
((?:\r?\n.+)+)
(?:\nSystem:\t+)(?P<System>.+)
(?:\r?\n(?!Data-Information).*)*
Sample file
Data-Information
Name: Polymer A
Sample: Sunday till Monday
User: SUD
Count Segments: 5
Application: RHEOSTAR
Tool: CP
Date/Time: 24.10.2021; 13:37
System: CP25
Constants:
- Csr [min/s]: 2,5421
- Css [Pa/mNm]: 2,54679
Section: 1
Number measuring points: 0
Time limit: 2 measuring points, drop
Duration 30 s
Measurement profile:
Temperature T[-1] = 25 °C
Section: 2
Number measuring points: 30
Time limit: 30 measuring points
Duration 2 s
Points Time Viscosity Shear rate Shear stress Momentum Status
[s] [Pa·s] [1/s] [Pa] [mNm] []
1 62 10,93 100 1.090 4,45 TGC,Dy_
2 64 11,05 100 1.100 4,5 TGC,Dy_
3 66 11,07 100 1.110 4,51 TGC,Dy_
4 68 11,05 100 1.100 4,5 TGC,Dy_
5 70 10,99 100 1.100 4,47 TGC,Dy_
6 72 10,92 100 1.090 4,44 TGC,Dy_
Section: 3
Number measuring points: 0
Time limit: 2 measuring points, drop
Duration 60 s
Section: 4
Number measuring points: 30
Time limit: 30 measuring points
Duration 2 s
Points Time Viscosity Shear rate Shear stress Momentum Status
[s] [Pa·s] [1/s] [Pa] [mNm] []
*** 1 *** 242 -6,334E+6 -0,0000115 72,7 0,296 TGC,Dy_
2 244 63,94 10,3 661 2,69 TGC,Dy_
3 246 35,56 20,7 736 2,99 TGC,Dy_
4 248 25,25 31 784 3,19 TGC,Dy_
5 250 19,82 41,4 820 3,34 TGC,Dy_
Section: 5
Number measuring points: 300
Time limit: 300 measuring points
Duration 1 s
Points Time Viscosity Shear rate Shear stress Momentum Status
[s] [Pa·s] [1/s] [Pa] [mNm] []
1 301 4,142 300 1.240 5,06 TGC,Dy_
2 302 4,139 300 1.240 5,05 TGC,Dy_
3 303 4,138 300 1.240 5,05 TGC,Dy_
4 304 4,141 300 1.240 5,06 TGC,Dy_
5 305 4,156 300 1.250 5,07 TGC,Dy_
6 306 4,153 300 1.250 5,07 TGC,Dy_
Data-Information
Name: Polymer B
Sample: Monday till Tuesday
User: SUD
Count Segments: 5
Application: RHEOSTAR
Tool: CP
Date/Time: 24.10.2021; 13:37
System: CP25
Constants:
- Csr [min/s]: 2,5421
- Css [Pa/mNm]: 2,54679
Section: 1
Number measuring points: 0
Time limit: 2 measuring points, drop
Duration 30 s
Measurement profile:
Temperature T[-1] = 25 °C
Section: 2
Number measuring points: 30
Time limit: 30 measuring points
Duration 2 s
Points Time Viscosity Shear rate Shear stress Momentum Status
[s] [Pa·s] [1/s] [Pa] [mNm] []
1 62 10,93 100 1.090 4,45 TGC,Dy_
2 64 11,05 100 1.100 4,5 TGC,Dy_
3 66 11,07 100 1.110 4,51 TGC,Dy_
4 68 11,05 100 1.100 4,5 TGC,Dy_
5 70 10,99 100 1.100 4,47 TGC,Dy_
6 72 10,92 100 1.090 4,44 TGC,Dy_
Section: 3
Number measuring points: 0
Time limit: 2 measuring points, drop
Duration 60 s
Section: 4
Number measuring points: 30
Time limit: 30 measuring points
Duration 2 s
Points Time Viscosity Shear rate Shear stress Momentum Status
[s] [Pa·s] [1/s] [Pa] [mNm] []
*** 1 *** 242 -6,334E+6 -0,0000115 72,7 0,296 TGC,Dy_
2 244 63,94 10,3 661 2,69 TGC,Dy_
3 246 35,56 20,7 736 2,99 TGC,Dy_
4 248 25,25 31 784 3,19 TGC,Dy_
5 250 19,82 41,4 820 3,34 TGC,Dy_
Section: 5
Number measuring points: 300
Time limit: 300 measuring points
Duration 1 s
Points Time Viscosity Shear rate Shear stress Momentum Status
[s] [Pa·s] [1/s] [Pa] [mNm] []
1 301 4,142 300 1.240 5,06 TGC,Dy_
2 302 4,139 300 1.240 5,05 TGC,Dy_
3 303 4,138 300 1.240 5,05 TGC,Dy_
4 304 4,141 300 1.240 5,06 TGC,Dy_
5 305 4,156 300 1.250 5,07 TGC,Dy_
6 306 4,153 300 1.250 5,07 TGC,Dy_

One option is to do it in 2 steps.
First get all the Data-Information parts using a pattern that starts with Data-Information and matches all following lines that do not start with Data-Information.
^Data-Information(?:\n(?!Data-Information$).*)*
Regex demo for Data-Information
The for every part, you can match the line that start with Points, and then match all following lines that contain at least a character (no empty lines)
^Points\b.*(?:\n.+)+
Regex demo for Points

Related

How to calculate the distance between pedestrians and vehicles in each frame? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I have data from a drone. The first table has the pedestrians' data in each frame. The pedestrians' data has the pedestrian Id, frame, x_est, y_est, v_abs.
The second table is the vehicles' data. The vehicles' data has the vehicle Id, frame, x_est, y_est, vel_est.
For example, in frame number 1, I have 39 pedestrians and two vehicles. I want to create a new table that has the following information:
the first column is the distance between each pedestrian and every vehicle in each frame.
For example, I have 39 pedestrians and 2 vehicles:
d1 = sqrt((ped1_x - veh1_x)^2 + (ped1_y - veh1_y)^2)
d2 = sqrt((ped1_x - veh2_x)^2 + (ped1_y - veh2_y)^2)
d3 = sqrt((ped2_x - veh1_x)^2 + (ped2_y - veh1_y)^2)
d4 = sqrt((ped2_x - veh2_x)^2 + (ped2_y - veh2_y)^2)
and so on
the second and third columns are the associated speed of the pedestrian and vehicles.
For example, if i get d1 then I have to include the speed of ped1 and speed of veh1.
If i get d2, then I have to include the speed of ped1 and speed of veh2 and so on.
I have 116 frames. I want to write code in python or Matlab to do these tasks. I tried python with the following code but didn't work because the vehicles' data has 182 rows and the pedestrians' data has 3950:
if peds["frame"] == vehs["frame"]:
distance = math.sqrt(((peds.x_est-vehs.x_est)**2)+((peds.y_est - vehs.y_est))**2)
I'm thinking to add a for loop to loop over each frame. How to modify the code to loop over each frame and calculate the distances between these objects?
This is an example of the data.
Pedestrians data:
id
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
frame
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
x_est
15.31251243
14.62957291
14.81940554
16.261254
14.25065235
13.71913744
12.77037156
11.82149333
11.02452266
10.4550769
9.962500442
10.56845947
11.17672903
11.70758495
14.28816743
13.41605196
11.6316746
11.4797624
13.3395301
10.98651306
11.89763531
12.46714593
13.90898049
11.17611058
9.20275126
11.10086732
11.66968305
12.88437259
13.37708455
14.17485782
14.81943565
15.65388549
17.7046208
18.76703333
19.22188971
19.03219028
19.63930168
19.26095788
20.54963754
21.87812196
y_est
24.04146967
23.85122822
22.59973819
21.46111998
22.25845431
22.52372129
22.37086056
22.82695733
22.7892778
22.78941678
21.42243054
21.49884121
21.46045752
21.53683577
19.86642817
19.75351002
19.29791125
17.24875682
16.56578255
14.97104209
10.11358571
9.733765266
9.165258769
8.102246321
8.216836276
6.659910277
6.774266283
2.865368769
3.05553263
4.459266668
4.193704362
4.420605884
4.26981189
3.547987191
4.042100815
4.876447865
5.294238544
6.090777216
3.966160063
4.762697865
v_abs
2.459007157
2.654334571
3.315403455
3.389573803
3.378566929
3.045539512
3.23011785
2.925099475
2.584998721
2.901642056
2.811892448
2.151019342
1.96347414
1.500567927
3.540985451
2.709992115
3.267972565
3.395063149
2.721779676
4.012212099
0.880854234
0.813933137
0.704372621
0.912089788
0.549592663
3.007799428
1.978963898
3.44757396
3.15737162
3.529382782
3.5556166
3.409764593
1.170765247
1.580709745
1.085228781
0.922279132
0.802698916
1.875894301
0.804975425
1.205954878
Vehicles data:
id
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
900
901
frame
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
9
9
10
10
11
11
12
12
13
13
14
14
15
15
16
16
17
17
18
18
19
19
20
20
x_est
13.12181538
20.79676544
13.01830532
20.73447182
12.92056089
20.65301595
12.77984823
20.52422202
12.67147093
20.45207468
12.56504216
20.34050999
12.45704283
20.23392863
12.35122477
20.13000728
12.24825411
20.03070371
12.15111691
19.9359101
12.04685347
19.8497257
11.93870377
19.7593045
11.82507185
19.66695714
11.70785362
19.57813133
11.60090539
19.48509113
11.48174963
19.39305883
11.36790853
19.29876475
11.26066404
19.21938896
11.14538066
19.12658484
11.02149472
19.02832259
y_est
12.48631346
12.04945225
12.50757522
12.09243053
12.53150136
12.14119099
12.56295128
12.20178773
12.59034906
12.22451609
12.60945482
12.2800529
12.61885415
12.31947563
12.62842376
12.35799492
12.64016339
12.40403046
12.66696466
12.44343686
12.67398026
12.4828893
12.69217314
12.52761456
12.70404418
12.56747969
12.71201387
12.60893079
12.7352435
12.65130145
12.75948092
12.68935761
12.77039761
12.72362537
12.78182668
12.76864139
12.80477996
12.80719663
12.82141401
12.8529026
vel_est
2.607251494
2.480379041
2.607160714
2.479543211
2.60660445
2.478939563
2.60984436
2.482517229
2.610124888
2.479412431
2.610034708
2.481993315
2.609925021
2.483303075
2.609504867
2.484305008
2.608686875
2.485197239
2.607293512
2.485028991
2.606477634
2.483509058
2.60665034
2.483007857
2.607657856
2.482442343
2.609280114
2.481348408
2.609374438
2.481088229
2.612094232
2.480242567
2.613246915
2.479486909
2.612946522
2.476581726
2.614972124
2.475903808
2.618889831
2.47716006
Thanks in advance

Just think through the problem in words. For each pedestrian entry, for each car entry, if the frame number matches, compute the distance between them, and add a new row. Nothing to it.
ped_id = "0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39"
ped_frame = "1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1"
ped_x_est = "15.31251243 14.62957291 14.81940554 16.261254 14.25065235 13.71913744 12.77037156 11.82149333 11.02452266 10.4550769 9.962500442 10.56845947 11.17672903 11.70758495 14.28816743 13.41605196 11.6316746 11.4797624 13.3395301 10.98651306 11.89763531 12.46714593 13.90898049 11.17611058 9.20275126 11.10086732 11.66968305 12.88437259 13.37708455 14.17485782 14.81943565 15.65388549 17.7046208 18.76703333 19.22188971 19.03219028 19.63930168 19.26095788 20.54963754 21.87812196"
ped_y_est = "24.04146967 23.85122822 22.59973819 21.46111998 22.25845431 22.52372129 22.37086056 22.82695733 22.7892778 22.78941678 21.42243054 21.49884121 21.46045752 21.53683577 19.86642817 19.75351002 19.29791125 17.24875682 16.56578255 14.97104209 10.11358571 9.733765266 9.165258769 8.102246321 8.216836276 6.659910277 6.774266283 2.865368769 3.05553263 4.459266668 4.193704362 4.420605884 4.26981189 3.547987191 4.042100815 4.876447865 5.294238544 6.090777216 3.966160063 4.762697865"
ped_v_abs = "2.459007157 2.654334571 3.315403455 3.389573803 3.378566929 3.045539512 3.23011785 2.925099475 2.584998721 2.901642056 2.811892448 2.151019342 1.96347414 1.500567927 3.540985451 2.709992115 3.267972565 3.395063149 2.721779676 4.012212099 0.880854234 0.813933137 0.704372621 0.912089788 0.549592663 3.007799428 1.978963898 3.44757396 3.15737162 3.529382782 3.5556166 3.409764593 1.170765247 1.580709745 1.085228781 0.922279132 0.802698916 1.875894301 0.804975425 1.205954878"
veh_id = "900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901 900 901"
veh_frame = "1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 16 17 17 18 18 19 19 20 20"
veh_x_est = "13.12181538 20.79676544 13.01830532 20.73447182 12.92056089 20.65301595 12.77984823 20.52422202 12.67147093 20.45207468 12.56504216 20.34050999 12.45704283 20.23392863 12.35122477 20.13000728 12.24825411 20.03070371 12.15111691 19.9359101 12.04685347 19.8497257 11.93870377 19.7593045 11.82507185 19.66695714 11.70785362 19.57813133 11.60090539 19.48509113 11.48174963 19.39305883 11.36790853 19.29876475 11.26066404 19.21938896 11.14538066 19.12658484 11.02149472 19.02832259"
veh_y_est = "12.48631346 12.04945225 12.50757522 12.09243053 12.53150136 12.14119099 12.56295128 12.20178773 12.59034906 12.22451609 12.60945482 12.2800529 12.61885415 12.31947563 12.62842376 12.35799492 12.64016339 12.40403046 12.66696466 12.44343686 12.67398026 12.4828893 12.69217314 12.52761456 12.70404418 12.56747969 12.71201387 12.60893079 12.7352435 12.65130145 12.75948092 12.68935761 12.77039761 12.72362537 12.78182668 12.76864139 12.80477996 12.80719663 12.82141401 12.8529026"
veh_vel_est = "2.607251494 2.480379041 2.607160714 2.479543211 2.60660445 2.478939563 2.60984436 2.482517229 2.610124888 2.479412431 2.610034708 2.481993315 2.609925021 2.483303075 2.609504867 2.484305008 2.608686875 2.485197239 2.607293512 2.485028991 2.606477634 2.483509058 2.60665034 2.483007857 2.607657856 2.482442343 2.609280114 2.481348408 2.609374438 2.481088229 2.612094232 2.480242567 2.613246915 2.479486909 2.612946522 2.476581726 2.614972124 2.475903808 2.618889831 2.47716006"
import math
import pandas as pd
def convert(s,cvt):
return [cvt(k) for k in s.split()]
peds = list(zip(
convert(ped_id,int),
convert(ped_frame,int),
convert(ped_x_est,float),
convert(ped_y_est,float),
convert(ped_v_abs,float)
))
cars = list(zip(
convert(veh_id,int),
convert(veh_frame,int),
convert(veh_x_est,float),
convert(veh_y_est,float),
convert(veh_vel_est,float)
))
newdata = []
for ped in peds:
for car in cars:
if ped[1] != car[1]:
continue
dist = math.sqrt((ped[2]-car[2])**2 + (ped[3]-car[3])**2)
newdata.append( (ped[0], car[0], dist, ped[4], car[4]) )
df = pd.DataFrame( newdata, columns=("ped id", "car id", "distance", "ped vel", "car vel"))
print(df)
Output:
ped id car id distance ped vel car vel
0 0 900 11.760986 2.459007 2.607251
1 0 901 13.186566 2.459007 2.480379
2 1 900 11.464494 2.654335 2.607251
3 1 901 13.316012 2.654335 2.480379
4 2 900 10.254910 3.315403 2.607251
.. ... ... ... ... ...
75 37 901 6.153415 1.875894 2.480379
76 38 900 11.303343 0.804975 2.607251
77 38 901 8.087069 0.804975 2.480379
78 39 900 11.675921 1.205955 2.607251
79 39 901 7.366554 1.205955 2.480379
[80 rows x 5 columns]

Pandas GroupBy with special sum

Lets say I have data like that and I want to group them in terms of feature and type.
feature type size
Alabama 1 100
Alabama 2 50
Alabama 3 40
Wyoming 1 180
Wyoming 2 150
Wyoming 3 56
When I apply df=df.groupby(['feature','type']).sum()[['size']], I get this as expected.
size
(Alabama,1) 100
(Alabama,2) 50
(Alabama,3) 40
(Wyoming,1) 180
(Wyoming,2) 150
(Wyoming,3) 56
However I want to sum sizes with only the same type not both type and feature.While doing this I want to keep indexes as (feature,type) tuple. I mean I want to get something like this,
size
(Alabama,1) 280
(Alabama,2) 200
(Alabama,3) 96
(Wyoming,1) 280
(Wyoming,2) 200
(Wyoming,3) 96
I am stuck trying to find a way to do this. I need some help thanks

Use set_index for MultiIndex and then transform with sum for return same length Series by aggregate function:
df = df.set_index(['feature','type'])
df['size'] = df.groupby(['type'])['size'].transform('sum')
print (df)
size
feature type
Alabama 1 280
2 200
3 96
Wyoming 1 280
2 200
3 96
EDIT: First aggregate both columns and then use transform
df = df.groupby(['feature','type']).sum()
df['size'] = df.groupby(['type'])['size'].transform('sum')
print (df)
size
feature type
Alabama 1 280
2 200
3 96
Wyoming 1 280
2 200
3 96

Here is one way:
df['size'] = df['type'].map(df.groupby('type')['size'].sum())
df.groupby(['feature', 'type'])['size_type'].sum()
# feature type
# Alabama 1 280
# 2 200
# 3 96
# Wyoming 1 280
# 2 200
# 3 96
# Name: size_type, dtype: int64

Graphically displaying BLAST alignments from local source

I have an issue that I am trying to work through. I have a large dataset of about 25,000 genes that seem to the product of domain shuffling or gene fusions. I would like to view these alignments in pdf format based on BLAST outfmt 6 output.
I have BLAST output files for each of these genes with 1 sequence (the recombinogenic gene) and a varying number of subject genes with the following columns:
qseqid sseqid evalue qstart qend qlen sstart send slen length
I was hoping to parse the files through some code to produce images like the attached file, using the following example blast output file:
Cluster_1___Hsap10003 Cluster_2___Hsap00200 1e-30 5 100 300 10 105 240 95
Cluster_1___Hsap10003 Cluster_2___Hsap00200 1e-10 200 230 300 205 235 30 95
Cluster_1___Hsap10003 Cluster_3___Aver00900 1e-20 5 100 300 10 105 125 100
Cluster_1___Hsap10003 Cluster_3___Atha00809 1e-20 5 110 300 5 115 120 105
Cluster_1___Hsap10003 Cluster_4___Ecol00002 1e-10 70 170 300 205 235 30 95
Cluster_1___Hsap10003 Cluster_4___Ecol00003 1e-30 75 175 300 10 105 240 95
Cluster_1___Hsap10003 Cluster_4___Sfle00009 1e-10 80 180 300 205 235 30 95
Cluster_1___Hsap10003 Cluster_5___Spom00010 1e-30 160 260 300 10 105 240 95
Cluster_1___Hsap10003 Cluster_5___Scer01566 1e-10 170 270 300 205 235 30 95
Cluster_1___Hsap10003 Cluster_5___Afla00888 1e-30 175 275 300 10 105 240 95
I am looking for the query sequence to be a thick coloured bar, and the alignment section of each subject to be thick colourful bars with thin black lines showing the rest of the gene length (one subject per line showing all alignment sections against the query).
Does anyone know any software or know of any github code that may do something like this?
Thanks so much!

Filtering records in Pandas python - syntax error

I have a pandas data frame that looks like this:
duration distance speed hincome fi_cost type
0 359 1601 4 3 40.00 cycling
1 625 3440 6 3 86.00 cycling
2 827 4096 5 3 102.00 cycling
3 1144 5704 5 2 143.00 cycling
If I use the following I export a new csv that pulls only those records with a distance less than 5000.
distance_1 = all_results[all_results.distance < 5000]
distance_1.to_csv('./distance_1.csv',",")
Now, I wish to export a csv with values from 5001 to 10000. I can't seem to get the syntax right...
distance_2 = all_results[10000 > all_results.distance < 5001]
distance_2.to_csv('./distance_2.csv',",")

Unfortunately because of how Python chained comparisons work, we can't use the 50 < x < 100 syntax when x is some vectorlike quantity. You have several options.
You could create two boolean Series and use & to combine them:
>>> all_results[(all_results.distance > 3000) & (all_results.distance < 5000)]
duration distance speed hincome fi_cost type
1 625 3440 6 3 86 cycling
2 827 4096 5 3 102 cycling
Use between to create a boolean Series and then use that to index (note that it's inclusive by default, though):
>>> all_results[all_results.distance.between(3000, 5000)] # inclusive by default
duration distance speed hincome fi_cost type
1 625 3440 6 3 86 cycling
2 827 4096 5 3 102 cycling
Or finally you could use .query:
>>> all_results.query("3000 < distance < 5000")
duration distance speed hincome fi_cost type
1 625 3440 6 3 86 cycling
2 827 4096 5 3 102 cycling

5001 < all_results.distance < 10000

Python barbs wrong direction

There is probably a really simple answer to this and I'm only asking as a last resort as I usually get my answers by searching but I can't figure this out or find an answer. Basically I'm plotting some wind barbs in Python but they are pointing in the wrong direction and I don't know why.
Data is imported from a file and put into lists, I found on another stackoverflow post how to set the U, V for barbs using np.sin and np.cos, which results in the correct wind speed but the direction is wrong. I'm basically plotting a very simple tephigram or Skew-T.
# Program to read in radiosonde data from a file named "raob.dat"
# Import numpy since we are going to use numpy arrays and the loadtxt
# function.
import numpy as np
import matplotlib.pyplot as plt
# Open the file for reading and store the file handle as "f"
# The filename is 'raob.dat'
f=open('data.dat')
# Read the data from the file handle f. np.loadtxt() is useful for reading
# simply-formatted text files.
datain=np.loadtxt(f)
# Close the file.
f.close();
# We can copy the different columns into
# pressure, temperature and dewpoint temperature
# Note that the colon means consider all elements in that dimension.
# and remember indices start from zero
p=datain[:,0]
temp=datain[:,1]
temp_dew=datain[:,2]
wind_dir=datain[:,3]
wind_spd=datain[:,4]
print 'Pressure/hPa: ', p
print 'Temperature/C: ', temp
print 'Dewpoint temperature: ', temp_dew
print 'Wind Direction/Deg: ', wind_dir
print 'Wind Speed/kts: ', wind_spd
# for the barb vectors. This is the bit I think it causing the problem
u=wind_spd*np.sin(wind_dir)
v=wind_spd*np.cos(wind_dir)
#change units
#p=p/10
#temp=temp/10
#temp_dew=temp_dew/10
#plot graphs
fig1=plt.figure()
x1=temp
x2=temp_dew
y1=p
y2=p
x=np.linspace(50,50,len(y1))
#print x
plt.plot(x1,y1,'r',label='Temp')
plt.plot(x2,y2,'g',label='Dew Point Temp')
plt.legend(loc=3,fontsize='x-small')
plt.gca().invert_yaxis()
#fig2=plt.figure()
plt.barbs(x,y1,u,v)
plt.yticks(y1)
plt.grid(axis='y')
plt.show()
The barbs should all mostly be in the same direction as you can see in the direction in degrees from the data.
Any help is appreciated. Thank you.
Here is the data that is used:
996 25.2 24.9 290 12
963.2 24.5 22.6 315 42
930.4 23.8 20.1 325 43
929 23.8 20 325 43
925 23.4 19.6 325 43
900 22 17 325 43
898.6 21.9 17 325 43
867.6 20.1 16.5 320 41
850 19 16.2 320 44
807.9 16.8 14 320 43
779.4 15.2 12.4 320 44
752 13.7 10.9 325 43
725.5 12.2 9.3 320 44
700 10.6 7.8 325 45
649.7 7 4.9 315 44
603.2 3.4 1.9 325 49
563 0 -0.8 325 50
559.6 -0.2 -1 325 50
500 -3.5 -4.9 335 52
499.3 -3.5 -5 330 54
491 -4.1 -5.5 332 52
480.3 -5 -6.4 335 50
427.2 -9.7 -11 330 45
413 -11.1 -12.3 335 43
400 -12.7 -14.4 340 42
363.9 -16.9 -19.2 350 37
300 -26.3 -30.2 325 40
250 -36.7 -41.2 330 35
200 -49.9 0 335 0
150 -66.6 0 0 10
100 -83.5 0 0 30
Liam

# for the barb vectors. This is the bit I think it causing the problem
u=wind_spd*np.sin(wind_dir)
v=wind_spd*np.cos(wind_dir)
Instead try:
u=wind_spd*np.sin((np.pi/180)*wind_dir)
v=wind_spd*np.cos((np.pi/180)*wind_dir)
(http://tornado.sfsu.edu/geosciences/classes/m430/Wind/WindDirection.html)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

regex for data preparation and processing afterwards in python - python

Related

How to calculate the distance between pedestrians and vehicles in each frame? [closed]

Pandas GroupBy with special sum

Graphically displaying BLAST alignments from local source

Filtering records in Pandas python - syntax error

Python barbs wrong direction

Categories

Resources