Python input from a <pre> tag - python

So I am writing some code in Python 2.7 to pull some information from a website, pull the relevant data from that set, then format that data in a way that is more useful. Specifically, I am wanting to take information from a html <pre> tag, put it into a file, turn that information in the file into an array (using numpy), and then do my analysis from that. I am stuck on the "put into a file" part. It seems that when I put it into a file, it is a 1x1 matrix or something and so it won't do what I hope it will. On an attempt previous to the code sample below, the error I got was: IndexError: index 5 is out of bounds for axis 0 with size 0 I had the index for array just to test if it would provide output from what I have so far.
Here is my code so far:
#Pulling data from GFS lamps
from lxml import html
import requests
import numpy as np
ICAO = raw_input("What station would you like GFS lamps data for? ")
page = requests.get('http://www.nws.noaa.gov/cgi-bin/lamp/getlav.pl?sta=' + ICAO)
tree = html.fromstring(page.content)
Lamp = tree.xpath('//pre/text()') #stores class of //pre html element in list Lamp
gfsLamps = open('ICAO', 'w') #stores text of Lamp into a new file
gfsLamps.write(Lamp[0])
array = np.genfromtxt('ICAO') #puts file into an array
array[5]
You can use KOGD as the ICAO to test this. As is, I get Value Error: Some Errors were detected and it lists Lines 2-23 (Got 26 columns instead of 8). What is the first step that I am doing wrong for what I want to do? Or am I just going about this all wrong?

The problem isn't in the putting data into the file part, its getting it out using genfromtxt. The problem is that genfromtxt is a very rigid function, mostly needs complete data unless you specify lots of options to skip columns and rows. Use this one instead:
arrays = [np.array(map(str, line.split())) for line in open('ICAO')]
The arrays variable will contain array of each line which contains each individual element in that line seperated by a space, for ex if your line has the following data:
a b cdef 124
the array for this line will be:
['a','b','cdef','124']
arrays will contain array of each line like this, which can be processed as you wish further.
So complete code is:
from lxml import html
import requests
import numpy as np
ICAO = raw_input("What station would you like GFS lamps data for? ")
page = requests.get('http://www.nws.noaa.gov/cgi-bin/lamp/getlav.pl?sta=' + ICAO)
tree = html.fromstring(page.content)
Lamp = tree.xpath('//pre/text()') #stores class of //pre html element in list Lamp
gfsLamps = open('ICAO', 'w') #stores text of Lamp into a new file
gfsLamps.write(Lamp[0])
gfsLamps.close()
array = [np.array(map(str, line.split())) for line in open('ICAO')]
print array

Related

Converting pixels into wavelength using 2 FITS files

I am new to python and FITS image files, as such I am running into issues. I have two FITS files; the first FITS file is pixels/counts and the second FITS file (calibration file) is pixels/wavelength. I need to convert pixels/counts into wavelength/counts. Once this is done, I need to output wavelength/counts as a new FITS file for further analysis. So far I have managed to array the required data as shown in the code below.
import numpy as np
from astropy.io import fits
# read the images
image_file = ("run_1.fits")
image_calibration = ("cali_1.fits")
hdr = fits.getheader(image_file)
hdr_c = fits.getheader(image_calibration)
# print headers
sp = fits.open(image_file)
print('\n\nHeader of the spectrum :\n\n', sp[0].header, '\n\n')
sp_c = fits.open(image_calibration)
print('\n\nHeader of the spectrum :\n\n', sp_c[0].header, '\n\n')
# generation of arrays with the wavelengths and counts
count = np.array(sp[0].data)
wave = np.array(sp_c[0].data)
I do not understand how to save two separate arrays into one FITS file. I tried an alternative approach by creating list as shown in this code
file_list = fits.open(image_file)
calibration_list = fits.open(image_calibration)
image_data = file_list[0].data
calibration_data = calibration_list[0].data
# make a list to hold images
img_list = []
img_list.append(image_data)
img_list.append(calibration_data)
# list to numpy array
img_array = np.array(img_list)
# save the array as fits - image cube
fits.writeto('mycube.fits', img_array)
However I could only save as a cube, which is not correct because I just need wavelength and counts data. Also, I lost all the headers in the newly created FITS file. To say I am lost is an understatement! Could someone point me in the right direction please? Thank you.
I am still working on this problem. I have now managed (I think) to produce a FITS file containing the wavelength and counts using this website:
https://www.mubdirahman.com/assets/lecture-3---numerical-manipulation-ii.pdf
This is my code:
# Making a Primary HDU (required):
primaryhdu = fits.PrimaryHDU(flux) # Makes a header # or if you have a header that you’ve created: primaryhdu = fits.PrimaryHDU(arr1, header=head1)
# If you have additional extensions:
secondhdu = fits.ImageHDU(wave)
# Making a new HDU List:
hdulist1 = fits.HDUList([primaryhdu, secondhdu])
# Writing the file:
hdulist1.writeto("filename.fits", overwrite=True)
image = ("filename.fits")
hdr = fits.open(image)
image_data = hdr[0].data
wave_data = hdr[1].data
I am sure this is not the correct format for wavelength/counts. I need both wavelength and counts to be contained in hdr[0].data
If you are working with spectral data, it might be useful to look into specutils which is designed for common tasks associated with reading/writing/manipulating spectra.
It's common to store spectral data in FITS files using tables, rather than images. For example you can create a table containing wavelength, flux, and counts columns, and include the associated units in the column metadata.
The docs include an example on how to create a generic "FITS table" writer with wavelength and flux columns. You could start from this example and modify it to suit your exact needs (which can vary quite a bit from case to case, which is probably why a "generic" FITS writer is not built-in).
You might also be able to use the fits-wcs1d format.
If you prefer not to use specutils, that example still might be useful as it demonstrates how to create an Astropy Table from your data and output it to a well-formatted FITS file.

How convert multidimensional array to two dimensional array

Here, my code feats value form text file; and create matrices as multidimensional array, but the problem is the code create more then two dimensional array, that I can't manipulate, I need two dimensional array, how I do that?
Explain algorithm of my code:
Moto of code:
My code fetch value from a specific folder, each folder contain 7 'txt' file, that generate from one user, in this way multiple folder contain multiple data of multiple user.
step1: Start a 1st for loop, and control it using how many folder have in specific folder,and in variable 'path' store the first path of first folder.
step2: Open the path and fetch data of 7 txt file using 2nd for loop.after feats, it close 2nd for loop and execute the rest code.
step3: Concat the data of 7 txt file in one 1d array.
step4(Here the problem arise): Store the 1d arry of each folder as 2d array.end first for loop.
Code:
import numpy as np
from array import *
import os
f_path='Result'
array_control_var=0
#for feacth directory path
for (path,dirs,file) in os.walk(f_path):
if(path==f_path):
continue
f_path_1= path +'\page_1.txt'
#Get data from page1 indivisualy beacuse there string type data exiest
pgno_1 = np.array(np.loadtxt(f_path_1, dtype='U', delimiter=','))
#only for page_2.txt
f_path_2= path +'\page_2.txt'
with open(f_path_2) as f:
str_arr = ','.join([l.strip() for l in f])
pgno_2 = np.asarray(str_arr.split(','), dtype=int)
#using loop feach data from those text file.datda type = int
for j in range(3,8):
#store file path using variable
txt_file_path=path+'\page_'+str(j)+'.txt'
if os.path.exists(txt_file_path)==True:
#genarate a variable name that auto incriment with for loop
foo='pgno_'+str(j)
else:
break
#pass the variable name as string and store value
exec(foo + " = np.array(np.loadtxt(txt_file_path, dtype='i', delimiter=','))")
#z=np.array([pgno_2,pgno_3,pgno_4,pgno_5,pgno_6,pgno_7])
#marge all array from page 2 to rest in single array in one dimensation
f_array=np.concatenate((pgno_2,pgno_3,pgno_4,pgno_5,pgno_6,pgno_7), axis=0)
#for first time of the loop assing this value
if array_control_var==0:
main_f_array=f_array
else:
#here the problem arise
main_f_array=np.array([main_f_array,f_array])
array_control_var+=1
print(main_f_array)
current my code generate array like this(for 3 folder)
[
array([[0,0,0],[0,0,0]]),
array([0,0,0])
]
Note: I don't know how many dimension it have
But I want
[
array(
[0,0,0]
[0,0,0]
[0,0,0])
]
I tried to write a recursive code that recursively flattens the list of lists into one list. It gives the desired output for your case, but I did not try it for many other inputs(And it is buggy for certain cases such as :list =[0,[[0,0,0],[0,0,0]],[0,0,0]])...
flat = []
def main():
list =[[[0,0,0],[0,0,0]],[0,0,0]]
recFlat(list)
print(flat)
def recFlat(Lists):
if len(Lists) == 0:
return Lists
head, tail = Lists[0], Lists[1:]
if isinstance(head, (list,)):
recFlat(head)
return recFlat(tail)
else:
return flat.append(Lists)
if __name__ == '__main__':
main()
My idea behind the code was to traverse the head of each list, and check whether it is an instance of a list or an element. If the head is an element, this means I have a flat list and I can return the list. Else, I should recursively traverse more.

creating a numpy array in a loop

I want to create a numpy array by parsing a .txt file. The .txt file consists of features of iris flowers seperated by commas. every line is has one flower example with 5 data seperated with 4 commas. first 4 number is features and the last one is the name. I parse the .txt in a loop and want to append (using numpy.append probably) every lines parsed data into a numpy array called feature_table.
heres the code;
import numpy as np
iris_data = open("iris_data.txt", "r")
for line in iris_data:
currentline = line.split(",")
#iris_data_parsed = (currentline[0] + " , " + currentline[3] + " , " + currentline[4])
#sepal_length = numpy.array(currentline[0])
#petal_width = numpy.array(currentline[3])
#iris_names = numpy.array(currentline[4])
feature_table = np.array([currentline[0]],[currentline[3]],[currentline[4]])
print (feature_table)
print(feature_table.shape)
so I want to create a numpy array using only first, fourth and fifth data in every line
but I can't make it work as I want to. tried reading numpy docs but couldn't understand it.
While the people in the comments are right in that you are not persisting your data anywhere, your problem, I assume, is incorrect np.array construction. You should enclose all of the arguments in a list like this:
feature_table = np.array([currentline[0],currentline[3],currentline[4]])
And get rid of redundant [ and ] around the arguments.
See the official documentation for more examples. Basically all of the input data needs to be grouped/separated to be only 1 argument as Python will consider the other arguemnts as different positional arguments.

I want to extract the MAC address of the following python-wifi 0.6.1 code

The following code tries to extract the MAC address of all nearby APs and then saves them in a matrix, i dont know what is wrong. it uses the library python-wifi 0.6.1. Here is the code and error:
`
import errno
import sys
import types
import pythonwifi.flags
from pythonwifi.iwlibs import Wireless, WirelessInfo, Iwrange, getNICnames, getWNICnames
i=0
ArregloMAC=[20][30]
wifi= Wireless('wlan0')
results = wifi.scan()
(num_channels, frequencies) = wifi.getChannelInfo()
print "%-8.16s Scan completed :" % (wifi.ifname, )
for ap in results:
index = 1
ArregloMAC[i][index-1]= str("%d-%s" % (_, ap.bssid))
index = index+1
print ArregloMAC`
IndexError: list index out of range
This line
ArregloMAC=[20][30]
will give you an index out of range straight off. What it says is, create a list of one element, [20], then take the 31st element of that list, and assign it to ArregloMAC. Since the list has only one element you will inevitably get an error.
It looks like you are trying to declare a 2-dimensional array. That is not how Python lists work.

I want to store more than one 2d arrays in file using python

I am using python 2.7. I tried to store 2d arrays in file, but it stored only recent value. Suppose if I enter values for 3 arrays which are of 4 rows and two columns then it just store recent single value which i entered for last array. I used numpy for taking input for array. I tried this code:
import numpy as np
from math import *
def main ():
i_p = input("\n Enter number of input patterns:")
out = raw_input("\n Enter number of output nodes:")
hidden = raw_input("\n Enter number of hidden layers:")
print 'Patterns:'
for p in range(0,i_p):
print "z[%d]"%p
rows=input("Enter no of rows:")
cols=input("Enter no of coloumns:")
ff=open('array.txt','w')
for r in range(0,rows):
for c in range(0,cols):
z=np.matrix(input())
ff.write(z)
np.savetxt('array.txt',z)
if __name__=="__main__":
main()
Your
np.savetxt('array.txt',z)
opens the file for a fresh write; thus it destroys anything written to that file before.
Try:
ff=open('array.txt','w')
for i in range(3):
z = np.ones((3,5))*i
np.savetxt(ff,z)
This should write 9 lines, with 5 columns
I was going to adapt your:
for r in range(0,rows):
for c in range(0,cols):
z=np.matrix(input())
np.savetxt...
But that doesn't make sense. You don't write by 'column' with savetxt.
Go to a Python interpreter, make a simple array (not np.matrix), and save it. Make several arrays and save those. Look at what you saved.

Categories

Resources