I am very new to pandas, so I wanted to convert this HTML table to CSV file with the pandas however my CSV file is giving me a weird sign and it didn't manage to covert all the table over to the CSV.
Here's my code. I read about using beautifulsoup but I'm not too sure how to use the function.
import as pandas
df = pd.read_html('https://aim-sg.caas.gov.sg/aip/2020-10-13/final/2020-09-10-Non-AIR'
'AC/html/eAIP/ENR-3.1-en-GB.html?s=B2EE1C5E1D2A684224A194E69D18338A560504FC#ENR-3.1')
df[0].to_csv('ENR3.0.csv')
Thank you!
Edited: I have changed my import to import pandas as dp but i still did not manage to convert all the HTML table to CSV file.
Greatly appreciate all your help!
You can use pandas itself to do this. You have messed up with the import statement. Here is how you do it correctly:
import pandas as pd
df = pd.read_html('https://aim-sg.caas.gov.sg/aip/2020-10-13/final/2020-09-10-Non-AIR'
'AC/html/eAIP/ENR-3.1-en-GB.html?s=B2EE1C5E1D2A684224A194E69D18338A560504FC#ENR-3.1')
df[0].to_csv('ENR3.0.csv', index = False)
If you want to get all the dataframes present within the variable df, then replace the last line with this:
for x in range(len(df)):
df[x].to_csv(f"CSV_File_{x+1}", index = False)
There is issue in import statement
It should be import pandas as pd and not import as pandas, as your are using alias pd in the code below.
Study about beautiful soup and use lxml parser to parse required data ( it is very fast ).
This link might help you out:
BeautifulSoup different parsers
If any other help is required, then do leave a comment on this post and will try to sort our your issue :)
Made correction in your code:
import pandas as pd
df = pd.read_html('https://aim-sg.caas.gov.sg/aip/2020-10-13/final/2020-09-10-Non-AIR'
'AC/html/eAIP/ENR-3.1-en-GB.html?s=B2EE1C5E1D2A684224A194E69D18338A560504FC#ENR-3.1')
df[0].to_csv('ENR3.0.csv')
Related
Hey guys I've looked around a lot with importing csv files using pandas however even though my file path is correct I get thrown tons of errors
import pandas as pd
df = pd.read_csv(r"C:\Users\Liam\PycharmProjects\assignment1\pipeline-incidents-comprehensive-data.csv")
print(df)
all errors as seen here
I am very new to python (1 week) so i do realize this is a very simple problem so any assistance is greatly appreciated
import pandas as pd
path = "C:\\Users\\Liam\\PycharmProjects\\assignment1\\pipeline-incidents-comprehensive-data.csv"
df = pd.read_csv()
print(df)
Hello here is a screener tool for Finviz but my stock_list result returns object of type Screener I am trying to put that into a data frame but I am having issues as the data is one long string divided by pipes. I tried to use str but method does not exists in the screener class. I am new to python this looks easy but I just dont know the proper syntax here can anyone help. Thank you!
import pandas as pd
import nest_asyncio
from finviz.screener import Screener
import csv
import sys
from datetime import datetime
nest_asyncio.apply()
filters = ['idx_sp500'] # Shows companies in NASDAQ which are in the S&P500
stock_list = Screener(filters=filters, order='price')
You could output the data to a csv file and read the file using pandas:
stock_list = Screener(filters=filters, order='price')
stock_list.to_csv(filename="stocks.csv")
df = pd.read_csv("stocks.csv")
I have large data-frame in a Csv file sample1 from that i have to generate a new Csv file contain only 100 data-frame.i have generate code for it.but i am getting key Error the label[100] is not in the index?
I have just tried as below,Any help would be appreciated
import pandas as pd
data_frame = pd.read_csv("C:/users/raju/sample1.csv")
data_frame1 = data_frame[:100]
data_frame.to_csv("C:/users/raju/sample.csv")`
`
The correct syntax is with iloc:
data_frame.iloc[:100]
A more efficient way to do it is to use nrows argument who purpose is exactly to extract portions of files. This way you avoid wasting resources and time parsing useless rows:
import pandas as pd
data_frame = pd.read_csv("C:/users/raju/sample1.csv", nrows=101) # 100+1 for header
data_frame.to_csv("C:/users/raju/sample.csv")
Ok so I'm looking to create a program that will interact with an excel spreadsheet. The idea that seemed to work the most is converting it to a csv file. I've managed to make a program that prints the data but I want it to edit it and thus change the results in the csv file itself.
Sorry if it's a bit confusing as my programming skills aren't great.
Heres the code:
import csv
with open('wert.csv') as csvfile:
freq=csv.reader(csvfile, delimiter=',')
for row in freq:
print(row[0],row[1],row[2])
If anyone has a better idea on how to make this program work then it would be greatly appreciated.
Thanks
You could try using the pandas package, a widely used data analysis/manipulation library.
import pandas as pd
data = pd.read_csv('foo.csv')
#change data here, see pandas documentation
data.to_csv('bar.csv')
You can find the docs here
If you csv file is composed of just numbers (floats) or numbers and a header, you can try reading it with:
import numpy as np
data=np.genfromtxt('name.csv',delimiter=',',skip_header=1)
Then modify your data in python, and save it with:
data_modified=data**2 #for example
np.savetxt('name_modified.csv',data_modified,delimiter=',',header='whaterverheader,you,want')
You can read the excel file directly using pandas and do the processing directly
import pandas
measured_data = pandas.read_excel(filename)
print(pd.read_excel(File,Sheet_Name,0,None,0,None,["Column_Name"],1))
Since i am a noob to pandas i want to retrive a column of ExcelSheet using pandas in the form of array. I tried the code above but it didn't really work.
The way to do it is:
import pandas as pd
df = pd.read_excel(File,sheetname=Sheet_Name)
print(df['column_name'])