print(pd.read_excel(File,Sheet_Name,0,None,0,None,["Column_Name"],1))
Since i am a noob to pandas i want to retrive a column of ExcelSheet using pandas in the form of array. I tried the code above but it didn't really work.
The way to do it is:
import pandas as pd
df = pd.read_excel(File,sheetname=Sheet_Name)
print(df['column_name'])
Related
I'm using jupyter nootbook and I want to display my data frame.
I have this code:
import pandas as pd
import openpyxl
f = pd.ExcelFile('urbanpop.xlsx')
f.sheet_names
df1 = f.parse("1960-1966")
print(df1)
When I print the data frame (df1), I get something like that:
And what I want to get, is something like that:
How can I do it?
Thank you.
I found a solution.
Just need to write "df1" instead of "print(df1)"
I am very new to pandas, so I wanted to convert this HTML table to CSV file with the pandas however my CSV file is giving me a weird sign and it didn't manage to covert all the table over to the CSV.
Here's my code. I read about using beautifulsoup but I'm not too sure how to use the function.
import as pandas
df = pd.read_html('https://aim-sg.caas.gov.sg/aip/2020-10-13/final/2020-09-10-Non-AIR'
'AC/html/eAIP/ENR-3.1-en-GB.html?s=B2EE1C5E1D2A684224A194E69D18338A560504FC#ENR-3.1')
df[0].to_csv('ENR3.0.csv')
Thank you!
Edited: I have changed my import to import pandas as dp but i still did not manage to convert all the HTML table to CSV file.
Greatly appreciate all your help!
You can use pandas itself to do this. You have messed up with the import statement. Here is how you do it correctly:
import pandas as pd
df = pd.read_html('https://aim-sg.caas.gov.sg/aip/2020-10-13/final/2020-09-10-Non-AIR'
'AC/html/eAIP/ENR-3.1-en-GB.html?s=B2EE1C5E1D2A684224A194E69D18338A560504FC#ENR-3.1')
df[0].to_csv('ENR3.0.csv', index = False)
If you want to get all the dataframes present within the variable df, then replace the last line with this:
for x in range(len(df)):
df[x].to_csv(f"CSV_File_{x+1}", index = False)
There is issue in import statement
It should be import pandas as pd and not import as pandas, as your are using alias pd in the code below.
Study about beautiful soup and use lxml parser to parse required data ( it is very fast ).
This link might help you out:
BeautifulSoup different parsers
If any other help is required, then do leave a comment on this post and will try to sort our your issue :)
Made correction in your code:
import pandas as pd
df = pd.read_html('https://aim-sg.caas.gov.sg/aip/2020-10-13/final/2020-09-10-Non-AIR'
'AC/html/eAIP/ENR-3.1-en-GB.html?s=B2EE1C5E1D2A684224A194E69D18338A560504FC#ENR-3.1')
df[0].to_csv('ENR3.0.csv')
I've created a CSV file from a dataframe with a shape of (1081, 165233). I did this using this command: df2.to_csv("better_matched.csv")
However, whenever I try to load this csv as a pandas dataframe later on, the shape of the dataframe becomes (660, 165234). This is the code that I'm using to load the csv:
import pandas as pd
df = pd.read_csv("/content/drive/My Drive/Authorship/better_matched.csv")
df.shape
I think I need to change some parameters with .to_csv() or .read_csv() functions. What can be done?
When I'm reading a csv file using the pandas library, but when I print the head of the dataframe it doesn't show it as a dataframe but more like a list.
This is the code I tipped:
import pandas as pd
df = pd.read_csv('Path/File')
print(df.head())
The output looks like this
How do I get it show the data frame properly?
I have large data-frame in a Csv file sample1 from that i have to generate a new Csv file contain only 100 data-frame.i have generate code for it.but i am getting key Error the label[100] is not in the index?
I have just tried as below,Any help would be appreciated
import pandas as pd
data_frame = pd.read_csv("C:/users/raju/sample1.csv")
data_frame1 = data_frame[:100]
data_frame.to_csv("C:/users/raju/sample.csv")`
`
The correct syntax is with iloc:
data_frame.iloc[:100]
A more efficient way to do it is to use nrows argument who purpose is exactly to extract portions of files. This way you avoid wasting resources and time parsing useless rows:
import pandas as pd
data_frame = pd.read_csv("C:/users/raju/sample1.csv", nrows=101) # 100+1 for header
data_frame.to_csv("C:/users/raju/sample.csv")