Retrieve a column in pandas

Retrieve a column in pandas - python

print(pd.read_excel(File,Sheet_Name,0,None,0,None,["Column_Name"],1))
Since i am a noob to pandas i want to retrive a column of ExcelSheet using pandas in the form of array. I tried the code above but it didn't really work.

The way to do it is:
import pandas as pd
df = pd.read_excel(File,sheetname=Sheet_Name)
print(df['column_name'])

Related

Printing Data Frame from xlsx file in specific way

I'm using jupyter nootbook and I want to display my data frame.
I have this code:
import pandas as pd
import openpyxl
f = pd.ExcelFile('urbanpop.xlsx')
f.sheet_names
df1 = f.parse("1960-1966")
print(df1)
When I print the data frame (df1), I get something like that:
And what I want to get, is something like that:
How can I do it?
Thank you.

I found a solution.
Just need to write "df1" instead of "print(df1)"

Converting HTML table to CSV file using python

I am very new to pandas, so I wanted to convert this HTML table to CSV file with the pandas however my CSV file is giving me a weird sign and it didn't manage to covert all the table over to the CSV.
Here's my code. I read about using beautifulsoup but I'm not too sure how to use the function.
import as pandas
df = pd.read_html('https://aim-sg.caas.gov.sg/aip/2020-10-13/final/2020-09-10-Non-AIR'
'AC/html/eAIP/ENR-3.1-en-GB.html?s=B2EE1C5E1D2A684224A194E69D18338A560504FC#ENR-3.1')
df[0].to_csv('ENR3.0.csv')
Thank you!
Edited: I have changed my import to import pandas as dp but i still did not manage to convert all the HTML table to CSV file.
Greatly appreciate all your help!

You can use pandas itself to do this. You have messed up with the import statement. Here is how you do it correctly:
import pandas as pd
df = pd.read_html('https://aim-sg.caas.gov.sg/aip/2020-10-13/final/2020-09-10-Non-AIR'
'AC/html/eAIP/ENR-3.1-en-GB.html?s=B2EE1C5E1D2A684224A194E69D18338A560504FC#ENR-3.1')
df[0].to_csv('ENR3.0.csv', index = False)
If you want to get all the dataframes present within the variable df, then replace the last line with this:
for x in range(len(df)):
df[x].to_csv(f"CSV_File_{x+1}", index = False)

There is issue in import statement
It should be import pandas as pd and not import as pandas, as your are using alias pd in the code below.
Study about beautiful soup and use lxml parser to parse required data ( it is very fast ).
This link might help you out:
BeautifulSoup different parsers
If any other help is required, then do leave a comment on this post and will try to sort our your issue :)
Made correction in your code:
import pandas as pd
df = pd.read_html('https://aim-sg.caas.gov.sg/aip/2020-10-13/final/2020-09-10-Non-AIR'
'AC/html/eAIP/ENR-3.1-en-GB.html?s=B2EE1C5E1D2A684224A194E69D18338A560504FC#ENR-3.1')
df[0].to_csv('ENR3.0.csv')

pandas incorrectly parsing csv

I've created a CSV file from a dataframe with a shape of (1081, 165233). I did this using this command: df2.to_csv("better_matched.csv")
However, whenever I try to load this csv as a pandas dataframe later on, the shape of the dataframe becomes (660, 165234). This is the code that I'm using to load the csv:
import pandas as pd
df = pd.read_csv("/content/drive/My Drive/Authorship/better_matched.csv")
df.shape
I think I need to change some parameters with .to_csv() or .read_csv() functions. What can be done?

Pycharm does not show dataframe

When I'm reading a csv file using the pandas library, but when I print the head of the dataframe it doesn't show it as a dataframe but more like a list.
This is the code I tipped:
import pandas as pd
df = pd.read_csv('Path/File')
print(df.head())
The output looks like this
How do I get it show the data frame properly?

way to generate a specified number dataframe of new csv file from existing csv file in python using pandas

I have large data-frame in a Csv file sample1 from that i have to generate a new Csv file contain only 100 data-frame.i have generate code for it.but i am getting key Error the label[100] is not in the index?
I have just tried as below,Any help would be appreciated
import pandas as pd
data_frame = pd.read_csv("C:/users/raju/sample1.csv")
data_frame1 = data_frame[:100]
data_frame.to_csv("C:/users/raju/sample.csv")`
`

The correct syntax is with iloc:
data_frame.iloc[:100]
A more efficient way to do it is to use nrows argument who purpose is exactly to extract portions of files. This way you avoid wasting resources and time parsing useless rows:
import pandas as pd
data_frame = pd.read_csv("C:/users/raju/sample1.csv", nrows=101) # 100+1 for header
data_frame.to_csv("C:/users/raju/sample.csv")

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Retrieve a column in pandas - python

print(pd.read_excel(File,Sheet_Name,0,None,0,None,["Column_Name"],1)) Since i am a noob to pandas i want to retrive a column of ExcelSheet using pandas in the form of array. I tried the code above but it didn't really work.

The way to do it is: import pandas as pd df = pd.read_excel(File,sheetname=Sheet_Name) print(df['column_name'])

Related

Printing Data Frame from xlsx file in specific way

Converting HTML table to CSV file using python

pandas incorrectly parsing csv

Pycharm does not show dataframe

way to generate a specified number dataframe of new csv file from existing csv file in python using pandas

Categories

Resources