I'm using jupyter nootbook and I want to display my data frame.
I have this code:
import pandas as pd
import openpyxl
f = pd.ExcelFile('urbanpop.xlsx')
f.sheet_names
df1 = f.parse("1960-1966")
print(df1)
When I print the data frame (df1), I get something like that:
And what I want to get, is something like that:
How can I do it?
Thank you.
I found a solution.
Just need to write "df1" instead of "print(df1)"
Related
please see attached photo
here's the image
I only need to import a specific column with conditions(such as specific data found in that column). And also, I only need to remove unnecessary columns. dropping them takes too much code. What specific code or syntax is applicable?
How to get a column from pandas dataframe is answered in Read specific columns from a csv file with csv module?
To quote:
Pandas is spectacular for dealing with csv files, and the following
code would be all you need to read a csv and save an entire column
into a variable:
import pandas as pd
df = pd.read_csv(csv_file)
saved_column = df.column_name #you can also use df['column_name']
So in your case, you just save the the filtered data frame in a new variable.
This means you do newdf = data.loc[...... and then use the code snippet from above to extract the column you desire, for example newdf.continent
My task is to take an output from a machine, and convert that data to json. I am using python, but the issue is the structure of the output.
From my research online, csv usually has the first row with the keys and the values in the same order underneath. Example: https://e.nodegoat.net/CMS/upload/guide-import_person_csv_notepad.png
However, the output from my machine doesn't look like this.
Mine looks like:
Date:,10/10/2015
Name:,"Company name"
Location:,"Company location"
Serial num:,"Serial number"
So the machine i'm working with is outputting each result on a new .dat file instead of appending onto a single csv with the row of keys and whatnot. Technically, yes the data is separated with csv, but not sure how to work with this.
How should I go about turning this kind of data to json? Should I look into restructuring the data to the default csv? Or is there a way I can work with this and not do any cleanup to convert this? In either case, any direction is appreciated.
You can try transpose using pandas
import pandas as pd
from io import StringIO
data = '''\
Date:,10/10/2015
Name:,"Company name"
Location:,"Company location"
Serial num:,"Serial number"
'''
f = StringIO(data)
df = pd.read_csv(f)
t = df.set_index(df.columns[0]).T
print(t['Location:'][0])
print(t['Serial num:'][0])
When I'm reading a csv file using the pandas library, but when I print the head of the dataframe it doesn't show it as a dataframe but more like a list.
This is the code I tipped:
import pandas as pd
df = pd.read_csv('Path/File')
print(df.head())
The output looks like this
How do I get it show the data frame properly?
I need to extract the domain for example: (http: //www.example.com/example-page, http ://test.com/test-page) from a list of websites in an excel sheet and modify that domain to give its url (example.com, test.com). I have got the code part figured put but i still need to get these commands to work on excel sheet cells in a column automatically.
here's_the_code
I think you should read in the data as a pandas DataFrame (pd.read_excel), make a function from your code then apply to the dframe (df.apply). Then it is easy to save to excel with pd.to_excel().
ofc you will need pandas to be installed.
Something like:
import pandas as pd
dframe = pd.read_excel(io='' , sheet_name='')
dframe['domains'] = dframe['urls col name'].apply(your function)
dframe.to_excel('your path')
Best
I want to put some data available in an excel file into a dataframe in Python.
The code I use is as below (two examples I use to read an excel file):
d=pd.ExcelFile(fileName).parse('CT_lot4_LDO_3Tbin1')
e=pandas.read_excel(fileName, sheetname='CT_lot4_LDO_3Tbin1',convert_float=True)
The problem is that the dataframe I get has the values with only two numbers after comma. In other words, excel values are like 0.123456 and I get into the dataframe values like 0.12.
A round up or something like that seems to be done, but I cannot find how to change it.
Can anyone help me?
thanks for the help !
You can try this. I used test.xlsx which has two sheets, and 'CT_lot4_LDO_3Tbin1' is the second sheet. I also set the first value as Text format in excel.
import pandas as pd
fileName = 'test.xlsx'
df = pd.read_excel(fileName,sheetname='CT_lot4_LDO_3Tbin1')
Result:
In [9]: df
Out[9]:
Test
0 0.123456
1 0.123456
2 0.132320
Without seeing the real raw data file, I think this is the best answer I can think of.
Well, when I try:
df = pd.read_csv(r'my file name')
I have something like that in df
http://imgur.com/a/Q2upp
And I cannot put .fileformat in the sentence
You might be interested in removing column datatype inference that pandas performs automatically. This is done by manually specifying the datatype for the column. Here is what you might be looking for.
Python pandas: how to specify data types when reading an Excel file?
Using pandas 0.20.1 something like this should work:
df = pd.read_csv('CT_lot4_LDO_3Tbin1.fileformat')
for exemple, in excel:
df = pd.read_csv('CT_lot4_LDO_3Tbin1.xlsx')
Read this documentation:
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html