django csv header and row format problem - python

I am trying to create csv download ,but result download gives me in different format
def csv_download(request):
import csv
import calendar
from datetime import *
from dateutil.relativedelta import relativedelta
now=datetime.today()
month = datetime.today().month
d = calendar.mdays[month]
# Create the HttpResponse object with the appropriate CSV header.
response = HttpResponse(mimetype='text/csv')
response['Content-Disposition'] = 'attachment; filename=somefilename.csv'
m=Product.objects.filter(product_sellar = 'jhon')
writer = csv.writer(response)
writer.writerow(['S.No'])
writer.writerow(['product_name'])
writer.writerow(['product_buyer'])
for i in xrange(1,d):
writer.writerow(str(i) + "\t")
for f in m:
writer.writerow([f.product_name,f.porudct_buyer])
return response
output of above code :
product_name
1
2
4
5
6
7
8
9
1|10
1|1
1|2
.
.
.
2|7
mgm | x_name
wge | y_name
I am looking out put like this
s.no porduct_name product_buyser 1 2 3 4 5 6 7 8 9 10 .....27 total
1 mgm x_name 2 3 8 13
2 wge y_name 4 9 13
can you please help me with above csv download ?
if possible can you please tell me how to sum up all the individual user total at end?
Example :
we have selling table in that every day seller info will be inserted
table data looks like
S.no product_name product_seller sold Date
1 paint jhon 5 2011-03-01
2 paint simth 6 2011-03-02
I have created a table where it displays below format and i am trying to create csv download
s.no prod_name prod_sellar 1-03-2011 2-03-2011 3-03-2011 4-03-2011 total
1 paint john 10 15 0 0 25
2 paint smith 2 6 2 0 10

Please read the csv module documentation, particularly the writer object API.
You'll notice that the csv.writer object takes a list with elements representing their position in your delimited line. So to get the desired output, you would need to pass in a list like so:
writer = csv.writer(response)
writer.writerow(['S.No', 'product_name', 'product_buyer'] + range(1, d) + ['total'])
This will give you your desired header output.
You might want to explore the csv.DictWriter class if you want to only populate some parts of the row. It's much cleaner. This is how you would do it:
writer = csv.DictWriter(response,
['S.No', 'product_name', 'product_buyer'] + range(1, d) + ['total'])
Then when your write command would follow as so:
for f in m:
writer.writerow({'product_name': f.product_name, 'product_buyer': f.product_buyer})

Related

how to extract specific data from csv file like"Name", "Address"?

Name
4 A-------
5 ---
6 Father Name
7 ------
8 Gender
9 Country of
10 M
11 Oman
12 Identity Number -n?
13 Date of Birth
14 ------------9
15 28.10.1995
16 ----
17 Date of Issue
18 Date of Expiry
To extract a specific column from a csv file you can simply use the iloc function from the pandas library after reading the initial csv file.
dataset = pd.read_csv("path_of_csv")
# Now once you've read the original csv file you can slice along the columns
# to get the desired column (Example: Name, 1st column)
Name = dataset.iloc[:,0]
Or if you use an older version of pandas, this just might work:
(Definitely works for pandas version 1.3.5)
dataset = pd.read_csv("path_of_csv")
Name = dataset['Name']

Fastest Possible python code that can replace this function

I am trying to optimise a function so it can work on a much larger dataframe.
I have a dataframe (called test_data) that looks like this
To create a toy example I have filtered this dataframe like so:
value_list = ["DDD","MMM","AAPL","MSFT","AMZN","TSLA"]
test_data2 = test_data[test_data['Asset'].isin(value_list)]
I have written a basic function to generate the required output:
def generate_stock_price_dataframe():
price_dataframe = pd.DataFrame()
for stock in test_data2['Asset'].unique():
data = pd.DataFrame(index = test_data2.index.unique())
data[stock] = pd.DataFrame(test_data2.query("Asset==#stock")['Price'])
price_dataframe = pd.concat([price_dataframe,data],axis=1)
stock_price_data = price_dataframe
return stock_price_data
and this gives the required output.
This works nicely for the toy example with only a few assets.
However, When I run this with the full dataframe with 1000's assets...it just doesn't work.
Wheres the best place to start to speed this up?
Thank you
EDIT: Here is some code to recreate the question.
assets = ['AAPL','AAPL','AAPL','AAPL','AAPL','MSFT','MSFT','MSFT','MSFT','MSFT','AMZN','AMZN','AMZN','AMZN','AMZN',]
dates = ['05/01/2021','05/02/2021','05/03/2021','05/04/2021','05/05/2021','05/01/2021','05/02/2021','05/03/2021','05/04/2021','05/05/2021','05/01/2021','05/02/2021','05/03/2021','05/04/2021','05/05/2021']
prices = range(1, 16)
test_data2 = pd.DataFrame(index=dates)
test_data2['Asset'] = assets
test_data2['Price'] = prices
df = generate_stock_price_dataframe()
df.tail()
df = test_data.pivot(columns='Asset')
Output
Price
Asset AAPL AMZN MSFT
05/01/2021 1 11 6
05/02/2021 2 12 7
05/03/2021 3 13 8
05/04/2021 4 14 9
05/05/2021 5 15 10
If we want to drop the Price from Multilevel Columns and the columns axis name Asset.
df = test_data.pivot(columns='Asset').droplevel(0,1).rename_axis(None, axis='columns')
df
Output
AAPL AMZN MSFT
05/01/2021 1 11 6
05/02/2021 2 12 7
05/03/2021 3 13 8
05/04/2021 4 14 9
05/05/2021 5 15 10

Append a variable to each column in a dataframe object

Trying to solve a trading problem, but rephrasing it in a different way.
I have an array of countries as
countries = {'country_name': ['France','Germany','Italy','Japan']}
For each country, I have a CSV stored on my laptop. Each CSV has 3 columns [Date, Birth, Death].
I am making for loop on Array and reading the CSV and creating a dataframe object.
countries = {'country_name': ['France','Germany','Italy','Japan']}
countries = pd.DataFrame(countries)
for country in countries['country_name']:
country_file_name = country + '.csv'
vars()[country] = pd.read_csv(country_file_name)
## Here I want to append country to each column except index
When I do France.head()
I get the output as France
index Birth Deaths
2020-01-01 9 10
2002-01-02 5 12
...
2002-12-10 14 10
But I want the output as France
index France_Birth France_Deaths
2020-01-01 9 10
2002-01-02 5 12
....
2002-12-10 14 10
Note - I do not want to do France.columns= ['France_Birth','France_Deaths'] because it will take me days to do it for all the csv.
I am using jupyternote book here.
https://colab.research.google.com/drive/1aOg3eOhsigbewAhRwQE1QsxGDKzEPyW5?usp=sharing
Note sure there is any way to this or I have to change my approach.
This can be achieved using the rename function of pandas.Dataframe:
countries = {'country_name': ['France','Germany','Italy','Japan']}
countries = pd.DataFrame(countries)
for country in countries['country_name']:
country_file_name = country + '.csv'
vars()[country] = pd.read_csv(country_file_name).rename(columns=lambda s: country + "_" + s)
You can check the documentation here.

How to sum a column in python without calling it a dataframe

I have data that outputs into a csv file as:
url date id hits
a 2017-01-01 123 2
a 2017-01-01 123 2
b 2017-01-01 45 25
c 2017-01-01 123 5
d 2017-01-03 678 1
d 2017-01-03 678 7
and so on where hits is the number of times the id value appears on a given day per url. (ie: the id number 123 appears 2 times on 2017-01-01 for url "a".
I need to create another column after hits, called "total hits" that captures the total number of hits there are per day for a given url, date and id value. So the output would look like this..
url date id hits total_hits
a 2017-01-01 123 2 4
a 2017-01-01 123 2 4
b 2017-01-01 45 25 25
c 2017-01-01 123 5 5
d 2017-01-03 678 1 8
d 2017-01-03 678 7 8
if there are solutions to this without using pandas or numpy that would be amazing.
Please help! Thanks in advance.
Simple with standard python installation.
read & parse file using line-by-line read & split
create a collections.defaultdict(int) to count the occurences of the url/date/id triplet
add the info in an extra column
write back (I chose csv)
like this:
import collections,csv
d = collections.defaultdict(int)
rows = []
with open("input.csv") as f:
title = next(f).split() # skip title
for line in f:
toks = line.split()
d[toks[0],toks[1],toks[2]] += int(toks[3])
rows.append(toks)
# complete data
for row in rows:
row.append(d[row[0],row[1],row[2]])
title.append("total_hits")
with open("out.csv","w",newline="") as f:
cw = csv.writer(f)
cw.writerow(title)
cw.writerows(rows)
here's the output file:
url,date,id,hits,total_hits
a,2017-01-01,123,2,4
a,2017-01-01,123,2,4
b,2017-01-01,45,25,25
c,2017-01-01,123,5,5
d,2017-01-03,678,1,8
d,2017-01-03,678,7,8

Create empty csv file with pandas

I am interacting through a number of csv files and want to append the mean temperatures to a blank csv file. How do you create an empty csv file with pandas?
for EachMonth in MonthsInAnalysis:
TheCurrentMonth = pd.read_csv('MonthlyDataSplit/Day/Day%s.csv' % EachMonth)
MeanDailyTemperaturesForCurrentMonth = TheCurrentMonth.groupby('Day')['AirTemperature'].mean().reset_index(name='MeanDailyAirTemperature')
with open('my_csv.csv', 'a') as f:
df.to_csv(f, header=False)
So in the above code how do I create the my_csv.csv prior to the for loop?
Just a note I know you can create a data frame then save the data frame to csv but I am interested in whether you can skip this step.
In terms of context I have the following csv files:
Each of which have the following structure:
The Day column reads up to 30 days for each file.
I would like to output a csv file that looks like this:
But obviously includes all the days for all the months.
My issue is that I don't know which months are included in each analysis hence I wanted to use a for loop that used a list that has that information in it to access the relevant csvs, calculate the mean temperature then save it all into one csv.
Input as text:
Unnamed: 0 AirTemperature AirHumidity SoilTemperature SoilMoisture LightIntensity WindSpeed Year Month Day Hour Minute Second TimeStamp MonthCategorical TimeOfDay
6 6 18 84 17 41 40 4 2016 1 1 6 1 1 10106 January Day
7 7 20 88 22 92 31 0 2016 1 1 7 1 1 10107 January Day
8 8 23 1 22 59 3 0 2016 1 1 8 1 1 10108 January Day
9 9 23 3 22 72 41 4 2016 1 1 9 1 1 10109 January Day
10 10 24 63 23 83 85 0 2016 1 1 10 1 1 10110 January Day
11 11 29 73 27 50 1 4 2016 1 1 11 1 1 10111 January Day
Just open the file in write mode to create it.
with open('my_csv.csv', 'w'):
pass
Anyway I do not think you should be opening and closing the file so many times. You'd better open the file once, write several times.
with open('my_csv.csv', 'w') as f:
for EachMonth in MonthsInAnalysis:
TheCurrentMonth = pd.read_csv('MonthlyDataSplit/Day/Day%s.csv' % EachMonth)
MeanDailyTemperaturesForCurrentMonth = TheCurrentMonth.groupby('Day')['AirTemperature'].mean().reset_index(name='MeanDailyAirTemperature')
df.to_csv(f, header=False)
Creating a blank csv file is as simple as this one
import pandas as pd
pd.DataFrame({}).to_csv("filename.csv")
I would do it this way: first read up all your CSV files (but only the columns that you really need) into one DF, then make groupby(['Year','Month','Day']).mean() and save resulting DF into CSV file:
import glob
import pandas as pd
fmask = 'MonthlyDataSplit/Day/Day*.csv'
df = pd.concat((pd.read_csv(f, sep=',', usecols=['Year','Month','Day','AirTemperature']) for f in glob.glob(fmask)))
df.groupby(['Year','Month','Day']).mean().to_csv('my_csv.csv')
and if want to ignore the year:
import glob
import pandas as pd
fmask = 'MonthlyDataSplit/Day/Day*.csv'
df = pd.concat((pd.read_csv(f, sep=',', usecols=['Month','Day','AirTemperature']) for f in glob.glob(fmask)))
df.groupby(['Month','Day']).mean().to_csv('my_csv.csv')
Some details:
(pd.read_csv(f, sep=',', usecols=['Month','Day','AirTemperature']) for f in glob.glob('*.csv'))
will generate tuple of data frames from all your CSV files
pd.concat(...)
will concatenate them into resulting single DF
df.groupby(['Year','Month','Day']).mean()
will produce wanted report as a data frame, which might be saved into new CSV file:
.to_csv('my_csv.csv')
The problem is a little unclear, but assuming you have to iterate month by month, and apply the groupby as stated just use:
#Before loops
dflist=[]
Then in each loop do something like:
dflist.append(MeanDailyTemperaturesForCurrentMonth)
Then at the end:
final_df = pd.concat([dflist], axis=1)
and this will join everything into one dataframe.
Look at:
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html
http://pandas.pydata.org/pandas-docs/stable/merging.html
You could do this to create an empty CSV and add columns without an index column as well.
import pandas as pd
df=pd.DataFrame(columns=["Col1","Col2","Col3"]).to_csv(filename.csv,index=False)

Categories

Resources