Pandas gives an unordered csv file - python

What can I do to make this (1 Pic):
look like this one with pandas (2 Pic):
Here's the code I used to make the csv file in the 1 Picture
import pandas as pd
import os
all_months_data = pd.DataFrame()
files = [file for file in os.listdir('Sales_Data/')]
for file in files:
df = pd.read_csv('Sales_Data/' + file)
all_months_data = pd.concat([all_months_data, df])
all_months_data.to_csv('all_data.csv')

I just figured the problem and it was Exel itself that have read my csv file as a text.
I did this and it worked:
Open Excel
Go to 'Data' tab
Select 'From Text/CSV' and select the .CSV file you want to import.
Click 'Import' and you're done!

Related

How to read the files of Azure file share as csv that is pandas dataframe

I have few csv files in my Azure File share which I am accessing as text by following the code:
from azure.storage.file import FileService
storageAccount='...'
accountKey='...'
file_service = FileService(account_name=storageAccount, account_key=accountKey)
share_name = '...'
directory_name = '...'
file_name = 'Name.csv'
file = file_service.get_file_to_text(share_name, directory_name, file_name)
print(file.content)
The contents of the csv files are being displayed but I need to pass them as dataframe which I am not able to do. Can anyone please tell me how to read the file.content as pandas dataframe?
After reproducing from my end, I could able to read a csv file into dataframe from the contents of the file following the below code.
generator = file_service.list_directories_and_files('fileshare/')
for file_or_dir in generator:
print(file_or_dir.name)
file=file_service.get_file_to_text('fileshare','',file_or_dir.name)
df = pd.read_csv(StringIO(file.content), sep=',')
print(df)
RESULTS:

How to convert JSON file into EXCEL file in python

I have 2 questions
How to convert and extract JSON file into EXCEL file in python
How to combine all json file into one file?
Now, I have 30 json files. I would like to extract them all into EXCEL file (In readable format).
Lastly, I need to combine all of the result into one excel file. So, curious on how to do that too.
Converting JSON into EXCEL;
import pandas as pd
df = pd.read_json('./file1.json')
df.to_excel('./file1.xlsx')
Combining multiple EXCELs (two files are combined in the example);
import glob
import pandas as pd
excl_list_path = ["./file1.xlsx", "./file2.xlsx"]
excl_list = []
for file in excl_list_path:
excl_list.append(pd.read_excel(file))
excl_merged = pd.DataFrame()
for excl_file in excl_list:
excl_merged = excl_merged.append(
excl_file, ignore_index=True)
excl_merged.to_excel('file1-file2-merged.xlsx', index=False)
Note; Your specific JSON file structure is important for these examples...
And I have the perfect function just for that
import pandas as pd
def save_to_excel(json_file, filename):
df = pd.read_json(json_file).T
df.to_excel(filename)
json_data = {"a": "data A", "b": "data B"}
save_to_excel(json_data, "json_data.xlsx")
More info here
You can try to use this library,
https://pypi.org/project/tablib/0.9.3/
It provides a lot of features that can help you on this.
How to combine all json file into one file?
ans:
import json
import glob
import pprint as pp #Pretty printer
'
combined = []
for json_file in glob.glob("*.json"): #Assuming that your json files and .py file in the same directory
with open(json_file, "rb") as infile:
combined.append(json.load(infile))
pp.pprint(combined)

Pandas, Python - Problem with converting xlsx to csv

I found to have problem with conversion of .xlsx file to .csv using pandas library.
Here is the code:
import pandas as pd
# If pandas is not installed: pip install pandas
class Program:
def __init__(self):
# file = input("Insert file name (without extension): ")
file = "Daty"
self.namexlsx = "D:\\" + file + ".xlsx"
self.namecsv = "D:\\" + file + ".csv"
Program.export(self.namexlsx, self.namecsv)
def export(namexlsx, namecsv):
try:
read_file = pd.read_excel(namexlsx, sheet_name='Sheet1', index_col=0)
read_file.to_csv(namecsv, index=False, sep=',')
print("Conversion to .csv file has been successful.")
except FileNotFoundError:
print("File not found, check file name again.")
print("Conversion to .csv file has failed.")
Program()
After running the code the console shows the ValueError: File is not a recognized excel file error
File i have in that directory is "Daty.xlsx". Tried couple of thigns like looking up to documentation and other examples around internet but most had similar code.
Edit&Update
What i intend afterwards is use the created csv file for conversion to .db file. So in the end the line of import will go .xlsx -> .csv -> .db. The idea of such program came as a training, but i cant get past point described above.
You can use like this-
import pandas as pd
data_xls = pd.read_excel('excelfile.xlsx', 'Sheet1', index_col=None)
data_xls.to_csv('csvfile.csv', encoding='utf-8', index=False)
I checked the xlsx itself, and apparently for some reason it was corrupted with columns in initial file being merged into one column. After opening and correcting the cells in the file everything runs smoothly.
Thank you for your time and apologise for inconvenience.

Convert TDMS-File to XLSX

I would like to convert a folder full of TDMS files 1:1 to XLSX.
Since important is that the Excel file has the same tabs as the TDMS file and the same file name.
I get the tabs read and the file names, but I don't know how to create new Excel files with the same names and content as the TDMS. Thats what i have tried so far:
from nptdms import TdmsFile
from nptdms import tdms
import os,glob
#Namen aller TDMS-Dateien in einem Ordner
file_names=glob.glob('*.tdms')
for file in glob.glob("*.tdms"):
tdms_file = TdmsFile.read(file)
tdms_groups = tdms_file.groups()
print(tdms_groups)
Now i found out, how to save each TDMS-File as XLSX,
import os, xlsxwriter,glob
import numpy as np
import pandas as pd
from nptdms import TdmsFile
from nptdms import tdms
file_names=glob.glob('*.tdms')
# Read the files back into a dataframe
dataframe_list = []
for file in glob.glob("*.tdms"):
tdms_file = TdmsFile.read(file)
df = tdms_file['Line 1'].as_dataframe()
dataframe_list.append(df)
file = file.replace(".tdms", "")
df.to_excel(str(file)+".xlsx")
But the problem is i have to know the path name ( Line 1 in this case).
I want to find out the path or group names, to save the XLSX File with alle tabs and the same name as in the original TDMS-File.
So can someone tell me how to read the individual tab names before opening the file and then create an XLSX file with the same number of tabs, the same tab name and content?
Edit:
When i use the command tdms_file.groups() i´ll get the following output:
[<TdmsGroup with path /'Line 1'>, <TdmsGroup with path /'Current_Line 1'>]
, but i can´t just get the tab names only ( "Line 1" and "Current Line 1"). After that i want to create an XLSX-File with the tab "Line 1" and the tab "Current Line 1" with the same content.

How to open csv file in pandas displaying all columns instead of entire data in only one column?

Firstly I've saved my file on my local disc with this code:
import requests
import csv
myDailyUrls = ['https://myurl.com/something//something_01-01-2020.csv', 'https://myurl.com/something//something_01-02-2020.csv']
for x in myDailyUrls:
urldailyLocal= os.path.basename(x)
response = requests.get(x, verify=False)
with open('/path/to/my/local/folder/'+urldailyLocal, 'w') as f:
writer = csv.writer(f)
for line in response.iter_lines():
writer.writerow(line.decode('utf-8').split(','))
However, when I'm trying to open my previously saved file in pandas it opens data in dataframe including all columns data in only one column:
import pandas as pd
data = '/path/to/my/local/folder/oneOfMySavedFiles.csv'
lines = pd.read_csv(data, sep=',',header=3, quoting=csv.QUOTE_NONE)
What I realised is when I moved csv file manually to my local folder then the above pandas pd.read_csv would open it as expected with 8 columns, but when used one of the saved files for which I used import csv method then it will open all in 1 column.
Could someone help with this?

Categories

Resources