Python CSV reading multiple Tabs - python

Is there a way I can use the csv reader in python to look at multiple tabs in the workbook?
I am using the following to open the file but how could I target python between tab1 and tab2 in the workbook?
working_file = open('X:/test/test/test_file.csv','r')
working_file_CSV = working_file.read().splitlines()
working_file= csv.reader(working_file_CSV)
working_file .close()
I want to read tab1 and then append it to a list in python and then read tab2 and append it to a list as well.

You're thinking about a spreadsheet (Excel format, LibreOffice etc). When you export a spreadsheet to a CSV, it only exports the current worksheet to a CSV (Spreadsheet formats are much more complex than a simple CSV file).
So there is no way to switch worksheets - simply open your CSV with a plain text-editor to see the contents yourself.

Related

Export redshift table data to csv file tabs using lambda python

I have a table metric_data that has data in the below format:
I want to export this data into csv file in S3 with separate tabs for components. So I will have 1 file with 3 tabs - COMP-01, COMP-02, COMP-03.
UNLOAD function is able to export all the data from the table to one CSV file. but how can I export the data as separate tabs in the CSAV file? Below is the UNLOAD command I am using:
unload ('select * from mydb.metric_data')
to 's3://mybucket/demo/folder/file.xlsx'
iam_role 'arn:aws:iam::0123456789012:role/MyRedshiftRole';
This command generates one csv file with all the data from the table. How can I export the data as separate sheets in a single CSV file?
UPDATE: as CSV doesn't support multiple sheets, I am trying to implement the same with excel. So i updated the Unload command to generate excel file and it produces one file with all the table data
You can't. The CSV file format doesn't support tabs / sheets. You will need to convert the CSV file to a different format (like .xls for example) that does support multiple sheets in one file.
Also the UNLOAD command you posted will produce multiple files, not just one. You can add the PARALLEL OFF option to make one file but this will only work for output files less than 5GB.

How do it set up automatic DataFrame structure when I scan the data in Excel to python?

My data in Excel is not separated by ",". Twitter data separated by columns. When I throw it in Python, it automatically installs DataFrame and Tweets are not showed full text. How can I overcome this?
enter image description here
If you have a copy open in Excel, the easiest solution would be to save a copy as a csv.
File -> Save As -> dropdown and select CSV.
But pandas also allows you to read excel files. This would be recommended if you have a lot of files and don't want to convert all of them.
df = pd.read_excel(<file>)
Now, if you're saying it isn't .xlsx and also not .csv, but you know the delimiter, then:
df = pd.read_csv(<file>, delimiter='\t') # for tab delimited, but you can change '\t' to any delimiter

CSV file with Arabic characters is displayed as symbols in Excel

I am using python to extract Arabic tweets from twitter and save it as a CSV file, but when I open the saved file in excel the Arabic language displays as symbols. However, inside python, notepad, or word, it looks good.
May I know where is the problem?
This is a problem I face frequently with Microsoft Excel when opening CSV files that contain Arabic characters. Try the following workaround that I tested on latest versions of Microsoft Excel on both Windows and MacOS:
Open Excel on a blank workbook
Within the Data tab, click on From Text button (if not
activated, make sure an empty cell is selected)
Browse and select the CSV file
In the Text Import Wizard, change the File_origin to "Unicode (UTF-8)"
Go next and from the Delimiters, select the delimiter used in your file e.g. comma
Finish and select where to import the data
The Arabic characters should show correctly.
Just use encoding='utf-8-sig' instead of encoding='utf-8' as follows:
import csv
data = u"اردو"
with(open('example.csv', 'w', encoding='utf-8-sig')) as fh:
writer = csv.writer(fh)
writer.writerow([data])
It worked on my machine.
The only solution that i've found to save arabic into an excel file from python is to use pandas and to save into the xlsx extension instead of csv, xlsx seems a million times better here's the code i've put together which worked for me
import pandas as pd
def turn_into_csv(data, csver):
ids = []
texts = []
for each in data:
texts.append(each["full_text"])
ids.append(str(each["id"]))
df = pd.DataFrame({'ID': ids, 'FULL_TEXT': texts})
writer = pd.ExcelWriter(csver + '.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1', encoding="utf-8-sig")
# Close the Pandas Excel writer and output the Excel file.
writer.save()
Fastest way is after saving the file into .csv from python:
open the .csv file using Notepad++
from Encoding drop-down menu choose UTF-8-BOM
click save as and save at with same name with .csv extension (e.g. data.csv) and keep the file type as it is .txt
re-open the file again with Microsoft Excel.
Excel is known to have an awful csv import sytem. Long story short if on same system you import a csv file that you have just exported, it will work smoothly. Else, the csv file is expected to use the Windows system encoding and delimiter.
A rather awkward but robust system is to use LibreOffice or Oracle OpenOffice. Both are far beyond Excel on any feature but the csv module: they will allow you to specify the delimiters and optional quoting characters along with the encoding of the csv file and you will be able to save the resulting file in xslx.
Although my CSV file encoding was UTF-8; but explicitly redoing it again using the Notepad resolved it.
Steps:
Open your CSV file in Notepad.
Click File --> Save as...
In the "Encoding" drop-down, select UTF-8.
Rename your file using the .csv extension.
Click Save.
Reopen the file with Excel.

How can I create sheet 2 in a CSV file by using Python code?

Is there is way to create sheet 2 in same csv file by using python code
yes. There is :
df = pd.read_excel("C:\\DWDM\\Status.xlsx") # read ur original file
workbook = load_workbook(filename="C:\\DWDM\\Status.xlsx")
ws2 = workbook.create_sheet("Summary", 0) # other sheet with name Summary is added to the same.
and you can check the same with "workbook.sheetnames"
You can do this by using multiple CSV files - one CSV file per sheet.
A comma-separated value file is a plain text format. It is only going to be able to represent flat data, such as a table (or a "sheet")
When storing multiple sheets, you should use separate CSV files. You can write each one separately and import/parse them individually into their destination.

Creating multiple CSV sheets in Python

Is there any way to create a CSV file with multiple sheets programmatically in Python?
Multiple CSV files. One CSV file per sheet.
A comma seperated value file is a plain text format. It is only going to be able to represent flat data, such as a table (or a 'Sheet')
For storing multiple sheets, you should use separate CSV files. You can write each one separately and import/parse them individually into their destination.
The xlsxwriter library is what you're looking for. It can make a workbook with multiple sheets.
Look at the link for tutorials and code.
P.S I am answering this because this is the first result I got when searching on how to do this. Hopefully this helps others.
Not sure what you are trying to do with Multiple Sheets of CSV's.
Let me elaborate my thinking on this.
Maybe you want a folder with different CSV files in it.
If you "can" use XML then probably you may want to have XML sheet in a big XML "Workbook".
Haven't seen multiple sheets of csv yet.
This worked for me as a conversion from excel to CSV while exporting individual sheet names.
xls = pd.read_excel/csv('file name.xls/csv', sheet_name =['1','2','3','4','5','6','7','8','9','10'])
#list out the sheets you want to export in the bracket above separated by commas and ' '
for sheet_name, df in xls.items():
df['sheets'] = sheet_name
df[['sheets']].to_csv(f'{sheet_name}.csv', header=None)
export_csv = df.to_csv (r'Location you want to export to on your machine',
index = None, header=True)

Categories

Resources