fast export from Teradata using Python

fast export from Teradata using Python - python

We are trying to export data from Teradata using Python but when we used {fn teradata_sessions(4)}{fn teradata_require_fastexport}select * from table1; it's not triggereing fast export in Teradata and its going as normal select.How to use fast export from teradata using python and here i can't increase more than 4 session too.Could anyone have used fastexport in python for exporting data from Teradata and then copying into dataframe ...then converting to CSV file or any exporting into CSV file from python from Teradata also welcome.

Related

How to read excel file in python using data provider

Is there any way in python to read excel file like we have data provider in the testng
i am having a test method (using python unit test framework) and from this test i am calling another method which is actually reading the excel sheet , I just want something like data provider so that with every data it should be treated as a new test case

You could use pandas to read the excel files or csv files.
import pandas as pd
excel_data = pd.read_excel('test_file.xlsx')
csv_data = pd.read_csv('test_file.csv')
And the result is DataFrame structure.

Use Pandas to read excel files in Python. From your question I assume you don't know about pandas.
If you have added python to path during installation of IDE. Use pip for installation in the terminal
py -m pip install pandas
The python code is
import pandas as pd
df=pd.read_excel('Data.xlsx')
print(df.head()) # This will print the first 5 rows.
if you want to use jupyter notebook in the terminal
py -m pip install notebook
This will work best. But you need to have pandas installed through pip. For adavanced functions or atleast update question to what you want.What is it that dataprovider does, so as to repeat it in python Specify the fuction
Go through pandas documentation : https://pandas.pydata.org/docs/

Error writing to parquet file using pyspark

I am working on windows 10. I installed spark, and the goal is to use pyspark. I have made the following steps:
I have installed Python 3.7 with anaconda -- Python was added to C:\Python37
I download wintils from this link -- winutils is added to C:\winutils\bin
I downloaded spark -- spark was extracted is: C:\spark-3.0.0-preview2-bin-hadoop2.7
I downloaded Java 8 from AdoptOpenJDK
under system variables, I set following variables:
HADOOP_HOME : C:\winutils
SPARK_HOME: C:\spark-3.0.0-preview2-bin-hadoop2.7
JAVA_HOME: C:\PROGRA~1\AdoptOpenJDK\jdk-8.0.242.08-hotspot
And finally, under system path, I added:
%JAVA_HOME%\bin
%SPARK_HOME%\bin
%HADOOP_HOME%\bin
In the terminal:
So I would like to know why I am getting this warning:
unable to load native-hadoop library... And why I couldn't bind on port 4040...
Finally, inside Jupyter Notebook, I am getting the following error when trying to write into Parquet file. This image shows a working example, and the following one shows the code with errors:
And here is DataMaster__3.csv on my disk:
And the DaterMaster_par2222.parquet:
Any help is much appreciated!!

If you are writing the file in csv format, I have found that the best way to do that is using the following approach
LCL_POS.toPandas().to_csv(<path>)
There is another way to save it directly without converting to pandas but the issue is it ends up getting split into multiple files (with weird names so I tend to avoid those). If you are happy to split the file up, its much better to write a parquet file in my opinion.
LCL_POS.repartition(1).write.format("com.databricks.spark.csv").option("header", "true").save(<path>)
Hope that answers your question.

How to load the excel data into hive using python script?

I need a python scripts to load the multiple excel sheet data into hive table using python. Any one helping on this.

You can read excels using pandas and insert the dataframe using pyhive or any other Hive library.
Inserting a Python Dataframe into Hive from an external server

Yes, it is very easy!!
You should have pandas library installed or install it using pip if you don't have by typing this in the command prompt - py -m pip install pandas
Then, use the following code -
import pandas as pd
df = pd.read_excel('', '')
print(df)
You will see that the table is available in excel.

Importing excel files into Python

I am a super new user of Python and have been having trouble loading an Excel file to play around with in Python.
I am using Python 3.7 on Windows 10 and have tried things like using the import statement or using pip and pip3 and then install statements. I am so confused on how to do this and no links I've read online are helping.
pip install pandas
pip3 install pandas
import pandas
I just want to upload an Excel file into Python. I'm embarrassed that it's causing me this much stress.

first of all you have to import pandas (assuming that is installed, in anaconda usually is already installed as far as i know)
import pandas as pd
to read more sheets in different dataframes (tables)
xls = pd.ExcelFile(path of your file)
df_schema = pd.read_excel(xls, sheet_name=xls.sheet_names)
df_schema is a dictionary of df in which the key is the name of the sheet and the value is the dataframe.
to read a single sheet the following should work:
xls = pd.ExcelFile(path of your file)
df = pd.read_excel(xls)

xlrd library not working with xlsx files.any way to covert xlsx to xls using python?

I want to convert xlsx file to xls format using python. The reason is that im using xlrd library to parse xls files, but xlrd is not able to parse xlsx files.
Switching to a different library is not feasible for me at this stage, as the entire project is using xlrd, so a lot of changes will be required.
So, is there any way i can programatically convert an xlsx file to xls using python ?
Please Help
Thank You

If you're using Python on Windows and you have Excel installed, you could use the Python for Windows Extensions to do it. Here's a sample piece of python code that did the job for me:
import win32com.client
xl = win32com.client.Dispatch("Excel.Application")
xl.DisplayAlerts = False
wb = xl.Workbooks.Open(r"C:\PATH\TO\SOURCE_FILENAME.XLSX")
wb.SaveAs(r"C:\PATH\TO\DESTINATION_FILENAME.XLS", FileFormat = 56)
wb.Close()
xl.Quit()
I tested this using Python 2.7.2 with pywin32 build 216 and Excel 2007 on Windows 7.

xlrd-0.9.2.tar.gz (md5) can extract data from Excel spreadsheets (.xls and .xlsx, versions 2.0 on-wards) on any platform.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

fast export from Teradata using Python - python

Related

How to read excel file in python using data provider

Error writing to parquet file using pyspark

How to load the excel data into hive using python script?

Importing excel files into Python

xlrd library not working with xlsx files.any way to covert xlsx to xls using python?

Categories

Resources