Convert rows to columns in Python - python

I have a excel in below format
Note:- Values in Column Name will be dynamic. In current example 10 records are shown. In another set of data it can be different number of column name.
I want to convert the rows into columns as below
Is there any easy option in python pandas to handle this scenario?

Thanks #juhat for the suggestion on pivot table. I was able to achieve the intended result with this code:
fsdData = pd.read_csv("py_fsd.csv")
fsdData.pivot(index="msg Srl", columns="Column Name", values="Value")

Related

pandas data rows part merge

I have created a dataframe with pandas.
There are more than 1000 rows
I want to merge rows of overlapping columns among them.
For convenience, there are example screenshots made in Excel.
I want to make that form in PYTHON.
I want to make the above data like below
This should be as simple as setting the index.
df = df.set_index('Symbol', append=True).swaplevel(0,1)
Output should be as desired.

Inserting value in a csv from another csv based on matching values of in one field

I have 2 csv files. csv-1 has ID, NAME, and other columns, the ID field however, has no values in it. CSV 2 also has ID and NAME columns, and both are populated. Now, I want to insert the ID values in csv-1 from csv-2 where values of the NAME in both files are the same. I need to do it by python (with or without pandas). Any lead will be highly appreciated. Thank you very much.
Some information is missing, like
are the sizes (rows) of both csv the same or different. neverthless, you need a set-up like this
Assuming sizes are same...
import pandas as pd
df1=pd.read_csv(<csv1 location>)
df2=pd.read_csv(<csv2 location>)
df1['ID'] = df2['ID'].values
df1.to_csv(file_name, encoding='utf-8', index=False)

I can't select rows from a datetime in a multiindex dataframe

I'm trying to get a scope of certain rows in a specific timeframe. The dataframe contains 2 indexes and one of them is made of datetimes (created with pd.to_datetime). When I try to select certain rows using df_pivot.loc[slice(None), '2021'] I get a KeyError: '2021'. Looking for rows using the year should be possible with datetimes right??? What do i do wrong? picture of the dataframe/indexes
problem is solved, I used reset_index() and then set_index('Datetime') to make it easier to navigate

How to force pdfplumber to extract table according to the number of columns in the upper row?

I am trying to extract a table from PDF document with python package pdfplumber. The table has four columns and multiple rows. The first row are headers and the second row has only one merged cell, then the values are saved normally (example)
pdfplumber was able to retrive the table, but it made 6 columns out if four and saved values not according to the columns.
Table as shown in PDF document
I tried to use various table settings, including "vertical strategy": "lines", but this yields me the same result.
# Python 2.7.16
import pandas as pd
import pdfplumber
path = 'file_path'
pdf = pdfplumber.open(path)
first_page = pdf.pages[7]
df5 = pd.DataFrame(first_page.extract_table())
getting six columns instead of four with values in wrong columns.
Output example:
Table as output in jupyter notebooks
I would be happy to hear, if anybody has any suggestion, solution.
Did you got the answer as i want ot replace the \n coming in the text of column?
This is not exactly what you're looking for but you could load the op into a dataframe and iterate over it using the non-null values in the first row as column names for another dataframe. After that it is easy, you can just collate all the data between 2 column name columns in the output dataframe and insert it into the new dataframe after merging those cells.

export table to csv keeping format python

I have a dataframe grouped by 3 variables. It looks like:
https://i.stack.imgur.com/q8W0y.png
When I export the table to csv, the format changes. I want to keep the original
Any ideas?
Thanks!
Pandas to_csv (and csv in general) does not support the MultiIndex used in your data. As such, it just stores the indices "long" (so each level of the MultiIndex would be a column, and each row would have its index value.) I suspect that's what you are calling "format changes".
The upshot is that if you expect to save a pandas dataframe to csv and then reestablish the dataframe from the csv, then you need to re-index the dataframe to the MultiIndex yourself, after importing it.

Categories

Resources