Search and replace dots and commas in pandas dataframe

Search and replace dots and commas in pandas dataframe - python

This is my DataFrame:
d = {'col1': ['sku 1.1', 'sku 1.2', 'sku 1.3'], 'col2': ['9.876.543,21', 654, '321,01']}
df = pd.DataFrame(data=d)
df
col1 col2
0 sku 1.1 9.876.543,21
1 sku 1.2 654
2 sku 1.3 321,01
Data in col2 are numbers in local format, which I would like to convert into:
col2
9876543.21
654
321.01
I tried df['col2'] = pd.to_numeric(df['col2'], downcast='float'), which returns a ValueError: : Unable to parse string "9.876.543,21" at position 0.
I tried also df = df.apply(lambda x: x.str.replace(',', '.')), which returns ValueError: could not convert string to float: '5.023.654.46'

The best is use if possible parameters in read_csv:
df = pd.read_csv(file, thousands='.', decimal=',')
If not possible, then replace should help:
df['col2'] = (df['col2'].replace('\.','', regex=True)
.replace(',','.', regex=True)
.astype(float))

You can try
df = df.apply(lambda x: x.replace(',', '&'))
df = df.apply(lambda x: x.replace('.', ','))
df = df.apply(lambda x: x.replace('&', '.'))

You are always better off using standard system facilities where they exist. Knowing that some locales use commas and decimal points differently I could not believe that Pandas would not use the formats of the locale.
Sure enough a quick search revealed this gist, which explains how to make use of locales to convert strings to numbers. In essence you need to import locale and after you've built the dataframe call locale.setlocale to establish a locale that uses commas as decimal points and periods for separators, then apply the dataframe's applymapp method.

Related

How to remove all string values that precede a character in pandas?

I have the following dataframe:
data = {'Name':['Square_Train_1', 'Stims1/Neut/32Neut1.jpg', 'Square_Train_2',
'Stims1/Neg/114Neg1.jpg', 'Square_Train_3',
'Stims1/Pos/129Pos1.jpg', 'Stims1/Neut/58Neut1.jpg',
'Stims1/Neg/13Neg1.jpg', 'Stims1/Pos/5Pos1.jpg',
'Stims1/Pos/25Pos1.jpg', 'Stims1/Neg/47Neg1.jpg',
'Stims1/Neut/8Neut1.jpg', 'Stims1/Neg/129Neg1.jpg',
'Stims1/Neut/42Neut1.jpg', 'Stims1/Pos/98Pos1.jpg',
'Stims1/Neut/24Neut1.jpg', 'Stims1/Neg/6Neg1.jpg',
'Stims1/Pos/107Pos1.jpg', 'Stims1/Neg/100Neg1.jpg',
'Stims1/Pos/77Pos1.jpg', 'Stims1/Neut/3Neut1.jpg',
'Stims1/Neg/53Neg1.jpg', 'Stims1/Pos/157Pos1.jpg',
'Stims1/Neut/13Neut1.jpg', 'Stims1/Neut/9Neut1.jpg',
'Stims1/Pos/104Pos1.jpg', 'Stims1/Neg/64Neg1.jpg',
'Stims1/Neut/30Neut1.jpg', 'Stims1/Pos/43Pos1.jpg',
'Stims1/Neg/1Neg1.jpg', 'Stims1/Neut/59Neut1.jpg',
'Stims1/Neg/172Neg1.jpg', 'Stims1/Pos/56Pos1.jpg',
'Stims1/Pos/44Pos1.jpg', 'Stims1/Neg/34Neg1.jpg',
'Stims1/Neut/16Neut1.jpg', 'Stims1/Neut/47Neut1.jpg',
'Stims1/Neg/21Neg1.jpg', 'Stims1/Pos/96Pos1.jpg',
'Stims1/Neg/50Neg1.jpg', 'Stims1/Pos/2Pos1.jpg',
'Stims1/Neut/21Neut1.jpg', 'Stims1/Neg/65Neg1.jpg',
'Stims1/Pos/35Pos1.jpg', 'Stims1/Neut/51Neut1.jpg',
'Stims1/Neut/55Neut1.jpg', 'Stims1/Pos/60Pos1.jpg',
'Stims1/Neg/30Neg1.jpg', 'Stims1/Neut/7Neut1.jpg',
'Stims1/Pos/9Pos1.jpg', 'Stims1/Neg/41Neg1.jpg',
'Stims1/Pos/31Pos1.jpg', 'Stims1/Neut/40Neut1.jpg',
'Stims1/Neg/156Neg1.jpg', 'Stims1/Neg/135Neg1.jpg',
'Stims1/Pos/71Pos1.jpg', 'Stims1/Neut/26Neut1.jpg',
'Stims1/Pos/105Pos1.jpg', 'Stims1/Neg/17Neg1.jpg',
'Stims1/Neut/44Neut1.jpg', 'Stims1/Pos/150Pos1.jpg',
'Stims1/Neut/57Neut1.jpg', 'Stims1/Neg/12Neg1.jpg',
'Stims1/Pos/24Pos1.jpg', 'Stims1/Neg/131Neg1.jpg',
'Stims1/Neut/31Neut1.jpg', 'Stims1/Pos/10Pos1.jpg',
'Stims1/Neut/11Neut1.jpg', 'Stims1/Neg/118Neg1.jpg',
'Stims1/Neg/51Neg1.jpg', 'Stims1/Pos/48Pos1.jpg',
'Stims1/Neut/34Neut1.jpg', 'Stims1/Pos/148Pos1.jpg',
'Stims1/Neut/22Neut1.jpg', 'Stims1/Neg/176Neg1.jpg',
'Stims1/Neut/5Neut1.jpg', 'Stims1/Neg/104Neg1.jpg',
'Stims1/Pos/68Pos1.jpg', 'Stims1/Neut/35Neut1.jpg',
'Stims1/Pos/14Pos1.jpg', 'Stims1/Neg/136Neg1.jpg',
'Stims1/Neut/54Neut1.jpg', 'Stims1/Neg/107Neg1.jpg',
'Stims1/Pos/47Pos1.jpg', 'Stims1/Neut/43Neut1.jpg',
'Stims1/Neg/58Neg1.jpg', 'Stims1/Pos/20Pos1.jpg',
'Stims1/Neut/6Neut1.jpg', 'Stims1/Neg/63Neg1.jpg',
'Stims1/Pos/135Pos1.jpg', 'Stims1/Neut/39Neut1.jpg',
'Stims1/Neg/164Neg1.jpg', 'Stims1/Pos/125Pos1.jpg',
'Stims1/Neg/117Neg1.jpg', 'Stims1/Neut/48Neut1.jpg',
'Stims1/Pos/69Pos1.jpg', 'Stims1/Pos/37Pos1.jpg',
'Stims1/Neg/159Neg1.jpg', 'Stims1/Neut/36Neut1.jpg',
'Stims1/Pos/75Pos1.jpg', 'Stims1/Neg/180Neg1.jpg',
'Stims1/Neut/50Neut1.jpg', 'Stims1/Neg/7Neg1.jpg',
'Stims1/Pos/11Pos1.jpg', 'Stims1/Neut/52Neut1.jpg',
'Stims1/Pos/29Pos1.jpg', 'Stims1/Neut/46Neut1.jpg',
'Stims1/Neg/115Neg1.jpg', 'Stims1/Neg/31Neg1.jpg',
'Stims1/Pos/66Pos1.jpg', 'Stims1/Neut/14Neut1.jpg',
'Stims1/Neut/53Neut1.jpg', 'Stims1/Neg/162Neg1.jpg',
'Stims1/Pos/97Pos1.jpg', 'Stims1/Neg/35Neg1.jpg',
'Stims1/Neut/45Neut1.jpg', 'Stims1/Pos/32Pos1.jpg',
'Stims1/Pos/81Pos1.jpg', 'Stims1/Neg/24Neg1.jpg',
'Stims1/Neut/1Neut1.jpg', 'Stims1/Neut/20Neut1.jpg',
'Stims1/Neg/69Neg1.jpg', 'Stims1/Pos/52Pos1.jpg',
'Stims2/Pos/35Pos2.jpg', 'Stims2/Neut/1Neut2.jpg',
'Stims2/Neg/30Neg2.jpg', 'Stims2/Neg/156Neg2.jpg',
'Stims2/Neut/59Neut2.jpg', 'Stims2/Pos/150Pos2.jpg',
'Stims2/Neg/114Neg2.jpg', 'Stims2/Neut/39Neut2.jpg',
'Stims2/Pos/98Pos2.jpg', 'Stims2/Pos/14Pos2.jpg',
'Stims2/Neg/24Neg2.jpg', 'Stims2/Neut/51Neut2.jpg',
'Stims2/Pos/48Pos2.jpg', 'Stims2/Neg/31Neg2.jpg',
'Stims2/Neut/26Neut2.jpg', 'Stims2/Neg/35Neg2.jpg',
'Stims2/Neut/40Neut2.jpg', 'Stims2/Pos/60Pos2.jpg',
'Stims2/Pos/77Pos2.jpg', 'Stims2/Neut/9Neut2.jpg',
'Stims2/Neg/47Neg2.jpg', 'Stims2/Neg/107Neg2.jpg',
'Stims2/Pos/66Pos2.jpg', 'Stims2/Neut/55Neut2.jpg',
'Stims2/Neut/14Neut2.jpg', 'Stims2/Pos/56Pos2.jpg',
'Stims2/Neg/34Neg2.jpg', 'Stims2/Neg/131Neg2.jpg',
'Stims2/Pos/97Pos2.jpg', 'Stims2/Neut/52Neut2.jpg',
'Stims2/Neut/45Neut2.jpg', 'Stims2/Neg/162Neg2.jpg',
'Stims2/Pos/129Pos2.jpg', 'Stims2/Pos/52Pos2.jpg',
'Stims2/Neg/104Neg2.jpg', 'Stims2/Neut/48Neut2.jpg',
'Stims2/Neut/21Neut2.jpg', 'Stims2/Pos/104Pos2.jpg',
'Stims2/Neg/50Neg2.jpg', 'Stims2/Pos/24Pos2.jpg',
'Stims2/Neut/34Neut2.jpg', 'Stims2/Neg/176Neg2.jpg',
'Stims2/Neg/129Neg2.jpg', 'Stims2/Pos/47Pos2.jpg',
'Stims2/Neut/36Neut2.jpg', 'Stims2/Pos/157Pos2.jpg',
'Stims2/Neg/58Neg2.jpg', 'Stims2/Neut/7Neut2.jpg',
'Stims2/Neut/53Neut2.jpg', 'Stims2/Pos/69Pos2.jpg',
'Stims2/Neg/172Neg2.jpg', 'Stims2/Pos/68Pos2.jpg',
'Stims2/Neut/35Neut2.jpg', 'Stims2/Neg/100Neg2.jpg',
'Stims2/Neg/17Neg2.jpg', 'Stims2/Pos/148Pos2.jpg',
'Stims2/Neut/46Neut2.jpg', 'Stims2/Neut/16Neut2.jpg',
'Stims2/Pos/105Pos2.jpg', 'Stims2/Neg/159Neg2.jpg',
'Stims2/Pos/29Pos2.jpg', 'Stims2/Neg/64Neg2.jpg',
'Stims2/Neut/58Neut2.jpg', 'Stims2/Neut/30Neut2.jpg',
'Stims2/Pos/71Pos2.jpg', 'Stims2/Neg/41Neg2.jpg',
'Stims2/Neut/20Neut2.jpg', 'Stims2/Neg/69Neg2.jpg',
'Stims2/Pos/9Pos2.jpg', 'Stims2/Pos/5Pos2.jpg',
'Stims2/Neut/13Neut2.jpg', 'Stims2/Neg/1Neg2.jpg',
'Stims2/Pos/31Pos2.jpg', 'Stims2/Neg/21Neg2.jpg',
'Stims2/Neut/32Neut2.jpg', 'Stims2/Pos/96Pos2.jpg',
'Stims2/Neg/118Neg2.jpg', 'Stims2/Neut/57Neut2.jpg',
'Stims2/Neut/3Neut2.jpg', 'Stims2/Pos/32Pos2.jpg',
'Stims2/Neg/117Neg2.jpg', 'Stims2/Neg/6Neg2.jpg',
'Stims2/Pos/10Pos2.jpg', 'Stims2/Neut/44Neut2.jpg',
'Stims2/Pos/25Pos2.jpg', 'Stims2/Neut/50Neut2.jpg',
'Stims2/Neg/51Neg2.jpg', 'Stims2/Neut/47Neut2.jpg',
'Stims2/Neg/135Neg2.jpg', 'Stims2/Pos/125Pos2.jpg',
'Stims2/Neut/43Neut2.jpg', 'Stims2/Neg/7Neg2.jpg',
'Stims2/Pos/11Pos2.jpg', 'Stims2/Neut/22Neut2.jpg',
'Stims2/Pos/20Pos2.jpg', 'Stims2/Neg/180Neg2.jpg',
'Stims2/Neut/31Neut2.jpg', 'Stims2/Neg/164Neg2.jpg',
'Stims2/Pos/37Pos2.jpg', 'Stims2/Neg/13Neg2.jpg',
'Stims2/Neut/5Neut2.jpg', 'Stims2/Pos/135Pos2.jpg',
'Stims2/Neg/53Neg2.jpg', 'Stims2/Neut/54Neut2.jpg',
'Stims2/Pos/81Pos2.jpg', 'Stims2/Pos/44Pos2.jpg',
'Stims2/Neut/11Neut2.jpg', 'Stims2/Neg/115Neg2.jpg',
'Stims2/Neut/6Neut2.jpg', 'Stims2/Pos/107Pos2.jpg',
'Stims2/Neg/136Neg2.jpg', 'Stims2/Pos/75Pos2.jpg',
'Stims2/Neg/65Neg2.jpg', 'Stims2/Neut/42Neut2.jpg',
'Stims2/Pos/43Pos2.jpg', 'Stims2/Neut/24Neut2.jpg',
'Stims2/Neg/12Neg2.jpg', 'Stims2/Neut/8Neut2.jpg',
'Stims2/Pos/2Pos2.jpg', 'Stims2/Neg/63Neg2.jpg']}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
df
The goal is to remove all characters that precede the '/' character.
I tried 'lstrip':
df['Name'] = df['Name'].map(lambda x: x.lstrip('Stims1/Neut/'))
df['Name'] = df['Name'].map(lambda x: x.lstrip('Stims1/Pos/'))
df['Name'] = df['Name'].map(lambda x: x.lstrip('Pos/'))
df['Name'] = df['Name'].map(lambda x: x.lstrip('2'))
df['Name'] = df['Name'].map(lambda x: x.lstrip('/Pos/'))
df['Name'] = df['Name'].map(lambda x: x.lstrip('Neg/'))
df['Name'] = df['Name'].map(lambda x: x.lstrip('/Neut/'))
df['Name'] = df['Name'].map(lambda x: x.lstrip('ut/'))
The problem with lstrip is that it requires a lot of different inputs for matching the string, and often strips too many strings.
I would like to avoid using 'replace', as it is even less efficient; it requires entering every single combination of strings. The same problem seems to apply to 're'.
Is there a way to remove all characters that precede the '/' most panefficiently?

What it really looks like you're trying to do is grab just the filename and drop the rest of the directory from the filepath. If that is the case, I would use df.apply with os.path.basename
>>> import os
>>> df['Name'] = df['Name'].apply(os.path.basename)
Which results in
>>> df
Name
0 Square_Train_1
1 32Neut1.jpg
2 Square_Train_2
3 114Neg1.jpg
4 Square_Train_3
.. ...
238 24Neut2.jpg
239 12Neg2.jpg
240 8Neut2.jpg
241 2Pos2.jpg
242 63Neg2.jpg
[243 rows x 1 columns]

Subset string rows that contain a 'flexible' pattern

I have the following df.
data = [
['DWWWWD'],
['DWDW'],
['WDWWWWWWWWD'],
['DDW'],
['WWD'],
]
df = pd.DataFrame(data, columns=['letter_sequence'])
I want to subset the rows that contain the pattern 'D' + '[whichever number of W's]' + 'D'. Examples of rows I want in my output df: DWD, DWWWWWWWWWWWD, WWWWWDWDW...
I came up with the following, but it does not really work for 'whichever number of W's'.
df[df['letter_sequence'].str.contains(
'DWD|DWWD|DWWWD|DWWWWD|DWWWWWD|DWWWWWWD|DWWWWWWWD|DWWWWWWWWD', regex=True
)]
Desired output new_df:
letter_sequence
0 DWWWWD
1 DWDW
2 WDWWWWWWWWD
Any alternatives?

Use [W]{1,} for one or more W, regex=True is by default, so should be omit:
df = df[df['letter_sequence'].str.contains('D[W]{1,}D')]
print (df)
letter_sequence
0 DWWWWD
1 DWDW
2 WDWWWWWWWWD

You can use the regex: D\w+D.
The code is shown below:
df = df[df['letter_sequence'].str.contains('Dw+D')]
Please let me know if it helps.

Sorting pandas dataframe with German Umlaute

I have a dataframe which I want to sort via sort_values on one column.
Problem is there are German umlaute as first letter of the words.
Like Österreich, Zürich.
Which will sort to Zürich, Österreich.
It should be sorting Österreich, Zürich.
Ö should be between N and O.
I have found out how to do this with lists in python using locale and strxfrm.
Can I do this in the pandas dataframe somehow directly?
Edit:
Thank You. Stef example worked quite well, somehow I had Numbers where his Version did not work with my real life Dataframe example, so I used alexey's idea.
I did the following, probably you can shorten this..:
df = pd.DataFrame({'location': ['Österreich','Zürich','Bern', 254345],'code':['ö','z','b', 'v']})
#create index as column for joining later
df = df.reset_index(drop=False)
#convert int to str
df['location']=df['location'].astype(str)
#sort by location with umlaute
df_sort_index = df['location'].str.normalize('NFD').sort_values(ascending=True).reset_index(drop=False)
#drop location so we dont have it in both tables
df = df.drop('location', axis=1)
#inner join on index
new_df = pd.merge(df_sort_index, df, how='inner', on='index')
#drop index as column
new_df = new_df.drop('index', axis=1)

You could use sorted with a locale aware sorting function (in my example, setlocale returned 'German_Germany.1252') to sort the column values. The tricky part is to sort all the other columns accordingly. A somewhat hacky solution would be to temporarily set the index to the column to be sorted and then reindex on the properly sorted index values and reset the index.
import functools
import locale
locale.setlocale(locale.LC_ALL, '')
df = pd.DataFrame({'location': ['Österreich','Zürich','Bern'],'code':['ö','z','b']})
df = df.set_index('location')
df = df.reindex(sorted(df.index, key=functools.cmp_to_key(locale.strcoll))).reset_index()
Output of print(df):
location code
0 Bern b
1 Österreich ö
2 Zürich z
Update for mixed type columns
If the column to be sorted is of mixed types (e.g. strings and integers), then you have two possibilities:
a) convert the column to string and then sort as written above (result column will be all strings):
locale.setlocale(locale.LC_ALL, '')
df = pd.DataFrame({'location': ['Österreich','Zürich','Bern', 254345],'code':['ö','z','b','v']})
df.location=df.location.astype(str)
df = df.set_index('location')
df = df.reindex(sorted(df.index, key=functools.cmp_to_key(locale.strcoll))).reset_index()
print(df.location.values)
# ['254345' 'Bern' 'Österreich' 'Zürich']
b) sort on a copy of the column converted to string (result column will retain mixed types)
locale.setlocale(locale.LC_ALL, '')
df = pd.DataFrame({'location': ['Österreich','Zürich','Bern', 254345],'code':['ö','z','b','v']})
df = df.set_index(df.location.astype(str))
df = df.reindex(sorted(df.index, key=functools.cmp_to_key(locale.strcoll))).reset_index(drop=True)
print(df.location.values)
# [254345 'Bern' 'Österreich' 'Zürich']

you can use unicode NFD normal form
>>> names = pd.Series(['Österreich', 'Ost', 'S', 'N'])
>>> names.str.normalize('NFD').sort_values()
3 N
1 Ost
0 Österreich
2 S
dtype: object
# use result to rearrange a dataframe
>>> df[names.str.normalize('NFD').sort_values().index]
It's not quite what you wanted, but for proper ordering you need language knowladge (like locale you mentioned).
NFD employs two symbols for umlauts e.g. Ö becomes O\xcc\x88 (you can see the difference with names.str.normalize('NFD').encode('utf-8'))

Sort with locale:
import pandas as pd
import locale
locale.setlocale(locale.LC_ALL, 'de_de')
#codes: https://github.com/python/cpython/blob/3.10/Lib/locale.py
#create df
df = pd.DataFrame({'location': ['Zürich','Österreich','Bern', 254345],'code':['z','ö','b','v']})
#convert int to str
df['location']=df['location'].astype(str)
#sort
df_ord = df.sort_values(by = 'location', key = lambda col: col.map(lambda x: locale.strxfrm(x)))
Multisort with locale:
import pandas as pd
import locale
locale.setlocale(locale.LC_ALL, 'es_es')
# create df
lista1 = ['sarmiento', 'ñ', 'á', 'sánchez', 'a', 'ó', 's', 'ñ', 'á', 'sánchez']
lista2 = [10, 20, 60, 40, 20, 20, 10, 5, 30, 20]
df = pd.DataFrame(list(zip(lista1, lista2)), columns = ['Col1', 'Col2'])
#sort by Col2 and Col1
df_temp = df.sort_values(by = 'Col2')
df_ord = df_temp.sort_values(by = 'Col1', key = lambda col: col.map(lambda x: locale.strxfrm(x)), kind = 'mergesort')

How to use pandas.to_clipboard with comma decimal separator

How can I copy a DataFrame to_clipboard and paste it in excel with commas as decimal?
In R this is simple.
write.table(obj, 'clipboard', dec = ',')
But I cannot figure out in pandas to_clipboard.
I unsuccessfully tried changing:
import locale
locale.setlocale(locale.LC_ALL, '')
Spanish_Argentina.1252
or
df.to_clipboard(float_format = '%,%')

Since Pandas 0.16 you can use
df.to_clipboard(decimal=',')
to_clipboard() passes extra kwargs to to_csv(), which has other useful options.

There are some different ways to achieve this. First, it is possible with float_format and your locale, although the use is not so straightforward (but simple once you know it: the float_format argument should be a function that can be called):
df.to_clipboard(float_format='{:n}'.format)
A small illustration:
In [97]: df = pd.DataFrame(np.random.randn(5,2), columns=['A', 'B'])
In [98]: df
Out[98]:
A B
0 1.125438 -1.015477
1 0.900816 1.283971
2 0.874250 1.058217
3 -0.013020 0.758841
4 -0.030534 -0.395631
In [99]: df.to_clipboard(float_format='{:n}'.format)
gives:
A B
0 1,12544 -1,01548
1 0,900816 1,28397
2 0,87425 1,05822
3 -0,0130202 0,758841
4 -0,0305337 -0,395631
If you don't want to rely on the locale setting but still have comma decimal output, you can do this:
class CommaFloatFormatter:
def __mod__(self, x):
return str(x).replace('.',',')
df.to_clipboard(float_format=CommaFloatFormatter())
or simply do the conversion before writing the data to clipboard:
df.applymap(lambda x: str(x).replace('.',',')).to_clipboard()

remove parentheses from complex numbers - pandas

As a follow up to this post python pandas complex number and now that complex works fine with pandas, I want to save the complex numbers but without the parentheses -
when I use the following command the last column (complex number) is printed inside parentheses
EDIT: here is the full code, to read the data file (sample here)
import numpy as np
import pandas as pd
df = pd.read_csv('final.dat', sep=",", header=None)
df.columns=['X.1', 'X.2', 'X.3', 'X.4','X.5', 'X.6', 'X.7', 'X.8']
df['X.8'] = df['X.8'].str.replace('i','j').apply(lambda x: np.complex(x))
df1 = df.groupby(["X.1","X.2","X.5"])["X.8"].mean().reset_index()
df1['X.3'] = df["X.3"] #add extra columns
df1['X.4']=df["X.4"]
df1['X.6']=df["X.6"]
df1['X.7']=df["X.7"]
sorted_data = df1.reindex_axis(sorted(df1.columns), axis=1)
sorted_data.to_csv = ('final_sorted.dat', sep=',', header = False)
all works well, but the in the output csv file the complex are inside parentheses - and I cannot use them this way, so I want to remove them

Prob could have better support for reading/writing complex, but ATM this will work.
In [25]: df = DataFrame([[1+2j],[2-1j]],columns=list('A'))
In [26]: df
Out[26]:
A
0 (1+2j)
1 (2-1j)
In [27]: df['A'] = df['A'].apply(str).str.replace('\(|\)','')
In [28]: df
Out[28]:
A
0 1+2j
1 2-1j
In [29]: df.to_csv('test.csv')
In [30]: !cat test.csv
,A
0,1+2j
1,2-1j

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Search and replace dots and commas in pandas dataframe - python

The best is use if possible parameters in read_csv: df = pd.read_csv(file, thousands='.', decimal=',') If not possible, then replace should help: df['col2'] = (df['col2'].replace('\.','', regex=True) .replace(',','.', regex=True) .astype(float))

You can try df = df.apply(lambda x: x.replace(',', '&')) df = df.apply(lambda x: x.replace('.', ',')) df = df.apply(lambda x: x.replace('&', '.'))

Related

How to remove all string values that precede a character in pandas?

Subset string rows that contain a 'flexible' pattern

Sorting pandas dataframe with German Umlaute

How to use pandas.to_clipboard with comma decimal separator

remove parentheses from complex numbers - pandas

Categories

Resources