I have a pandas dataframe
df = pd.DataFrame({'num_legs': [1, 34, 34, 104 , 6542, 6542 , 48383]})
I want to append a str before each row`s value.
The str is ZZ00000
The catch is that the row data must always = 7 characters in total
so the desired output will be
df = num_legs
0 ZZ00001
1 ZZ00034
2 ZZ00034
3 ZZ00104
4 ZZ06542
5 ZZ06542
6 ZZ48383
As the column is of type int I was thinking of changing to a str type and then possibly using regex and some str manipulation to achieve my desired outcome..
Is there a more streamlined way possibly using a function with pandas?
Use
df['num_legs'] = "ZZ" + df['num_legs'].astype(str).str.rjust(5, "0")
You could use string concatenation here:
df["num_legs"] = 'ZZ' + ('00000' + str(df["num_legs"]))[-5:]
The idea here is that, given a num_legs integer value of say 6542, we first form the following string:
000006542
Then we retain the right 5 characters, leaving 06542.
You could also pad using the following:
'ZZ' + df['num_legs'].astype(str).str.pad(width=5, side='left', fillchar='0')
Here you pad your current number (converted to string) on the left with zeros up to a width of 5 and conctatenate that to your 'ZZ' string.
Use pythons .zfill()
df['num_legs']='zz'+df['num_legs'].astype(str).str.zfill(7)
You could try this - using a regex, and a for loop: for strings, for loops are more efficient, usually, than pandas String methods :
import re
variable = "ZZ00000"
df["new_val"] = [re.sub("\d" + f"{{{len(num)}}}$", num, variable)
for num in df.num_legs.astype(str)]
df
num_legs new_val
0 1 ZZ00001
1 34 ZZ00034
2 34 ZZ00034
3 104 ZZ00104
4 6542 ZZ06542
5 6542 ZZ06542
6 48383 ZZ48383
out = []
for nl in df["num_legs"]:
out.append(f'ZZ{nl:05d}')
The rest is up to your output manipulation
Related
I ask a Measurement Device to give me some Data. At first it tells me how many bytes of data are in the storage. It is always 14. Then it gives me the data which i have to encode into hex. It is Python 2.7 canĀ“t use newer versions. Line 6 to 10 tells the Device to give me the measured data.
Line 12 to 14 is the encoding to Hex. In other Programs it works. but when i print result(Line 14) then i get a Hex number with 13 Bytes PLUS 1 which can not be correct because it has an L et the end. I guess it is some LONG or whatever. and i dont need the last Byte. but i do think it changes the Data too, which is picked out from Line 15 and up. at first in Hex. Then it is converted into Int.
Is it possible that the L has an effect on the Data or not?
How can i fix it?
1 ap.write(b"ML\0")
rmemb = ap.read(2)
print(rmemb)
rmemb = int(rmemb)+1
5 rmem = rmemb #must be and is 14 Bytes
addmem = ("MR:%s\0" % rmem)
# addmem = ("MR:14\0")
ap.write(addmem.encode())
10 time.sleep(1)
test = ap.read(rmem)
result = hex(int(test.encode('hex'), 16))
print(result)
15 ftflash = result[12:20]
ftbg = result[20:28]
print(ftflash)
print(ftbg)
ftflash = int(ftflash, 16)
20 # print(ftflash)
ftbg = int(ftbg, 16)
# print(ftbg)
OUTPUT:
14
0x11bd5084c0b000001ce00000093L
b000001c
e0000009
Python 2 has two built-in integer types, int and long. hex returns a string representing a Python hexadecimal literal, and in Python 2, that means that longs get an L at the end, to signify that it's a long.
I wouldlike to have same digit/bytes all time (8 bytes)
Example : My first number have 5 decimals after comma : 95.12345 so 8 bytes
If this number is now 100.12345, I have got 9 bytes. Is it possible to delete the last number to conserv all time 8 bytes like this :
100.12345 ===> 100.1234
1000.1234 ===> 1000.123
Thanks for your help !
x = 95.12345
print(str(x)[:8])
95.12345
and, to avoid problem with too short strings, you might do:
x = 1.00
print("{:0>8s}".format(str(x)[:8]))
000001.0
d = 100.12345
print(d, str(d)[:8], sep=' => ')
100.12345 => 100.1234
But 1 digit is not 1 byte, it's true only for string.
What would be the best way to convert a numerical column containing float AND unit as in :
df = pd.DataFrame(["211.301 MB","435.5 GB","345.234 Bytes"])
expected output in Bytes for example:
211.301*1024*1024 = 221565157.376
Many questions like this one :
Reusable library to get human readable version of file size?
are showing ways of doing the opposite : convert number to human readable. How to convert human readable to float ?
Is there a more efficient way than splitting :
spl = pd.DataFrame(dataf['Total_Image_File_Size'].str.split(' ',expand=True))
and then parsing the units column with multiples if's ?
Thanx
I think this one should work: https://pypi.python.org/pypi/humanfriendly
>>> import humanfriendly
>>> user_input = raw_input("Enter a readable file size: ")
Enter a readable file size: 16G
>>> num_bytes = humanfriendly.parse_size(user_input)
>>> print num_bytes
17179869184
>>> print "You entered:", humanfriendly.format_size(num_bytes)
You entered: 16 GB
You could create function to convert text to value and use apply
import pandas as pd
df = pd.DataFrame(["211.301 MB","435.5 GB","345.234 Bytes"])
def convert(text):
parts = text.split(' ')
value = float(parts[0])
if parts[1] == 'KB':
value *= 1024
elif parts[1] == 'MB':
value *= 1024 * 1024
elif parts[1] == 'GB':
value *= 1024 * 1024
return value
df['value'] = df[0].apply(convert)
0 value
0 211.301 MB 2.215652e+08
1 435.5 GB 4.566548e+08
2 345.234 Bytes 3.452340e+02
EDIT: you could use humanfriendly in this function instead of if/elif
Just another idea.
>>> for size in "211.301 MB", "435.5 GB", "345.234 Bytes":
number, unit = size.split()
print float(number) * 1024**'BKMGT'.index(unit[0])
221565157.376
4.67614564352e+11
345.234
I'm trying to create a pyramid that looks like the picture below(numberPyramid(6)), where the pyramid isn't made of numbers but actually a black space with the numbers around it. The function takes in a parameter called "num" and which is the number of rows in the pyramid. How would I go about doing this? I need to use a for loop but I'm not sure how I implement it. Thanks!
666666666666
55555 55555
4444 4444
333 333
22 22
1 1
def pyramid(num_rows, block=' ', left='', right=''):
for idx in range(num_rows):
print '{py_layer:{num_fill}{align}{width}}'.format(
py_layer='{left}{blocks}{right}'.format(
left=left,
blocks=block * (idx*2),
right=right),
num_fill=format((num_rows - idx) % 16, 'x'),
align='^',
width=num_rows * 2)
This works by using python's string format method in an interesting way. The spaces are the string to be printed, and the number used as the character to fill in the rest of the row.
Using the built-in format() function to chop off the leading 0x in the hex string lets you build pyramids up to 15.
Sample:
In [45]: pyramid(9)
999999999999999999
88888888 88888888
7777777 7777777
666666 666666
55555 55555
4444 4444
333 333
22 22
1 1
Other pyramid "blocks" could be interesting:
In [52]: pyramid(9, '_')
999999999999999999
88888888__88888888
7777777____7777777
666666______666666
55555________55555
4444__________4444
333____________333
22______________22
1________________1
With the added left and right options and showing hex support:
In [57]: pyramid(15, '_', '/', '\\')
ffffffffffffff/\ffffffffffffff
eeeeeeeeeeeee/__\eeeeeeeeeeeee
dddddddddddd/____\dddddddddddd
ccccccccccc/______\ccccccccccc
bbbbbbbbbb/________\bbbbbbbbbb
aaaaaaaaa/__________\aaaaaaaaa
99999999/____________\99999999
8888888/______________\8888888
777777/________________\777777
66666/__________________\66666
5555/____________________\5555
444/______________________\444
33/________________________\33
2/__________________________\2
/____________________________\
First the code:
max_depth = int(raw_input("Enter max depth of pyramid (2 - 9): "))
for i in range(max_depth, 0, -1):
print str(i)*i + " "*((max_depth-i)*2) + str(i)*i
Output:
(numpyramid)macbook:numpyramid joeyoung$ python numpyramid.py
Enter max depth of pyramid (2 - 9): 6
666666666666
55555 55555
4444 4444
333 333
22 22
1 1
How this works:
Python has a built-in function named range() which can help you build the iterator for your for-loop. You can make it decrement instead of increment by passing in -1 as the 3rd argument.
Our for loop will start at the user supplied max_depth (6 for our example) and i will decrement by 1 for each iteration of the loop.
Now the output line should do the following:
Print out the current iterator number (i) and repeat it itimes.
Figure out how much white space to add in the middle.
This will be the max_depth minus the current iterator number, then multiply that result by 2 because you'll need to double the whitespace for each iteration
Attach the whitespace to the first set of repeated numbers.
Attach a second set of repeated numbers: the current iterator number (i) repeated itimes
When your print characters, they can be repeated by following the character with an asterisk * and the number of times you want the character to be repeated.
For example:
>>> # Repeats the character 'A' 5 times
... print "A"*5
AAAAA
I need to save a tuple of 4 numbers inside a column that only accepts numbers (int or floats)
I have a list of 4 number like -0.0123445552, -29394.2393339, 0.299393333, 0.00002345556.
How can I "store" all these numbers inside a number and be able to retrieve the original tuple in Python?
Thanks
Following up on #YevgenYampolskiy's idea of using numpy:
You could use numpy to convert the numbers to 16-bit floats, and then view the array as one 64-bit int:
import numpy as np
data = np.array((-0.0123445552, -29394.2393339, 0.299393333, 0.00002345556))
stored_int = data.astype('float16').view('int64')[0]
print(stored_int)
# 110959187158999634
recovered = np.array([stored_int], dtype='int64').view('float16')
print(recovered)
# [ -1.23443604e-02 -2.93920000e+04 2.99316406e-01 2.34842300e-05]
Note: This requires numpy version 1.6 or better, as this was the first version to support 16-bit floats.
If by int you mean the datatype int in Python (which is unlimited as of the current version), you may use the following solution
>>> x
(-0.0123445552, -29394.2393339, 0.299393333, 2.345556e-05)
>>> def encode(data):
sz_data = str(data)
import base64
b64_data = base64.b16encode(sz_data)
int_data = int(b64_data, 16)
return int_data
>>> encode(x)
7475673073900173755504583442986834619410853148159171975880377161427327210207077083318036472388282266880288275998775936614297529315947984169L
>>> def decode(data):
int_data = data
import base64
hex_data = hex(int_data)[2:].upper()
if hex_data[-1] == 'L':
hex_data = hex_data[:-1]
b64_data = base64.b16decode(hex_data)
import ast
sz_data = ast.literal_eval(b64_data)
return sz_data
>>> decode(encode(x))
(-0.0123445552, -29394.2393339, 0.299393333, 2.345556e-05)
You can combine 4 integers into a single integer, or two floats into a double using struct module:
from struct import *
s = pack('hhhh', 1, -2, 3,-4)
i = unpack('Q', pack('Q', i[0]))
print i
print unpack('hhhh', s)
s = pack('ff', 1.12, -2.32)
f = unpack('d', s)
print f
print unpack('ff', pack('d', f[0]))
prints
(18445618190982447105L,)
(1, -2, 3, -4)
(-5.119999879002571,)
(1.1200000047683716, -2.319999933242798)
Basically in this example tuple (1,-2,3,-4) gets packed into an integer 18445618190982447105, and tuple ( 1.12, -2.32) gets packed into -5.119999879002571
To pack 4 floats into a single float you will need to use half-floats, however this is a problem here:
With half-float it looks like there is no native support in python as of now:
http://bugs.python.org/issue11734
However numpy module do have some support for half-floats (http://docs.scipy.org/doc/numpy/user/basics.types.html). Maybe you can use it somehow to pack 4 floats into a single float
This does not really answer your question, but what you're trying to do violates 1NF. Is changing the DB schema to introduce an intersection table really not an option?
my idea is weird; but will it work??
In [31]: nk="-0.0123445552, -29394.2393339, 0.299393333, 0.00002345556"
In [32]: nk1="".join(str(ord(x)) for x in nk)
In [33]: nk1
Out[33]: '454846484950515252535353504432455057515752465051575151515744324846505757515751515151443248464848484850515253535354'
In [34]: import math
In [35]: math.log(long(nk1), 1000)
Out[36]: 37.885954947611985
In [37]: math.pow(1000,_)
Out[37]: 4.548464849505043e+113
you can easily unpack this string(Out[33]); for example split it at 32; its for space.
also this string is very long; we can make it to a small number by math.log; as we got in Out[36].