Collect for a timestamp column

Collect for a timestamp column - python

When I execute the following statement:
spark.sql("SELECT CAST('0001-01-01' AS TIMESTAMP)").show()
I get:
CAST(0001-01-01 AS TIMESTAMP)
0001-01-01 00:00:00
but when I use spark.sql("SELECT CAST('0001-01-01' AS TIMESTAMP)").collect() I get the following error:
Fail to execute line 1: spark.sql("SELECT CAST('0001-01-01' AS TIMESTAMP)").collect()
Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-6127737743421449115.py", line 380, in <module>
exec(code, _zcUserQueryNameSpace)
File "<stdin>", line 1, in <module>
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 535, in collect
return list(_load_from_socket(sock_info, BatchedSerializer(PickleSerializer())))
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 147, in load_stream
yield self._read_with_length(stream)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 172, in _read_with_length
return self.loads(obj)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 580, in loads
return pickle.loads(obj, encoding=encoding)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 1396, in <lambda>
return lambda *a: dataType.fromInternal(a)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 633, in fromInternal
for f, v, c in zip(self.fields, obj, self._needConversion)]
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 633, in <listcomp>
for f, v, c in zip(self.fields, obj, self._needConversion)]
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 445, in fromInternal
return self.dataType.fromInternal(obj)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/types.py", line 199, in fromInternal
return datetime.datetime.fromtimestamp(ts // 1000000).replace(microsecond=ts % 1000000)
ValueError: year 0 is out of range

In your timezone, 0001-01-01 would be treated as 0000-12-31 and the year value with 0 is invalid for the internal function ymd_to_ord() in the datetime package. See the reference
This is not happening in the scala version of spark.

Related

concat Erro using join, multiIndex and tuples

How can I do this command otherwise
s1 = df.set_index('COTA').groupby(['TradingDesk', 'DATA4'])['DATA5'].idxmax()
s2 = s1.reset_index(level=0).groupby(['TradingDesk'])['DATA5'].shift(freq='BM')
df = df.join(pd.concat([s1.rename('COTA_LastDay'), s2.rename('COTA_LastDayPrevMonth')], axis=1), on=['TradingDesk', 'DATA4'])
this way it is giving the following error
File "<stdin>", line 1, in <module>
File "C:\Users\eric.santos.INFINITY\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\util\_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\eric.santos.INFINITY\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\reshape\concat.py", line 368, in concat
op = _Concatenator(
^^^^^^^^^^^^^^
File "C:\Users\eric.santos.INFINITY\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\reshape\concat.py", line 563, in __init__
self.new_axes = self._get_new_axes()
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\eric.santos.INFINITY\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\reshape\concat.py", line 633, in _get_new_axes
return [
^
File "C:\Users\eric.santos.INFINITY\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\reshape\concat.py", line 634, in <listcomp>
self._get_concat_axis if i == self.bm_axis else self._get_comb_axis(i)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\eric.santos.INFINITY\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\reshape\concat.py", line 640, in _get_comb_axis
return get_objs_combined_axis(
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\eric.santos.INFINITY\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexes\api.py", line 105, in get_objs_combined_axis
return _get_combined_index(obs_idxes, intersect=intersect, sort=sort, copy=copy)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\eric.santos.INFINITY\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexes\api.py", line 158, in _get_combined_index
index = union_indexes(indexes, sort=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\eric.santos.INFINITY\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexes\api.py", line 310, in union_indexes
result = result.union(other, sort=None if sort else False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\eric.santos.INFINITY\AppData\Local\Programs\Python\Python311\Lib\site-packages\pandas\core\indexes\base.py", line 3336, in union
raise NotImplementedError(
NotImplementedError: Can only union MultiIndex with MultiIndex or Index of tuples, try mi.to_flat_index().union(other) instead.
How do I solve this
When I run this code in jupyter lab it works ok, but when I run it in cmd it gives this error

Sending raw transaction from web3py: TypeError: <lambda>() missing 4 required positional arguments: 'hash', 'r', 's', and 'v'

I am trying to send raw transaction by web3py using this code:
t = w3.eth.account.sign_transaction(test_contract.functions.edit("test").buildTransaction(
{
"nonce": w3.eth.get_transaction_count(w3.eth.default_account)
}
), pkey)
w3.eth.send_raw_transaction(t)
But, where python comes to the last line, I have this error in console:
Traceback (most recent call last):
File "***/main.py", line 64, in <module>
w3.eth.send_raw_transaction(t)
File "***/venv/lib/python3.9/site-packages/web3/module.py", line 53, in caller
(method_str, params), response_formatters = method.process_params(module, *args, **kwargs) # noqa: E501
File "***/venv/lib/python3.9/site-packages/web3/method.py", line 194, in process_params
_apply_request_formatters(params, self.request_formatters(method)))
File "***/venv/lib/python3.9/site-packages/eth_utils/functional.py", line 45, in inner
return callback(fn(*args, **kwargs))
File "***/venv/lib/python3.9/site-packages/web3/method.py", line 50, in _apply_request_formatters
formatted_params = pipe(params, request_formatters)
File "cytoolz/functoolz.pyx", line 667, in cytoolz.functoolz.pipe
File "cytoolz/functoolz.pyx", line 642, in cytoolz.functoolz.c_pipe
File "cytoolz/functoolz.pyx", line 254, in cytoolz.functoolz.curry.__call__
File "cytoolz/functoolz.pyx", line 250, in cytoolz.functoolz.curry.__call__
File "***/venv/lib/python3.9/site-packages/web3/_utils/abi.py", line 799, in map_abi_data
return pipe(data, *pipeline)
File "cytoolz/functoolz.pyx", line 667, in cytoolz.functoolz.pipe
File "cytoolz/functoolz.pyx", line 642, in cytoolz.functoolz.c_pipe
File "cytoolz/functoolz.pyx", line 254, in cytoolz.functoolz.curry.__call__
File "cytoolz/functoolz.pyx", line 250, in cytoolz.functoolz.curry.__call__
File "***/venv/lib/python3.9/site-packages/web3/_utils/abi.py", line 833, in data_tree_map
return recursive_map(map_to_typed_data, data_tree)
File "***/venv/lib/python3.9/site-packages/web3/_utils/decorators.py", line 30, in wrapped
wrapped_val = to_wrap(*args)
File "***/venv/lib/python3.9/site-packages/web3/_utils/formatters.py", line 89, in recursive_map
items_mapped = map_collection(recurse, data)
File "***/venv/lib/python3.9/site-packages/web3/_utils/formatters.py", line 76, in map_collection
return datatype(map(func, collection))
File "***/venv/lib/python3.9/site-packages/web3/_utils/formatters.py", line 88, in recurse
return recursive_map(func, item)
File "***/venv/lib/python3.9/site-packages/web3/_utils/decorators.py", line 30, in wrapped
wrapped_val = to_wrap(*args)
File "***/venv/lib/python3.9/site-packages/web3/_utils/formatters.py", line 89, in recursive_map
items_mapped = map_collection(recurse, data)
File "***/venv/lib/python3.9/site-packages/web3/_utils/formatters.py", line 76, in map_collection
return datatype(map(func, collection))
File "***/venv/lib/python3.9/site-packages/web3/_utils/abi.py", line 855, in __new__
return super().__new__(cls, *iterable)
File "***/venv/lib/python3.9/site-packages/web3/_utils/formatters.py", line 88, in recurse
return recursive_map(func, item)
File "***/venv/lib/python3.9/site-packages/web3/_utils/decorators.py", line 30, in wrapped
wrapped_val = to_wrap(*args)
File "***/venv/lib/python3.9/site-packages/web3/_utils/formatters.py", line 89, in recursive_map
items_mapped = map_collection(recurse, data)
File "***/venv/lib/python3.9/site-packages/web3/_utils/formatters.py", line 76, in map_collection
return datatype(map(func, collection))
TypeError: <lambda>() missing 4 required positional arguments: 'hash', 'r', 's', and 'v'
I am using infura custom node, that's why I cant send transaction by contract.functions.method.transact(). Don't know to do with this error, spent a lot of time reading docs and got nothing.
How can I fix that?

sign_transaction returns SignedTransaction object while send_raw_transaction accepts raw transaction bytes. So change your last line to:
w3.eth.send_raw_transaction(t.rawTransaction)
You also probably want to save the result to a variable to track the transaction later.

You need to sign the transaction before sending with your account that has ETH balance.
You need to use Signing middleware.
>>> from web3 import Web3, EthereumTesterProvider
>>> w3 = Web3(EthereumTesterProvider)
>>> from web3.middleware import construct_sign_and_send_raw_middleware
>>> from eth_account import Account
>>> acct = Account.create('KEYSMASH FJAFJKLDSKF7JKFDJ 1530')
>>> w3.middleware_onion.add(construct_sign_and_send_raw_middleware(acct))
>>> w3.eth.default_account = acct.address
# Now you can send a tx from acct.address without having to build and sign each raw transaction

Issue TypeError: argument must be a string or number

There is only one categorical column and I want to encode it, it is working fine on notebook but when it is being uploaded to aicrowd platform it is creating this trouble.
There are totally 3 categorical features where one is the target feature, one is the row of ids and after excluding them for the training I am left with one feature.
df[['intersection_pos_rel_centre']]
from sklearn.preprocessing import LabelEncoder
le=LabelEncoder()
df[['intersection_pos_rel_centre']]=le.fit_transform(df[['intersection_pos_rel_centre']])
df[['intersection_pos_rel_centre']]
My error is
Selecting runtime language: python
[NbConvertApp] Converting notebook predict.ipynb to notebook
[NbConvertApp] Executing notebook with kernel: python
Traceback (most recent call last):
File "/opt/conda/bin/jupyter-nbconvert", line 11, in <module>
sys.exit(main())
File "/opt/conda/lib/python3.8/site-packages/jupyter_core/application.py", line 254, in launch_instance
return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/traitlets/config/application.py", line 845, in launch_instance
app.start()
File "/opt/conda/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 350, in start
self.convert_notebooks()
File "/opt/conda/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 524, in convert_notebooks
self.convert_single_notebook(notebook_filename)
File "/opt/conda/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 489, in convert_single_notebook
output, resources = self.export_single_notebook(notebook_filename, resources, input_buffer=input_buffer)
File "/opt/conda/lib/python3.8/site-packages/nbconvert/nbconvertapp.py", line 418, in export_single_notebook
output, resources = self.exporter.from_filename(notebook_filename, resources=resources)
File "/opt/conda/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 181, in from_filename
return self.from_file(f, resources=resources, **kw)
File "/opt/conda/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 199, in from_file
return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, **kw)
File "/opt/conda/lib/python3.8/site-packages/nbconvert/exporters/notebook.py", line 32, in from_notebook_node
nb_copy, resources = super().from_notebook_node(nb, resources, **kw)
File "/opt/conda/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 143, in from_notebook_node
nb_copy, resources = self._preprocess(nb_copy, resources)
File "/opt/conda/lib/python3.8/site-packages/nbconvert/exporters/exporter.py", line 318, in _preprocess
nbc, resc = preprocessor(nbc, resc)
File "/opt/conda/lib/python3.8/site-packages/nbconvert/preprocessors/base.py", line 47, in __call__
return self.preprocess(nb, resources)
File "/opt/conda/lib/python3.8/site-packages/nbconvert/preprocessors/execute.py", line 79, in preprocess
self.execute()
File "/opt/conda/lib/python3.8/site-packages/nbclient/util.py", line 74, in wrapped
return just_run(coro(*args, **kwargs))
File "/opt/conda/lib/python3.8/site-packages/nbclient/util.py", line 53, in just_run
return loop.run_until_complete(coro)
File "/opt/conda/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
File "/opt/conda/lib/python3.8/site-packages/nbclient/client.py", line 553, in async_execute
await self.async_execute_cell(
File "/opt/conda/lib/python3.8/site-packages/nbconvert/preprocessors/execute.py", line 123, in async_execute_cell
cell, resources = self.preprocess_cell(cell, self.resources, cell_index)
File "/opt/conda/lib/python3.8/site-packages/nbconvert/preprocessors/execute.py", line 146, in preprocess_cell
cell = run_sync(NotebookClient.async_execute_cell)(self, cell, index, store_history=self.store_history)
File "/opt/conda/lib/python3.8/site-packages/nbclient/util.py", line 74, in wrapped
return just_run(coro(*args, **kwargs))
File "/opt/conda/lib/python3.8/site-packages/nbclient/util.py", line 53, in just_run
return loop.run_until_complete(coro)
File "/opt/conda/lib/python3.8/site-packages/nest_asyncio.py", line 98, in run_until_complete
return f.result()
File "/opt/conda/lib/python3.8/asyncio/futures.py", line 178, in result
raise self._exception
File "/opt/conda/lib/python3.8/asyncio/tasks.py", line 280, in __step
result = coro.send(None)
File "/opt/conda/lib/python3.8/site-packages/nbclient/client.py", line 852, in async_execute_cell
self._check_raise_for_error(cell, exec_reply)
File "/opt/conda/lib/python3.8/site-packages/nbclient/client.py", line 760, in _check_raise_for_error
raise CellExecutionError.from_cell_and_msg(cell, exec_reply_content)
nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:
------------------
df[['intersection_pos_rel_centre']]
from sklearn.preprocessing import LabelEncoder
le=LabelEncoder()
df[['intersection_pos_rel_centre']]=le.fit_transform(df[['intersection_pos_rel_centre']])
df[['intersection_pos_rel_centre']]
------------------
TypeError: argument must be a string or number

Pandas error when using resample and groupby to apply function

I am new to python. I used to code in R and work with period_apply function. So I tried the following approaches in python below.
First, I do not understand what the errors are trying to tell me.
Second, I do not understand why I only get errors with groupby if I include the first row of the data. Yet with resample, i get error no matter whether I include the first row or not.
Third, how do I resolve this problem, please do not tell me skip the first row, because I work with a much much bigger dataset
Data
Best_Bid Best_Ask
Timestamp
2019-05-02 11:59:59.602 29636.0 29638.0
2019-05-02 12:59:00.033 NaN NaN
2019-05-02 12:59:00.033 NaN NaN
2019-05-02 12:59:00.033 NaN NaN
2019-05-02 12:59:00.033 NaN NaN
2019-05-02 12:59:00.033 NaN NaN
2019-05-02 12:59:00.033 NaN NaN
{'Best_Bid': {Timestamp('2019-05-02 11:59:59.602000'): 29636.0,
Timestamp('2019-05-02 12:59:00.033000'): nan},
'Best_Bid_Q': {Timestamp('2019-05-02 11:59:59.602000'): 4.0,
Timestamp('2019-05-02 12:59:00.033000'): nan},
'Best_Ask': {Timestamp('2019-05-02 11:59:59.602000'): 29638.0,
Timestamp('2019-05-02 12:59:00.033000'): nan}}
And I am trying to apply the below function(I know I could have just done .agg({'Best_Bid':['last']}) but this is a simplified version of my original code).
Function
def func(x):
best_bid = (x['Best_Bid'])[-1]
best_ask = (x['Best_Ask'])[-1]
return pd.Series([best_bid,best_ask], index=['bbbid', 'aaask'])
groupby and grouper
If I skip the first row and run. Things work fine.
df.iloc[1:,:].groupby(pd.Grouper(freq='180S',closed='right',label='right',base=-0.0001)).apply(func)
bbbid aaask
Timestamp
2019-05-02 12:59:59.999899904 NaN NaN
However, if i include the first row, I got the following error.
df.groupby(pd.Grouper(freq='180S',closed='right',label='right',base=-0.0001)).apply(func)
Traceback (most recent call last):
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\base.py", line 4405, in get_value
return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
File "pandas\_libs\index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 471, in pandas._libs.index.DatetimeEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 997, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1004, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: -1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-191-85550b07b869>", line 1, in <module>
df.iloc[371448:371455,0:3].groupby(pd.Grouper(freq='180S',closed='right',label='right',base=-0.0001)).apply(func)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\groupby\groupby.py", line 735, in apply
result = self._python_apply_general(f)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\groupby\groupby.py", line 751, in _python_apply_general
keys, values, mutated = self.grouper.apply(f, self._selected_obj, self.axis)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\groupby\ops.py", line 206, in apply
res = f(group)
File "<ipython-input-104-c57c7e2b6885>", line 2, in func
best_bid = (x['Best_Bid'])[-1]
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\series.py", line 871, in __getitem__
result = self.index.get_value(self, key)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\datetimes.py", line 651, in get_value
value = Index.get_value(self, series, key)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\base.py", line 4411, in get_value
return libindex.get_value_at(s, key)
File "pandas\_libs\index.pyx", line 44, in pandas._libs.index.get_value_at
File "pandas\_libs\index.pyx", line 45, in pandas._libs.index.get_value_at
File "pandas\_libs\util.pxd", line 98, in pandas._libs.util.get_value_at
File "pandas\_libs\util.pxd", line 89, in pandas._libs.util.validate_indexer
IndexError: index out of bounds
Traceback (most recent call last):
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\base.py", line 4405, in get_value
return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
File "pandas\_libs\index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 471, in pandas._libs.index.DatetimeEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 997, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1004, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: -1
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-191-85550b07b869>", line 1, in <module>
df.iloc[371448:371455,0:3].groupby(pd.Grouper(freq='180S',closed='right',label='right',base=-0.0001)).apply(func)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\groupby\groupby.py", line 735, in apply
result = self._python_apply_general(f)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\groupby\groupby.py", line 751, in _python_apply_general
keys, values, mutated = self.grouper.apply(f, self._selected_obj, self.axis)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\groupby\ops.py", line 206, in apply
res = f(group)
File "<ipython-input-104-c57c7e2b6885>", line 2, in func
best_bid = (x['Best_Bid'])[-1]
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\series.py", line 871, in __getitem__
result = self.index.get_value(self, key)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\datetimes.py", line 651, in get_value
value = Index.get_value(self, series, key)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\base.py", line 4411, in get_value
return libindex.get_value_at(s, key)
File "pandas\_libs\index.pyx", line 44, in pandas._libs.index.get_value_at
File "pandas\_libs\index.pyx", line 45, in pandas._libs.index.get_value_at
File "pandas\_libs\util.pxd", line 98, in pandas._libs.util.get_value_at
File "pandas\_libs\util.pxd", line 89, in pandas._libs.util.validate_indexer
IndexError: index out of bounds
Resample
I got the following error regardless of including the first row or not.
df.resample(rule='180S',closed='right',label='right',base=-0.0001).agg(func)
Traceback (most recent call last):
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\base.py", line 4411, in get_value
return libindex.get_value_at(s, key)
File "pandas\_libs\index.pyx", line 44, in pandas._libs.index.get_value_at
File "pandas\_libs\index.pyx", line 45, in pandas._libs.index.get_value_at
File "pandas\_libs\util.pxd", line 98, in pandas._libs.util.get_value_at
File "pandas\_libs\util.pxd", line 83, in pandas._libs.util.validate_indexer
TypeError: 'str' object cannot be interpreted as an integer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\datetimes.py", line 651, in get_value
value = Index.get_value(self, series, key)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\base.py", line 4419, in get_value
raise e1
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\base.py", line 4405, in get_value
return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
File "pandas\_libs\index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 90, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 473, in pandas._libs.index.DatetimeEngine.get_loc
File "pandas\_libs\index.pyx", line 479, in pandas._libs.index.DatetimeEngine._date_check_type
KeyError: 'Best_Bid'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "pandas\_libs\tslibs\conversion.pyx", line 520, in pandas._libs.tslibs.conversion.convert_str_to_tsobject
File "pandas\_libs\tslibs\parsing.pyx", line 228, in pandas._libs.tslibs.parsing.parse_datetime_string
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\dateutil\parser\_parser.py", line 1374, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\dateutil\parser\_parser.py", line 649, in parse
raise ParserError("Unknown string format: %s", timestr)
dateutil.parser._parser.ParserError: Unknown string format: Best_Bid
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\datetimes.py", line 660, in get_value
return self.get_value_maybe_box(series, key)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\datetimes.py", line 675, in get_value_maybe_box
key = Timestamp(key)
File "pandas\_libs\tslibs\timestamps.pyx", line 418, in pandas._libs.tslibs.timestamps.Timestamp.__new__
File "pandas\_libs\tslibs\conversion.pyx", line 292, in pandas._libs.tslibs.conversion.convert_to_tsobject
File "pandas\_libs\tslibs\conversion.pyx", line 523, in pandas._libs.tslibs.conversion.convert_str_to_tsobject
ValueError: could not convert string to Timestamp
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\IPython\core\interactiveshell.py", line 3331, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-190-d2caa0c5152a>", line 1, in <module>
df.iloc[371448:371455,0:3].resample(rule='180S',closed='right',label='right',base=-0.0001).agg(func)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\resample.py", line 285, in aggregate
result = self._groupby_and_aggregate(how, grouper, *args, **kwargs)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\resample.py", line 359, in _groupby_and_aggregate
result = grouped._aggregate_item_by_item(how, *args, **kwargs)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\groupby\generic.py", line 1172, in _aggregate_item_by_item
result[item] = colg.aggregate(func, *args, **kwargs)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\groupby\generic.py", line 269, in aggregate
result = self._aggregate_named(func, *args, **kwargs)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\groupby\generic.py", line 452, in _aggregate_named
output = func(group, *args, **kwargs)
File "<ipython-input-104-c57c7e2b6885>", line 2, in func
best_bid = (x['Best_Bid'])[-1]
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\series.py", line 871, in __getitem__
result = self.index.get_value(self, key)
File "C:\Users\testUser\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\indexes\datetimes.py", line 662, in get_value
raise KeyError(key)
KeyError: 'Best_Bid'

From what I can tell you are getting an error with the (x['Best_Bid'])[-1] because it is returning a KeyError: -1
Your apply function is iterating though each element (x) from the column (Best_Bid and Bid_Ask) and trying to grab the last index from the element (x) which doesn't make sense.
I don't have your dataset in front of me to work with but I would try this code to see if it works.
gdf = df.groupby(pd.Grouper(freq='180S',closed='right',label='right',base=-0.0001)).copy()
print(gdf['Best_Bid'][gdf.index[-1]],gdf.index[-1])
print(gdf['Best_Ask'][gdf.index[-1]],gdf.index[-1])
Now this code can definitely be simplified but it should work for all rows and it will be much faster than the .apply method if it is a large dataset.

Astropy get_gcrs_posvel in Python 2.7

In Python 2.7 I am trying to calculate the position and velocity of an observatory by doing
>>> from astropy.time import Time
>>> from astropy.coordinates import SkyCoord, EarthLocation, ICRS
>>> from astropy import units as u
>>> from astropy import coordinates
>>> time=Time(58121.93, format='mjd')
>>> location=EarthLocation(53.2367, 2.3085, 100, ellipsoid = None)
>>> op, ov = location.get_gcrs_posvel(time)
But I get the error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\p\lib\site-packages\astropy\coordinates\earth.py", line 653, in get_gcrs_posvel
gcrs_data = self._get_gcrs(obstime).data
File "C:\p\lib\site-packages\astropy\coordinates\earth.py", line 633, in _get_gcrs
return itrs.transform_to(GCRS(obstime=obstime))
File "C:\p\lib\site-packages\astropy\coordinates\baseframe.py", line 934, in transform_to
return trans(self, new_frame)
File "C:\p\lib\site-packages\astropy\coordinates\transformations.py", line 1314, in __call__
curr_coord = t(curr_coord, curr_toframe)
File "C:\p\lib\site-packages\astropy\coordinates\transformations.py", line 847, in __call__
reprwithoutdiff = supcall(from_diffless, toframe)
File "C:\p\lib\site-packages\astropy\coordinates\builtin_frames\intermediate_rotation_transforms.py", line 72, in cirs_to_gcrs
return gcrs.transform_to(gcrs_frame)
File "C:\p\lib\site-packages\astropy\coordinates\baseframe.py", line 934, in transform_to
return trans(self, new_frame)
File "C:\p\lib\site-packages\astropy\coordinates\transformations.py", line 1314, in __call__
curr_coord = t(curr_coord, curr_toframe)
File "C:\p\lib\site-packages\astropy\coordinates\transformations.py", line 914, in __call__
return supcall(fromcoord, toframe)
File "C:\p\lib\site-packages\astropy\coordinates\builtin_frames\icrs_cirs_transforms.py", line 221, in gcrs_to_gcrs
return from_coo.transform_to(ICRS).transform_to(to_frame)
File "C:\p\lib\site-packages\astropy\coordinates\baseframe.py", line 934, in transform_to
return trans(self, new_frame)
File "C:\p\lib\site-packages\astropy\coordinates\transformations.py", line 1314, in __call__
curr_coord = t(curr_coord, curr_toframe)
File "C:\p\lib\site-packages\astropy\coordinates\transformations.py", line 914, in __call__
return supcall(fromcoord, toframe)
File "C:\p\lib\site-packages\astropy\coordinates\builtin_frames\icrs_cirs_transforms.py", line 188, in gcrs_to_icrs
i_ra, i_dec = aticq(gcrs_ra, gcrs_dec, astrom)
File "C:\p\lib\site-packages\astropy\coordinates\builtin_frames\utils.py", line 196, in aticq
before = norm(ppr-d)
File "C:\p\lib\site-packages\astropy\coordinates\builtin_frames\utils.py", line 125, in norm
return p/np.sqrt(np.einsum('...i,...i', p, p))[..., np.newaxis]
File "C:\p\lib\site-packages\numpy\core\einsumfunc.py", line 1087, in einsum
einsum_call=True)
File "C:\p\lib\site-packages\numpy\core\einsumfunc.py", line 688, in einsum_path
input_subscripts, output_subscript, operands = _parse_einsum_input(operands)
File "C:\p\lib\site-packages\numpy\core\einsumfunc.py", line 432, in _parse_einsum_input
raise TypeError("For this input type lists must contain "
TypeError: For this input type lists must contain either int or Ellipsis

I searched for this issue on Astropy's issue tracker and came across this:
numpy TypeError on coordinate transform (TypeError: For this input type lists must contain either int or Ellipsis)
It was fixed as of Astropy v2.0.4, so, pretty recent. You should be able to upgrade your Astropy.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Collect for a timestamp column - python

In your timezone, 0001-01-01 would be treated as 0000-12-31 and the year value with 0 is invalid for the internal function ymd_to_ord() in the datetime package. See the reference This is not happening in the scala version of spark.

Related

concat Erro using join, multiIndex and tuples

Sending raw transaction from web3py: TypeError: <lambda>() missing 4 required positional arguments: 'hash', 'r', 's', and 'v'

Issue TypeError: argument must be a string or number

Pandas error when using resample and groupby to apply function

Astropy get_gcrs_posvel in Python 2.7

Categories

Resources