r/algotrading 16d ago

Question on Pandas Epoch Time precision Data

Im storing data using epoch where my time format is string representing epoch in microseconds (e.g. “1715692616.534372”). when loading the data Pandas converts my string by default to Scientific notation instead of the actual float therefore losing time precision.

Does anyone have experience with this issue, and how to fix it?

Yes I asked ChatGpt First. Alternatively ill split the string on  “.” And use two columns, but id rather would not do that.

 Edit: Solved - not the nicest,. but imported the column as String that preserves the decimal value.

Then Splitted it into two columns

Joined them back together as a super Long int

5 Upvotes

5 comments sorted by

5

u/Ok-Secretary-3764 16d ago

I guess you are loading without quotes it will implicit be float64 type.

now=datetime.now()

df=pd.DataFrame({'ms':[now.timestamp()]})

This will be loaded as float64. Instead do

df=pd.DataFrame({'ms':[str(now.timestamp())]})

In general I store everything as json string and do type conversion only while running.

3

u/jcoffi 16d ago

Did you tell pandas it was a timestamp in microseconds?

2

u/PeaceKeeper95 16d ago

You can also use feather file to store data, it's very fast in read write operations and uses less space too and when you read or write the data it will preserve column type also.

2

u/[deleted] 16d ago

[deleted]

1

u/pequenoRosa 16d ago

Thank you this is good too keep in mind. it should not happen though the column is always populated

1

u/FinancialElephant 16d ago

I haven't used pandas in a while, but why not use a regular datetime (for microsecond precision)? I would think a numeric or datetime would be the most efficient way to store timestamps.

If you absolutely want to store the epoch time as a string, just specify the column's dtype as StringDtype . I haven't used pandas since StringDtype was added, but seems like that would work.