Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: A high-performance TensorFlow library for quantitative finance (github.com/google)
153 points by partingshots on Jan 19, 2021 | hide | past | favorite | 38 comments



Is there any actual value in doing quantitative finance via TF as opposed to just Pandas? I get that calculating a zillion option prices on a GPU is faster, but does anyone actually do that? Not to mention that dataframes (both conceptually, and in practice) seem much more mature and easier to grok than tensors.


If you are pricing and risk-managing complex derivatives books then Pandas is not the right tool for the job. Pandas does not include derivatives pricing models, daycount conventions, schedule generation and algorithmic differentiation which are all basic ingredients in a modern derivatives pricing framework. In general what you are concerned with here is computing the fair value of each instrument usually as an expectation in a simulation or as a solution to a partial differential equation, and the first derivatives and sometimes second-order derivatives of the fair value with respect to the model parameters. Pandas makes a lot of work in empirical finance nicer, data cleaning, time series work, etc. But when pricing and risk-managing a derivatives book as a market maker you are less concerned with inferring the future from the past, and more with interpolating current data in such a way that you do not open yourself to arbitrage.

I worked as a model quant for a year a couple of years ago in a bank in the team responsible for developing and implementing the derivatives pricing models. Everything was written in (a lot of) C++, but I suspected and still suspect that you could get the same performance by leveraging one of the off-the-shelf deep learning libraries with a lot less code. Maybe you would have to write a few central components in C++ and assembly to get the exact same performance. I am glad to see someone actually following through on that idea.


> but does anyone actually do that?

Backtesting a regular single symbol model on CPU is often not cheap, especially at higher frequencies. Now multiply that by 50 correlated instruments, each having say, 1500-3000 strikes+sides+expiries in their respective option markets, and the value of GPU offload should become obvious. Backtesting dispersion trades on the S&P 500 using just 5 strikes per underlying, or any other scenario where the evolution of the volatility surface is important would already potentially involve >25,000 individual option markets

All professionals backtest, often numerous times per day


> I get that calculating a zillion option prices on a GPU is faster, but does anyone actually do that?

Yes.

Counterparty credit risk calculations are done as Monte Carlo simulations far into the future and on multiple scenarios on top of the baseline. It easily grows to 10^8 times of the number of valuations in the baseline scenario at present time (which itself can be millions of positions).


Agreed this is less convenient than pandas for many tasks, however large market makers (citadel securities, Susquehanna, Jane st, Hudson River trading) employ abstractions like this out of necessity.


They’re completely different tools. Tensorflow is a specialised matrix math library. Pandas is just for sql-like data handling.


More likely production systems will be written in C++ than Python.


Appears to be the result of an Area120 incubated project: https://avera.area120.com/


I understand that this is google's library, and they're promoting TensorFlow's usage, and subsequently the usage of TPU's.

But this can just as easily be done in Pytorch right? Yes we can't have GPU's there but having your model train 50% slower (exaggeration) is better than having to spend 150% of the time taken in Pytorch to debug anything in TensorFlow.

I may be beating a dead horse here, but why doesn't google just accept that TensorFlow needs to be redesigned for more ease of use?

Can anyone point out the benefits of TensorFlow over Pytorch (besides TPU's) ? It's been a few years since I've used TF and I may have missed something.


Have you used TF 2.0? It’s definitely better in terms of ease of use. Still not as good as Pytorch, in my opinion. Pytorch is still just more... ergonomic, I would say.

However, I don’t think the Pytorch ecosystem really matches tensorflow yet for production, with TFX and all the other nice-to-haves that google and others have open sourced.

As someone who spends a lot of their time doing POCs and research work, I strongly prefer Pytorch. But I have colleagues who mainly productionize language models, and they all seem to like Tensorflow. I don’t know if that’s just inertia on their part, or the result of a considered choice however.


TFX/TensorFlow Serving doesn't have a PyTorch equivalent. TF has better mobile mobile support. Jax also works on TPUs and is closer (with other libraries on top like Haiku) to PyTorch.

But framework wars are like language wars :) There's probably not too much productive arguments that haven't been made


Is this using floating point numbers under the hood? There is a lot you can do with floating point, but for some of what I've worked on you really do want fixed precision types.


This is a common misapprehension. Yes, fixed precision is great for accounting (like your bank statement) but when you're making a predictive model of asset prices (or higher moments of price distributions) being off by 1 in 10^14 is not important (because your model isn't that precise anyway!) and the performance you get from dedicated floating part hardware is well worth the "loss" of precision.

This wholesale dismissal of floating point for "financial" systems ignores the real business needs which might point you toward fixed or floating point numbers. Always ask yourself - do you need to know this number to better than 1 in 10^14? Are you going to find its square root at some point? Also remember that storing fixed point numbers usually takes more bytes than double-precision floats.


>Always ask yourself - do you need to know this number to better than 1 in 10^14?

Yes, because over time thousands of multiplies and divides means you will get more than a penny off. The more high frequency the more floats become a problem.


Floating point multiplication and division are generally much safer in terms of precision loss than addition or subtraction, and on the flip side you could easily be more than a penny off with zero operations if the quantities involved were large enough.

Quibbles aside, they're not suggesting doing accounting with floats. E.g., suppose you want to estimate the expected value of an option. You'll have a model that attempts to describe that option's behavior (e.g. Black Scholes), and you want to evaluate that model with a certain set of parameters. The model itself is imperfect, and given the transcendentals involved even if it were flawless there would be a guaranteed loss of precision when attempting to clamp a real option to its predicted expected value. The model is a tool that guides decisions, but nobody really cares if it's off by a little bit because there are a ton of other error sources anyway. 1 in 10^14 is more than good enough.

Edit: Unless you're just suggesting that people should do a little numerical analysis and be cognizant of the total error in a model?


Black Scholes is a reasonably good way to estimate the value of an options contract, which is fine for floating point. Anything / just about anything that is macro (to the larger system) is fine with floating point.

But for actual simulating trading where calculastions compounds on itself, instead of one algorithm that calculates something and then is done, floating point becomes an issue.

----

Because of the downvotes, let's take a step back to 101 about floating point error:

>>> 1.20 - 1.00

> 0.199999999999999996

In this example, you're a penny off. This is a single equation. So you have to check for rounding error and possibly round up for EVERY calculation you do, which is time consuming.

Alternative, there is fixed precision types which are fast, very fast, faster than regularly checking for errors.

If you have an equation that calculates once and then is done, a penny or two off is no big deal, but when you're backtesting, a penny or two off for every trade compounds and you end up with dollars a day off. If the algorithm is identifying to the faction of a penny in the middle of the trading day and adjusting its configuration accordingly, it will be wrong, which creates a butterfly effect that ripples out into the day, and with high frequency trading it's somewhat uncommon, but possible, to get 10% off a day from floating point precision due to the actual algorithm failing on deciding the correct path forward while trading significantly compounding the issue. Now take that and compound it out months and you see the issue.


In your example, you're only a penny off if you truncate 0.199999999999999996, rather than rounding (which is described in IEEE 754!). Here's a real simple example. Let's say your model depends on the average of the last three ticks. The last three ticks are $1.00, $1.00, and $2.00. Ok, what's the (exact!) average without being off by a fraction of a penny? This is the point - as soon as you start manipulating numbers in anything other than the most trivial way, you run into the dreaded floating point error, because that's how the real numbers work.

I am unaware of fixed precision types that have hardware optimized (other than FPGAs which are used for feed handling in HFT anyway). If you are modeling discrete things like Minimum Price Variations, then yes, use fixed precision, or even encode it in a way that saves space. But if you're numerically solving a partial differential equation, e.g., Black Scholes, it's difficult to see how fixed precision numbers are going to have an advantage.


I think the point they're going after is that algorithmic trading behavior can be meaningfully sensitive to rounding errors (which seems plausible if you profit by amplifying tiny signals), so in the context of a simulation you might still have components like Black Scholes, but for the trades themselves (even simulated) you need to take more care or risk an excessive error.

In other words, they're describing a scenario where 1 in 10^14 error is potentially not tolerable because of some amplified discrete behavior.


Agree - real world discrete things should be modeled as such. If MPV was $0.23, then model $0.23 increments - whether you use fixed point, or the cardinality of increments, who cares. But all the other math leading up to a discrete decision on the increment is almost certain to be best described with, and faster to implement in, floats.


On modern day Intel cpus (no idea on AMD's side) math using longs is faster than math using floats. ymmv depending on architecture.

https://stackoverflow.com/questions/2550281/floating-point-v...

Floats were not historically adopted for speed, but because you can get more decimal places with small numbers like 8 bit numbers.

Today with 64 and 128 bit numbers, floating point loses most of its historical benefit, but of course it still has its benefits.


you want fixed point for accounting, but there's no reason not to use floating point in quantitative analysis


I wonder why they are not using the in-built numpy api - https://www.tensorflow.org/guide/tf_numpy

This will be accelerated by Tensorflow - which means acceleration on Macbook M1 as well


We tried to use Tensorflow's native C++ library to run some predictive model in real time on CPU-only. The performance sucked, even though the model was not particularly complex.

My advice would be to only use Tensorflow with the python wrappers. It's just too complex.


Hi, I have a need to know the price of Liquidity Pool shares without relying on many oracles and multiple ABI calls

Liquidity pool shares are asset backed by at least two other assets

Can Tensorflow help me with keeping pricing?


Right now, it looks like a fun project of students who got their hands on Andersen/Piterbarg, Brigo/Mercurio and the QMC part of Glasserman. I wonder what their real goal is with this.


> I wonder what their real goal is with this.

Google do shadow banking. They are "contracted" by Lloyds, but it's all Google on the backend, developing a Revolut alternative among other things


I am not sure if that was meant as criticism but if so, could you be more specific for those of us who are not familiar with the domain?


This was not done by people who make their living in markets


Can you expand? Would love to hear what issues you have so far with the existing work.


Exactly


Why would Google employees be doing quant finance on company time?


This is made by a project in Google's internal incubator: https://avera.area120.com/

(I work for Alphabet but have nothing to do with this)


interesting - have noticed google knocking on my door much more (I am in capital markets) - they have hired of the street so that provides some credibility when mixed in with their strong tech teams. Have interesting debates with my head quant about quant vs automl - what's 'better' :)


Are they trying to spin off a hedge fund?

wtf.


Was this developed for Google internal finance projects?


I was wondering if it was developed to help promote TF more to finance firms.


Are still someone using Black-Scholes for option pricing nowadays?


Yes but it is used as a function for transforming vanilla option prices into a more natural unit: their implied volatility rather than their dollar price. Option traders think of option prices in terms of the implied volatility not the dollar price of an option. So in that sense Black-Scholes is used in the pricing of almost all vanilla options on stock-like things. Even though the price from a market maker usually comes out of a more complex model that (attempts to) ensures the quoted prices are free of arbitrage in the cross-section of strikes and across maturities. Usually some form of either stochastic volatility, local volity or LSV(local stochastic volatility) model.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: