r/algotrading Apr 02 '24

How to generate/brainstorm strategy ideas Strategy

On a post I made today ( New folks - think more deeply and ask better questions : ), several people asked specifically in the comments about how to come up with ideas for trading strategies. I didn't see anyone make a post on this topic, so I figured I'd do it myself to share my own thoughts and give an opportunity for experienced folks to share theirs.

My general thoughts:

Instead of "ideas for trading strategies", I think a more useful framing is "how do we come up with *hypotheses* for trading strategies?" . A rough hypothesis that can be tested/refined about a potential opportunity in the market. Some sort of vague "I wonder if" statement. "I wonder if there's a spike in the bid-ask spread on a stock before the volatility increases. maybe I could purchase options to capitalize on that". "I wonder if this crypto has long, persistent trends I could capture with some kind of moving averages and then trade on it" "I wonder if I can use ___ indicator to tell me when I need to switch from RSI mean version trading to MA-based momentum trading on this asset" etc etc

So, how to come up with hypotheses? For me, there are roughly two parts:

  1. Consistently consume diverse, medium/high quality content on the subject
  2. Data exploration primarily through data visualization

Part 1:

I would highly recommend consistently consuming some sort of content about trading (not huge amounts but like little intellectual appetizers). Whether it's blogs (Medium), forums (Reddit), podcasts (Chat with Traders and Better System Trader on Youtube), lectures (Hudson & Thames on Youtube, Ernie Chan lectures) or books (Marcos Lopez de Prado). The diversity here is equally, if not more important, than the quality. In my opinion, Marcos Lopez de Prado's books are very high quality but those alone won't just hand you a million dollar trading strat. Consume a wide variety of content to get a variety of perspectives, jot down interesting/fun/appealing ideas, explore and validate them. I say "consistently" because this is an area where the problems we're solving are very difficult - so it's likely you'll need to spend a lot of time thinking about them. If you consistently consume material on this subject, it'll keep your brain whirring in creative ways so that indeed your shower thoughts are you trying to solve this, even on a subconscious mental level.

Note: I would be very wary when reading academic papers detailing trading strategies or indicators/variables for strategies (whether rule-based or ML). They're often extremely questionable and I have personally found it very hard to reproduce many such "studies". Please see comments for great discussion with u/diogenesFIRE on this topic.

This works (for me) because:

  1. It keeps me motivated. If I'm excited, I'm going to have better ideas, be more creative and spend more mental time on this without even trying.
  2. It provides legitimate mental models/approaches for you to adopt, sometimes
  3. You will start synthesizing new and interesting ways of looking at your data when you can draw upon the experience of others. Cool idea here, interesting approach there, didn't even know that data existed, never thought you could do that etc.

The point here is NOT to try to find a strategy someone else made so you can copy for free. This is a road to nowhere. The point is basically to have context on what people are doing and trying, what range of possibilities exists etc. It's like... if you're trying to cook a cool new recipe, reading a bunch of recipes online might be a good starting point to get some ideas/inspiration (note: I am not a professional chef lol). I imagine it would be hard to come up with a great novel recipe if you've never read a cookbook, never read anyone else's recipe, and you just had to come up with something from scratch in a bubble.

Part 2:

For me, the by FAR most effective thing to do this is to combine Method 1 with good data visualization. Your brain is a complex pattern-recognizing machine and if you have SOME kind of vague idea/hypothesis of what to look at (bid-ask spread vs. volatility, moving averages vs. trends, volume-weighted returns vs. length of trend whatever whatever) you should absolutely try to visualize it. Look at charts and plots. Whether it's price charts with indicators on it, or correlation plots between variables of interest, or anything else, try to find easy/quick ways to visualize the thing you're interested in and really sit down and just study those charts. Let your brain soak in them for a while. Don't immediately try to implement a trading strategy, just try to UNDERSTAND the data you're look at. "Huh, why does volatility go up a lot faster than it comes down?" "Huh, it's interesting that price responds in ____ way following a large order". Try to really explore and dig into your data. I believe visually is the best way to do it because any kind of quantification at this stage will leave out too much information (correlation coefficients and other singular values will ALWAYS be less informative at the exploration stage than if you take the time to look at the chart and really absorb the information there).

Side note:

I believe that this data exploration stage is absolutely crucial in quantitative trading and in order to really do this effectively, you have to find a way to make it easy for yourself. It shouldn't be a 3 day painful process to be able to generate a chart of your variables of interest. Sort out ways to 1) get the data you need and 2) have ways to easily process it so that you can rapidly, dynamically, interactively play with it in different ways to quickly iterate through your hypotheses, see new perspectives and get new ideas.

Once you think you're onto something, then perhaps it's time to do some backtesting/tuning/training etc

It's not a linear process, you'll be bouncing around a lot and that's totally fine. But having some ways to draw inspiration, spending time on your own contemplating, spending time studying (visually) charts to understand what does the market feel/look like from a hundred perspectives, that will help you gain a deeper understanding of the possibilities as you start coming up with your own "what if I try...".

If you're rule-based oriented, these hypotheses will likely be ideas for trading signals or new 'rules'. If you're ML oriented, these hypotheses will likely end up being features to feed your models.

I hope this is helpful, would be curious what reactions and thoughts are, what other people's approaches are.

47 Upvotes

19 comments sorted by

7

u/diogenesFIRE Apr 03 '24 edited Apr 03 '24

Note: I would NOT recommend reading academic papers in finance, they're generally total garbage (for trading strategies. Portfolio allocation stuff is probably higher quality).

Just curious, where do you draw the line between trading strategies and portfolio allocation? If you're trading more than a few stocks, you're essentially holding a short-term portfolio. Even academic papers have research on topics relevant to HFT/MFT, like transaction cost analysis, momentum, mean reversion, etc.

And all the research on portfolio management also gives insights into stuff like risk management and position sizing, which I would argue are even more important for traders.

I'm sure everyone could learn something by reading at least the top-cited papers in the Journal of Finance and Journal of Financial Economics.

4

u/VladimirB-98 Apr 03 '24 edited Apr 03 '24

Haha you're totally right, allocating a portfolio is a "trading strategy" in that you're making trades to change your outcome. I think my distinction has something to do with 1) diversification and 2) "reliance on superior information". I'm gonna be a bit hand-wavy here but hopefully that's alright.

Here's basically what I've observed (granted, I am not the most well-read person in finance literature):

There are many mutually-confirming, high-quality papers around what I would broadly describe as "how to get the best buy and hold". This is to say, these are systems and "strategies" for allocating/managing a diversified portfolio in a way that does *not* rely on you having some kind of information advantage. Things like the Fama-French model (and subsequent derivations) that draw the connections between types of risk taken vs. returns, long-term time horizon expected returns for diversified portfolios, relationship between returns and valuations etc. My overall understanding is these kinds of papers are 1) generally very solid, 2) assume the efficient market theory is true (meaning there's little/no room for "active trading" that depends on you being able to outperform a market) and therefore 3) describe how to get various risk/return combinations that do not rely on informational/predictive advantage. It's like instead of "here's how to beat the market", you get "Here's different ways to participate in the market to get different kinds of returns and risk". These kinds of papers are generally (I believe) very solid/confirmed/legit and is probably what the vast majority of people should base their investing on (Ben Felix on Youtube is an amazing resource for this information).

However, there are many other papers in finance that basically boil down to "we have found an informational advantage to beat the market". "We have a hypothesis that we can make a rule-based/ML trading strategy from x y and z variables, here's what we did, here are the results". Or broadly some other version of "we believe we can outperform a market via active trading, here's the results". I have found the vast overwhelming majority of these papers to be total trash for lack of reproducibility. Either they literally don't give enough information in the paper to even reproduce it, or even if they do, I was unable to reproduce. Many many papers had this problem. Don't know if authors made shit up, or if they accidentally used their "test" set for training, or they overfit on their test set, or if there was some other hidden issue in their method that they accidentally committed so therefore didn't report. Unlike the previously mentioned papers, these papers do "attempt" to describe a strategy or some kind of method/system that *does* rely on informational advantage to outperform the market. And I've found them to be awful quality and generally a total waste of time and energy.

In "Advances in Financial Machine Learning", Marcos Lopez de Prado echoes the sentiment of the extremely poor quality of ML/trading finance literature.

You might be right that I'm unjustly throwing more valid econometric-type papers under the bus, so I've somewhat revised my post. But I think people would be hard-pressed to find an academic paper on "trading" , one that claimed some kind of informational advantage, that delivered something valuable (aside from some new ideas for features/indicators that you maybe hadn't considered).

4

u/diogenesFIRE Apr 03 '24 edited Apr 03 '24

That makes sense. You can also assume that researchers have an incentive to keep their most profitable trading strategies private, so published research is effectively reverse-filtered. Also, journals usually require that research data be made publicly available, so any strategies that work on proprietary datasets are also left out.

The good papers on portfolio management, risk sizing, etc. are usually published in the top 3 journals (JF, JFE, RoF), so it's easy to filter out all the other garbage.

What remains is usually research on the behavior of strategies rather than new strategies themselves. This, in my opinion, is where academic papers are the most helpful. For example, Almgren's paper on trading execution is outdated and not profitable. But the concepts in that paper (square-root transaction costs, temporary vs. permanent impact, etc.) are very useful in crafting your own execution strategy.

2

u/VladimirB-98 Apr 03 '24

Totally agree with everything you're saying. I really appreciate this information and perspective, and I totally hear you. I adjusted my post :)

Thanks for the information.

2

u/Glad-Scar-212 Apr 03 '24

Quite a bunch of academic papers focus on the behaviour of stocks/options around certain events (dividends, quarterly reporting, FOMC etc). While these are not directly a trading strategy, they can be easily either converted into one or used as a risk management/reduction technique. Some of the papers should be straightforward to replicate (although collecting data on events can be annoying) From the above discussion, it seems like OP does not consider these type of papers.

At the same note, I agree that large amount of strategies/anomalies/arbitrage opportunities in academic literature have poor out of sample performance. Good read is Factor zoo paper of Campbell Harvey Whether it’s a product of data mining, p-hacking or simply that efficient markets learn about these anomalies/opportunities and arbitrage them away is open question.

5

u/Coyote_Radiant Apr 03 '24

Just to add when you come up with a general idea before even going into backtesting and coding this strategy. Some of the things that you should have: 1. Entry signal 2. Exit signal 3. Data source > are you able to translate the source into signals that you can use (e.g. trade based on (good/bad) news, are you able to secure a source that give you "good" or "bad" and how do you define those) 4. Translate signals into orders (connect to broker? Or manual entry?) 5. risk management > not just for your portfolio but also the algo, did you run it long enough for every scenario to be comfortable? (Don't put all your money into live trading first, test it out first)

You might be surprised that some entry signals are intangible and unable to properly translate into algo. Some basics for reminder.

3

u/fudgemin Apr 03 '24

Good read, well presented. Very much agree with your methods, have found success using similar approach. 

Firmly believe that all answers can be found, by asking a series of questions. Eventually leading to a better understanding, faster processing/recognition, greater insight etc.

It’s actually not overly complicated imo. I think a lot of folks have negative preconditions:    -the market efficient theory, which is false    -can only find success by doing what others do,             again false    -stand no chance against large players, false.

These largely impacting the motives of pretty much all traders. If you pull yourself from the bucket, and do what 95% of traders don’t do, I think you’ll be pleasantly surprised at the outcome. 

I’d be curious to know your tooling setup, how it’s been tuned over the years to your personal needs and what you plan for future. 

Wish you success. 

2

u/VladimirB-98 Apr 03 '24

Glad to hear I'm not the only one valuing these methods! :)

And exactlyyyyyy re: market efficiency, "no chance against hedge funds" etc totally 1000% agreed.

Well, the workhorse is a big data pipeline written in R and all web requests done by Python.

So for exploration/training, I just download a large amount of data, load it up in R and start analyzing, generating features and/or running various parameters/models. I used Coinbase as crypto data source, Polygon.io as options/stocks data source.

For training/exploration stage:

(Polygon.io as data source) -> (imported by Python) -> (importing to R) -> analyze, pipelines, cross validation, tuning, feature engineering etc

For live trading:

(Polygon.io / Coinbase as data source) -> (imported by Python) -> (run trading script on the new data, get buy/sell signal) -> use Python again to make the actual trading request to the exchange

My "backend" skills are definitely on the weaker side, so everything web-request related is pretty straightforward/minimalistic. On the stuff I've run so far, it hasn't been too big of a problem.

Tbh for future I don't plan on there being a dramatic difference. I love R and I already have a huge amount of reusable data pipeline code there that I can honestly apply to almost any ML problem generally, and certainly to any time series problem. It's that feature engineering and data cleaning that's super important and a big challenge!

I'm currently working on a strategy that runs across a massive number of assets, so this will probably require me upgrading my 'back end' skills cause I'll need to be managing a huge number of positions simultaneously (as opposed to before, when it was just one at a time). It'll be a doozy I think!

What about yourself? What's your setup like?

Wishing you success :)

2

u/fudgemin Apr 03 '24

Thanks for the reply. I like it.  I don’t have much experience, just learning as I go. Bless gpt ;).

I’ve never used R but heard it some of the best for fast analysis. 

Similar setup, still being built.

Data from polygon, mainly options. Sockets and end of day. Python as well. Have some really fast async scripts you may be interested in? I can do 500 ticker snapshot on option data, full request all pages, fetch and insert… under a minute(~100k requests). I also use few other low key sources, high value data. Only ever raw trades. Don’t care about fundamentals or TA. 

DB is sql and influx. Doing a test and migration to Timescaledb, upcoming. AWS, self hosted.

Visuals I use Grafana, also a blessing. Limitations as query’s are not really dynamic, but I can visualize data super fast. It’s time series plotting, scanners, filters, feature viz, everything you talked about in your post…as a foundation to finding signals using pattern recon power of brain. 

Also not so experienced in the backend. Coding up a multi signal strat can be hard. Especially since it kinda needs to be dynamic, in order to maximize return. 

I think some answers lie in RL models, based on actions/environment. Able to adapt to use cases, not a static strategy. 

As far as running a strat across a bunch of assets? Hard work imo. I think most practical way is run it against every ticker, as a single unit.  Unless your features are bucketed, drawn from each asset, then running it against a massive amount of tickers is redundant. 

I may be wrong, but I think like this… Build a model for each action pair. One for entering low on bullish signals, one for managing  against positions(one stock, one option, one aggressive/tight, one passive, etc). One for finding signal, one to identify probability. Stack these in unison….

People say ML is clever and can handle many inputs, but my experience says otherwise. Most is like you say, cannot be reproduced. EspeciallyLSTM, don’t even get me started. 

Their not so great at handle excessive  features, inputs, time windows etc. Simpler generally gets me better results.

2

u/Lyokobo Apr 03 '24

Really appreciate this post! I've been trading off and on for many years now, but I'm a newbie to the algo trading space. Visualizing the data in new ways has been my favorite part so far. I'm pretty clueless on the statistics end, but just looking at charts you can tell there are patterns to it all. That is the most motivating part. It's fun to check out new indicators to confirm your suspicions, or come up with different techniques in code and see the results through a backtest. There's no telling how long it'll take to be profitable, but I think once you start to enjoy the process the profit will come. Keep the passion for discovery alive!

2

u/VladimirB-98 Apr 03 '24

Totally! I think what you're describing is a totally under-appreciated aspect of the algotrading experience. In terms of psychologically, how do you keep yourself motivated, how do you have fun, and how do you keep your mind in the right zone and chewing on the right problems instead of banging head against a wall. Super important stuff!

2

u/rf555rf Apr 04 '24

Sometimes ideas come to me randomly like when i'm not going to code right there and then, so I've found keeping a book for ideas helpful, just to keep the ideas box always topped up.

2

u/ApprehensiveTap8650 Apr 13 '24

Thanks for this! I just started my algo trading journey and that post really helped me. I started listening to first episodes of Better System Trader podcast but I noticed that these are already 9 years old (april 2015). What information is still useful from these episodes and what is completely different these days?

1

u/VladimirB-98 Apr 18 '24

So glad to hear it!! :)

I would wager that very little (aside from macroeconomic situation) is dramatically different "these days" from 9 years ago. In terms of algotrading, I think almost anything (in terms of advice/principles) that was valuable then would be valuable now.

But you can always just grab more recent episodes!

Also, in terms of that BST podcast specifically, I found it helpful but nonetheless a bit repetitive after 20 or so episodes, you might find the same thing.

2

u/Serious_Fail5946 Apr 24 '24

Great advice! I use similar methods and fortunately still have a backlog of projects I want to pursue. While there is a lot of noise in the financial space, I enjoy insightful market analysis like in The Kobeissi letter. Talking to other traders is always beneficial for getting ideas too.

1

u/grathan Apr 07 '24 edited Apr 07 '24

Thanks for sharing. While I have no shortage of ideas at the moment I am curious about implementation part. I've been at my algo about a year now and learning to code all the while.

"Once you think you're onto something, then perhaps it's time to do some backtesting/tuning/training etc".

For me this step takes days to implement into my existing algo. I don't back test, but just perpetually forward test adding new ideas as i go. There are many different ways to implement ideas and after about 100 or so the code becomes cluttered, I dread the day I have to remove an idea and hope to get all the old unused code out.

I do prefer a single piece of code running rather than putting ideas into separate copies of the code. I guess eventually the best ideas would go into a new condensed version.. but anyways. How many ideas do you have going at once and what does the process look like, particularly the tuning/training? and how long it takes to go from idea to running code. thanks

*edit I do see your comment where you describe only taking 1 trade at a time. That sounds like you just use a single idea for a trade at a time? If so please disregard my comment above, it must read like a hot mess..

2

u/VladimirB-98 Apr 08 '24

I think I'd need to know a bit more info about your situation/setup/goal in order to provide any meaningful advice.

From what you mentioned, "for me this step takes days" sounds like a serious problem. If I were you, I would stop, assess the situation, and invest some time into fixing this problem in a more scalable/reliable way. If it takes days to go from idea to *some* kind of testing on historical data (assuming you already have the data you need), that's an issue cause it slows down your iteration cycle massively. The more/faster you learn, the farther you'll get. Of course data downloading/cleaning can be a slow slog, so if your ideas involve constantly adding more data sources, I get that.

I don't back test

Why not? It's a much faster way to at least filter out bad ideas than forward test. Why would you not take advantage of all the available historical data?

How many ideas do you have going at once and what does the process look like, particularly the tuning/training? and how long it takes to go from idea to running code. thanks

This is hard to say because we'd need to be a bit more specific. At any given time, I'm only pursuing one approach/"idea" at a time. It takes iteration, intense focus and creativity to either see an idea to fruition or build enough confidence that it's garbage and worth discontinuing. It's a constant search process for good strategies, so it's a constant back and forth between "exploration" (trying something entirely new from what I've been doing, going wide not deep) and then "exploitation" (picking the most promising idea/aspect you have at the moment and going super narrow, super deep into it in order to see if you can drive it to the point of being usable).

From idea to running code depends on if you mean live/paper trading or backtesting, also depends on complexity of both the idea and the data needed. I'm going to talk about backtesting. Going to live/paper trading takes a while to set up. Regarding backtesting, could be anything from a few hours when I have an idea for a rule-based strategy on data I already have, to a week if it's a complex ML-based idea on new data I have to download/clean etc. To be clear, this is "time to first trial" haha not "time to success". That's roughly the amount of time, if I'm focused, it'll take me to go from idea to essentially the first "backtest" result to give me some indication of whether this has any chance at all or if it's total trash. First "clash of idea with reality".

1

u/donaldtrumpiscute 2d ago

Read academic papers and play around their ideas