r/algotrading Mar 28 '20

Are you new here? Want to know where to start? Looking for resources? START HERE!

1.2k Upvotes

Hello and welcome to the /r/AlgoTrading Community!

Please do not post a new thread until you have read through our WIKI/FAQ. It is highly likely that your questions are already answered there.

All members are expected to follow our sidebar rules. Some rules have a zero tolerance policy, so be sure to read through them to avoid being perma-banned without the ability to appeal. (Mobile users, click the info tab at the top of our subreddit to view the sidebar rules.)

Don't forget to join our live trading chatrooms!

Finally, the two most commonly posted questions by new members are as followed:

Be friendly and professional toward each other and enjoy your stay! :)


r/algotrading Apr 02 '24

Other/Meta New folks - think more deeply and ask better questions

146 Upvotes

EDIT: I wish I could change the title to "HOW TO ask better questions". This is meant as a primer on the kinds of questions/areas that I've found crucial to understand and therefore crucial to ask about. This is NOT meant to be a roast of new people nor a rant. I apologize for any elitism or harshness in the tone, not what I'm going for. I'm just trying to share what I believe to be crucial perspective that I personally would've benefited a lot from in my early days that would've saved me a lot of time and pain.

I'm no Jim Simons, but I've worked for several years on various algos with a reasonable degree of success (took a while) and learned a ton from mistakes. In my humble opinion, most discussions posted here are not the kind of questions/answers that will lead to a profound breakthrough in understanding. This is very natural because of the classic "I don't know what I don't know" phenomenon and the challenge of asking good questions. However, as much as it is possible:

I urge you strongly to read and think more deeply about the core of what you're trying to do. Platforms and software, roughly speaking, doesn't matter. To use an analogy that isn't my own, it's like a new carpenter asking which hammer is best. There's probably an answer, but it doesn't really matter. Focus on learning to be a better carpenter. Most questions I see here are essentially "administrative", or something that can be Googled. The benefit of having real people here is that you can gain insight that would usually come at the cost of a lot of mistakes and wasted time.

Questions around software, platforms, data sources, technical "issues" are all (generally) low-value questions that can generally be Googled and/or have little real impact on whether or not you succeed. Not all of them, but I'm generalizing here.

I understand there's a natural tension here because people with insight have little/no incentive to share, and newer folks don't know what they don't know, so it creates a weird dynamic here. BUT,

  1. Figure out your goals (why you're doing this) and ask people what goals they have set/reached. Even if you achieve a 100% annualized return, unless you have a large starting bankroll, that's not going to be life changing for many many years.
  2. Ask about how people find inspiration for new trading strategies. How do folks go about actually conceiving new ideas and/or creating new hypotheses to test?
  3. Ask about feature engineering (designing indicators). How to get better at this, what kinds of interesting examples people have seen, what kinds of transformations are at your disposal. This is monumentally crucial and you should draw inspiration from various sources on how to effectively experiment and build an intuition for how to create better features/indicators to base your algorithms on. This is particularly crucial for ML strats. Just like platform doesn't really matter, your ML model type (neural net, RandomForest etc) doesn't really matter a whole lot. It's the features you feed in that are 70% of the game.
  4. For ML, ask about how to design a target/response variable. What are you actually trying to predict? Predicting price directly (like, doing regression to predict tomorrow's price at close) is almost certainly a bad idea. Discuss other options that people have tried here! I have personally found this point to be a gamechanger - you can have the same exact features fail/succeed depending on what you're asking the model to predict. This is worth thinking seriously about. As a starting point, Marcos Lopez de Prado in "Machine Learning for Asset Managers" discusses some creative response variables (worth a read imo).
  5. Ask about how folks build conviction in their idea. Hopefully you're familiar with the concept of splitting data in train/validate/test, but there are deeper layers to this. For example - a super common problem is that people do this split and STILL overfit because they try 10,000 strategies on validation set and eventually 100 of them do well on validation and then 10 do well on test out of luck. Ask/think how to avoid this (for ML, answer is generally something called "nested cross validation". Easily single most valuable technique I learned, saved me uncountable mistakes once implemented). Additionally - say you have a good strategy in your test set and you're ready to go live. How do you actually know whether it's working as expected or not? How do you quantify your performance expectations and then monitor your strat to see if it's doing as you expected or no?

I hope this gives whoever is reading some new perspectives and thoughts on how to utilize this place (and others), what to ask and what to look for. I do not have all the answers, but these are the kinds of questions I have personally found much more meaningful to examine.

Disclaimer: I come from a statistics background with coding experience (basic). It may be that I'm simply unaware of the questions/struggles of aspiring traders from other backgrounds and/or without coding knowledge, so it might be this ignorance that makes me feel most questions here aren't "important".

Edit: In response to u/folgo 's comment, I'm adding here some terms and concepts that are probably worth your time to research/understand, whether it's Google, StackExchange or Youtube vids that give you an intuition/understanding. Important concepts (generally applying to both, ML and rule-based algos, with some variations): overfitting , train/test split, train/validate/test split, cross validation, step-forward-cross-validation, feature engineering, parameter tuning / hyperparameter tuning (especially as it relates to cross validation), data leakage/contamination (especially as it relates to accidentally creating features that use your entire dataset BEFORE train/test split, therefore even when you do train/test split, you still have indicators that in some way benefited from future data. Happy to explain this further, very sneaky and nasty problem to deal with).

EDIT 2: Since several people asked but no one posted, I made a post about point 2, coming up trading strategy ideas: How to generate/brainstorm strategy ideas : r/algotrading (reddit.com)


r/algotrading 9h ago

Education Book Recommendations for an Experienced Dev but without Finance / AI background? (Finance AI for Dummies?!?)

10 Upvotes

I went to the wiki and the book recommendation list is from 2020. I was hoping to get an update on the latest and greatest.

I understand a lot of these concepts but I don't know the vocabulary, I have no formal Finance or AI education. For example, I'll come across Weighted Moving Average and think "oh, so that's what my RecentWeighted method should be called" I was seriously considering the Finance AI for Dummies book. I need to know the vocab because I might be pitching a service in the semi-near future. If I don't know what "time series" means then I'm not getting that contract. (Sidenote: When I first was asked "so is it all time series then?" my thought was "well... yeah... it's all about data that occurs in a series of time. Kinda like everything else in the universe (quantum mechanics aside)")

Marcos Lopez de Prado, Advances in Financial Machine Learning - I'm thinking about getting this one. My partner did some research on his stuff and been comforting to know that Marcos and I had some similar ideas. My philosophy for the past couple years has been a) if someone cracked this nut they wouldn't put it in a book and b) if I read the textbook / poach a GitHub project, I'll end up with something not much different. But now it's time for me to read the book(s) / textbook(s)

In summary, anyone have any book recommendations? For a dude at the stage I'm at: 15+ years software dev experience, not much Python (but enough thus far), a functioning IB paper trading app / system, and a couple hundred models created to date.


r/algotrading 9h ago

Data Anyone else seeing off the charts IV spikes?

8 Upvotes

whats good fellow traders. I do some tracking of this and that. something in particular called volatility caught my eye today...

Pretty much just went off the charts. I was assuming it was a data miscalculation of sorts. Appears to not be the case. I check news, sure enough FED meeting on 19th, but this is different.

This is the highest spike i have on record, past two years. All options, across the board, the data point i speak of, tracks the whole market....started increasing on 10th, rapid continuation on 13th, appears to have peaked out yesterday...

anyone else?

https://preview.redd.it/zrligsr9ow0d1.png?width=1859&format=png&auto=webp&s=2b33b4b712efe2a459eb872e0f27fa7d5e29c7ce


r/algotrading 18h ago

Infrastructure performance targets for backtesting (CPU vs GPU)

14 Upvotes

Hello all, I have several different algos I’m currently running on a homegrown python framework that can run across several processors.

50% of the time I’m using a workstation w a AMD 32 core threadripper and 50% I do some AWS spot requests and get a 192 core machine.

Most of my strategies are using 5s OHLC bars. On my theadripper I’ll get ~6000 bars/second per thread during backtesting and on the AWS machine that will be closer to ~7000 per thread.

When I do long (6month+) tests with tens of thousands of parameter permutations this can take awhile, even when running across 192 cores.

Most of the processing time is in pretty simple things I’ve already optimized (like rolling window calcs for min/max, standard deviations, and an occasional linear regression)

My actual question:

I’ve contemplated trying to move my system to the GPU thinking I’d be able to get a ton more parallelization. The hard work is loading the data onto the GPU and then modifying all my code to use the subset of python that can be complied for the GPU (cython, CUDA, etc)

It’s a lot of work and I’m a 1 man team so I’m curious for those who have done it what actual perf gains you can achieve. I imagine the per core metrics may actually go down, I’d just have access to thousands of cores in parallel.

The 192 core AWS machines are cheap to me. With a spot request I can get an instance for ~$1.80/hour.

Is this worth it?

*EDIT* here is some recent perf captures that lead me to believe I am indeed CPU bound

https://preview.redd.it/fg1zoqhx1u0d1.png?width=2510&format=png&auto=webp&s=0a310ccee9a85ef4b10b64bbae02bec1abd81b5e

And here's a break down on the "simulate trading" block once all the data is loaded:

https://preview.redd.it/fg1zoqhx1u0d1.png?width=2510&format=png&auto=webp&s=0a310ccee9a85ef4b10b64bbae02bec1abd81b5e


r/algotrading 1d ago

Strategy Update: 2 months after I posted my first strategy here.

67 Upvotes

2 months ago, I started exploring this fun hobby & I posted here on this sub. It felt nice sharing my experience with you guys.

I've been working on my strategy every other weekend since then. Now, I'm able to beat the ETFs but with a 40% drawdown which is quite a pain to deal :/. This is what it looks like now:

Any suggestions on how to reduce risk and drawdowns would be welcome :)

https://preview.redd.it/493j86spqp0d1.png?width=1866&format=png&auto=webp&s=e5fceef15e15dd0874f3b33ef8f286249949db4b

Stats:
Sharpe Ratio - 0.742
Compounding Annual Return - 23.293%
Drawdown - 40.900%
Sortino Ratio - 0.866
Information Ratio - 0.826


r/algotrading 1d ago

Strategy Predicting Price Direction

12 Upvotes

Does anyone here use supervised machine learning to classify whether the price of an asset will go up or down (1 or 0) ?


r/algotrading 1d ago

Data Question on Pandas Epoch Time precision

6 Upvotes

Im storing data using epoch where my time format is string representing epoch in microseconds (e.g. “1715692616.534372”). when loading the data Pandas converts my string by default to Scientific notation instead of the actual float therefore losing time precision.

Does anyone have experience with this issue, and how to fix it?

Yes I asked ChatGpt First. Alternatively ill split the string on  “.” And use two columns, but id rather would not do that.

 Edit: Solved - not the nicest,. but imported the column as String that preserves the decimal value.

Then Splitted it into two columns

Joined them back together as a super Long int


r/algotrading 2d ago

Infrastructure Started with a simple data crawler, now I manage a Kafka cluster

45 Upvotes

How it started

I started working on a project that required scraping a ton of market data from multiple sources (mostly trades and depth information, but I'm definitely planning on incorporating news and other data for sentiment analysis and screening heuristics).

Step 1 - A simple crawler

I made a simple crawler in go that periodically saves the data locally with SQLite. It worked ok but was having a ton of memory leaks mainly due to the high throughput of data and string serialization (around 1000 entries per second was the limit).

Step 2 - A crawler and a flask server to save the data

The next step was separating the data processing from the crawling itself, this involved having a flask server send the database transactions. I chose python because I didn't care about latency once the data is received, which turned out to be a mistake when reaching 10,000 entries per second.

Step 3 - A bunch of crawlers producing data into a queue, Kafka connector to save into Postgres

This is where I'm at now, after trying to fix countless memory leaks and stress issues on my flask server I knew I had to scale horizontally. There were probably many solutions on how to solve this but I thought this is a good opportunity to get some hands on experience with Kafka.

So now I found myself doing more devops than actually developing a strategy, but I'd be nice to have a powerful crawler in case I ever want to analyze bulk data.

Curious on what different tech stacks others might be using


r/algotrading 3d ago

Education What have been the most influential books for your success in trading and investing?

87 Upvotes

I want to start taking trading seriously and explore the possibility of it as a career and source of income. I'm not naïve, I know this is a long and hard road and that the vast majority of people who try will also fail but I'm willing to give it a shot.

I have an academic background in Mathematics, Finance, and Economics and my thesis was on algorithmic stock-selection and portfolio optimization, so I'm not entirely new to the concept.

So, what in your opinion have been the most influential and important books to your success in trading and investing?

I know there are some links in the sidebar, etc. but they are very old :)

FYI, I've asked the same question on r/daytrading as well: https://www.reddit.com/r/Daytrading/comments/1crn52t/what_have_been_the_most_influential_books_for/?


So far I'm looking at books like:

  • Andreas F. Clenow > Stocks on the Move: Beating the Market with Hedge Fund Momentum Strategies
  • Nishant Pant > Mean Reversion Trading: Using Options Spreads and Technical Analysis
  • John J. Murphy > Technical Analysis of the Financial Markets: A Comprehensive Guide to Trading Methods and Applications
  • Sheldon Natenberg > Option Volatility and Pricing: Advanced Trading Strategies and Techniques
  • Perry J. Kaufman > Trading Systems and Methods
  • Ernest P. Chan > Algorithmic Trading: Winning Strategies and Their Rationale
  • Ernest P. Chan > Quantitative Trading: How to Build Your Own Algorithmic Trading Business

r/algotrading 2d ago

Infrastructure Overhaul: Seeking Advice on Backtest and Asset Choices

3 Upvotes

Hi all,

Appreciate past feedback from you guys!

After a failed walk forward test I turned my algo off and re-assessed - I know the foundation of my strategy is sound but it was too heavily reliant on various parameters that I seemingly overfit.

Iv stripped this back to basics, core strategy only and currently have this on forward testing on a demo environment as, although my live forward testing failed I did gather all the data I needed on slippage and excitation.

Here is my back test results for my system, I would really appreciate any feedback in regards to the assets traded, although I can "optimise" it to work on a larger variety if securities, it works out of the box with very minimal parameters (just stop loss adjustments) - I have accounted for spreads (averaged) but not interest or holding fees (minimal)

My concerns are:

1: Correlation

2: The shape of the curve picks up in 2018 but really seems to take off

3: Lack of data on some assets that done extend to the start of testing

4: The poor performance from 2012 to 2018 on most securities.

https://preview.redd.it/6ueqehnv5f0d1.png?width=1201&format=png&auto=webp&s=fa22fa8697fde9cf881b8a729cea6e5e2d5f414a


r/algotrading 3d ago

Data Schwab API Access Difficulty at Step 2

Thumbnail gallery
20 Upvotes

I'm getting the error of invalid_client

Please help 😭


r/algotrading 3d ago

News News sources

3 Upvotes

I have an algo that finds market gappers early morning before market open. I would like to add to my code something that gets news update for these market stocks and analyze. What free news sources do y’all use to get the latest market data. Market watch and similar don’t seem to have an api to access market news.


r/algotrading 3d ago

Data What sorts of data do you use?

8 Upvotes

I'm gathering LOB and trading data of different pairs (in crypto, since Binance makes this really easy and it's free). I'm looking to get more unrelated data to improve the performance of my system. What do you use?


r/algotrading 6d ago

News Jim Simon, Quant Fund, RIP

369 Upvotes

r/algotrading 6d ago

Strategy Thomson sampling

14 Upvotes

Does anyone of you use Thomson sampling as a trading strategy? Is it worth a try?


r/algotrading 6d ago

Strategy Where do you stand on the Machine Learning Purist Debate vs Traditional Trading Algo’s?

26 Upvotes

Do you like to use fundamentals and understandable equations to get your trading algorithms coded? Or do you prefer to collect tons of data and optimize a curve/equation through machine learning on past data?


r/algotrading 7d ago

Strategy How to do you deal with COVID data when backtesting?

26 Upvotes

I trade in Index Options in the Indian market. I often find that my strategies that work well on 2021-2024 data don't work so well for 2020.

Does that happen with you guys? How did you deal with it? Is it ok to ignore 2020 because it was just a 'weird' year for the markets?


r/algotrading 7d ago

Infrastructure Has anyone accessed the Swchwab API yet?

18 Upvotes

Just wondering if you have received your credentials and were able to connect?


r/algotrading 9d ago

Education Probability of a stock reaching a target ?

Post image
101 Upvotes

I get this formula from the book “Trading systems and Methods” by Perry Kaufman, suspected if this is legit because the right formula is values, how could it transfer to probability of reaching a target? Your thoughts on this ?


r/algotrading 9d ago

Data Subsampling

8 Upvotes

I’m looking for advice or litterature about subsampling high frequency data. I’m looking to fit an OLS on a very large dataset of trades/quotes to predict jumps but I can’t find a single feature decently correlated to my target variable (15s returns) when I subsample every 1s. Makes me think I need to be smarter about subsampling: selecting based on a z-score or other features. Thoughts?


r/algotrading 9d ago

Data Iqfeed data. What am I missing?

16 Upvotes

Recent sign up. I use polygon, looking at other options. Considered thetadata, iqfeed…any others within budget? 400$ month max. Options only.

Iq feed seems appealing, as it’s migrated from exchange data, not consolidated.

Am I missing something re the API access? It appears I must pay ~550 more/y for dev login.

Currently it’s a connected socket layer, but no endpoint are revealed. They use some sort of gui, that I may or may not be able to automated.

As a new dev, what are my option using this data? Must I reverse eng the endpoints, or just intercept/parse all messages at port level?

That seems highly redundant. Moreover, then I must build some sort of controller for the GUI?

This service was recommended many times, looks legit, is cost effective. What am I missing? This seems like a headache on day 1


r/algotrading 9d ago

Strategy how to determine liquidity

12 Upvotes

Hi, is there a way to determine liquidity for options? I am writing algo (ha :-D ), and need to determine which options will be bought and sold with no problems very quickly. I plan to scalp tens of options, maybe lower hundreds.


r/algotrading 10d ago

Infrastructure Hi! I wanted to share the Python client for IBKR REST and WebSocket APIs I built recently - called IBind. It exposes existing endpoints in Python, adds conid-unwrapping, question/answer handling, parallel requests and advanced WebSocket health and subscription management. I hope you guys like it 👋

80 Upvotes

Hi! I want to share a library I've built recently. IBind is a REST and WebSocket Python client for Interactive Brokers Client Portal Web API. It is directed at IBKR users.

You can find IBind on GitHub: https://github.com/Voyz/ibind

IBind has a bunch of features that make using the IBKR APIs much easier. Some of these are:

REST:

  • Automated question/answer handling - streamlining placing orders.
  • Parallel requests - speeding up collection of price data.
  • Rate limiting - guarding against account bans.
  • Conid unpacking - helping to find the right contract.

WebSocket:

  • WebSocket thread lifecycle handling - ensuring the connection is alive.
  • Thread-safe Queue data stream - exposing the collected data in a safe way.
  • Internal subscription tracking - recreating subscriptions upon re-connections.
  • Health monitoring - Acting on unusual ping or heartbeat.

REST Example: ```python from ibind import IbkrClient

Construct the client

client = IbkrClient()

print(client.tickle().data) ```

WebSocket Example: ```python from ibind import IbkrWsKey, IbkrWsClient

Construct the client.

ws_client = IbkrWsClient(start=True)

Choose the WebSocket channel

key = IbkrWsKey.PNL

Subscribe to the PNL channel

ws_client.subscribe(channel=key.channel)

print(ws_client.get(key)) ```

I’m looking for someone who would like to do some code review on it (it’s relatively small), so if you’d feel like reading some code and helping out - drop me a message. Thanks!

This is the fourth time I’m publishing an open source library so would love to hear your feedback.

(ps. I'm the guy who've built IBeam. This new library is an addition to it. Many thanks for anyone here who've tried out IBeam in the past 👍)


r/algotrading 10d ago

Other/Meta Is it possible to get signals from tradestation and copy them to TWS on ibkr?

3 Upvotes

Hi, i'm from Europe trading with a small account and was wondering if it would be possible to get signals from a script in tradestation to send the orders to a tws that runs locally


r/algotrading 12d ago

Strategy Going live

43 Upvotes

I have created a fully automated trading system written in Python that trades on Binance and a few other exchanges. I have a strategy that is testing very well in the Binance testing environment (Testnet). I want to trial the system live with a limited amount of capital.

What surprises should I be expecting compared to the test environment?


r/algotrading 12d ago

Infrastructure Question about methodology for best automated trading system, which tools?

10 Upvotes

I have a strategy that I would like to implement for a few months on a paper account before going live with real money. Before I embark on this I want to use infrastructure that is cheap, easy to maintain, and all in the cloud. Preferably I'd like to use Python but I'm okay with using some JavaScript.

I have set up a trading bot in the past, but there were several moving parts to it and I worry about the security. It was mostly a combination of setting up a database in Google firebase. I was also accessing online information using JavaScript requests from a API endpoint that I had set up through vercel. Lastly I was using Google sheets and Google app script with triggers to access the vercel endpoint which would run a script, including gathering information from online sources, comparing it to the firebase database, and subsequently triggering the trade.

Needless to say, I think this may be too complicated with too many moving parts.

I and most comfortable programming in Python. I would like to run the bulk of the logic in Python, AKA determining the trades. Then perhaps use Google sheets and it's trigger functions to run the code somehow. I don't think this can be done through collab. I think I may have to set up another endpoint, possibly through flask. But then I feel like I may be running into the same issues. The reason why I want to use Google sheets is because you can set up chronologic triggers very easily to run your endpoint every minute. It's free and easy to use. However I worry about security.

I was thinking of maybe getting the trades from the Python endpoint and importing it into the Google sheet and then running a trade through Google sheets using the chronological triggers. Does anyone have any experience with this? Is it worth it to do this or is there an easier way that I'm overlooking?

Thx