AWS Serverless

AWS S3 to Lambda to Aurora to Lambda to Binance | Serverless architecture for crypto trading

I recently asked on LinkedIn about advice and opinions on infrastructure for collecting, storing, processing, and storing back derived data (features, signals) for some simple mid freq / stat arb trading strategies.

I did not expect to receive so much feedback about infrastructure for trading data pipelines:

Many opinions on infra

I am currently using a very simple stack based on serverless AWS cloud services: S3, Lambda, PostgreSQL (Aurora Serverless v2). Before that I used to host databases (PostgreSQL and MongoDB) on EC2 instances, and maintain a whole bunch of python scripts, writing and reading parquet files, scheduled by cron jobs. On several occasions, instances went down, all services did not restart properly, and I lost a couple of days of data and trading.

The common theme in the opinions I collected is to challenge the use of a PostgreSQL database. People were suggesting to do one of the following instead:

Don’t use any databases but:

  • use csv files
  • use parquet files
  • use flat files on S3

or use other databases:

or other services such as:

Concerning the Lambdas:

  • use docker (10GB limit) instead of zip (250MB limit)

Of course, I do not know or master all of the above technologies, and I have limited time to maintain the trading infrastructure (solo project); my main focus is on alpha research rather than technology. I am essentially looking for a sound, robust, and easy to maintain pipeline to go from data to trading; hence serverless is convenient for me, for now.

However, thanks to the feedback, I am thinking of exploring further 1) Amazon Feature Store; 2) Prefect.