One benefit to a non-traditional career trajectory is the diversity of projects you complete. On my record, there are fields such as Quantitative Finance, Data Science for Finance, Corporate Finance and Blockchain.
In each case, I worked with data in Python. Here are some remarkable projects that I enjoyed.
1 — Detecting fraudulent transactions on accounting data
During my master’s, I was introduced to Data Science while working in research assistance. We were looking into the methods to automatically detect potential fraudulent transactions in a set of accounting data.
Our research started with simple statistical methods like Bayes’ Law and continued with supervised machine learning algorithms like Random Forests. The most time-consuming investigation was the application of unsupervised machine learning, e.g., Autoencoders.
My task was to code the models in Python and test them on accounting data.
2 — Model validation
Model validation relates to testing the assumptions of the broad calculations applied in banking, such as the validity of VaR, Expected Shortfall and other financial concepts.
This is a bureaucratic procedure. The goal is to evaluate the model and advise on improvements thoroughly. The output is a written document with the test results.
In this case, I had to use Python to backtest the historical data, calculate the realized VaR and compare it to the measurements used by the bank.
3 — Excel process automation
The truth is that traditional financial institutions still heavily rely on Excel for data processing despite all the technological advancements. Being tasked with daily manual copy-paste was unacceptable, so I automated most of those processes.
The code was fairly simple — take data from multiple sources ( pandas
library is your best friend here), process the data and perform calculations. Typically, the final output also had to be a newly generated Excel file, so I was playing around with formatting pandas
data frames, making it look like I did it manually.
4 — Collecting trading data from multiple exchanges
When working with tokens of small market capitalisation listed on a few exchanges, observing the trading patterns on each exchange is insightful. To do that, you must have a system that collects and unifies the realized trades on each interest exchange.
Hereby is the simple outline for such a project:
- Collect historical trade data from all exchanges using an API call (typically, no registration is needed);
- Create an SQL database with columns of interest (e.g., trade ID, timestamp, exchange, side, price, etc.) — be aware that such a database needs archiving measures in place as it becomes huge quite fast;
- Use SQL commands in combination with Python to get valuable insights such as trading frequency, average trade size, etc.
5 — Trading bot detection
To expand on the previous project, I had to look at the patterns in the trading data to identify if a bot was potentially present on any of the exchanges.
In this project, I had to check if the trades followed a predefined suspicious pattern, such as orders incoming at an interval below a few seconds or equal-size round orders.
The final touch to this project is an automated notification when a suspicious pattern is depicted.
I went from corporate banking to [data] engineering at a privacy startup. I code, enjoy remote living, explore life, read books and dance ballet.
Check out these other resources if you want to connect with me: