KEYWORD |
Crypto Antifraud Bitcoin transaction analysis with Machine learning
keywords BITCOIN, CRYPTOCURRENCY, MACHINE LEARNING, ARTIFICIAL NEURAL NETWORKS, MACHINE LEARNING, DEEP LEARNING, OPTIMIZATION, PYTHON
Reference persons EROS GIAN ALESSANDRO PASERO, VINCENZO RANDAZZO
External reference persons La tesi viene fatta in collaborazione con un'azienda
Research Groups Neuronics (Artificial Neural Networks)
Thesis type EXPERIMENTAL APPLIED, START UP
Description The primary objective of this project is to develop a predictive service through the application of AI/ML that analyzes transactional patterns of Bitcoin accounts to ascertain if they are potentially fraudulent.
The key aim is to flag an address as malicious before it is reported by sites receiving fraud reports and allegations, in order to preemptively protect the community users. Given the objective to employ a supervised learning technique for the classification of Bitcoin addresses, a database of reliable labelled (licit vs illicit) transactions will be provided, as a representative sample of the real population.
Neural networks will be used to perform pattern recognition within a current account, with reference to the specific use case. The performance of various machine learning models will be compared, focusing on key dimensions such as computational efficiency, training time, inference throughput, among others.
Many features will be extracted to train a useful model for label predictions, including:
- Transaction stats: total number of transactions, incoming and outgoing transactions.
- Amount stats: calculated both in BTC and US Dollars, taking into account the market value of BTC at the time of the transaction. The max, min, average and median of the total amount sent and received by each address, as well as a range and ratio, will be computed.
- Time stats: statistics on the temporal distance between consecutive transactions made by the same address.
- Advanced stats: complex indicators based on signal theory.
Different Machine Learning Models will be trained, e.g. Logistic Regression, XGBoost, and deep learning ones. There are numerous ways to fine-tune these models, such as feature selection (for example, removing btc features to avoid the influence of market price fluctuations) or averaging predictions in a more complex manner to balance the model performance.
To ensure model integration and scalability across our software ecosystem, we require that the final neural network model be delivered in a standardized format (e.g., .h5, .pb, .pt, or .pth), selected based on compatibility with our deployment platforms. Detailed documentation of the model will also be necessary, illustrating the training techniques used and the network architectures chosen, to facilitate the interpretation of the results and any iteration on the model.
Required skills passion for thesis work. basic knowledge of neural networks, deep learning and finance can be useful. otherwise, we will provide all the material
Notes The thesis is done in collaboration with a company
Deadline 21/02/2025
PROPONI LA TUA CANDIDATURA