In a world where everyone is connected to the internet and there are sensors everywhere, being a data scientist and mastering machine learning are my main goals to make accurate predictions about the future.
The fields that I am most interested in, for using machine learning and data science techniques on large datasets, are transportation, health, social networks and cryptocurrencies. Currently, at the same time that I finish my Master's in Data Science, I work as an intern in the MyDrive team at TomTom Amsterdam and I also work independently as a Professional Fitness Coach, helping people to improve their body composition, health, fitness performance and long term habits.
Other interests of mine are GNU/Linux administration, physics, athletic performance optimization, human nutrition and traveling. If you want to see the rest of my interests and skills, please check out my profile on LinkedIn.
These are some of my last data science projects. To access the full content, click on the images.
Google Trends Modeling: Predicting depression
In this team project, my colleague M. Bragagnolo and I made a model to predict the percentage of people who have had depression in the USA at state level with a coefficient of determination of 0.74. As dataset, we used the public interest rate of Google Trends for a set of depression related keywords and official surveys as ground truth.
Data Science challenge: Decision making
This is my personal solution to a Data Science Challenge proposed by a well known international transportation network company. The challenge is, given a dataset, finding out whether it will be profitable or not to adopt a certain new technology and how much the company is willing to pay for it. Also I performed data exploration and modeling using random forest.
Brain Oscillations and Network Activity
In this team project, my colleague O. Lazareva and I do a spectral and network analysis to study the topology of the human brain from EEG signals collected with a subject under two conditions: eyes open and eyes closed. We used multivariate autoregressive models, and the Louvain Method for community detection
Bayesian Inference: Predicting bodyweight changes
There is a clear dependence between changes in body weight and energy balance. In this project, I model this dependence using Bayesian Regression on the 607 days of data taken from one of my patients. I also provide a stability analysis of the regression, and a prediction using the model.
Flights Scraper: a dynamic web scraper to get the best flight deals
Suppose you want to fly somewhere anytime in May, and come back anytime in July. There are about 900 possible day combinations for your trip, too many to track them manually. Because of that, I did this software which scrapes the dynamic web of the popular travel agent Kayak, and finds all the flights that satisfy your dates, along with all relevant information.
House Prices: Advanced Regression Techniques
This is my personal solution to the Kaggle challange of predicting house prices with Extreme Gradient Boosting based on a dataset of the characteristics and price of different houses.
Epidemic spreading in Social Networks
In this project I simulate and study the spread of a disease on a Barabasi-Albert graph. Also I propose different solutions to immunize the nodes according to different bounds in order to stop the disease.
Implementation of recommender systems for books on big datasets
This software gives book recommendations based on a data set of ratings 1.200.000 ratings from 279.000 users on 271.000 books. The algorithms used are collaborative filtering, item based, user based and association rules.
Book clustering by language using Jaccard similarity
Using the Jaccard similarity on the words of the books, this software lets you find the the books that are written in the same language and cluster them together.