What do I do?

In a world where everyone is connected to the internet and there are sensors everywhere, being a data scientist and mastering machine learning are my main goals to make accurate predictions about the future.

The fields that I am most interested in, for using machine learning and data science techniques on large datasets, are health, social networks and cryptocurrencies. Currently, at the same time that I finish my Master's in Data Science, I work independently as a Professional Fitness Coach, helping people to improve their body composition, health, fitness performance and long term habits.

Other interests of mine are GNU/Linux administration, physics, athletic performance optimization, human nutrition and traveling. If you want to see the rest of my interests and skills, please check out my profile on LinkedIn.

Recent Projects

Data Science challenge: Decision making

This is my personal solution to a Data Science Challenge proposed by a well known international transportation network company. The challenge is, given a dataset, finding out wether it will be profitable or not to adopt a certain new technology and how much the company is willing to pay for it.

Flights Scraper: a dynamic web scraper to get the best flight deals

Suppose you want to fly somewhere anytime in May, and come back anytime in July. There are about 900 possible day combinations for your trip, too many to track them manually. Because of that, I did this software which scrapes the dynamic web of the popular travel agent Kayak, and finds all the flights that satisfy your dates, along with all relevant information.

Predicting bodyweight changes with Bayesian Inference

There is a clear dependence between changes in body weight and energy balance. In this project, I model this dependence using Bayesian Regression on the 607 days of data taken from one of my patients who provided written consent for this publication. I also provide a stability analysis of the regression, and a prediction using the model. The article contains the code used in the programming language R.

House Prices: Advanced Regression Techniques

This is my personal solution to the Kaggle challange of predicting house prices with Extreme Gradient Boosting based on a dataset of the characteristics and price of different houses.

Implementation of recommender systems for books on big datasets

This software gives book recommendations based on a data set of ratings 1.200.000 ratings from 279.000 users on 271.000 books. The algorithms used are collaborative filtering, item based, user based and association rules.

Book clustering by language using Jaccard similarity

Using the Jaccard similarity on the words of the books, this software lets you find the the books that are written in the same language and cluster them together.