Here are 7 Data Science Projects on GitHub to Showcase your Machine Learning Skills!

These are critical questions a data scientist needs to answer.

And the HungaBunga project will help you reach that answer faster than most data science libraries.

It runs through all the sklearn models (yes, all!) with all the possible hyperparameters and ranks them using cross-validation.

Here’s how to import all the models (both classification and regression): from hunga_bunga import HungaBungaClassifier, HungaBungaRegressor You should check out the below comprehensive article on supervised machine learning algorithms: Commonly used Machine Learning Algorithms (with Python and R Codes)   Deep Learning Projects Behavior Suite for Reinforcement Learning (bsuite) by DeepMind Deepmind has been in the news recently for the huge losses they have posted year-on-year.

But let’s face it, the company is still clearly ahead in terms of its research in reinforcement learning.

They have bet big on this field as the future of artificial intelligence.

So here comes their latest open source release – the bsuite.

This project is a collection of experiments that aims to understand the core capabilities of a reinforcement learning agent.

I like this area of research because it is essentially trying to fulfill two objectives (as per their GitHub repository): Collect informative and scalable issues that capture key problems in the design of efficient and general learning algorithms Study the behavior of agents via their performance on these shared benchmarks The GitHub repository contains a detailed explanation of how to use bsuite in your projects.

You can install it using the below code: pip install git+git://github.

com/deepmind/bsuite.

git If you’re new to reinforcement learning, here are a couple of articles to get you started: Simple Beginner’s Guide to Reinforcement Learning & its Implementation A Hands-On Introduction to Deep Q-Learning using OpenAI Gym in Python   DistilBERT – A Lighter and Cheaper Version of Google’s BERT You must have heard of BERT at this point.

It is one of the most popular and quickly becoming a widely-adopted Natural Language Processing (NLP) framework.

BERT is based on the Transformer architecture.

But it comes with one caveat – it can be quite resource-intensive.

So how can data scientists work on BERT on their own machines?.Step up – DistilBERT!.DistilBERT, short for Distillated-BERT, comes from the team behind the popular PyTorch-Transformers framework.

It is a small and cheap Transformer model built on the BERT architecture.

According to the team, DistilBERT runs 60% faster while preserving over 95% of BERT’s performances.

This GitHub repository explains how DistilBERT works along with the Python code.

You can learn more about PyTorch-Transformers and how to use it in Python here: Introduction to PyTorch-Transformers: An Incredible Library for State-of-the-Art NLP (with Python code)   ShuffleNet Series – An Extremely Efficient Convolutional Neural Network for Mobile Devices A computer vision project for you!.ShuffleNet is an extremely computation-efficient convolutional neural network (CNN) architecture.

It has been designed for mobile devices with very limited computing power.

This GitHub repository includes the below ShuffleNet models (yes, there are multiple): ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices ShuffleNetV2: Practical Guidelines for Efficient CNN Architecture Design ShuffleNetV2+: A strengthened version of ShuffleNetV2.

ShuffleNetV2.

Large: A deeper version based on ShuffleNetV2.

OneShot: Single Path One-Shot Neural Architecture Search with Uniform Sampling DetNAS: DetNAS: Backbone Search for Object Detection So are you looking to understand CNNs?.You know I have you covered: A Comprehensive Tutorial to learn Convolutional Neural Networks from Scratch   RAdam – Improving the Variance of Learning Rates RAdam was released less than two weeks ago and it has already accumulated 1200+ stars.

That tells you a lot about how well this repository is doing!.The developers behind RAdam show in their paper that the convergence issue we face in deep learning techniques is due to the undesirably big variance of the adaptive learning rate in the early stages of model training.

RAdam is a new variant of Adam, that rectifies the variance of the adaptive learning rate.

This release brings a solid improvement over the vanilla Adam optimizer which does suffer from the issue of variance.

Here is the performance of RAdam compared to Adam and SGD with different learning rates (X-axis is the number of epochs): You should definitely check out the below guide on optimization in machine learning (including Adam): Introduction to Gradient Descent Algorithm (along with variants) in Machine Learning   Programming Projects ggtext – Improved Text Rendering for ggplot2 This one is for all the R users in our community.

And especially all of you who work regularly with the awesome ggplot2 package (which is basically everyone).

The ggtext package enables us to produce rich-text rendering for the plots we generate.

Here are a few things you can try out using ggtext: A new theme element called element_markdown() renders the text as markdown or HTML You can include images on the axis (as shown in the above picture) Use geom_richtext() to produce markdown/HTML labels (as shown below) The GitHub repository contains a few intuitive examples which you can replicate on your own machine.

ggtext is not yet available through CRAN so you can download and install it from GitHub using this command: devtools::install_github(“clauswilke/ggtext”) Want to learn more about ggplot2 and how to work with interactive plots in R?.Here you go: 10 Questions R Users always ask while using ggplot2 package How I Built Animated Plots in R to Analyze my Fitness Data (and you can too!)   End Notes I love working on these monthly articles.

The amount of research and hence breakthroughs happening in data science are extraordinary.

No matter which era or standard you compare it with, the rapid advancement is staggering.

Which data science project did you find the most interesting?.Will you be trying anything out soon?.Let me know in the comments section below and we’ll discuss ideas!.You can also read this article on Analytics Vidhyas Android APP Share this:Click to share on LinkedIn (Opens in new window)Click to share on Facebook (Opens in new window)Click to share on Twitter (Opens in new window)Click to share on Pocket (Opens in new window)Click to share on Reddit (Opens in new window) Related Articles (adsbygoogle = window.

adsbygoogle || []).

push({});.

. More details

Leave a Reply