Top 7 Machine Learning Github Repositories for Data Scientists

Well, this matrix profile is a vector that stores the z-normalized Euclidean distance between any subsequence within a time series and its nearest neighbor.

Below are a few time series data mining tasks this matrix profile helps us perform: Anomaly discovery Semantic segmentation Density estimation Time series chains (temporally ordered set of subsequence patterns) Pattern/motif (approximately repeated subsequences within a longer time series) discovery Use the below code to install it directly via pip: pip install stumpy   MeshCNN in PyTorch MeshCNN is a general-purpose deep neural network for 3D triangular meshes.

These meshes can be used for tasks such as 3D-shape classification or segmentation.

A superb application of computer vision.

The MeshCNN framework includes convolution, pooling and unpooling layers which are applied directly on the mesh edges: Convolutional Neural Networks (CNNs) are perfect for working with image and visual data.

CNNs have become all the rage in recent times with a boom of image related tasks springing up from them.

Object detection, image segmentation, image classification, etc.

– these are all possible thanks to the advancement in CNNs.

3D deep learning is attracting interest in the industry, including fields like robotics and autonomous driving.

The problem with 3D shapes is that they are inherently irregular.

This makes operations like convolutions difficult an challenging.

This is where MeshCNN comes into play.

From the repository: Meshes are a list of vertices, edges and faces, which together define the shape of the 3D object.

The problem is that every vertex has a different # of neighbors, and there is no order.

If you’re a fan of computer vision and are keen to learn or apply CNNs, this is the perfect repository for you.

You can learn more about CNNs through our articles: A Comprehensive Tutorial to learn Convolutional Neural Networks from Scratch Architecture of Convolutional Neural Networks (CNNs) Demystified   Awesome Decision Tree Research Papers Decision Tree algorithms are among the first advanced techniques we learn in machine learning.

Honestly, I truly appreciate this technique after logistic regression.

I could use it on bigger datasets, understand how it worked, how the splits happened, etc.

I personally love this repository.

It is a treasure trove for data scientists.

The repository contains a collection of papers on tree based algorithms, including decision, regression and classification trees.

The repository also contains the implementation of each paper.

What more could we ask for?.  TensorWatch by Microsoft Research Have you ever wondered how your machine learning algorithm’s training process works?.We write the code, some complication happens behind the scenes (the joy of programming!), and we get the results.

Microsoft Research have come up with a tool called TensorWatch that enables us to see real-time visualizations of our machine learning model’s training process.

Incredible!.Check out a snippet of how TensorWatch works: TensorWatch, in simple terms, is a debugging and visualization tool for deep learning and reinforcement learning.

It works in Jupyter notebooks and enables us to perform many other customized visualizations of our data and our models.

    Reddit Discussions Let’s spend a few moments checking out the most awesome Reddit discussions related to data science and machine learning from May, 2019.

There’s something here for everyone, whether you’re a data science enthusiast or practitioner.

So let’s dig in!.  Which Skills should a PhD student have if he/she wants to work in the industry?.This is a tough nut to crack.

The first question is whether you should actually opt for a Ph.

D ahead of an industry role.

And then if you did opt for one, then what skills should you pick up to make your industry transition easier?.I believe this discussion could be helpful in decoding one of the biggest enigmas in our career – how do we make a transition from one field or line of work to another?.Don’t just look at this from the point of view of a Ph.

D student.

This is very relevant for most of us wanting to get that first break in machine learning.

I strongly encourage you to go through this thread as so many experienced data scientists have shared their personal experiences and learning.

  Neural nets typically contain smaller “sub networks” that can often learn faster – MIT Recently, a research paper was released expanding on the headline of this thread.

The paper explained the Lottery Ticket Hypothesis in which a smaller sub-network, also known as a winning ticket, could be trained faster as compared to a larger network.

This discussion focuses on this paper.

To read more about the Lottery Ticket Hypothesis and how it works, you can refer to my article where I break down this concept for even beginners to understand: Decoding the Best Papers from ICLR 2019 – Neural Networks are Here to Rule   Does anybody else feel overwhelmed looking at how much there is to learn?.I picked this discussion because I can totally relate to it.

I used to think – I’ve learned so much, and yet there is so much more left.

Will I ever become an expert?.I made the mistake of looking just at the quantity and not the quality of what I was learning.

With the continuous and rapid advancement technology, there will always be a LOT to learn.

This thread has some solid advice on how you can set priorities, stick to them, and focus on the task at hand rather than trying to become a jack of all trades.

  End Notes I had a lot of fun (and learning) putting together this month’s machine learning GitHub collection!.I highly recommend bookmarking both these platforms and regularly checking them.

It’s a great way to stay up to date with all that’s new in machine learning.

Or, you can always come back each month and check out our top picks.

????.If you think I’ve missed any repository or any discussion, comment below and I’ll be happy to have a discussion on it!. More details

Leave a Reply