In this article I’ll be testing almost all the tools they have so you don’t have to.
The project is public in the platform, you can see it here:MatrixDS | The Data Project WorkbenchMatrixDS is a place to build, share and manage data projects at any scale.
community.
platform.
matrixds.
comIf you want to test it out, what you hav to do is forklift it and that’s it.
Testing Python thingsJupyter NotebookMy favorite programming language of the moment is Python.
There are lots of great tools and features that can help you using this language.
One of the most popular ones is Jupyter Notebook.
To launch a notebook in MatrixDS do this:Go to the Tools tab in the platform.
Click on the (+) button on the right hand side:3.
Choose Python 3 (or 2) with Jupyter Notebook:4.
Choose a name for the tool and set the number of cores and RAM:5.
When the notebook is created and started then just open it:6.
Have fun programming ;)Inside of the notebook you are free to do whatever you want.
I created a simple Python notebook to test PySnooper so you can try.
Here’s a gist of that notebook, that you can find in the MatrixDS project:Jupyter LabJupyterLab is the next-generation web-based user interface for Project Jupyter.
It’s like Jupyter Notebooks on steroids.
To launch a notebook in MatrixDS do this:Go to the Tools tab in the platform.
Click on the (+) button on the right hand side:3.
Choose Python 3 with JupyterLab:4.
Choose a name for the tool and set the number of cores and RAM:5.
When the tool is created and started then just open it:6.
Have more fun :)I created a simple Python Notebook in the JupyterLab instance to test so you can try.
If you’ve been following me so far this is what you should be seeing:Oh by the way if you want to know how to use git with MatrixDS check this article:Data Science with Optimus.
Part 2: Setting your DataOps Environment.
Breaking down data science with Python, Spark and Optimus.
Today: Data Operations for Data Science.
.
::Part 1 here…towardsdatascience.
comThe test notebook I created tests the new library fklearn for functional machine learning.
Here’s a gist of that notebook, that you can find in the MatrixDS project:Testing R thingshttps://www.
computerworld.
com/video/series/8563/do-more-with-rI started my data science career on R.
It’s a great tool for doing data analysis, data cleaning, plotting and much more.
I think right now the machine learning part it’s better with Python, but to be a successful data scientist you need to know them both.
To launch RStudio in MatrixDS do this:Go to the Tools tab in the platform.
Click on the (+) button on the right hand side:3.
Choose R 3.
5 with RStudio:4.
Choose a name for the tool and set the number of cores and RAM:5.
When the tool is created and started then just open it:6.
Have R fun :)The test R environment I created is testing the new library caalled g2r that creates graphics for interactive visualizations using g2.
Btw!.I had to do this before running g2r:sudo su apt-get install libv8-devSo, normally this is what you do for getting a plot with ggplot2:library(ggplot2)ggplot(iris, aes(Petal.
Length, Petal.
Width, color = Species)) + geom_point() + facet_wrap(.
~Species)And you will get:Not so bad, but what about bringing interactivity to that??.With g2r it’s very easy.
This is the code for doing that:library(g2r)g2(iris, asp(Petal.
Length, Petal.
Width, color = Species)) %>% fig_point() %>% plane_wrap(planes(Species))And you’ll get:In the code you change:aes -> aspgeom_point() -> fig_point()facet_wrap(.
~Species) -> plane_wrap(planes(Species))I’m still wondering why they didn’t use the same API but’s it’s a very cool project.
There are more examples you can check out here:ArticlesGet up and running with g2r, discover the similarities with ggplot2 and a few examples to point you in the right…g2r.
devThis is all the code:Testing Julia thingsWhen I was studying my masters in Physics (like 2 years ago) I really thought that Julia was going to revolutionize the scientific programming world.
Don’t get me wrong, it’s doing an amazing job, but I think new advancements with Python has left the project in second place for so many things.
For testing the Julia capabilities of MatrixDS I wanted to take a look at the data libraries of the language.
And you can see that below.
To launch a Julia Notebook in MatrixDS do this:Go to the Tools tab in the platform.
Click on the (+) button on the right hand side:3.
Choose Julia 1.
1.
0 with JupyterLab:4.
Choose a name for the tool and set the number of cores and RAM:5.
When the tool is created and started then just open it:6.
Let’s Julia 🙂 (sounds weird)When you launch it, you can open or create any Python or Julia notebook:This is the notebook I created for testing Julia capabilities for data science:Here I tested some libraries like DataFrames, Gadfly, Queryverse, Vega for plotting and more :)So as you can see, this for me is the more complete platform for doing data science in the cloud.
You need minimum configuration and you can even install your own tools with docker.
There’s much more to cover and things to do with the platform, and I’ll be doing that in other articles.
If you want to be in touch with me follow me here:Favio Vázquez (@FavioVaz) | TwitterThe latest Tweets from Favio Vázquez (@FavioVaz).
Data Scientist.
Physicist and computational engineer.
I have a…twitter.
com.