10 Powerful Python Tricks for Data Science you Need to Try Today

It is one of the most popular Python libraries around and is widely used for data manipulation and analysis.

We know that Pandas has amazing capabilities to manipulate and summarize the data.

I was recently working on a time series problem and noticed that Pandas had a Grouper function that I had never used before.

I became really curious about its use (the data scientist curse!).

As it turns out, this Grouper function is quite an important function for time series data analysis.

So, let’s try this out and see how it works.

You can download the dataset for this code here.

View the code on Gist.

Now, the first step to deal with any time series data is to convert the date column into a DateTime format: View the code on Gist.

  Suppose our objective is to see the monthly sales for each customer.

Most of us try to write something complex here.

But this is where Pandas has something more useful for us (got to love Pandas!).

View the code on Gist.

Instead of having to play around with reindexing, we can use a simple approach via the groupby syntax.

We’ll add something extra to this function by providing a little more information on how to group the data in the date column.

It looks cleaner and works in exactly the same way: View the code on Gist.

  7.

unstack: Transform the Index into Columns of your Dataframe We just saw how grouper can be helpful in grouping time series data.

Now, here’s a challenge – what if we want to see the name column (which is the index in the above example) as the column of our dataframe.

This is where the unstack function becomes vital.

Let’s apply the unstack function on the above code sample and see the results.

View the code on Gist.

  Quite useful!.Note: If the index is not a MultiIndex, the output will be a Series.

  8.

%matplotlib Notebook: Interactive Plots in your Jupyter Notebook I’m a big fan of the matplotlib library.

It is the most common visualization library that we use to generate all kinds of plots in our Jupyter Notebooks.

To view these plots, we generally use one line – %matplotlib inline – while importing the matplotlib library.

This works well but it renders the static plots within the Jupyter Notebook.

Just replace the line %matplotlib inline with %matplotlib notebook and watch the magic unfold.

You will get resizable and zoomable plots within your notebook!.View the code on Gist.

  https://s3-ap-south-1.

amazonaws.

com/av-blog-media/wp-content/uploads/2019/08/matplotlib-final.

mp4 Brilliant!.With just one word change, we can get interactive plots that allow us to resize and zoom within the plots.

  9.

%%time: Check the Running Time of a Particular Block of Python Code There can be multiple approaches to solve one problem.

We know this pretty well as data scientists.

Computational costs matter in the industry – especially if it’s a small or medium-sized organization.

You might want to choose the best approach which completes the task in the minimum amount of time.

It’s actually very easy to check the running time of a particular block of code in your Jupyter notebook.

Just add the %%time command to check the running time of a particular cell: View the code on Gist.

Here, we have the CPU time and the Wall time.

The CPU time is the total execution time or runtime for which the CPU was dedicated to a process.

Wall time is the time that a clock would have measured as having elapsed between the start of the process and ‘now’.

  10: rpy2: R and Python in the Same Jupyter Notebook!.R and Python are two of the best and most popular open-source programming languages in the data science world.

R is mainly used for statistical analysis while Python provides an easy interface to translate mathematical solutions into code.

Here’s the good news – we can use both of them in a single Jupyter Notebook!.We can make use of both ecosystems and for that, we just need to install rpy2.

So, let’s shelve the R versus Python debate for now and enjoy plotting ggplot-level charts in our Jupyter Notebook.

!pip3 install rpy2 We can use both the languages together, and even pass variables between them.

View the code on Gist.

View the code on Gist.

View the code on Gist.

  Here, we created a dataframe df in Python and used that to create a scatterplot using R’s ggplot2 library (the function geom_point).

Go ahead and try this out – you’re sure to love it.

  End Notes This is my essential Python tricks collection.

I love using these packages and functions in my day-to-day tasks.

Honestly, my productivity has increased and it’s made working in Python more fun than ever before.

Are there any Python tricks you want me to know apart from these?.Let me know in the comments section below and we’ll trade ideas!.And if you’re a Python beginner and a newcomer in data science, you really should check out our comprehensive and best-selling course: Applied Machine Learning – Beginner to Professional You can also read this article on Analytics Vidhyas Android APP Share this:Click to share on LinkedIn (Opens in new window)Click to share on Facebook (Opens in new window)Click to share on Twitter (Opens in new window)Click to share on Pocket (Opens in new window)Click to share on Reddit (Opens in new window) Related Articles (adsbygoogle = window.

adsbygoogle || []).

push({});.

. More details

Leave a Reply