What is .npy files and why you should use them…

More than 70x faster!A LOT faster, also notice that we didn’t need to reshape the data since that information was contained in the .

npy file.

Another “minor” feature of using .

npy files is the reduced storage the file occupies.

In this case it’s more than a 50% reduction in size.

This can wary a lot though but in general the .

npy files are more storage friendly.

“What about Pandas and their .

csv handling?”Let’s find out!First let’s create a proper .

csv file for Pandas to read, this would be the most likely real-life scenario.

data = pd.

DataFrame(data_array)data.

to_csv('data.

csv', index = None)This simply saves the ‘data_array’ we created before as a standard .

csv file without index.

Now let’s load it and see what kind of time we get:Which gives me the following output:2.

66 seconds.

Faster than the standard .

txt read but still snails pace compared to the .

npy file!Now you might think this is cheating because we’re also loading into a Pandas DataFrame, but it turns out that the time-loss for that is negligible, if we read in like this:data_array = np.

load('data.

npy')data = pd.

DataFrame(data_array)And time it we get the following:Almost no different from loading without a DataFrame.

The take-awayYou’re probably used to loading and saving data as .

csv but the next time you do a data-science project try getting into the habit of loading and saving to .

npy files instead!.It’ll save you a lot of downtime and annoyance when you’re waiting for the kernel to load your file!◾️ ◾️ ◾️ I hope this was enjoyable / useful even though it was short!.Make sure to follow my profile ????.if you enjoy this article and want to see more!.◾️ ◾️ ◾️.

. More details

Leave a Reply