More than 70x faster!A LOT faster, also notice that we didn’t need to reshape the data since that information was contained in the .
npy file.
Another “minor” feature of using .
npy files is the reduced storage the file occupies.
In this case it’s more than a 50% reduction in size.
This can wary a lot though but in general the .
npy files are more storage friendly.
“What about Pandas and their .
csv handling?”Let’s find out!First let’s create a proper .
csv file for Pandas to read, this would be the most likely real-life scenario.
data = pd.
DataFrame(data_array)data.
to_csv('data.
csv', index = None)This simply saves the ‘data_array’ we created before as a standard .
csv file without index.
Now let’s load it and see what kind of time we get:Which gives me the following output:2.
66 seconds.
Faster than the standard .
txt read but still snails pace compared to the .
npy file!Now you might think this is cheating because we’re also loading into a Pandas DataFrame, but it turns out that the time-loss for that is negligible, if we read in like this:data_array = np.
load('data.
npy')data = pd.
DataFrame(data_array)And time it we get the following:Almost no different from loading without a DataFrame.
The take-awayYou’re probably used to loading and saving data as .
csv but the next time you do a data-science project try getting into the habit of loading and saving to .
npy files instead!.It’ll save you a lot of downtime and annoyance when you’re waiting for the kernel to load your file!◾️ ◾️ ◾️ I hope this was enjoyable / useful even though it was short!.Make sure to follow my profile ????.if you enjoy this article and want to see more!.◾️ ◾️ ◾️.