A Data Science Public Service Announcement

These contributions are a great place to get your feet wet working on open source and also build help you build a profile to show employers you’re committed to data science (please try not to just make token contributions for your profile — make a commitment to a project).

The ethos of open source is that each issue should be viewed as an opportunity: if you see a problem, there is nothing stopping you from making a contribution that improves the library for yourself and for millions of other people.

If neither of those options appeals to you, or you want to help out even more, you can provide financial support to open-source projects.

Now, you may say my donation can never make a difference, but with the criminally small amounts most of these projects get, ($3000 for Pandas in 2017, $1300 for Numpy) even a minor donation can go a large way.

Moreover, a lot of small donations from many individuals add up: if everyone reading my articles in a typical month gave just $1 month to open source, that would be over half a million dollars to support our data science tools!When it comes to giving donations, sometimes there can be too many options, so I’ll give you just one focused on data science, NumFocus.

The stated goal of this charity is “provide a stable, independent, and professional home for projects in the open source scientific data stack.

” NumFocus gives support to open-source projects such as numpy, pandas, matplotlib, project jupyter, ipython, pymc3 , pytables, and many others.

Donating to NumFocus is simple: click the image and within 30 seconds you’ll be a sustaining member.

Click this image and make a difference!Personally, I donate $2 a month to NumFocus.

It’s a tiny amount — in the universal measure of value, less than one cup of coffee a month — but I enjoy knowing I’m doing a small part to help the libraries I love.

Becoming a sustaining member is also great because I don’t even have to remember to donate — I just sign up once, and automatically make my donation, each month getting a thank you email from NumFocus.

In case data science isn’t your field, or you want to contribute to other projects, here are a few organizations that support open-source:Python Software FoundationApache Software FoundationSoftware Freedom Conservancy(You can see a larger list here.

) Again, I think it’s important to not get overwhelmed, so just pick one or two, automate your support, and then you don’t even have to think about it.

The best automation lets us make the world better without any conscious effort!If you want to go further, ask your company to become a sustaining member of open-source as well.

If my experience is any indication, then your company will be glad to invest in these tools if you rely on them in your work.

A few weeks ago, I ran into a strange bug in Pandas which I posted about on Stack Overflow and then GitHub, where it joined 2800 other open issues for Pandas.

Shortly after, I received a comment directing me to the likely source of the issue, line 7400 in an 8000 line file forming the basis of dataframes in pandas and was told that a pull request would be welcome.

Unfortunately, my technical skills and knowledge of pandas are nowhere near the level at which I want to go messing with the internals of the library.

So, feeling like I needed to do something to help, I turned to the CTO of my company (Cortex Building Intel) and asked if the company would be willing to contribute monthly to NumFocus.

Fortunately, our CTO realizes the value of supporting the technology we use every day and was happy to help out.

I share this story not because I’m a paragon, but because it shows there are multiple ways to support open-source.

When I was out of my technical league, I turned to another way to make a difference.

I’m not naive enough to think my action alone will alleviate the issue, but if enough people act, we can improve the sustainability of these tools.

Although it’s a little simplistic to think throwing money at problems will solve them, more paid developer time does help.

Pandas has a list of goals to reach before version 1.

0 is released and the only way these will get done in a timely manner is with funds to support paid developers.

Pandas road map (if they get more donations!)To make it more effective, put your message this way: supporting open-source tools is investing in the future of your company.

Free and open-source technology has allowed many start-ups to get off the ground and now forms the technical core of numerous companies and even entire parts of the web.

Donations now will ensure open-source continues to level the technical field, strengthen our infrastructure, and provide us with the best data science tools.

Final ThoughtsSupporting open-source is about more than just having effective, free tools, it’s about being part of a larger community.

The most powerful solution to the tragedy of the commons is fostering a sense of community.

Make people feel like they belong to a shared group and they will work to ensure the resources are maintained for all members.

When you start to make contributions, you feel a stronger sense of community (something severely lacking in our world) and know you are helping yourself and others.

Also, if you do make a contribution of any type, you have my complete permission to boast about it on all your social media channels.

There are some activities — donating blood, volunteering at a food bank — which are so inherently good for the world that I never get tired of seeing posts about them.

When you donate to open source, shout it from the top of your personal mountain.

If you donate more than me (yes this is a challenge), then let me know it and I’ll be glad to hear it.

If someone calls you annoying, just shrug them off: you are making the world a better place and they are not.

The open-source data science community will only keep growing, so let’s work to provide a sustainable foundation for the tools we all use.

As a reminder, here’s the game plan:Submit quality issues and do your best to help out those solving them.

Go to your favorite open-source library, pick an issue (there should be some marked “good first issue”) and try to solve itIf you are able, make a sustaining donation to NumFocus or other open-source organization.

If your company relies on open-source tools, talk to someone about sustaining company support for open-source softwarePost about your donations wherever you want.

Convince others to do the same through conversations and writing.

As always, I welcome feedback and constructive criticism.

I can be reached on Twitter @koehrsen_will.


. More details

Leave a Reply