Is It Time for a Data Scientist Code of Ethics?

While DeepNude is an implementation of a collection of publicly available machine learning techniques, perhaps we need to take a step back and look at how the research for such technology is being done and made possible by the data scientist community itself.

The Medical Code of EthicsTo start, as a point of reference, medicine has had a code of ethics dating back to the Greek Hippocratic Oath.

The Greek physician Hippocrates, https://commons.



php?curid=164808More modern interpretations of the medical ethics system revolve around a set of moral principles as a guideline.

According to Wikipedia’s page on medical ethics, they break down into the following four values:Respect for autonomy — the patient has the right to refuse or choose their treatment.

Beneficence — a practitioner should act in the best interest of the patient.

Non-maleficence — to not be the cause of harm.

Also, “Utility” — to promote more good than harmJustice — concerns the distribution of scarce health resources, and the decision of who gets what treatment.

Even these alone are a good starting place for a data scientist code of ethics, but let’s take a moment to explore some potential concepts that could define a specialized set of ethics for data scientists.

Data Science Code of EthicsOn the top of most people’s mind may be the idea of using such code to define accountability.

But the real concept behind a code of ethics isn’t accountability, per se, but the idea that the group can collectively agree on a set of core principles.

These principles drive the actions at a systemic level to help ensure that an individual’s moral compass is pointing in the right direction when their own values or beliefs are questioned.

A few groups are currently trying to define these such as The Data Science Association, Alan Fritzler’s data science for social good, and the Oxford — Munchin code of conduct just to name a few.

But while these are great, detailed attempts to cover a wide variety of a data scientist’s responsibility and the work they perform, perhaps we need a smaller, more focused set of values to agree on:Non-maleficenceThis one is lifted directly from above, but data scientists should be working towards the best interest of humanity in a way that doesn’t intentionally cause harm.

While using machine learning for autonomous vehicle driving is something that is for the greater good, the conversation becomes more complicated when applied to military vehicles.

Data scientists need to think through the ramifications of their actions to better understand if the final creation will promote more good than harm.

StatutoryIt should go without saying that data scientists should not only stay within the laws of their own country but of internationally agreed regulations as well.

Data scientists should be directly aware of the legal ramifications of their creations.

The act of releasing something to release it before someone else under the guise that it would be inevitable is not acceptable.

The Greater GoodNot every creation is solely good or bad.

Perhaps one of the most challenging positions for data scientists is the reality that they are, for the most part, performing research.

It’s not always possible to control the implementation of such research.

But as a group, the research should be for the greater good.

NotabilityThe idea of hiding behind anonymity for fear of the potential outcome shouldn’t be a way around protecting one’s self.

Data scientists’ names are attached to the research and in turn, have a direct connection to the implementation.

Standing behind one’s work and the derivatives of it can help ensure that there is a level of responsibility to the creations it inspires.

Sometimes, it takes events like these to force our hand to regulate something that may have previously been considered harmless or inconsequential.

It doesn’t take a code of ethics to judge the use case for releasing something like DeepNude.

If the author of the app were confident of the legality of releasing it, they wouldn’t have done it anonymously.

I’m hesitant to see what the ramifications of that action even are.

Sometimes, it takes events like these to force our hand to regulate something that may have previously been considered harmless or inconsequential.

The above set of suggested values is just my knee jerk reaction to something alarming which I’m afraid will only get worse as this deepfake space continues to grow on the backs of good research done by unassuming data scientists.

There are even larger conversations to be had around the collection and usage of data that goes into machine learning.

What about the privacy of individuals connected directly or indirectly to the work’s outcome?.How can legislation be created to protect data scientists and those affected by their actions?.It’s too large of a subject, but it’s a conversation we need to start now.

What’s Next?There is a broader exploration to be had about real accountability for the work being done by data scientists since there is a fine line between regulation and stifling innovation.

One can use any instrument for good or bad, but a clear line in the sand is the belief the tools you build should have a more significant benefit that far outweighs the bad.

Photo by Neil Rosenstech on UnsplashI fear that the situation will get worse before it gets better.

If we don’t take steps now to instill a sense of responsibility in the new generation of data scientists entering the field, we as a society may have a difficult time overcoming the damage being done.

And while a google search for “data scientist code of ethics” returns results, the fact that there is no single truth is something we need to address before it’s too late.


. More details

Leave a Reply