Doing Data the Right Way

One of my main resources is the Data Science Ethics course on Coursera by Professor Jagadish at the University of Michigan.Data CollectionGiven a Data Science task, the first general phase of the process is the collection and cleaning of data..Many points of ethical contention arise, like my encounter with web scraping..Here’s a question to kickstart the ethics of data collection: If I take a photograph of you, does the photograph belong to me because I took it?.Or since it has you as the subject, does the photograph belong to you?This basic example already brings up two major concepts that have been gaining traction in the field of Data Science Ethics: informed consent and data ownership.The notion of informed consent comes from the field of medical research, where patients have to know the full risks of a treatment before undergoing it..More relevantly, informed consent happens where subjects of a study have to know that they are being researched..This seems obvious, doesn’t it?.If you’re taking part in some clinical trial, you’d know that you’re part of this trial and it would be completely unethical for researchers to drug random people to test the effectiveness of some treatment..Now apply this concept to Data Science..Many businesses make decisions based on A/B tests..A/B testing might not have the same impact on subjects as say, testing a cancer drug, but it’s also a kind of experiment.Here are a couple real-life examples to consider..In 2014, it was revealed that Facebook was manipulating the newsfeeds of users in a study in conjunction with Cornell University and the University of California-San Fransisco on emotional contagion..They wanted to find out if users that saw more positive posts on their feeds would post more positively, and if users that saw more negative posts would post more negatively.. More details

Leave a Reply