How not to Use Machine Learning Models

I was very interested in how the presentation team will be using the model though, so I asked the question, “So what are you going to use the model for once it is built?”Answer that came back was, “We will use the model to determine who is at risk of failing the year and who is not, once we have done that, we will focus our resources more on those who is not going to fail, ensure that they passed with significantly good results.” *Jaw drop*The team, being made up of students who are new to data science, I decided it was a good chance to give them a lesson on the usage of models..So I continued, “So if the student is classified by the model as being at risk, it means he/she will be deprived off any resource to succeed and get good grades?”“Yes!” came back the answer..So I continued, “What happens if I am using the model and YOU are classified as at risk of failing?.Do you think it is fair for me to deprive any teaching/learning resource from you?” Silence…“You have to know that by using machine learning models here, it only serves us a probability of an outcome, not a 100% guarantee..If someone is ‘predicted’ to be at risk, it is only because the current features inside the model tells us that..Life is more complicated since there are other factors that can affect an outcome, in this case failure in exams, besides those factors that are captured inside the data.”Continuing, “What you could have proposed was with the same model built, we can firstly, investigate what are the factors that will indicate that the students are at risk and why is that the case?.Secondly, we should aim to devise a good plan to help the students that are at risk of failure, since we can now identify them better..This will improve the overall society, producing more productive member of society and helping more students get out of poverty trap, perhaps.”Usage of Machine Learning ModelWhat you might notice is that with the same model, depending on how it was used, we can either help more people with it or discriminate..I am a very strong believer that building the model is just a small part of the data science project, knowing how to use the model especially ethically is also important as well..This is the reason for projects where I have control on the grading criteria, I will place a significant portion of the score on strategy and implementation, not into the IT infrastructure but more of into business processes.I hope readers after reading this will really put in more thoughts on how to use your models and empathize its impact on people or customers..Building the ‘best’ machine learning model is only a small part of the bigger picture of deriving value from our data.I hope the blog has been useful to you..I wish all readers a FUN Data Science learning journey and do visit my other blog posts and LinkedIn profile.. More details

Leave a Reply