The problem with metrics is a big problem for AI

What is the long-term impact on hiring to be the subject of years worth of privacy scandals, political manipulation, and facilitating genocide?Simply measuring what users click on is a short-term concern, and does not take into account factors like the potential long-term impact of a long-form investigative article which may have taken months to research and which could help shape a reader’s understanding of a complex issue and even lead to significant societal changes.

A recent Harvard Business Review article looked at Wells Fargo as a case study of how letting metrics replace strategy can harm a business.

After identifying cross-selling as a measure of long-term customer relationships, Wells Fargo went overboard emphasizing the cross-selling metric: intense pressure on employees combined with an unethical sales culture led to 3.

5 million fraudulent deposit and credit card accounts being opened without customers’ consent.

The metric of cross-selling is a much more short-term concern compared to the loftier goal of nurturing long-term customer relationships.

Overemphasizing metrics removes our focus from long-term concerns such as our values, trust and reputation, and our impact on society and the environment, and myopically focuses on the short-term.

   It matters which metrics we gather and in what environment we do so.

Metrics such as what users click on, how much time they spend on sites, and “engagement” are heavily relied on by tech companies as proxies for user preference, and are used to drive important business decisions.

Unfortunately, these metrics are gathered in environments engineered to be highly addictive, laden with dark patterns, and where financial and design decisions have already greatly circumscribed the range of options.

 Zeynep Tufekci, a professor at UNC and regular contributor to the New York Times, compares recommendation algorithms (such as YouTube choosing which videos to auto-play for you and Facebook deciding what to put at the top of your newsfeed) to a cafeteria shoving junk food into children’s faces.

 “This is a bit like an autopilot cafeteria in a school that has figured out children have sweet teeth, and also like fatty and salty foods.

So you make a line offering such food, automatically loading the next plate as soon as the bag of chips or candy in front of the young person has been consumed.

” As those selections get normalized, the output becomes ever more extreme: “So the food gets higher and higher in sugar, fat and salt – natural human cravings – while the videos recommended and auto-played by YouTube get more and more bizarre or hateful.

” Too many of our online environments are like this, with metrics capturing that we love sugar, fat, and salt, not taking into account that we are in the digital equivalent of a food desert and that companies haven’t been required to put nutrition labels on what they are offering.

Such metrics are not indicative of what we would prefer in a healthier or more empowering environment.

   All this is not to say that we should throw metrics out altogether.

Data can be valuable in helping us understand the world, test hypotheses, and move beyond gut instincts or hunches.

Metrics can be useful when they are in their proper context and place.

One way to keep metrics in their place is to consider a slate of many metrics for a fuller picture (and resist the temptation to try to boil these down to a single score).

For instance, knowing the rates at which tech companies hire people from under-indexed groups is a very limited data point.

For evaluating diversity and inclusion at tech companies, we need to know comparative promotion rates, cap table ownership, retention rates (many tech companies are revolving doors driving people from under-indexed groups away with their toxic cultures), number of harassment victims silenced by NDAs, rates of under-leveling, and more.

Even then, all this data should still be combined with listening to first-person experiences of those working at these companies.

Columbia professor and New York Times Chief Data Scientist Chris Wiggins wrote that quantitative measures should always be combined with qualitative information, “Since we can not know in advance every phenomenon users will experience, we can not know in advance what metrics will quantify these phenomena.

To that end, data scientists and machine learning engineers must partner with or learn the skills of user experience research, giving users a voice.

”Another key to keeping metrics in their proper place is to keep domain experts and those who will be most impacted closely involved in their development and use.

Surely most teachers could have foreseen that evaluating teachers primarily on the standardized test scores of their students would lead to a host of negative consequences.

I am not opposed to metrics; I am alarmed about the harms caused when metrics are overemphasized, a phenomenon that we see frequently with AI, and which is having a negative, real-world impact.

AI running unchecked to optimize metrics has led to Google/YouTube’s heavy promotion of white supremacist material, essay grading software that rewards garbage, and more.

By keeping the risks of metrics in mind, we can try to prevent these harms.

  Bio: Rachel Thomas is director of the USF Center for Applied Data Ethics and co-founder of fast.

ai, which has been featured in The Economist, MIT Tech Review, and Forbes.

She was selected by Forbes as one of 20 Incredible Women in AI, earned her math PhD at Duke, and was an early engineer at Uber.

Rachel is a popular writer and keynote speaker.

Original.

Reposted with permission.

Related: var disqus_shortname = kdnuggets; (function() { var dsq = document.

createElement(script); dsq.

type = text/javascript; dsq.

async = true; dsq.

src = https://kdnuggets.

disqus.

com/embed.

js; (document.

getElementsByTagName(head)[0] || document.

getElementsByTagName(body)[0]).

appendChild(dsq); })();.

. More details

Leave a Reply