What does a Principal Data Scientist look like in 2025?

That’s why we need PDSs.

And they’re certainly experts in ML.

They’ve each read the 80-page Machine Learning as a Service (MLaaS) manual provided by Google Cloud Platform (GCP).

(Don’t be intimidated by this long book, it’s only a picture book.

Or, as one might say, a slide deck.

)Hence, each PDS has mastered this technology and therefore knows how to click through the ML wizard — always accepting the default settings — and finally click the big “Run Machine Learning!!!” button.

(Yes, Google added the exclamation points in response to overwhelming customer feedback.

)But their expertise in ML alone are insufficient for them to solve such challenging artificial intelligence problems.

For as every PDS knows, ML applied to poorly prepared data is useless.

Hence, PDSs also need deep expertise in their problem domain so as to properly prepare their data.

Their specialized graduate coursework trains them accordingly.

Further, PDSs are experts at leveraging data microscopes.

These are advanced technologies that allow their highly-technical users to explore the structure of data and annotate data elements.

For example, a PDS specialized in comment moderation knows how to discover flagged comments by clicking the appropriate tab in their data microscope.

They then apply their professional judgment in annotating individual comments with the right labels.

After roughly a week, they’ve prepared sufficient data (averaging 2k comments/day) to start applying state-of-the-art ML methods.

In 2025, running MLaaS on GCP takes on average 4 minutes and costs $7.


A newly minted PDS Ph.


may think this is where the project ends.

But as any veteran knows, no ML solution is perfect on the first try.

Instead, real-world data science is an iterative loop whereby we constantly improve the methods through user feedback.

One might call this “organic reinforcement learning” whereby users of a platform let the PDS know where the AI methods are currently failing by contesting ML-predicted labels.

In an attempt to improve the ML performance, some junior PDSs may be tempted to play with the MLaaS settings in GCP — thinking they can do ML better than Google — but they soon find this to be in vain.

For our God, Google, is omnipotent in the domain of ML.

Instead, the problem is always insufficient prepared data.

Back to the microscope!And remember, don’t hit that “Run Machine Learning!!!” button more than once a week; it costs us real money.

As you might imagine, PDSs are well-compensated professionals in 2025.

They average $18/hr and the top contractors can make north of $32/hr.

These top professionals are chiefly distinguished by their emotional fortitude in such soul-racking domains as identifying child pornography.

Hence, the big bucks.

Like dentists before them, these professionals exhibit a statistically-significant higher suicide rate.

Too bad they don’t know statistics.

Look out for part II where we consider how large teams of PDSs are coordinated to collaboratively tackle massive artificial intelligence problems in 2025.

Spoiler: coordinating and managing PDSs is such a difficult job that we only trust computers to do it.


. More details

Leave a Reply