Why You Should Learn About Streaming Data Science

Why You Should Learn About Streaming Data ScienceMark PalmerBlockedUnblockFollowFollowingMar 2Adaptive learning and the unique use cases for data science on streaming data.

By Dr.

Tom Hill and Mark PalmerTraditional machine learning trains models based on historical data.

This approach assumes that the world essentially “stays the same” — that the same patterns, anomalies, and mechanisms observed in the past will happen in the future.

So, “predictive” analytics is really looking-to-the-past rather than the future.

Streaming business intelligence is innovative technology that allows business users to “query the future” based on real-time streaming data from any streaming data source including IoT sensors, web interactions or transactions, GPS position information or social media content.

And, at the same time, we can now apply data science models to streaming data.

No longer bound to look only at the past, the implications of streaming data science are profound.

Data science models based on historical data are good but not for everythingThe majority of applications for machine learning today seek to identify repeated and reliable patterns in historical data that are predictive of future events.

When the relationships between dimensions and “concepts” are stable and predictive of future events, then this approach is practical.

For example, the number of visitors expected at a beach can be predicted from the weather and season-of-the-year: Fewer people will visit the beach in the winter or when it rains, and these relationships will be stable over time.

Likewise, the numbers, amounts, and types of credit card charges made by most consumers will follow patterns that are predictable from historical spending data, and any deviations from those patterns can serve as useful triggers for fraud alerts.

Further, even when there is so-called “concept-drift” and when relationships between variables change over time — for example when credit card spending patterns change — efficient model monitoring and automatic batch-updates (“recalibration,” or “re-basing”) of models can yield an effective, accurate, yet adaptive system.

Streaming Data Science applies algorithms in-streamIn some cases, however, there are significant advantages to applying learning algorithms to streaming data in real time.

Sometimes, a critical factor that drives application value is the speed at which newly identified and emerging insights are translated into actions.

Sometimes, there are advantages to apply learning algorithms to streaming data in real-time, rather than waiting for it come to rest in a database.

For example, to identify the critical factors predicting public opinions, fashions, consumer preferences, or the dynamics of complex automated manufacturing processes requires a highly agile approach to continuous modeling and model updating, and learning from the most recent data in real time.

Streaming BI — an enabling technology for Streaming Data ScienceTo understand streaming data science, it helps to understand streaming BI first.

The short video below shows Streaming BI in action for a Formula One race car.

Embedded IoT sensors stream data as the car speeds around the track.

Analysts see a real-time, continuous view of the car’s position and data: throttle, RPM, brake pressure — potentially hundreds, or thousands of metrics.

By visualizing some of those metrics, a race strategist can see what static snapshots could never reveal: motion, direction, relationships, the rate of change.

Like an analytics surveillance camera.

Streaming Business Intelligence allows business analysts to query real-time data.

By embedding data science models into the streaming engine, those queries can also include predictions from models scored in real time.

The innovation of Streaming BI is that you can query real-time data, and also query the future.

Once you create a visualization, the system remembers your questions that power the visualization and continuously updates the results.

You just set it and forget it.

In this case, the BI tool registers this question:“Select Continuous * [location, RPM, Throttle, Brake]”When any data changes on the stream — location, RPM, throttle, brake pressure — the visualization updates automatically.

Computations change.

Relationships change.

Visual elements change.

The ground-breaking innovation of Streaming BI is that you can query real-time data, and also query for future events and conditions.

What questions would you ask if you could query the future?.A race team can ask when the car is about to take a suboptimal path into a hairpin turn; figure out when the tires will start showing signs of wear given track conditions, or understand when the weather forecast is about to affect tire performance.

Streaming BI allows you, for the first time, to query the future.

But what if those queries could also incorporate data science algorithms?.Well, they can!Adaptive learning use casesIn environments where the real world changes frequently, analyzing only what happened in the past may not be as effective as analyzing what’s happening now as well.

Adaptive learning with streaming data is the data science equivalent of how humans learning by continuously observing the environment.

Adaptive learning with streaming data is the data science equivalent of how humans learning by continuously observing the environment.

In complex manufacturing, a nearly infinite number of different failure modes can occur.

To avoid such failures, streaming data can help identify patterns associated with quality problems as they emerge, and as quickly as possible.

When never-before-seen root causes (machines, manufacturing inputs) begin to affect product quality (there is evidence of concept drift), staff can respond quickly.

Adaptive learning from streaming data means continuous learning and calibration of models based on the newest data, and sometimes applying specialized algorithms to streaming data to simultaneously improve the prediction models, and to make the best predictions at the same time.

Other examples where continuous adaptive learning from streaming data is instrumental include price optimization for insurance products or consumer goods, fraud detection applications in financial services, or the rapid identification of changing consumer sentiment and fashion preferences.

Towards the future of Streaming Data ScienceLearning from continuously streaming data is different than learning based on historical data or data at rest.

Most implementations of Machine Learning and Artificial Intelligence depend on large data repositories of relevant historical data and assume that historical data patterns and relationships will be useful for predicting future outcomes.

However, when streaming data is used to monitor and support business-critical continuous processes and applications, dynamic changes in data patterns are often expected.

Different analytic and architectural approaches are required to analyze data in motion, compared to data at rest.

Streaming BI provides unique capabilities enabling analytics and AI for practically all streaming use cases.

These capabilities will deliver business-critical competitive differentiation and success.


. More details

Leave a Reply