The Data Fabric for Machine Learning – Part 2: Building a Knowledge-Graph

   In the last articles of the series:The Data Fabric for Machine Learning.

Part 1.

How the new advances in semantics can help us be better at Machine Learning.

towardsdatascience.

comThe Data Fabric for Machine Learning.

Part 1-b: Deep Learning on Graphs.

Deep learning on graphs is taking more importance by the day.

Here I’ll show the basics of thinking about machine… towardsdatascience.

comI’ve been talking about the data fabric in general, and giving some concepts of Machine Learning and Deep Learning in the data fabric.

And also gave my definition of the data fabric:The Data Fabric is the platform that supports all the data in the company.

How it’s managed, described, combined and universally accessed.

This platform is formed from an Enterprise Knowledge Graph to create an uniform and unified data environment.

If you take a look at the definition, it says that the data fabric is formed from an Enterprise Knowledge Graph.

 So we better know how to create and manage it.

   GeneralSet up the basis of knowledge-graphs theory and construction.

Specifics   The fabric in the data fabric is built from a knowledge-graph, to create a knowledge-graph you need semantics and ontologies to find an useful way of linking your data that uniquely identifies and connects data with common business terms.

   The knowledge graph consists in integrated collections of data and information that also contains huge numbers of links between different data.

The key here is that instead of looking for possible answers, under this new model we’re seeking an answer.

 We want the facts — where those facts come from is less important.

The data here can represent concepts, objects, things, people and actually whatever you have in mind.

The graph fills in the relationships, the connections between the concepts.

In this context we can ask this question to our data lake:What exists here?We are in a different here.

A one where it’s possible to set up a framework to study data and its relation to other data.

In a knowledge-graph information represented in a particular formal ontology can be more easily accessible to automated information processing, and how best to do this is an active area of research in computer science like data science.

All data modeling statements (along with everything else) in ontological languages and the world of knowledge-graphs for data are incremental, by their very nature.

Enhancing or modifying a data model after the fact can be easily accomplished by modifying the concept.

With a knowledge-graph what we are building is a human-readable representation of data that uniquely identifies and connects data with common business terms.

This “layer” helps end users access data autonomously, securely and confidently.

Remember this image?I proposed before that insights in the data fabric can be an insight can be thought as a dent in it.

And the automatic process of discovering what that insight is, it’s machine learning.

But what is this fabric?.Is the object formed by the knowledge-graph.

Like in Einstein’s theory of relativity, where the fabric is made by the continuum (or discrete?) of spacetime, here the fabric is built when you create a knowledge-graph.

For building the knowledge-graph you need linked data.

The goal of linked data is to publish structured data in such a way that it can be easily consumed and combined with other linked data, and ontologies as the way we can connect entities and understand their relationships.

   A while ago Sebastien Dery wrote an interesting article about the challenges of knowledge-graphs.

Here you can take a look:Challenges of Knowledge Graphs From Strings to Things — An Introduction medium.

comAnd from the great blog at cambridgesemantis.

comLearn RDF Introduction This set of lessons is an introduction to RDF, the core data model of the Semantic Web and the foundation… www.

cambridgesemantics.

comand more resources, one of the concepts that I haven’t even mention in any article, but is very important, is the concept of triples: subject, object, and predicate (or entity-attribute-value).

Commonly when you study triplets they actually mean the Resource Description Framework (RDF).

RDF is one of the three foundational Semantic Web technologies, the other two being SPARQL and OWL.

RDF is the data model of the Semantic Web.

NOTE: Oh btw, almost all this concepts came with the new definition of semantics for the world wide web, but we will use it for knowledge-graphs in general.

I’m not going to give a full description of the framework here, but I’ll give you an example on how they work.

Remember that I’m doing this because is the way we start building ontologies, linking data and the knowledge-graph.

Let’s take a look at an example to see what this triples are.

This is closely related to the example from Sebastien.

We will start with the string “geoffrey hinton”.

Here we have a simple string that represents first edge, the thing I want to know more aboutNow to start building a knowledge-graaph first the system recognizes the that string actually meant the person Geoffrey Hinton.

And then it will recognize the related entities to that person.

Then we have some entities that are related to Geoffrey but we don’t know what they are yet.

Btw, this is Geoffrey Hinton if you don’t know him: And then the system will start giving names to the relationships:Now we have named relationships where we know what type of connection we have for our main entity.

This system can go for a while finding connections of connections and thus creating a huge graph representing the different relationships for our “search string”.

To do this the knowledge-graph uses the triples.

Like this:To have a triple we need a subject and object, and a predicate linking the two.

So as you can see we have the subject <Geoffrey Hinton> related to the object <Researcher> by the predicate <is a>.

This may sound easy for us humans, but it needs a very comprehensive framework to do this with machines.

This is the way the knowledge-graph gets formed and how we link data using ontologies and semantics.

So, what do we need to create a successful knowledge-graph?.Partha Sarathi from Cambridge Semantics wrote a great blog about that.

You can read it here:Creating a Successful Enterprise Knowledge Graph Ever since Google mainstreamed knowledge graphs in 2012 through a popular blog on enhanced web search, enterprises have… blog.

cambridgesemantics.

comAnd to sum up, he says we need:You can read more here:Build Your Enterprise Knowledge Graph Enterprise Knowledge Graphs are helping companies connect their complex data sources.

With Anzo® you can design, build… info.

cambridgesemantics.

com   Google:Google is a basically a huge knowledge (with more additions) graph and they created maybe the biggest data fabric there is upon that.

Google has billions of facts that includes information about and relationships between millions of objects.

And allow us to search through their system to discover insights inside it.

Here you can learn more:LinkedIn:My favorite social network LinkedIn has a huge knowledge graph base built upon “entities” on LinkedIn, such as members, jobs, titles, skills, companies, geographical locations, schools, etc.

These entities and the relationships among them form the ontology of the professional world.

Insights help leaders and sales make business decisions, and increase member engagement with LinkedIn: Remember that LinkedIn (and almost every) knowledge-graph needs to scale as new members register, new jobs are posted, new companies, skills, and titles appear in member profiles and job descriptions, etc.

You can read more here:Building The LinkedIn Knowledge Graph Authors: Qi He, Bee-Chung Chen, Deepak Agarwal engineering.

linkedin.

comKnowledge-graphs for financial institutions: In this article by Marty Loughlin, he showed an example of what the platform Anzo can do for a bank, and there you can see that this technology is not only related to search engines and works with disparate data.

There he shows how a knowledge-graph can help this kind of institution with:and more.

Go take a look.

  To create a knowledge-graph you need semantics and ontologies to find an useful way of linking your data that uniquely identifies and connects data with common business terms and thus building the underlaying structure of the data fabric.

When we build a knowledge-graph we need to form triples to link data using ontologies and semantics.

Also the manufacturing of the knowledge-graph depends on basically three things: People that envision it, data diversity and a good product to built it.

There are many examples of knowledge-graphs around us that we don’t even know.

Most successful companies in the world are implementing and migrating their systems to build a data fabric and of course all the things inside of it.

  Bio: Favio Vazquez is a physicist and computer engineer working on Data Science and Computational Cosmology.

He has a passion for science, philosophy, programming, and music.

He is the creator of Ciencia y Datos, a Data Science publication in Spanish.

He loves new challenges, working with a good team and having interesting problems to solve.

He is part of Apache Spark collaboration, helping in MLlib, Core and the Documentation.

He loves applying his knowledge and expertise in science, data analysis, visualization, and automatic learning to help the world become a better place.

Original.

Reposted with permission.

Related: var disqus_shortname = kdnuggets; (function() { var dsq = document.

createElement(script); dsq.

type = text/javascript; dsq.

async = true; dsq.

src = https://kdnuggets.

disqus.

com/embed.

js; (document.

getElementsByTagName(head)[0] || document.

getElementsByTagName(body)[0]).

appendChild(dsq); })();.. More details

Leave a Reply