Ontology and Data Science

I’ll be explicit in the difference between philosophical ontology and the ontology related to information and data in computer science.Ontology (the philosophical part)In simple words, one can say that ontology is the study of what there is..Such theories usually propose axioms about these entities in question, spelled out in some formal language based on some system of formal logic.And this will allow us to do a quantum jump to next part of the article.https://xkcd.com/1240/Ontology (the information and computational part)If we bring back the definition of formal ontology from above, and then we think of data and information, it’s possible to set up a framework to study data and its relation to other data..Information represented in a particular formal ontology can be more easily accessible to automated information processing, and how best to do this is an active area of research in computer science like data science..It is a framework to represent information, and as such it can be representationally successful whether or not the formal theory used in fact truly describes a domain of entities.Now it’s a good moment to see how ontology can help us in the data science world.If you remember in my last article about semantic technologies:Deep Learning for the Masses (… and The Semantic Layer)Deep learning is everywhere right now, in your watch, in your televisor, your phone, and in someway the platform you…towardsdatascience.comI talked about the concept of Linked Data..And also I discussed the concept of the knowledge graph which consists in integrated collections of data and information that also contains huge numbers of links between different data.Well the missing concept in all those definition was ontology..With ontology one can enable such a description, but first we need to formally specify components such as individuals (instances of objects), classes, attributes and relations as well as restrictions, rules and axioms.Here’s a pictographic way of explaining the paragraph above:https://www.ontotext.com/knowledgehub/fundamentals/what-are-ontologies/Data Bases modeling and ontologiesCurrently, most of the technologies that employ data modeling languages (like SQL) are designed using a rigid “Build the Model, then Use the Model” mindset.For example, suppose you want to change a property in a relational database..This transition can also be thought of as going from traditional data bases to graph data bases + semantics.According to this presentation by the company Cambridge Semantics there are three reasons why graph data bases are useful:Graph Database offerings are showing maturity in capability and diversityGraph is being used beyond classical graph problemsDigital Transformation of complex data requires graphSo if you tie this benefits with a semantic layer, built on ontologies, you can go from having your data like this:to thiswhere you have a human-readable representation of data that uniquely identifies and connects data with common business terms..We have asked What were the main…www.kdnuggets.comI talked about what will happen (maybe) in the next years for data science.For me the key trends for 2019:AutoX: We will see more companies developing and including into their stack technologies and libraries for automatic Machine and Deep Learning. The X here means that this auto-tools will be extended to data ingestion, data integration, data cleansing, exploration and deployment. Automation is here to stay.Semantic technologies: On the most interesting discoveries for me this year was the connection between DS and semantics. It’s not a new field in the data-world but I see more people getting an interest in the field of semantics, ontologies, knowledge-graphs and its connection to DS and ML.Programming less: This is a hard thing to say, but with automation in almost every step of the DS process we will program less and less everyday. We will have tools for creating code and that will understand what we want with NLP and then transform that into queries, sentences and full programs. I think [programming] it’s still a very important thing to learn, but it will be more easy soon.This is one of the reasons why I’m creating this article, trying to follow what’s happening across the industry, and you should be aware of this. We will program less, and will use semantics technologies more in the near future. It’s closer to the way we think. I mean do you think in relational data bases? I’m not saying we think in graphs, but it’s much easier to pass information between our heads and a knowledge graph than creating weird data base models.Expect more from me about this interesting topic. And please if you have any suggestions let me know :)If you have questions just follow me on Twitter:Favio Vázquez (@FavioVaz) | TwitterThe latest Tweets from Favio Vázquez (@FavioVaz). Data Scientist. Physicist and computational engineer. I have a…twitter.comand LinkedIn:Favio Vázquez — Founder — Ciencia y Datos | LinkedInView Favio Vázquez’s profile on LinkedIn, the world’s largest professional community. Favio has 16 jobs listed on their…www.linkedin.comSee you there :). More details

Leave a Reply