For this, we need to select the sample corresponding to Drug 2 = NoNo Drug 2 tableand calculate the probability of recoveringP(Recovering | Drug 2 = No) = 50 /( 50 + 5) ~ 91%Which is a totally different result!What has happened?!We can model the data generation process using a causal graph.Data generation graphThis graph reflects our observational data..The data generated by this process is obtained in the Drugstore selling table above.However, our question regarding the lack of Drug 2, regards to a different data generation process..If we could sample data from this graph, we would obtain the Interventional data tableInterventional data tableand we could directly calculate under the new probabilistic distribution P_intervenedP_intervened(Recovering | Drug 2 = No) = 50 /( 50 + 50)= 50%which is the expected result..Of course, in many situations, this would have large unwanted consequences.One of the main questions of causal inference is whether we can derive data from the interventional graph only from gathered observational data without having to go to reality and making an experiment..Meaning, can we get the probability of Recovery in the interventional graph with historical data only?. More details