Simple Yet Practical Analysis of ClinicalTrials.
govwith a pinch of machine learningGlib RadchenkoBlockedUnblockFollowFollowingJan 17Working in clinical research has taught me a few important things, and all of them can be expressed in a single quote:“Things done well and with a care, exempt themselves from fear.
”This is especially true for clinical trials, in which researchers take responsibility for health and well-being of patients.
The entire process of developing a drug is extremely complex and can take decades.
A new drug is first studied in the laboratory, and if the drug looks promising, it is carefully studied in people during the course of clinical trials.
Clinical trials are conducted to find better ways to prevent, diagnose and treat disease.
So essentially, the goal of clinical trials is to determine whether a drug is both safe and effective.
Today, most of the clinical trials are subjected to a thorough registration procedure.
FDA (Food and Drug Administration) requires that all drugs and devices undergoing clinical trials in humans must be registered on ClinicalTrials.
gov, a web-based resource that provides the public with easy access to information on publicly and privately supported clinical studies.
While it is undoubtedly a great tool for helping patients find studies that they may be able to participate, it also provides researchers and health care professionals with up-to-date information on new drug developments.
This is where we come to another thing that I’ve learned from my journey into data science: there is nothing more satisfying than working with easily accessible and structured data.
A single zip file containing all study records in XML format is available for download.
Let’s see what kind of insights we can glean from this data.
First, it would be nice to show the number of clinical studies conducted in all parts of the world throughout the years.
Click here to сheck the interactive map created with plotly.
gov provides information about US-based clinical trials, no wonder that most studies come from the U.
Apart from this, the U.
pharmaceutical market is the world’s biggest national market.
Now let’s see which nation tends to participate in clinical trials more actively.
To figure this out, I will take the number of studies started in 2016 for each country and divide it by the country’s population.
You can check the map here.
Unexpectedly, the U.
loses its leadership here.
Denmark seems to have more trials per person than any other country.
Probably, such a big number of studies is the result of multiple initiatives by the Danish government to establish the country as a preferred country for clinical trials.
Another example is a map with a number of lung cancer trials.
Despite the U.
S being the leader, we can see that China is one of the main places for cancer research.
The major cause for this might be that China is the world’s largest consumer and producer of tobacco.
Here comes Machine LearningWhile working in clinical research I was particularly interested in how pharmaceutical companies determine the direction of research.
Most of the companies try to focus on a specific niche and develop products targeted at a specific group of diseases.
Using Principal component analysis (PCA) it is possible to visualize research endeavors of the top 50 companies.
Each company is represented by a vector where each value is the number of studies of a particular disease performed by the company.
I also normalized the values and used PCA to be able to visualize the data.
I definitely encourage you to follow this link to check the visualization (WARNING: it might take some time to open).
This way we can see which companies share the same interests in clinical researchHere you can see that some companies form separate clusters.
For example, the yellow dots (Sanofi Pasteur, MedImmune LLC and Novartis Vaccines) are the companies that work predominantly on vaccines.
The violet ones at the bottom (Boston Scientific Corporation, St.
Jude Medical and Medtronic Cardiac Rhythm and Heart Failure) are famous for developing products for cardiology.
Unsurprisingly, Genentech and Hoffmann-La Roche are also pretty close together.
I believe that this kind of analysis is a great way to get a bigger picture of the world of clinical trials.
And undoubtedly, the potential of machine learning in clinical research is difficult to overestimate.
Please share your ideas on how machine learning can be applied to clinical research in the comments below.
.. More details