Using predictive modelling to identify those most likely to benefit from a given disease modifying therapy would not only improve outcomes at an individual level, but enable predictive enrichment to increase the power of clinical trials. Jean-Pierre Falet, MD, CM, McGill University, Montreal, Canada, talks on work using deep learning to estimate treatment effect for patients with progressive multiple sclerosis (MS). Baseline clinical and imaging data from 700 participants in the ORATORIO (NCT01194570) and OLYMPUS (NCT00087529) trials of anti-CD20 therapy was used to train a causal inference network to predict the rate of disability progression on active treatment and placebo, and thus predict treatment response. In a training dataset, the model was able to accurately separate responders and non-responders. The model was also validated using data from the ARPEGGIO trial (NCT02284568), which assessed the efficacy of laquinimod, an agent with a different mechanism of action to that of the anti-CD20 antibodies on which the model was trained. The model was able to identify responders to laquinimod. Dr Falet comments on the potential impact of such technology in future clinical trials and ongoing work to optimize the model. This interview took place at the American Academy of Neurology 2022 Congress in Seattle, WA.
Transcript (edited for clarity)
In order to train this model… Well, first we have to design it. So we had to use what’s called the causal inference framework. This is a framework built through the language of causal inference, which is really key to answering these types of questions. And these are not traditional machine learning problems in the sense that we don’t have an outcome that we can train a network to predict...
In order to train this model… Well, first we have to design it. So we had to use what’s called the causal inference framework. This is a framework built through the language of causal inference, which is really key to answering these types of questions. And these are not traditional machine learning problems in the sense that we don’t have an outcome that we can train a network to predict. We don’t know if a person will respond to a treatment and we can never know that because in the real-world setting, we only ever give a treatment or a control or another treatment to a patient. So, we have to use a very specific framework to do this. And in order to do this, we’ve built a neural network that has a specific architecture that’s able to tackle this challenge.
The dataset that we use to then train this network is built on two clinical trials that are ORATORIO and OLYMPUS, which studied ocrelizumab and rituximab against placebo. They were randomized controlled trial data that we were able to pool together because they both studied anti-CD20 antibodies with similar mechanisms of action. And this data set is around a thousand patients. And we took part of it to train the model and we left out part of it to test how well it can generalize to unseen patients. And the data that we have available about the participants in these trials are the baseline characteristics. So what they have before the treatment initiated, and these are common metrics that are recorded as part of clinical trials, such as age, sex, height, weight, disability scores, time from symptom onset, and a variety of MRI metrics that are extracted from automatic segmentation, such as T2 lesion volume, GAD counts and normalized brain volume.
And the network is trained to predict how fast the disability will progress on the active arm, so the anti-CD20 antibody, or on the control, and through our causal inference framework, we will then predict for each individual given their characteristics how well they will respond to the anti-CD20 antibody. And so this is how the network is trained. When we validate the model we use again, causal inference metrics to see how well the network is able to separate people who are more responsive from people who are less responsive. And we see that the network is able to actually rank responders in terms of how well they respond quite well. So that we can start with a hazard ratio for instance, which is in the range of 0.7 to 0.8 when we look at the test samples. And we can go down to hazard ratio 0.4 to 0.5 and also increase the significance on the [inaudible] test to find a subgroup of responders that is much more responsive to the anti-CD20 antibodies than the whole group level.
So this is what we’ve done. We’ve shown that this network that can actually predict responders on anti-CD20 antibodies can also predict responders to another class of medication. We did this on the trial ARPEGGIO, which tested laquinimod against placebo. And laquinimod has a very different mechanism of action than anti-CD20 antibodies. And we did this to see if the predictors that are learned on the anti-CD20 antibody data set also generalized to another class of medication. Because in the real world, where we’d like to deploy this, we’d like to see if a network that’s trained on existing data can generalize this to novel drugs that want to be tested. And in fact, it was able to accurately rank responders in the laquinimod dataset as well. And this suggests that there might be common predictors, irrespective of drug mechanisms.
So finally, the other metric that we’ve computed comes from a simulation study where we simulated a very short clinical trial. A one year long clinical trial of anti-CD20 antibodies in the future. To see how well using predictive enrichment could reduce the sample size that would be needed to detect significant difference. And we’ve shown that with our model, if we were to select the most responsive individuals above a certain cutoff, we could reduce by about six times the number of patients that are needed for randomization in order to detect an effective medication, which proves the concept that we’re trying to show here.
There are different aspects to this. The first thing that can make this work move forward is to have more data about different drug classes. So here we’ve shown generalization from anti-CD20 antibodies to laquinimod. But to really understand how the predictors from one class generalize to another class, we need many different class medications, even also medications that are thought to be more immune modulatory, versus medications that are more targeting neurodegenerative mechanisms. To see if the predictors learned on the immune modulatory class could generalize to the neurodegenerative targets.
And this is ongoing work to collect data that can answer this question. The other aspect here that we’re working on in terms of trying to increase the dataset size, because we had a large dataset in the world of progressive MS. We had over a thousand patients, but in the world of deep learning where these are very, very complex models with a lot of parameters, you need a lot of data to learn these models. And in an attempt to increase data, we make use of the relapsing-remitting data that is much more widely available. There’s many more clinical trials that were run on the relapsing-remitting MS category than in the progressive MS subtypes.
We can actually make use of that data through a process that’s called transfer learning. Which means that we can learn the parameters initially on the larger data set composed of the relapsing-remitting MS patients. And hope that these parameters, these predictors will transfer to the progressive MS task, which is then used for fine tuning. So we learn a more robust model on a larger data set in a related task, and then transfer what we’ve learned to a harder task, with less data. And we hope that this will improve the robustness and the performance of our neural network. And that’s something that we’re working on right now.
The other aspect that we’re working on is to use the images themselves, the MRI sequences to learn from. So in this initial study, we’ve focused on readily available MRI metrics that are extracted from automatic segmentation, T2 lesion volume, GAD count, normalized brain volume. But these are human-defined in some ways, and are not necessarily the most predictive features in the images. There might be more subtle predictors, more complex interactions between different parts of the brain MRI that might be more predictive of treatment response.
So using these sequences, the voxel level data itself as an input to the network is something that we hope will increase performance. And to do this, we extend our neural network to include a convolutional neural network at its input, which processes the images in a data-driven fashion to learn predictors of response automatically in what we call an end-to-end approach. We’ve shown this improves performance and related task in treatment response in relapsing-remitting MS recently. And we’re trying to extend this to the progressive MS task that I’m going to talk about at the AAN.