By LUKE OAKDEN-RAYNER
In surprising news this week, CMS (the Centres for Medicare & Medicaid Services) in the USA approved the first reimbursement for AI augmented medical care. Viz.ai have a deep learning model which identifies signs of stroke on brain CT and automatically contacts the neurointerventionalist, bypassing the first read normally performed by a general radiologist.
From their press material:
Viz.ai demonstrated to CMS a significant reduction in time to treatment and improved clinical outcomes in patients suffering a stroke. Viz LVO has been granted a New Technology Add on Payment of up to $1,040 per use in patients with suspected strokes.
This is enormous news, and marks the start of a totally new era in medical AI.
Especially that pricetag!
Doing it tough
It is widely known in the medical AI community that it has been a troubled marketplace for AI developers. The majority of companies have developed putatively useful AI models, but have been unable to sell them to anyone. This has lead to many predictions that we are going to see a crash amongst medical AI startups, as capital runs out and revenue can’t take over. There have even been suggestions that a medical “AI winter” might be coming.
To be clear, this was never a problem with the technology. Deep learning works, and there are lots of ways it can be applied usefully in medicine. It was an alignment problem: the people who procure medical technology (typically CIOs) are motivated by business needs, not how useful a model is.
The strongest business incentive is money, earning more or spending less, and proving that AI models can help here has been really difficult.
Most researchers and developers have focused on medical outcomes, like diagnostic performance or lives saved. But even if a model saves lives, it might not impress a CIO because healthcare providers have no inbuilt incentive to help people. Gross, I know, but medicine is full of perverse incentives. Nations and employers care about health and wellbeing (because it improves productivity and is generally popular among constituents), but hospitals (both public and private) care about something else.
They care about reimbursement.
Money, that’s what I want
Reimbursement is how medicine incentivises actually helping people. A central payer, whether a government or an insurance company, decides what medical management is cost-effective to improve health.
When a test or treatment is reimbursed, then healthcare providers get paid to use it. All of a sudden, CIOs are really excited. Pay some money to a company, get as much or more money back for using the product.
Does it work?
Well, I’ve spoken about mammography CAD before, an old form of AI intended to assist in detecting breast cancer. This became popular in the 00s, when CMS decided to reimburse CAD-aided mammography tests. A provider would get about $10 more if they used CAD than if they did “standard” reading.
Within a decade almost every screening mammogram in America is read with CAD assistance.
But, you say, maybe they just used it because it was amazing?
Nope. It didn’t work.
In fact, nobody else uses it. I’ve never found the exact numbers, but CAD use outside the USA is practically non-existent. Why? Cos it doesn’t work, and you don’t get paid for it.
Just think about that. Medicare has spent hundreds of millions (billions?) on a technology which didn’t work, driving widespread use. Financial incentives are powerful and dangerous things*.
Time is brain
So, financial incentives are the big deal. Life or death for new technologies. So far, modern medical AI (by which I mostly mean deep learning) has received dozens of FDA clearances, but there has been almost no financial incentive to use these products.
So what is ContaCT, and how did Viz.ai get CMS to reimburse its use?
Viz.ai received FDA clearance in early 2018 for a deep learning system that can detect blockages in the large blood vessels that supply the brain, on CT scans. This system was an interesting break from the dozens of pure diagnostic systems that startups were producing at the time, in that it was intended purely for triage and fast response. If it saw a blockage, it directly contacted the specialist who could fix the problem, skipping the radiologist who would normally read the image first.
Viz.ai claim that by reducing the time for a specialist to review the CT scan of possible blockages, they prevent long delays during which time more and more brain cells are dying from a lack of blood. They have published a few papers on the topic (here and here) and had to provide a fair bit more to CMS to justify this claim.
The CMS document that describes the decision to reimburse ContaCT is 40 pages long, but is well worth a read if this whole topic is of interest. There is a lot in there, with a lot of back and forth between CMS and Viz.ai, covering a lot of topics (including many that I have seen raised on Twitter). I’ve uploaded the document here (extracted from a longer 2000+ page document on other CMS decisions).
CMS requires that applicants prove the technology produces “substantial clinical improvement”. So what did Viz.ai provide?
They show several things:
- faster time to notification of the clot-busting specialist
- faster time to transfer from peripheral hospital to a central hospital where the relevant procedure can be performed
- faster time to clot-busting procedure
These things alone are interesting, but rely purely on our existing knowledge that delays lead to worse brain injuries (as the saying goes, “in strokes, time is brain”). But Viz.ai didn’t stop there. They actually did the thing I always harp on about. They showed outcomes.
- Improved modified Rankin score (mRS) at discharge
- Improved NIH Stroke Score (NIHSS) at day 5
- Improved mRS at day 90
These outcomes show that these patients did better than patients without ContaCT. These scores are widely used in stroke trials and summarise degree of damage/disability following a stroke**. So that is awesome, finally we have evidence of a clinical improvement for a radiology AI system!
Not everyone was impressed with the evidence provided to CMS when this story hit the webs.
Hugh and Ahmed raise two main points.
- that the time-saving comes from cutting out the radiologist and getting the neuro-interventionalist to review the CT scan directly
- that the sample size is pretty small
I’m going to take off my skeptical hat and disagree with both of them!
Several people argued that this isn’t actually an AI intervention at all (or even a technological one), and that all they are doing is changing the care pathway. I find this claim dubious – it relies on the idea that stroke management would be better if a neuro-interventionalist (abbreviated to INR from here on) read every CT angiogram performed for a possible stroke.
There is a problem with this. INRs are rare. These are subspecialists. In my state in Australia, population 1.7 million, we have four of them. Across the whole of the US, there were previously estimated to be 200-400 INRs, although those figures are quite old.
In the US there are over 1,000,000 stroke admissions per year (~850,000 from Medicare alone in 2010). There is no way these busy INRs can review all those scans.
This is where the AI comes in. If the AI picks up a possible blockage, the INR is contacted. According to Viz.ai, their ContaCT system detects ~90% of blockages, and will exclude around 90% of the patients who don’t have a blockage. So instead of reviewing a million scans a year, the INRs only need to review 100,000. Much more achievable with the limited workforce.
So, yes, the innovation here is that the INR sees the scan before the radiologist, but it only works because the AI system cuts out the majority of the scans.
Then we come to the complaint about sample size. I’m normally all about criticising studies for small sample sizes, and it is true their clinical outcomes results (the mRS and NIHSS results) were in 43 patients. But they did provide a lot more data on the other outcomes. Across 3 additional sites, they show that another 80 or so patients had a statistically shorter time to puncture than controls. They also show that their entire database of real-world cases where ContaCT was used, almost 5,000 patients, achieved the same time-to-notification as they had in the original study.
In combination, these results are all reassuring. It is also worth noting that they are currently carrying out a large multi-centre study and we will see a larger sample size for the outcomes results in the near future. Sure, I would prefer to see that before reimbursement, but I’m not shocked that the decision was made this way.
Seriously though. $1000?
The announcement that Medicare would reimburse providers up to $1000 per use of the AI model was by far the most controversial part, and for good reason. AI models cost pretty much nothing to run. The CT scan itself, which determines if a patient can be treated, is about $1000. Does CMS think that this AI is as useful as CT scanning in stroke?
Well, no. Of course not.
This whole thing is a bit weird, but essentially CMS have tried to work with the business model of Viz.ai, which is unlike any other medical technology business model. Viz.ai charge a yearly subscription to deploy and maintain their AI system.
I don’t know the actual pricing for ContaCT, but the document repeatedly refers to a cost of $25,000 per annum. In this example, they say that the reimbursement cost is designed to cover the subscription. If there are 25 patients in the year, then they reimburse $1000 per patient. If there are 500 patients (much more likely), they reimburse $50 per patient.
Do note that the actual payment seems to be fixed for a year at a time. So it is absolutely true that in 2021 each user of ContaCT will be reimbursed $1000. At the end of the 2021 financial year CMS will look at how many claims there were and revise the payment (down, presumably). So it is possible that some high volume hospital will make out like bandits this year and be reimbursed a million dollars for a 25k subscription (ie if they use the AI system 1000 times). Not sure if there are safeguards against that.
The reason the press release is talking about $1000 dollars is that this is the cap on reimbursement per patient. So if a hospital scans less than 25 patients per year, they cannot recoup all their costs and will be out of pocket.
If anything, this approach is conservative. No matter how much the system is used, no matter how much value it generates, it only costs $25,000 per year. This is not the runaway profit that many imagined for medical AI (although broad coverage of hospitals would still be incredibly lucrative).
What this model does do, however is produce guaranteed revenue, which is a huge step forward in this challenging space. Winter is averted, maybe.
This is a massive deal.
I hadn’t mentioned this yet, but honestly I didn’t see this coming and neither did many others I have spoken to. I thought we were probably years away from reimbursement for AI, and that it would probably start in mammography.
While the exact funding mechanism is a bit strange, startups now have a clear path to follow to generate revenue. If this doesn’t stabilise the market, I don’t know what could.
That isn’t to say it will be easy to follow Viz.ai’s footsteps here. It still remains to be seen if this decision by CMS will translate into widespread adoption (and this may hinge to some extent on the results of their large trial). It is also true that Viz hit on a formula here which is unusual. This new pathway works because there is no obvious risk – if the model misses a stroke they still receive the current standard of care, which is review by a radiologist. Thus far I haven’t been able to come up with another use case that this would work in. Maybe if you can think of one, let me know in the comments or on Twitter.
But even if it is not a simple path to follow, at least it is a path. I am certainly re-evaluating my expectations as well. More reimbursements, for less restrictive tasks, may be just around the corner.
This is where it started, folks.
* For an interesting little discussion on how this happened, Joshua Fenton summarised the extensive lobbying effort led by silicon valley Congresswoman Anna Eshoo in an editorial (unfortunately paywalled) for JAMA, where he asked if we should stop spending 1 in every 10,000 dollars in US healthcare on a failed technology. Spoiler: we still do.
** The mRS and NIHSS scores aren’t perfect by any means, but are pretty broadly accepted as endpoints for this sort of study.
Luke Oakden-Rayner is a radiologist in South Australia, undertaking a Ph.D in Medicine with the School of Public Health at the University of Adelaide. This post originally appeared on his blog here.