Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I agree that the level of risk/consequence is higher for radiology misses, but I wonder if radiologists are already missing things because of simplification for human feasibility. Things like LI-RADS and BI-RADS are so simple from a computer science perspective. I wouldn't even call them algorithms, just simple checkbox decision making.

This tendency to simplify is everywhere in radiology: When looking for a radial head fracture, we're taught to exam the cortex for discontinuities, look for an elbow joint effusion, evaluate the anterior humeral line, etc. But what if there's some feature (or combination of feature) that is beyond human perception? Maybe the radial ulnar joint space is a millimeter wider than it should be? Maybe soft tissues are just a bit too dense near the elbow? Just how far does the fat pad have to be displaced to indicate an effusion? Probably the best "decision function" is a non-linear combination of all these findings. Oh, but we only have 1 minute to read the radiograph and move on to the next one.

Unfortunately, as someone noted below, advances in medicine are glacially slow. I think change is only going to come in the form of lawsuits. Imagine a future where a patient and her lawyer can get a second-opinion from an online model, "Why did you miss my client's proximal scaphoid fracture? We uploaded her radiographs and GPT-4 found it in 2 seconds." If and when these types of lawsuits occur, malpractice insurances are going to push for radiologists to use AI.

Regarding other tasks performed by radiologists, some radiologists do more than dictate images, but those are generally the minority. The vast majority of radiologists read images for big money without ever meeting the patient or the provider who ordered the study. In the most extreme case, radiologists read studies after the acute intervention has been performed. This happens a lot in IR - we get called about a bleed, review the imaging, take the patient to angiography, and then get paged by diagnostic radiology in the middle of the case.

Orthopedists have already wised-up to the disconnect between radiology reimbursement and the discrepancy in work involved in MR interpretation versus surgery. At least two groups, including the "best orthopedic hospital in the country" employ their own in-house radiologists so that they can capture part of the imaging revenue. If GPT-4 can offer summative reads without feature simplification, and prior to intervention, why not have the IR or orthopedist sign off the GPT-4 report?



1a. Seeing as we know the sensitivity, specificity and inter-rater reliability of LI-RADS and BI-RADS so we can easily determine how many cases we are missing. Your suggestion that we are potentially 'missing' cases with these two algorithms is a misunderstanding of the point of both, with LI-RADS we are primarily optimizing specificity to avoid biopsy and establish a radiologic diagnosis of HCC. With BI-RADS it's a combination of both, and we have great sensitivity. We don't need to be diagnosing more incidentalomas.

1b. With respects to the simplicity of LI-RADS, if you are strictly following the major criteria only it's absolutely simple. This was designed to assist the general radiologist so they do not have to hedge (LR-5 = cancer). If you are practicing in a tertiary care cancer center (i.e. one where you would be providing locoregional therapy and transplant where accurate diagnosis matters), it is borderline negligent to not be applying ancillary features (while optional LR-4 triggers treatment as you would be experienced with in your practice). Ancillary features and accurate lesion segmentation over multiple sequences that are not accurately linked on the Z-axis remains an unsolved problem, and are non-trivial to solve and integrate findings on in CS (I too have a CS background and while my interest is in language models my colleagues involved with multi-sequence segmentation have had less than impressive results even using the latest techniques with diffusion models, although better than U-net, refer to Junde Wu et al. from baidu on their results). As you know with medicine it is irrefutable that increased / early diagnosis does not necessarily lead to improved patient outcomes, there are several biases that result from this and in fact we have routinely demonstrated that overdiagnosis results in harm for patients and early diagnosis does not benefit overall survival or mortality.

2a. Again a fundamental misunderstanding of how radiology and AI work and in fact the reason why the two clinical decision algorithms you mentioned were developed. First off, we generally have an overdiagnosis problem rather than an underdiagnosis one. You bring up a specifically challenging radiographic diagnosis (scaphoid fracture), if there is clinical suspicion for scaphoid injury it would be negligent to not pursue advanced imaging. Furthermore, let us assume for your hypothetical GPT-4 or any ViLM has enough sensitivity (in reality they don't, see Stanford AIMI and Microsoft's separate on chest x-rays for more detail), you are ignoring specificity. Overdiagnosis HARMS patients.

2b. Sensitivity and specificity are always tradeoffs by strict definition. For your second example of radial head fracture, every radiologist should be looking at the soft tissues, it takes 5 seconds to window if the bone looks normal and I am still reporting these within 1-2 minutes. Fortunately, this can also be clinically correlated and a non-displaced radial head fracture that is 'missed' or 'occult' can be followed up in 1 week if there is persistent pain with ZERO (or almost zero) adverse outcomes as management is conservative anyway. We do not have to 'get it right' for every diagnosis on every study the first time, thats not how any field of medicine works and again is detrimental to patient outcomes. All of the current attempts at AI readers have demonstrably terrible specificity hence why they are not heavily used even in research settings, its not just inertia. As an aside, the anterior humeral line is not a sign of radial head fracture.

2c. Additionally, if you were attempting to build such a system using a ViLM model is hardly the best approach. It's just sexy to say GPT-4 but 'conventional' DL/ML is still the way to go if you have a labelled dataset and has higher accuracy than some abstract zero-shot model not trained on medical images.

3. Regarding lawsuits, we've had breast computer-aided-diagnosis for a decade now and there have been no lawsuits, at least major enough to garner attention. It is easy to explain why, 'I discounted the AI finding because I reviewed it myself and disagreed.' In fact that is the American College of Radiology guidance on using breast CAD. A radiologist should NOT change their interpretation solely based on a CAD finding if they find it discordant due to aforementioned specificity issues and the harms of overdiagnosis. What you should (and those of us practicing in these environments do) is give a second look to the areas identified by CAD.

4. Regarding other tasks, this is unequivocally changing. In most large centres you don't have IR performing biopsies. I interviewed at 8 IR fellowships and 4 body imaging fellowships and in all of those this workload was done by diagnostic radiologists. We also provide fluoroscopic services, I think you are referring to a dying trend where IR does a lot of them. Cleveland Clinic actually has nurses/advanced practice providers doing this. Biopsies are a core component of diagnostic training per ACGME guidelines. It is dismissive to say the vast majority of radiologists read images for big one without ever reviewing the clinical chart, I don't know any radiologist who would read a complex oncology case without reviewing treatment history. How else are you assessing for complications without knowing what's been done? I don't need to review the chart on easy cases, but that's also not what you want a radiologist for. You can sign a normal template for 90% of reports, or 98% of CT pulmonary embolism studies without looking at the images and be correct. That's not why were trained and do fellowships in advanced imaging, its for the 1% of cases that require competent interpretation.

5. Regarding orthopedists, the challenge here is that it is hard for a radiologist to provide accurate enough interpretation without the clinical history for a single or few pathologies that a specific orthopedist deals with. For example, a shoulder specialist looks at the MRI for every one of their patients in clinic. As a general radiologist my case-volumes are far lower than theres. My job on these reports is to triage patients to the appropriate specialty (i.e. flag the case as abnormal for referral to ortho) who can then correlate with physical exam maneuvers and adjust their ROC curves based on arthroscopic findings. I don't have that luxury. Fortunately, that is also not why you employ a MSK radiologist as our biggest role is contributing to soft tissue and malignancy characterization. I've worked with some of very renowned orthopedists in the US and as soon as you get our of their wheelhouse of the 5 ligaments they care about they rely heavily on our interpretations.

Additionally, imaging findings in MSK does not equal disease. In a recent study of asymptomatic individuals > 80% had hip labral tears. This is why the clinical is so important. I don't have numbers on soft tissue thickening as an isolated sign of radial head fracture but it would be of very low yield, in the very infrequent case of a radial head fracture without joint effusion I mention the soft tissues and as above follow-up in 1 week to see evolution of the fracture line if it was occult. That's a way better situation than to immobilize every child because of a possible fracture due to soft tissue swelling.

With respects to the best orthopaedic hospital in the country, presumably referring to HSS, they employ radiologists because that is the BEST practice for the BEST patient outcomes/care. It's not solely/mostly because of the money. EVERY academic/cancer center employs MSK radiologists.

6. Respectfully, the reason to not have IR sign off the GPT-4 report is because you are not trained in advanced imaging of every modality. See point 1b, if you aren't investing your time staying up to date on liver imaging because you are mastering your interventional craft you may be unaware of several important advances over the past few years.

7. With respect to hidden features, there are better ones to talk about than soft tissue swelling. There is an entire field about this with radiomics and texture analysis, all of the studies on this have been underwhelming except in very select and small studies showing questionable benefit that is very low on the evidence tree.

To summarize, radiology can be very very hard. We do not train to solely diagnose simple things that a junior resident can pickup (a liver lesion with APHE and washout). We train for the nuanced cases and hard ones. We also do not optimize for 'accurate' detection on every indication and every study type, there are limitations to each imaging modality and the consequences of missed/delayed diagnosis vary depending on the disease process being discussed, similarly with overdiagnosis and overtreatment. 'Hidden features' have so far been underwhelming in radiology or we would use them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: