I heard about the Society of Imaging Informatics in Medicine’s (SIIM) Scientific Conference on Machine Intelligence in Medical Imaging (C-MIMI) on Twitter. Priced attractively, easy to get to, I’m interested in Machine Learning and it was the first radiology conference I’ve seen on this subject, so I went. Organized on short notice so I was expecting a smaller conference.
I almost didn’t get a seat. It was packed.
The conference had real nuts and bolts presentations & discussions on healthcare imaging machine learning (ML). Typically, these were Convolutional Neural Networks (CNN‘s/Convnets) but a few Random Forests (RF) and Support Vector Machines (SVM) sneaked in, particularly in hybrid models along with a CNN (c.f. Microsoft). Following comments assume some facility in understanding/working with Convnets.
Some consistent threads throughout the conference:
- Most CNN’s were trained on Imagenet with the final fully connected (FC) layer removed; then re-trained on radiology data with a new classifer FC layer placed at the end.
- Most CNN’s were using Imagenet standard three layer RGB input despite being greyscale. This is of uncertain significance and importance.
- The limiting of input matrices to grids less than image size is inherited from the Imagenet competitions (and legacy computational power). Decreased resolution is a limiting factor in medical imaging applications, potentially worked-around by multi-scale CNN’s.
- There is no central data repository for a good “Ground Truth” to develop improved machine imaging models.
- Data augmentation methods are commonly used due to lower numbers of obtained cases.
Keith Dryer DO PhD gave an excellent lecture about the trajectory of machine imaging and how it will be an incremental process with AI growth more narrow in scope than projected, chiefly limited by applications. At this time, CNN creation and investigation is principally an artisanal product with limited scalability. There was a theme – “What is ground truth?” which in different instances is different things (path proven, followed through time, pathognomonic imaging appearance).
There was an excellent educational session from the FDA’s Berkman Sahiner. The difference between certifying a type II or type III device may keep radiologists working longer than expected! A type II device, like CAD, identifies a potential abnormality but does not make a treatment recommendation and therefore only requires a 510(k) application. A type III device, as in an automated interpretation program creating diagnosis and treatment recommendations will require a more extensive application including clinical trials, and a new validation for any material changes. One important insight (there were many) was that the FDA requires training and test data to be kept separate. I believe this means that simple cross-validation is not acceptable nor sufficient for FDA approval or certification. Adaptive systems may be a particularly challenging area for regulation, as similar to the ONC, significant changes to the software of the algorithm will require a new certification/approval process.
Industry papers were presented from HK Lau of Arterys, Xiang Zhou of Siemens, Xia Li of GE, and Eldad Elnekave of Zebra medical. The Zebra medical presentation was impressive, citing their use of the Google Inception V3 model and a false-color contrast limited adaptive histogram equalization algorithm, which not only provides high image contrast with low noise, but also gets around the 3-channel RGB issue. Given statistics for their CAD program were impressive at 94% accuracy compared to a radiologist at 89% accuracy.
Scientific Papers were presented by Matthew Chen, Stanford; Synho Do, Harvard; Curtis Langlotz, Stanford; David Golan, Stanford; Paras Lakhani, Thomas Jefferson; Panagiotis Korfiatis, Mayo Clinic; Zeynettin Akkus, Mayo Clinic; Etka Bullar, U Saskatchewan; Mahmudur Rahman, Morgan State U; Kent Ogden SUNY upstate.
Ronald Summers, MD PhD from the NIH gave a presentation on the work from his lab in conjunction with Holger Roth, detailing the specific CNN approaches to Lymph Node detection, Anatomic level detection, Vertebral body segmentation, Pancreas Segmentation, and colon polyp screening with CT-colonography, which had high False Positives. In his experience, deeper models performed better. His lab also changes unstructured radiology reporting into structured reporting through ML techniques.
Abdul Halabi of NVIDIA gave an impressive presentation on the supercomputer-like DGX-1 GPU cluster (5 deliveries to date, the fifth of which was to Mass. General, a steal at over $100K), and the new Pascal architecture in the P4 & P40 GPU’s. 60X performance on AlexNet vs the original version/GPU configuration in 2012. Very impressive.
Sayan Pathak of Microsoft Research and the Inner Eye team gave a good presentation where he demonstrated that a RF was really just a 2 layer DNN, i.e. a sparse 2 layer perceptron. Combining this with a CNN (dNDE.NET), it beat googLENet’s latest version in the Imagenet arms race. However, as one needs to solve for both structures simultaneously, it is an expensive (long, intense) computation.
Closing points were the following:
- Most devs currently using Python – Tensorflow +/- Keras with fewer using CAFFE off of Modelzoo
- DICOM -> NIFTI -> DICOM
- De-identification of data is a problem, even moreso when considering longitudinal followup.
- Matching accuracy to the radiologist’s report may not be as important as actual outcomes report.
- There was a lot of interest in organizing a competition to advance medical imaging, c.f. Kaggle.
- Radiologists aren’t obsolete just yet.
It was a great conference. An unexpected delight. Food for your head!
Thanks for providing a good summary of the first C-MIMI, especially appreciated by us who where not able to attend.