The coming computer vision revolution

3 layer (7,5,3 hidden layers) neural network created in R using the neuralnet package.
3 layer (7,5,3 hidden layers) neural network created in R using the neuralnet package.

 

Nothing of him that doth fade
But doth suffer a sea-change
Into something rich and strange.

– Shakespeare, The Tempest 1.2.396-401

I’m halfway through auditing Stanford’s CS231n course – Convolutional Neural Networks for Visual Recognition.

Wow. Just Wow. There is a sea-changing paradigm shift that is happening NOW –  we probably have not fully realized it yet.

We are all tangentially aware of CV applications in our daily lives – Facebook’s ability to find us in photos, optical character recognition (OCR) of our address on postal mail, that sort of thing. But these algorithms were rule-based expert systems grounded in supervised learning methods. Applications were largely one-off for a specific, single task. They were expensive, complicated, and somewhat error prone.

So what changed?   First, a little history. In the early 1980’s I had a good friend obtaining a MS in comp sci all atwitter about “Neural Networks”. Back then they went nowhere. Too much processing/memory/storage required, too difficult to tune, computationally slow. Fail.

Then:

1999 –  Models beginning with SIFT & ending with SVM (support vector machine) deformable parts. Best model only 74% accurate.

2006 – Restricted Boltzmann Machines apply backpropogation to allow deep neural networks.

2012 – AlexNet Deep learning applied to Imagenet classification database competition achieves a nearly 2X increase in accuracy to earlier SVM methods.

2015-   ResNet Deep learning system achieves a 4.5X increase in accuracy compared to Alexnet and 8X increase in accuracy to old SVM models.

In practical aspects, what does this mean? On a data set with 1000 different items  (ImageNet), ResNet is getting the item 100% correct (compared to a human) about 80% of the time, and correctly classifies the image as one of a list of 5 most probable items 96.4% of the time. People are typically believed to have 95% accuracy identifying the correct image. It’s clear to see that the computer is not far off.

2012 was the watershed year with the first application and win of the CNN to the dataset, and the improvement was significant enough it sparked additional refinements and development. That is still going on – the ResNet example was just released in December 2015! It’s clear that this is an area of active research and further improvements expected.

The convolutional neural network is a game-changer and will likely approach and perhaps exceed human accuracy in computer vision and classification in the near future. That’s a big deal.  As this is a medical blog, the applications to healthcare are obvious – radiology, pathology, dermatology, ophthalmology for starters.  But the CNN may also be useful for the complicated process problems I’ve developed here on the blog – the flows themselves resemble networks naturally.  So why not model them as such?  Why is it a game changer?  Because the model is probably universally adaptable to visual classification problems and once trained, potentially cheap.

 

I’ll  write more on this in the coming weeks – I’ve been inching towards deep learning models (but lagging blogging about them) but there is no reason to wait any more. The era of the deep learning neural network is here.

A quick look back at the blog

I’ve been blogging occasionally over the last few years, and there is no denying that this is a (very) niche blog mostly focusing on the interaction between healthcare, technology, machine learning and process analytics as they relate to operations.

Despite that I’m delighted that the blog had the sum total of 7384 views (!) from 4782 unique visitors (!!) suggesting that the average visitor reads 1.54 posts (!!!).

Realizing that some of the folks who land here do so by accident or are bots, that suggests to me that the average # of reads by real people who land here is much higher, which is gratifying.

Most traffic, of course, comes from the USA. A big hi to the one person who read me from Tanzania.

 

The most popular posts on the blog are the following:

  1. What big data visualization analytics can learn from Radiology
  2. What medicine can learn from Wall Street, Part 1 (& related posts)
  3. Mentoring, Compassion, Curing and Healing
  4. The danger of choosing the wrong metric
  5. Why does everything work in vitro but not in vivo?
  6. Black Swans, Antifragility, Six Sigma & healthcare operations.

Catching up with the “What medicine can learn from Wall St. ” Series

The “What medicine can learn from Wall Street” series is getting a bit voluminous, so here’s a quick recap of where we are up to so far:

Part 1 – History of analytics – a broad overview which reviews the lagged growth of analytics driven by increasing computational power.

Part 2 – Evolution of data analysis – correlates specific computing developments with analytic methods and discusses pitfalls.

Part 3 – The dynamics of time – compares and contrasts the opposite roles and effects of time in medicine and trading.

Part 4 – Portfolio management and complex systems – lessons learned from complex systems management that apply to healthcare.

Part 5 – RCM, predictive analytics, and competing algorithms – develops the concept of competing algorithms.

Part 6 – Systems are algorithms – discusses ensembling in analytics and relates operations to software.


 

What are the main themes of the series?

1.  That healthcare lags behind wall street in computation, efficiency, and productivity; and that we can learn where healthcare is going by studying Wall Street.

2.  That increasing computational power allows for more accurate analytics, with a lag.  This shows up first in descriptive analytics, then allows for predictive analytics.

3.  That overfitting data and faulty analysis can be dangerous and lead to unwanted effects.

4.  That time is a friend in medicine, and an enemy on Wall Street.

5.  That complex systems behave complexly, and modifying a sub-process without considering its effect upon other processes may have “unintended consequences.”

6.  That we compete through systems and processes – and ignore that at our peril as the better algorithm wins.

7.  That systems are algorithms – whether soft or hard coded – and we can ensemble our algorithms to make them better.


 

Where are we going from here?

– A look at employment trends on Wall Street over the last 40 years and what it means for healthcare.

– More emphasis on the evolution from descriptive analytics to predictive analytics to proscriptive analytics.

– A discussion for management on how analytics and operations can interface with finance and care delivery to increase competitiveness of a hospital system.

– Finally, tying it all together and looking towards the future.

 

All the best to you and yours and great wishes for 2016!

 

 

Further Developing the Care Model – Part 3 – Data generation and code

Returning to our care model that discussed in parts one and two, we can begin by defining our variables.

n2value

Each sub-process variable is named for  its starting sub-process and ending sub-process.  We will define mean time for the sub-processes in minutes, and add a component of time variability.  You will note that the variability is skewed – some shorter times exist, but disproportionately longer times are possible.  This coincides with real-life: in a well-run operation, mean times may be close to lower limits – as these represent physical (occurring in the real world) processes, there may simply be a physical constraint on how quickly you can do anything!  However, problems, complications and miscommunications may extend that time well beyond what we all would like it to be – for those of us who have had real-world hospital experience, does this not sound familiar?

Because of this, we will choose a gamma distribution to model our processes:

                              \Gamma(a) = \int_{0}^{\infty} {t^{a-1}e^{-t}dt} 

The gamma distribution is useful because it deals with continuous time data, and we can skew it through its shaping parameters Kappa (\kappa) and Theta (\theta) .   We will use the function in R : rgamma(N,\kappa, \theta) to generate our distribution between zero and 1, and use a multiplier (slope) and offset (Y-intercept) to adjust  the distributions along the X-axis.  The gamma distribution can deal with the absolute lower time limit – I consider this a feature, not a flaw.

It is generally recognized that a probability density plot (or Kernel plot) as opposed to a histogram of distributions is more accurate and less prone to distortions related to number of samples (N).  A plot of these distributions looks like this:Property N2value.com

The R code to generate this distribution, graph, and our initial values dataframe is as follows:

seed <- 3559
set.seed(seed,kind=NULL,normal.kind = NULL)
n <- 16384 ## 2^14 number of samples then let’s initialize variables
k <- c(1.9,1.9,6,1.9,3.0,3.0,3.0,3.0,3.0)
theta <- c(3.8,3.8,3.0,3.8,3.0,5.0,5.0,5.0,5.0)
s <- c(10,10,5,10,10,5,5,5,5,5)
o <- c(4.8,10,5,5.2,10,1.6,1.8,2,2.2)
prosess1 <- (rgamma(n,k[1],theta[1])*s[1])+o[1]
prosess2 <- (rgamma(n,k[2],theta[2])*s[2])+o[2]
prosess3 <- (rgamma(n,k[3],theta[3])*s[3])+o[3]
prosess4 <- (rgamma(n,k[4],theta[4])*s[4])+o[4]
prosess5 <- (rgamma(n,k[5],theta[5])*s[5])+o[5]
prosess6 <- (rgamma(n,k[6],theta[6])*s[6])+o[6]
prosess7 <- (rgamma(n,k[7],theta[7])*s[7])+o[7]
prosess8 <- (rgamma(n,k[8],theta[8])*s[8])+o[8]
prosess9 <- (rgamma(n,k[9],theta[9])*s[9])+o[9]
d1 <- density(prosess1, n=16384)
d2 <- density(prosess2, n=16384)
d3 <- density(prosess3, n=16384)
d4 <- density(prosess4, n=16384)
d5 <- density(prosess5, n=16384)
d6 <- density(prosess6, n=16384)
d7 <- density(prosess7, n=16384)
d8 <- density(prosess8, n=16384)
d9 <- density(prosess9, n=16384)
plot.new()
plot(d9, col=”brown”, type = “n”,main=”Probability Densities”,xlab = “Process Time in minutes”, ylab=”Probability”,xlim=c(0,40), ylim=c(0,0.26))
legend(“topright”,c(“process 1″,”process 2″,”process 3″,”process 4″,”process 5″,”process 6″,”process 7″,”process 8″,”process 9”),fill=c(“brown”,”red”,”blue”,”green”,”orange”,”purple”,”chartreuse”,”darkgreen”,”pink”))
lines(d1, col=”brown”, add=TRUE)
lines(d2, col=”red”, add=TRUE)
lines(d3, col=”blue”, add=TRUE)
lines(d4, col=”green”, add=TRUE)
lines(d5, col=”orange”, add=TRUE)
lines(d6, col=”purple”, add=TRUE)
lines(d7, col=”chartreuse”, add=TRUE)
lines(d8, col=”darkgreen”, add=TRUE)
lines(d9, col=”pink”, add=TRUE)
ptime <- c(d1[1],d2[1],d3[1],d4[1],d5[1],d6[1],d7[1],d8[1],d9[1])
pdens <- c(d1[2],d2[2],d3[2],d4[2],d5[2],d6[2],d7[2],d8[2],d9[2])
ptotal <- data.frame(prosess1,prosess2,prosess3,prosess4,prosess5,prosess6,prosess7,prosess8,prosess9)
names(ptime) <- c(“ptime1″,”ptime2″,”ptime3″,”ptime4″,”ptime5″,”ptime6″,”ptime7″,”ptime8″,”ptime9”)
names(pdens) <- c(“pdens1″,”pdens2″,”pdens3″,”pdens4″,”pdens5″,”pdens6″,”pdens7″,”pdens8″,”pdens9”)
names(ptotal) <- c(“pgamma1″,”pgamma2″,”pgamma3″,”pgamma4″,”pgamma5″,”pgamma6″,”pgamma7″,”pgamma8″,”pgamma9”)
pall <- data.frame(ptotal,ptime,pdens)

 

Where the relevant term is rgamma(n,\kappa, \theta).  We’ll use these distributions in our dataset.

One last concept needs to be discussed: The probability of the sub-processes’ occurence.  Each sub-process has a percentage chance of happening – some a 100% certainty, others a fairly low 5% of cases.  This reflects the real world reality of what happens – once a test is ordered, there’s a 100% certainty of the patient showing up for the test, but not 100% of the patients will get the test.  Some cancel due to contraindications, others can’t tolerate it, others refuse, etc…  The percentages that are <100% reflect those probabilities and essentially are like a non-binary boolean switch applied to the beginning of the term that describes that sub-process.  We’re evolving first toward a simple generalized linear equation similar to that put forward in this post.  I think its going to look somewhat like this:

N2Value.comBut we’ll see how well this model fares as we develop it and compare it to some others.  The x terms will likely represent the probabilities between 0 and 1.0 (100%).

For a EMR based approach, we would assign a UID (medical record # plus 5-6 extra digits, helpful for encounter #’s).  We will ‘disguise’ the UID by adding or subtracting a constant known only to us and then performing a mathematical operation on it. However, for our purposes here, we would not need to do that.

We’ll  head on to our analysis in part 4.

 

Programming notes in R:

1.  I experimented with for loops and different configurations of apply with this, and after a few weeks of experimentation, decided I really can’t improve upon the repetitive but simple code above.  The issue is that the density function returns a list of 7 variables, so it is not as easy as defining a matrix, as the length of the data frame changes.  I’m sure there is a way to get around this, but for the purposes of this illustration it is beyond our needs.  Email me at mailto:contact@n2value.com if you have working code that does it better!

2.  For the density function, the number of samples must be a power of 2.  So by choosing 16384 (2^14) we meet that goal.  Setting N to that number makes the data frame more symmetric.

3.  In variable names above, prosess is an intentional misspelling.

 

Why does everthing work In Vitro but not In Vivo?

Once, I was bitten by the Neurosurgery bug.  (Thanks, Dr. Ojemann!)  Before I became a radiologist, I researched vasospasm in Sub-arachnoid Hemorrage (SAH).  It was a fascinating problem, unfortunately with very real effects for those afflicted. Nimodipene and the old “triple-H” therapy were treatment mainstays. Many Neurosurgeons added their own ‘special sauce’ – the treatment du jour. For In Vitro (in the lab) experimental interventions held great promise for this terrible complication, but nearly all would fail when applied in clinical practice, In Vivo (in real life).

As physicians, we look at disease and try to find a “silver bullet” which will treat that disease 100% of the time with complete efficacy and no side effects. Using Occam’s Razor, the simplest & most obvious solution is often the best.

Consider a disease cured by a drug, as in figure 1. Give the drug, get the desired response. The drug functions as a key in the lock, opening up the door.

www.drawingnow.com

This is how most carefully designed In Vitro experiments work. But take the treatment out of the laboratory, and it fails. Why?

The carefully controlled lab environment is just that – controlled. You set up a simple process, and get your result. However, the In Vivo environment is not so simple – competing complex processes maintaining physiologic homeostasis, at the cellular bio-chemical level – interact with your experiment & confound the results. And the number of disease processes that involve a simple direct cure dwindle with time – the previous generations of scientists have culled those low-hanging fruit already!  You are left with this:

www.n2value.com
www.n2value.com

 

Consider three locked doors, one after the other.  You can’t open the second without opening the first, and you can’t open the third without opening up the first and second.  Here, we have a good therapy, which will cure the disease process , represented by opening up door #3.  But the therapy cannot get to Door #3 – it’s blocked by Doors #1 and #2.

Considering the second system, which more closely approximates what we find in real life, an effecacious drug or treatment exists, which can’t get to the disease-impacting pathway, because it is “locked out” by the body’s other systems.  Not exhaustively: Drug elimination, enzymatic drug inactivation, or feedback pathways counteracting the drug’s effec – it works, but the body’s own homeostatic mechanisms compensate!

Experimentally though, we are not taught to think of this possibility – instead preferring a single agent with identifiable treatment results.  However, many of these easy one-item solutions have already been discovered. That’s why there has been a decrease in the number of novel synthetic chemical drug discoveries lately, as opposed to growth in biologics.  Remember monthly new antibiotic releases?  How often do you see new antibiotics now?

There is a tremendous opportunity to go back and revisit compounds that have been initially discarded for reasons other than toxicity to see if there are new or synergistic effects when combined with other therapy.  Randomized controlled trials would be too large and costly to perform a priori on such compounds – but using EHR data mining, cross-validated longitudinal trials could be designed from existing patient data sets, and some of these unexpected effects could be elucidated after the fact!  Then a smaller, but focused, prospective study could be used to confirm the  suspected hypothesis.  Big data analytics has great promise in teasing out these relationships, and the same techniques can be applied to non-pharmacologic interventions and decisions in patient care throughout medicine.  In fact, the answers may already be there – we just may not have recognized it!

P.S.  Glad to be back after a long hiatus.  Life happens!

Black Swans, Antifragility, Six Sigma and Healthcare Operations – What medicine can learn from Wall St Part 7

Black Swans, Antifragility, Six Sigma and Healthcare Operations – What medicine can learn from Wall St Part 7

antifragile

I am an admirer of Nicholas Nassim Taleb – a mercurial options trader who has evolved into a philosopher-mathematician.  The focus of his work is on the effects of randomness, how we sometimes mistake randomness for predictable change, and fail to prepare for randomness by excluding outliers in statistics and decision making.  These “black swans” arise unpredictably and cause great harm, amplified by systems that have put into place which are ‘fragile’.

Perhaps the best example of a black swan event is the period of financial uncertainty we have lived through during the last decade.  A quick recap: the 1998 global financial crisis was caused by a bubble in US real estate assets.  This in turn from legislation mandating lower lending standards and facilitating securitization of these loans combining with lower lending standards (subprime, Alt-A) allowed by the proverbial passing of the ‘hot potato’.  These mortgages were packaged into derivatives named collateralized debt obligations (CDO’s), using statistical models to gauge default risks in these loans.  Loans more likely to default were blended with loans less likely to default, yielding an overall package that was statistically unlikely to default.  However, as owners of these securities found out, the statistical models that made them unlikely to default were based on a small sample period in which there were low defaults.  The models indicated that the financial crisis was a 25-sigma (standard deviations) event that should only happen once in:

Lots of Zeroesyears. (c.f.wolfram alpha)

Of course, the default events happened in the first five years of their existence, proving that calculation woefully inadequate.

The problem with major black swans is that they are sufficiently rare and impactful enough that it is difficult to plan for them.  Global Pandemics, the Fukushima Reactor accident, and the like.  By designing robust systems, expecting system perturbations, you can mitigate their effects when they occur and shake off the more frequent minor black (grey) swans – system perturbations that occur occasionally (but more often than you expect); 5-10 sigma events that are not devastating but disruptive (like local disease outbreaks or power outages).

Taleb classifies how things react to randomness into three categories: Fragile, Robust, and Anti-Fragile.  While the interested would benefit from reading the original work, here is a brief summary:

1.     The Fragile consists of things that hate, or break, from randomness.  Think about tightly controlled processes, just-in-time delivery, tightly scheduled areas like the OR when cases are delayed or extended, etc…
2.     The Robust consists of things that resist randomness and try not to change.  Think about warehousing inventories, overstaffing to mitigate surges in demand, checklists and standard order sets, etc…
3.     The Anti-Fragile consists of things that love randomness and improve with serendipity.  Think about cross-trained floater employees, serendipitous CEO-employee hallway meetings, lunchroom physician-physician interactions where the patient benefits.

In thinking about FragileRobustAnti-Fragile, be cautious about injecting bias into meaning.  After all, we tend to avoid breakable objects, preferring things that are hardy or robust.  So, there is a natural tendency to consider fragility ‘bad’, robustness ‘good’ and anti-fragility must be therefore be ‘great!’  Not true – when we approach these categories from an operational or administrative viewpoint.

Fragile processes and systems are those prone to breaking. They hate variation and randomness and respond well to six-sigma analyses and productivity/quality improvement.  I believe that fragile systems and processes are those that will benefit the most from automation & technology.  Removing human input & interference decreases cycle time and defects.  While the fragile may be prone to breaking, that is not necessarily bad.  Think of the new entrepreneur’s mantra – ‘fail fast’.  Agile/SCRUM development, most common in software (but perhaps useful in Healthcare?) relies on rapid iteration to adapt to a moving target.scrum.jpg   Fragile systems and processes cannot be avoided – instead they should be highly optimized with the least human involvement.  These need careful monitoring (daily? hourly?) to detect failure, at which point a ready team can swoop in, fix whatever has caused the breakage, re-optimize if necessary, and restore the system to functionality.  If a fragile process breaks too frequently and causes significant resultant disruption, it probably should be made into a Robust one.

Robust systems and processes are those that resist failure due to redundancy and relative waste.  These probably are your ‘mission critical’ ones where some variation in the input is expected, but there is a need to produce a standardized output.  From time to time your ER is overcome by more patients than available beds, so you create a second holding area for less-acute cases or patients who are waiting transfers/tests.  This keeps your ER from shutting down.  While it can be wasteful to run this area when the ER is at half-capacity, the waste is tolerable vs. the lost revenue and reputation of patients leaving your ER for your competitor’s ER or the litigation cost of a patient expiring in the ER after waiting 8 hours.    The redundant patient histories of physicians, nurses & medical students serve a similar purpose – increasing diagnostic accuracy.  Only when additional critical information is volunteered to one but not the other is it a useful practice.  Attempting to tightly manage robust processes may either be a waste of time, or turn a robust process into a fragile one by depriving it of sufficient resilience – essentially creating a bottleneck.  I suspect that robust processes can be optimized to the first or second sigma – but no more.

Anti-fragile processes and systems benefit from randomness, serendipity, and variability.  I believe that many of these are human-centric.  The automated process that breaks is fragile, but the team that swoops in to repair it – they’re anti-fragile.  The CEO wandering the halls to speak to his or her front-line employees four or five levels down the organizational tree for information – anti-fragile.  Clinicians that practice ‘high-touch’ medicine result in good feelings towards the hospital and the unexpected high-upside multi-million dollar bequest of a grateful donor 20 years later – that’s very anti-fragile.  It is important to consider that while anti-fragile elements can exist at any level, I suspect that more of them are present at higher-level executive and professional roles in the healthcare delivery environment.  It should be considered that automating or tightly managing anti-fragile systems and processes will likely make them LESS productive and efficient.  Would the bequest have happened if that physician was tasked and bonused to spend only 5.5 minutes per patient encounter?  Six sigma management here will cause the opposite of the desired results.

I think a lot more can be written on this subject, particularly from an operational standpoint.   Systems and processes in healthcare can be labeled fragile, robust, or anti-fragile as defined above.  Fragile components should have human input reduced to the bare minimum possible, then optimize the heck out of these systems.  Expect them to break – but that’s OK – have a plan & team ready for dealing with it, fix it fast, and re-optimize until the next failure.  Robust systems should undergo some optimization, and have some resilience or redundancy also built in – and then left the heck alone!  Anti-fragile systems should focus on people and great caution should be used in not only optimization, but the metrics used to manage these systems – lest you take an anti-fragile process, force it into a fragile paradigm, and cause failure of that system and process.  It is the medical equivalent of forcing a square peg into a round hole.  I suspect that when an anti-fragile process fails, this is why.

Skeptical about competing algorithms?

Someone commented to me that the concept of competing algorithms was very science-fictiony and hard to take at face value outside of the specific application of high frequency trading on Wall Street.  I can understand how that could be argued, at first glance.

However, consider that systems are algorithms (you may want to re-read Part 6 of the What Medicine can learn from Wall Street series).  We have entire systems (in some cases, departments) set up in medicine to handle the process of insurance billing and accounts receivable.  Just when our billing departments seem to get very good at running claims, the insurers implement a new system or rule set which increases our denials.  Our billers then adapt to that change to return to their earlier baseline of low denials.

Are you still sure that there are no competing algorithms in healthcare?  They are hard-coded in people and processes not soft-coded in algorithms & software.

If you are still not sure, consider legacy retailers who are selling commodity goods.  If everyone is selling the same item at the same price, you can only beat your competition by successful internal processes that give you increased profitability over your competitors, allowing you to out-compete them.  You win because you have better algorithms.

Systems are algorithms.  And algorithms compete.

What medicine can learn from Wall Street part 6 – Systems are algorithms

Systems trading on Wall Street in the early days (pre 1980’s) was done by hand or by laborious computation.  Systems traded off indicators –  hundreds of indicators, exist but most are either trend or anti-trend.  Trending indicators range from the ubiquitous and time-honored Moving Average, to the MACD, etc…  Anti-trend indicators tend to be based on oscillators such as relative strength (RSI), etc.  In a trending market, the moving average will do well, but it will get chopped around in a non-trending market with frequent wrong trades.  The oscillator solves some of this problem, but in a strongly trending market, tends to underperform and miss the trend.  Many combinations of trend and anti-trend systems were tried with little success to develop a consistent model that could handle changing market conditions from trend to anti-trend (consolidation) and back.

The shift towards statistical models in the 2000’s (see Evidence-Based Technical Analysis by Aronson) provided a different way to analyze the markets with some elements of both systems.  While I would argue that mean reversion has components of an anti-trend system, I’m sure I could find someone to disagree with me.  The salient point is that it is a third method of evaluation which is neither purely trend or anti-trend.

Finally, the machine learning algorithms that have recently become popular give a fourth method of evaluating the markets. This method is neither trend, anti-trend, or purely statistical (in the traditional sense), so it provides additional information and diversification.

Combining these models through ensembling might have some very interesting results.  (It also might create a severely overfitted model if not done right).

Sidebar:  I believe that the market trades in different ways at different times.  It changes from a technical market, where predictive price indicators are accurate, to a fundamental market, driven by economic data and conditions, to a psychologic market, where ‘random’ current events and investor sentiment are the most important aspects.  Trending systems tend to work well in fundamental markets, anti-trend systems work well in technical or psychologic markets, statistical (mean reversion) systems tend to work well in technical or fundamental markets, and I suspect machine learning might be the key to cracking the psychologic market.  What is an example of a psychologic market?  This – the S&P 500 in the fall of 2008 when the financial crisis hit its peak and we were all wondering if capitalism would survive.

40% Drop in the S&P 500 from August - November during the 2008 financial crisis.
40% Drop in the S&P 500 from August – November during the 2008 financial crisis.

By the way, this is why you pay a human to manage your money, instead of just turning it over to a computer.  At least for now.

So why am I bringing this up?  I’m delving more deeply into Queuing & operations theory these days, wondering if it would be helpful in developing an ensemble model – part supervised learning(statistics), part unsupervised (machine) learning, part Queue Theory algorithms.  Because of this, I’m putting this project on hold.  But it did make me think about the algorithms involved, and I had an aha! moment that is probably nothing new to Industrial Engineering types or Operations folks who are also coders.

Algorithms, like an ensemble model composed of three separate models: a linear model (Supervised Learning), a machine learning model (Unsupervised learning) and a rule based models (Queueing theory), are software coded rule sets.  However, the systems we put in place in physical space are really just the same thing.  The policies, procedures and operational rule sets that exist in our workplace (e.g. the hospital) are hard-coded algorithms made up of flesh and blood, equipment and architecture, operating in an analogue of computer memory – the wards and departments of the hospital.

If we only optimize for one value (profit, throughput, quality of care, whatever), we may miss the opportunity to create a more robust and stable model.  What if we ensembled our workspaces to optimize for all parameters?

The physical systems we have in place, which stem from policies, procedures, management decisions, workspace & workflow design, are a real-life representation of a complex algorithm we have created, or more accurately has grown largely organically, to serve the function of delivering care in the hospital setting.

What if we looked at this system as such and then created an ensemble model to fulfill the triple (quad) aim?

How powerful that would be.

Systems are algorithms.  

Follow up to “The Etiquette of Help”

c.f. mark ong at ganyfd.com
Superior Mesenteric Angiogram demonstrating a right colonic bleed.

I came across this wonderful piece by Bruce Davis MD on Physician’s Weekly.com about “The Etiquette of Help”. How do you help a colleague emergently in a surgical procedure where things go wrong? As proceduralists, we are always cognizant that this is a possibility.

“Any Surgeon to OR 6 STAT. Any Surgeon to OR 6 STAT.

 

No surgeon wants to hear or respond to a call like that. It means someone is in deep kimchee and needs help right away.”

 

I was called about an acute lower GI bleed with a strongly positive bleeding scan. I practice in a resort area, and an extended family had come here with their patriarch, a man in his late 50’s. (Identifying details changed/withheld – image above is NOT from this case). He had been feeling woozy in the hot sun, went to the men’s room, evacuated a substantial amount of blood, and collapsed.

 

As an interventional radiologist, I was asked to perform an angiogram and embolize the bleeder if possible. The patient was brought to the cath lab; I gained access to the right femoral artery, and then consecutively selected the celiac, superior mesenteric, and inferior mesenteric arteries to evaluate abdominal blood supply. The briskly bleeding vessel was identifiable in the right colonic distribution as an end branch off the ileocolic artery. I guided my catheter, and then threaded a smaller micro-catheter through it, towards the vessel that was bleeding.

 

When you embolize a vessel, you are cutting off blood flow. Close off too large a region, and the bowel will die. Also, collateral vessels in the colon will resupply the bleeding vessel, so you have to be precise.

 

Advancing a microcatheter under fluoroscopy to an end vessel is slow, painstaking work requiring multiple wire exchanges and contrast injections. After one injection, I asked my assisting scrub tech to hand me back the wire.

“Sir, I’m sorry. I dropped the wire on the floor.”

“That’s OK. Just open up another one.”

“Sir, I’m sorry. That was the last one in the hospital.”

“There’s an art to coming in to help a colleague in trouble. Most of us have been in that situation, both giving and receiving help. A scheduled case that goes bad is different from a trauma. In trauma, you expect the worst. Your thinking and expectations are already looking for trouble. In a routine case, trouble is an unwelcome surprise, and even an experienced surgeon may have difficulty shifting from routine to crisis mode.”

 

We inquired how quickly we could get another wire. It would take hours, if we were lucky. The patient was still actively bleeding and requiring increasing fluid and blood support to maintain pressure. After a few creative attempts at solving this problem, it was clear that it was not going to be solved by me, today, in that room. It was time to pull the trigger and make the call the interventionalist dreads – the call to the surgeon.

 

The general surgeon came down to the angio suite and I explained what was happening. I marked the bowel with a dye to assist him in surgery, and sent the patient with him to the OR. The patient was operated on within 30 minutes from leaving my cath lab, and OR time was perhaps 45 minutes. After the procedure was done the surgeon remarked to me that it was one of the easiest resections ever, as he knew exactly where to go from my work.  The surgeon never said anything negative to me, and we had a very good working relationship thereafter.

“The first thing to remember when stepping into a bad situation is that you are the cavalry. You didn’t create the situation, and recriminations and blame have no place in the room. You need to be the calm center to a storm that started before you got involved. Sometimes that’s all that is needed. A fresh perspective, a few focused questions, and the operating surgeon can calm down and get back on track.”

 

I saw the patient the next day, sitting up with a large smile on his face. He explained to me how happy he was that he had come here for vacation, that it was the trip of a lifetime for him, and that he was looking forward to attending his youngest daughter’s wedding later that year. He told me he lived in a rural Midwest area, hours from a very small hospital without an interventionalist, and if this had happened at home, well, who knows?

 

If I had not objectively assessed my inability to finish the case because of equipment issues, well, who knows?

 

If I had been prideful and unwilling to accept my limitations at that time, well, who knows?

 

If I had been more concerned with my reputation or what my partners would think, well, who knows?

 

I sincerely hope that my patient has enjoyed many years of happiness with his family in his bucolic rural Midwestern home. I will never see him again, but I do think of him from time to time.

Further developing the care model – part 2 – definitions

We’ve gone about as far as we can go in theoretical terms with the process model.   The next step is to create a training data set on which to do further experiments and get further insights about combining process and statistics.

Let’s define the variables and the dataset we will be using for this project.

1.  Each encounter with the entire process (all sub-processes from start to finish) requires a unique identifier (UID).   A single patient could go through the process more than once, so a UID is necessary.  This can be as simple as taking their MRN and adding a four digit trailing number identifying how many times through the process.

2.  For each sub-process, time is measured in minutes.  Using start and stop times/dates has some added benefits but is more complex to carry out, as anyone who has ever done so will recognize (non-synced internal clocks providing erroneous time/date data, especially after power outages/surges).

3.  The main times are the pathway times – Sunprocess 1-2, 2-3,3-4,4-5,5-6.
1-2 Reflects the time it takes the physician to order the study and patient transport to come for the patient.
2-3 Reflects transport time from the ED to CT holding.
3-4 Reflects time of nursing evaluation of the patient’s appropriateness for CT imaging.
4-5 Reflects the time bringing the patient into the imaging room and scanning, and sending the study to the PACS system.
5-6 Reflects the time for the radiologist to react to the study being available, interpret the study, and dictate a preliminary result in a format the ED physician can use.

4.  When an interaction occurs along the inner lines we need to account for these in a realistic way.  The boolean variable built into the process will take care of whether the interaction is present or not.  The effect of the off-pathway interaction is to lengthen the time of the main pathway sub-processes.  For example:  Patient arrives in CT holding and nurse identifies a creatinine of 1.9 which needs further information for contrast imaging.  She phones the ED doctor (4 min) and then phones the Radiologist to approve the study based upon that information (2 min).  These phone calls are part of the overall time in subprocess 3-4 for this UID.   To evaluate the time process 3-4 takes without these phone calls, simply subtract the two inner processes.
Or in other words, Process3-4(theoretical)=Process3-4(actual)-(Process1-3 + Process 3-5)

5.  This table will represent potential times for each part of the process, chosen at random but with some basis in fact.

 

Process Mean Time Variability
1-2 10 minutes – 5 / +30 minutes
2-3 15 minutes – 5 / +10 minutes
3-4 15 minutes – 10 / + 15 minutes
4-5 15 minutes -5 / +30 minutes
5-6 20 minutes -10 /+40 minutes
1-3 5 minutes – 3 / +10 minutes
1-4 5 minutes – 3 / +10 minutes
1-5 5 minutes – 3 / + 10 minutes
3-5 5 minutes – 3/ + 10 minutes
3-6 5 minutes – 3/ + 10 minutes

Next post, we’ll begin coding this in an R language data frame.