Further Developing the Care Model – Part 3 – Data generation and code

Returning to our care model that discussed in parts one and two, we can begin by defining our variables.

n2value

Each sub-process variable is named for  its starting sub-process and ending sub-process.  We will define mean time for the sub-processes in minutes, and add a component of time variability.  You will note that the variability is skewed – some shorter times exist, but disproportionately longer times are possible.  This coincides with real-life: in a well-run operation, mean times may be close to lower limits – as these represent physical (occurring in the real world) processes, there may simply be a physical constraint on how quickly you can do anything!  However, problems, complications and miscommunications may extend that time well beyond what we all would like it to be – for those of us who have had real-world hospital experience, does this not sound familiar?

Because of this, we will choose a gamma distribution to model our processes:

                              \Gamma(a) = \int_{0}^{\infty} {t^{a-1}e^{-t}dt} 

The gamma distribution is useful because it deals with continuous time data, and we can skew it through its shaping parameters Kappa (\kappa) and Theta (\theta) .   We will use the function in R : rgamma(N,\kappa, \theta) to generate our distribution between zero and 1, and use a multiplier (slope) and offset (Y-intercept) to adjust  the distributions along the X-axis.  The gamma distribution can deal with the absolute lower time limit – I consider this a feature, not a flaw.

It is generally recognized that a probability density plot (or Kernel plot) as opposed to a histogram of distributions is more accurate and less prone to distortions related to number of samples (N).  A plot of these distributions looks like this:Property N2value.com

The R code to generate this distribution, graph, and our initial values dataframe is as follows:

seed <- 3559
set.seed(seed,kind=NULL,normal.kind = NULL)
n <- 16384 ## 2^14 number of samples then let’s initialize variables
k <- c(1.9,1.9,6,1.9,3.0,3.0,3.0,3.0,3.0)
theta <- c(3.8,3.8,3.0,3.8,3.0,5.0,5.0,5.0,5.0)
s <- c(10,10,5,10,10,5,5,5,5,5)
o <- c(4.8,10,5,5.2,10,1.6,1.8,2,2.2)
prosess1 <- (rgamma(n,k[1],theta[1])*s[1])+o[1]
prosess2 <- (rgamma(n,k[2],theta[2])*s[2])+o[2]
prosess3 <- (rgamma(n,k[3],theta[3])*s[3])+o[3]
prosess4 <- (rgamma(n,k[4],theta[4])*s[4])+o[4]
prosess5 <- (rgamma(n,k[5],theta[5])*s[5])+o[5]
prosess6 <- (rgamma(n,k[6],theta[6])*s[6])+o[6]
prosess7 <- (rgamma(n,k[7],theta[7])*s[7])+o[7]
prosess8 <- (rgamma(n,k[8],theta[8])*s[8])+o[8]
prosess9 <- (rgamma(n,k[9],theta[9])*s[9])+o[9]
d1 <- density(prosess1, n=16384)
d2 <- density(prosess2, n=16384)
d3 <- density(prosess3, n=16384)
d4 <- density(prosess4, n=16384)
d5 <- density(prosess5, n=16384)
d6 <- density(prosess6, n=16384)
d7 <- density(prosess7, n=16384)
d8 <- density(prosess8, n=16384)
d9 <- density(prosess9, n=16384)
plot.new()
plot(d9, col=”brown”, type = “n”,main=”Probability Densities”,xlab = “Process Time in minutes”, ylab=”Probability”,xlim=c(0,40), ylim=c(0,0.26))
legend(“topright”,c(“process 1″,”process 2″,”process 3″,”process 4″,”process 5″,”process 6″,”process 7″,”process 8″,”process 9”),fill=c(“brown”,”red”,”blue”,”green”,”orange”,”purple”,”chartreuse”,”darkgreen”,”pink”))
lines(d1, col=”brown”, add=TRUE)
lines(d2, col=”red”, add=TRUE)
lines(d3, col=”blue”, add=TRUE)
lines(d4, col=”green”, add=TRUE)
lines(d5, col=”orange”, add=TRUE)
lines(d6, col=”purple”, add=TRUE)
lines(d7, col=”chartreuse”, add=TRUE)
lines(d8, col=”darkgreen”, add=TRUE)
lines(d9, col=”pink”, add=TRUE)
ptime <- c(d1[1],d2[1],d3[1],d4[1],d5[1],d6[1],d7[1],d8[1],d9[1])
pdens <- c(d1[2],d2[2],d3[2],d4[2],d5[2],d6[2],d7[2],d8[2],d9[2])
ptotal <- data.frame(prosess1,prosess2,prosess3,prosess4,prosess5,prosess6,prosess7,prosess8,prosess9)
names(ptime) <- c(“ptime1″,”ptime2″,”ptime3″,”ptime4″,”ptime5″,”ptime6″,”ptime7″,”ptime8″,”ptime9”)
names(pdens) <- c(“pdens1″,”pdens2″,”pdens3″,”pdens4″,”pdens5″,”pdens6″,”pdens7″,”pdens8″,”pdens9”)
names(ptotal) <- c(“pgamma1″,”pgamma2″,”pgamma3″,”pgamma4″,”pgamma5″,”pgamma6″,”pgamma7″,”pgamma8″,”pgamma9”)
pall <- data.frame(ptotal,ptime,pdens)

 

Where the relevant term is rgamma(n,\kappa, \theta).  We’ll use these distributions in our dataset.

One last concept needs to be discussed: The probability of the sub-processes’ occurence.  Each sub-process has a percentage chance of happening – some a 100% certainty, others a fairly low 5% of cases.  This reflects the real world reality of what happens – once a test is ordered, there’s a 100% certainty of the patient showing up for the test, but not 100% of the patients will get the test.  Some cancel due to contraindications, others can’t tolerate it, others refuse, etc…  The percentages that are <100% reflect those probabilities and essentially are like a non-binary boolean switch applied to the beginning of the term that describes that sub-process.  We’re evolving first toward a simple generalized linear equation similar to that put forward in this post.  I think its going to look somewhat like this:

N2Value.comBut we’ll see how well this model fares as we develop it and compare it to some others.  The x terms will likely represent the probabilities between 0 and 1.0 (100%).

For a EMR based approach, we would assign a UID (medical record # plus 5-6 extra digits, helpful for encounter #’s).  We will ‘disguise’ the UID by adding or subtracting a constant known only to us and then performing a mathematical operation on it. However, for our purposes here, we would not need to do that.

We’ll  head on to our analysis in part 4.

 

Programming notes in R:

1.  I experimented with for loops and different configurations of apply with this, and after a few weeks of experimentation, decided I really can’t improve upon the repetitive but simple code above.  The issue is that the density function returns a list of 7 variables, so it is not as easy as defining a matrix, as the length of the data frame changes.  I’m sure there is a way to get around this, but for the purposes of this illustration it is beyond our needs.  Email me at mailto:contact@n2value.com if you have working code that does it better!

2.  For the density function, the number of samples must be a power of 2.  So by choosing 16384 (2^14) we meet that goal.  Setting N to that number makes the data frame more symmetric.

3.  In variable names above, prosess is an intentional misspelling.

 

Black Swans, Antifragility, Six Sigma and Healthcare Operations – What medicine can learn from Wall St Part 7

Black Swans, Antifragility, Six Sigma and Healthcare Operations – What medicine can learn from Wall St Part 7

antifragile

I am an admirer of Nicholas Nassim Taleb – a mercurial options trader who has evolved into a philosopher-mathematician.  The focus of his work is on the effects of randomness, how we sometimes mistake randomness for predictable change, and fail to prepare for randomness by excluding outliers in statistics and decision making.  These “black swans” arise unpredictably and cause great harm, amplified by systems that have put into place which are ‘fragile’.

Perhaps the best example of a black swan event is the period of financial uncertainty we have lived through during the last decade.  A quick recap: the 1998 global financial crisis was caused by a bubble in US real estate assets.  This in turn from legislation mandating lower lending standards and facilitating securitization of these loans combining with lower lending standards (subprime, Alt-A) allowed by the proverbial passing of the ‘hot potato’.  These mortgages were packaged into derivatives named collateralized debt obligations (CDO’s), using statistical models to gauge default risks in these loans.  Loans more likely to default were blended with loans less likely to default, yielding an overall package that was statistically unlikely to default.  However, as owners of these securities found out, the statistical models that made them unlikely to default were based on a small sample period in which there were low defaults.  The models indicated that the financial crisis was a 25-sigma (standard deviations) event that should only happen once in:

Lots of Zeroesyears. (c.f.wolfram alpha)

Of course, the default events happened in the first five years of their existence, proving that calculation woefully inadequate.

The problem with major black swans is that they are sufficiently rare and impactful enough that it is difficult to plan for them.  Global Pandemics, the Fukushima Reactor accident, and the like.  By designing robust systems, expecting system perturbations, you can mitigate their effects when they occur and shake off the more frequent minor black (grey) swans – system perturbations that occur occasionally (but more often than you expect); 5-10 sigma events that are not devastating but disruptive (like local disease outbreaks or power outages).

Taleb classifies how things react to randomness into three categories: Fragile, Robust, and Anti-Fragile.  While the interested would benefit from reading the original work, here is a brief summary:

1.     The Fragile consists of things that hate, or break, from randomness.  Think about tightly controlled processes, just-in-time delivery, tightly scheduled areas like the OR when cases are delayed or extended, etc…
2.     The Robust consists of things that resist randomness and try not to change.  Think about warehousing inventories, overstaffing to mitigate surges in demand, checklists and standard order sets, etc…
3.     The Anti-Fragile consists of things that love randomness and improve with serendipity.  Think about cross-trained floater employees, serendipitous CEO-employee hallway meetings, lunchroom physician-physician interactions where the patient benefits.

In thinking about FragileRobustAnti-Fragile, be cautious about injecting bias into meaning.  After all, we tend to avoid breakable objects, preferring things that are hardy or robust.  So, there is a natural tendency to consider fragility ‘bad’, robustness ‘good’ and anti-fragility must be therefore be ‘great!’  Not true – when we approach these categories from an operational or administrative viewpoint.

Fragile processes and systems are those prone to breaking. They hate variation and randomness and respond well to six-sigma analyses and productivity/quality improvement.  I believe that fragile systems and processes are those that will benefit the most from automation & technology.  Removing human input & interference decreases cycle time and defects.  While the fragile may be prone to breaking, that is not necessarily bad.  Think of the new entrepreneur’s mantra – ‘fail fast’.  Agile/SCRUM development, most common in software (but perhaps useful in Healthcare?) relies on rapid iteration to adapt to a moving target.scrum.jpg   Fragile systems and processes cannot be avoided – instead they should be highly optimized with the least human involvement.  These need careful monitoring (daily? hourly?) to detect failure, at which point a ready team can swoop in, fix whatever has caused the breakage, re-optimize if necessary, and restore the system to functionality.  If a fragile process breaks too frequently and causes significant resultant disruption, it probably should be made into a Robust one.

Robust systems and processes are those that resist failure due to redundancy and relative waste.  These probably are your ‘mission critical’ ones where some variation in the input is expected, but there is a need to produce a standardized output.  From time to time your ER is overcome by more patients than available beds, so you create a second holding area for less-acute cases or patients who are waiting transfers/tests.  This keeps your ER from shutting down.  While it can be wasteful to run this area when the ER is at half-capacity, the waste is tolerable vs. the lost revenue and reputation of patients leaving your ER for your competitor’s ER or the litigation cost of a patient expiring in the ER after waiting 8 hours.    The redundant patient histories of physicians, nurses & medical students serve a similar purpose – increasing diagnostic accuracy.  Only when additional critical information is volunteered to one but not the other is it a useful practice.  Attempting to tightly manage robust processes may either be a waste of time, or turn a robust process into a fragile one by depriving it of sufficient resilience – essentially creating a bottleneck.  I suspect that robust processes can be optimized to the first or second sigma – but no more.

Anti-fragile processes and systems benefit from randomness, serendipity, and variability.  I believe that many of these are human-centric.  The automated process that breaks is fragile, but the team that swoops in to repair it – they’re anti-fragile.  The CEO wandering the halls to speak to his or her front-line employees four or five levels down the organizational tree for information – anti-fragile.  Clinicians that practice ‘high-touch’ medicine result in good feelings towards the hospital and the unexpected high-upside multi-million dollar bequest of a grateful donor 20 years later – that’s very anti-fragile.  It is important to consider that while anti-fragile elements can exist at any level, I suspect that more of them are present at higher-level executive and professional roles in the healthcare delivery environment.  It should be considered that automating or tightly managing anti-fragile systems and processes will likely make them LESS productive and efficient.  Would the bequest have happened if that physician was tasked and bonused to spend only 5.5 minutes per patient encounter?  Six sigma management here will cause the opposite of the desired results.

I think a lot more can be written on this subject, particularly from an operational standpoint.   Systems and processes in healthcare can be labeled fragile, robust, or anti-fragile as defined above.  Fragile components should have human input reduced to the bare minimum possible, then optimize the heck out of these systems.  Expect them to break – but that’s OK – have a plan & team ready for dealing with it, fix it fast, and re-optimize until the next failure.  Robust systems should undergo some optimization, and have some resilience or redundancy also built in – and then left the heck alone!  Anti-fragile systems should focus on people and great caution should be used in not only optimization, but the metrics used to manage these systems – lest you take an anti-fragile process, force it into a fragile paradigm, and cause failure of that system and process.  It is the medical equivalent of forcing a square peg into a round hole.  I suspect that when an anti-fragile process fails, this is why.

Skeptical about competing algorithms?

Someone commented to me that the concept of competing algorithms was very science-fictiony and hard to take at face value outside of the specific application of high frequency trading on Wall Street.  I can understand how that could be argued, at first glance.

However, consider that systems are algorithms (you may want to re-read Part 6 of the What Medicine can learn from Wall Street series).  We have entire systems (in some cases, departments) set up in medicine to handle the process of insurance billing and accounts receivable.  Just when our billing departments seem to get very good at running claims, the insurers implement a new system or rule set which increases our denials.  Our billers then adapt to that change to return to their earlier baseline of low denials.

Are you still sure that there are no competing algorithms in healthcare?  They are hard-coded in people and processes not soft-coded in algorithms & software.

If you are still not sure, consider legacy retailers who are selling commodity goods.  If everyone is selling the same item at the same price, you can only beat your competition by successful internal processes that give you increased profitability over your competitors, allowing you to out-compete them.  You win because you have better algorithms.

Systems are algorithms.  And algorithms compete.

What medicine can learn from Wall Street part 6 – Systems are algorithms

Systems trading on Wall Street in the early days (pre 1980’s) was done by hand or by laborious computation.  Systems traded off indicators –  hundreds of indicators, exist but most are either trend or anti-trend.  Trending indicators range from the ubiquitous and time-honored Moving Average, to the MACD, etc…  Anti-trend indicators tend to be based on oscillators such as relative strength (RSI), etc.  In a trending market, the moving average will do well, but it will get chopped around in a non-trending market with frequent wrong trades.  The oscillator solves some of this problem, but in a strongly trending market, tends to underperform and miss the trend.  Many combinations of trend and anti-trend systems were tried with little success to develop a consistent model that could handle changing market conditions from trend to anti-trend (consolidation) and back.

The shift towards statistical models in the 2000’s (see Evidence-Based Technical Analysis by Aronson) provided a different way to analyze the markets with some elements of both systems.  While I would argue that mean reversion has components of an anti-trend system, I’m sure I could find someone to disagree with me.  The salient point is that it is a third method of evaluation which is neither purely trend or anti-trend.

Finally, the machine learning algorithms that have recently become popular give a fourth method of evaluating the markets. This method is neither trend, anti-trend, or purely statistical (in the traditional sense), so it provides additional information and diversification.

Combining these models through ensembling might have some very interesting results.  (It also might create a severely overfitted model if not done right).

Sidebar:  I believe that the market trades in different ways at different times.  It changes from a technical market, where predictive price indicators are accurate, to a fundamental market, driven by economic data and conditions, to a psychologic market, where ‘random’ current events and investor sentiment are the most important aspects.  Trending systems tend to work well in fundamental markets, anti-trend systems work well in technical or psychologic markets, statistical (mean reversion) systems tend to work well in technical or fundamental markets, and I suspect machine learning might be the key to cracking the psychologic market.  What is an example of a psychologic market?  This – the S&P 500 in the fall of 2008 when the financial crisis hit its peak and we were all wondering if capitalism would survive.

40% Drop in the S&P 500 from August - November during the 2008 financial crisis.
40% Drop in the S&P 500 from August – November during the 2008 financial crisis.

By the way, this is why you pay a human to manage your money, instead of just turning it over to a computer.  At least for now.

So why am I bringing this up?  I’m delving more deeply into Queuing & operations theory these days, wondering if it would be helpful in developing an ensemble model – part supervised learning(statistics), part unsupervised (machine) learning, part Queue Theory algorithms.  Because of this, I’m putting this project on hold.  But it did make me think about the algorithms involved, and I had an aha! moment that is probably nothing new to Industrial Engineering types or Operations folks who are also coders.

Algorithms, like an ensemble model composed of three separate models: a linear model (Supervised Learning), a machine learning model (Unsupervised learning) and a rule based models (Queueing theory), are software coded rule sets.  However, the systems we put in place in physical space are really just the same thing.  The policies, procedures and operational rule sets that exist in our workplace (e.g. the hospital) are hard-coded algorithms made up of flesh and blood, equipment and architecture, operating in an analogue of computer memory – the wards and departments of the hospital.

If we only optimize for one value (profit, throughput, quality of care, whatever), we may miss the opportunity to create a more robust and stable model.  What if we ensembled our workspaces to optimize for all parameters?

The physical systems we have in place, which stem from policies, procedures, management decisions, workspace & workflow design, are a real-life representation of a complex algorithm we have created, or more accurately has grown largely organically, to serve the function of delivering care in the hospital setting.

What if we looked at this system as such and then created an ensemble model to fulfill the triple (quad) aim?

How powerful that would be.

Systems are algorithms.  

Quick Post on Systems vs. Statistical Learning on large datasets

"Bp-6-node-network" by JamesQueue - Own work. Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons - https://commons.wikimedia.org/wiki/File:Bp-6-node-network.jpg#mediaviewer/File:Bp-6-node-network.jpgThe other day I attended a Webinar on Big Data vs. Systems Theory hosted by the MIT Systems design & management group which offers free, and usually very good, webinars.  I recommend them to anyone interested in data driven management using systems and processes.  The specific lecture was “Move Over, Big Data! – How Small, Simple Models can yield Big insights” given by Dr. Larson.  The lecture was very good – it discussed some of the pitfalls we might be likely to fall into with large data sets, and how algorithmic evaluation can alternatively get us to the same place, but in a different way.

Great points raised within the lecture were:
Always consider the average as a distribution (i.e.  a confidence interval) , and compare to its median to avoid some of the pitfalls of averages.
Outliers are easy to dismiss as noncontributory – but when your outlier causes significant effects on your function (i.e. ‘black swans’) you’d better include it!
Averages experienced by one population may be different than averages experienced by another.  (a bit more sophisticated than the N=1 concept)

There was a neat discussion of Queues with Little’s law cited – L=lambda W where L=time average # of customers in system, lambda is average arrival rate and W- mean time spent by customers in the queue.  M/M/K queue notation cited.  Dr. Larson’s Queue Inference Engine (using a poisson distribution) was reviewed.  You can find some more information about the Queue inference engine here.  The point was that small models are an alternative means to sassing out big data than simply using statistical regression.  I’ll admit to not knowing much about queue theory and Markov chains, but I can see some interesting applications in combination with large datasets.  Much along the lines of an ensemble model, but including the queue theory as part of the ensemble…  Unfortunately, as Dr. Larson noted, much like in the linear models we have been approaching, serial queues or networked queues require difficult math with many terms.   The question yet to be answered is – can we provide the best of both worlds?

Further developing the care model – theoretical to applied – part 1

Consider an adult patient who has presented to the ER for abdominal pain.  The ER doctor suspects an appendicitis, so next is a CT scan to “r/o appendicitis.”  There is an assumption that the patient has already had labs drawn and done upon presentation to the ER (probably a rapid test).ER_CT_process

First, the ER doctor has to decide to order the CT study, and write the order.  We’ll assume a modern CPOE system to take out the intervening steps of having the nurse pick up the order, sign off, and then give it to the HUC to call the order to the CT technologist.  We’ll also assume that the CPOE system automatically contacts patient transport and lets them know that there is a patient ready for transport.  Depending on your institution’s HIMSS level, these may be a lot of assumptions!

Second, patient transport needs to pick up the patient and bring them to the CT holding area (from the hallway to a dedicated room).

Third, the nurse (or a second technologist / tech assistant) will assess this patient and make sure that they are a proper candidate for the procedure.  This involves taking a focused history, making sure there is no renal compromise that would be made worse by the low osmolar contrast (LOCA) used in a CT scan, ensuring that IV access is satisfactory for the LOCA injection (or establishing it if it is not), and ensuring that the patient does not have a contrast allergy that would be a contraindication to the study.

Fourth, the CT technologist gets the patient from holding, places them on the CT gantry, hooks up the contrast, and protocols the patient, and then scans.  Once the scan finishes, the patient returns to holding, and the study posts to the PACS system for interpretation by the M.D. radiologist.

Fifth, the radiologist physician sees the study pop up on their PACS (picture archiving & communication system), interprets the study, generates a report (usually by dictating into voice recognition software these days), proofreads it and then approves the report.  If there is an urgent communication issue, the radiologist will personally telephone the ER physician, if not, ancillary staff on both sides usually notice the report is completed and alerts the ER physician to review it when he has time.

Sixth, the ER physician sees the radiologist’s report.  She or he then takes all the information on the patient, including that report, laboratory values, physical examination, patient history, and outside medical records and synthesizes that information to make a most likely diagnosis and exclude other diagnoses.  It is entirely possible that the patient may go on to additional imaging, and the process can repeat.

In comparison to the prior model where all interactions were considered, we can use a bit of common sense to get the number of interacting terms down.  The main rate limiting step is the ordering ER physician – the process initiates with that physician’s decision to get CT imaging.  It is possible for that person to exceed capacity.  Also, there are unexpected events which may require immediate discussion and interaction between members of the team – ER physician to either radiology physician, radiology nurse, or radiology technologist.  Note that the radiology physician and the radiology nurse can both interact with the ER physician both before (step 1) and after (step 6) the study, because of the nature of patient care.

An astute observer may note that there is no transport component of the patient back to ER from radiology holding.  This is because the patient has already been assessed by the ER physician, and more testing, disposition, etc… is pending the information generated by the CT scan.  While the patient certainly needs care, where that care is given during the assessment process (for a stable patient )is not critical.  It could be that the patient goes from CT holding to dialysis, or another testing area, etc…  Usually the next ordered test, consult, or disposition hinges on the basis of the CT results, and will be entered via CPOE, where the patient and ER physician need not be in the same physical space to execute.

From practical experience, ER physician – CT technologist interactions are most common and usually one-sided.  (please take this patient first, I want the study done this way, etc…)  ER physician – nurse interactions are uncommon and usually unidirectional (nurse to physician – this patient is in renal failure, we can’t use LOCA, etc…).  ER Physician and radiology physician interactions are even less common but bidirectional.  (‘This patient is confounding – how can we figure this out?’ vs. ‘your patient has a ruptured aortic aneurysm and will die immediately without surgical interaction!’)

Next post we will modify our generalized linear model and begin assembling a dataset to test our assumptions.

Developing a simple care delivery model further – dependent interactions

Let’s go back to our simple generalized linear model of care delivery from this post:simplified ER Process

With its resultant Generalized Linear Function:

GLM

This model, elegant in its simplicity, does not account for the inter-dependencies in care delivery.  A more true-to-life revised model is:

ER Process with interdependencies - New PageWhere there are options for back and forth pathways depending on new clinical information, denoted in red.

A linear model that takes into account these inter-dependencies would look like this:

GLM2
Including these interactions, we go from 4 terms to 8.  And this is a overly-simplified model!  By drilling down in a typical PI/Six Sigma environment into an aspect of the healthcare delivery processes, its not hard to imagine creating well over four points of contact/patient interaction, each with their own set of interdependencies.  Imagine a process with 12-15 sub-processes and most of those sub-processes each having on average six (6) interdependencies.  And then the possibility of multiple interdependencies among the processes…  This doesn’t even account for a EMR dataset where the number of columns could be …. 350?  Quickly, your ‘simple’ linear model is looking not so simple with easily over 100 terms in the equation, which also causes solvation problems.     Not to despair!  There are ways to take this formula with a high number of terms and create a more manageable model as a reasonable approximation!  The mapping and initial modeling of the care process is of greatest utility from an operational standpoint, to allow for understanding and to guide interpretation of the ultimate data analysis.

I am a believer that statistical computational analysis can identify the terms which are most important for the model.  By inference, these inputs will have the most effect upon outcome, and can guide management to where precious effort, resources, and time should be guided to maximize outcomes.

 

What medicine can learn from Wall Street – Part 3 – The dynamics of time

This a somewhat challenging post with cross-discipline correlations, some unfamiliar terminology, and concepts.  There is a payoff!

You can recap part 1 and part 2 here. 

The crux of this discussion is time.  Understanding the progression towards shorter and shorter time frames on Wall Street enables us to draw parallels and differences in medical care delivery particularly pertaining to processes and data analytics.  This is relevant because some vendors tout real-time capabilities in health care data analysis.  Possibly not as useful as one thinks.

In trading, the best profit one is a risk-less one.  A profit that occurs by simply being present, is reliable, and reproducible, and exposes the trader to no risk.  Meet arbitrage.  Years ago, it was possible for the same security to be trading at different prices on different exchanges as there was no central marketplace.  A network of traders could execute a buy of a stock for $10 in New York, and then sell those same shares on the Los Angeles exchange for $11.  If one imagines a 1000 share transaction, a $1 profit per share yields $1000.  It was made by the head trader holding up two phones to his head and saying ‘buy’ into one and ’sell’ into the other.*   These relationships could be exploited over longer periods of time and represented an information deficit.  However, as more traders learned of them, the opportunities became harder to find as greater numbers pursued them.  This price arbitrage kept prices reasonably similar before centralized, computerized exchanges and data feeds.

As information flow increased, organizations became larger and more effective, and time frames for executing profitable arbitrages decreased.  This led traders to develop simple predictive algorithms, like Ed Seykota did, detailed in part 1.  New instruments re-opened the profit possibility for a window of time, which eventually closed.  The development of futures, options, indexes, all the way to closed exchanges (ICE, etc…) created opportunities for profit which eventually became crowded.  Since the actual arbitrages were mathematically complex (futures have an implied interest rate, options require a solution of multiple partial differential equations, and indexes require summing instantaneously hundreds of separate securities) a computational model was necessary as no individual could compute the required elements quickly enough to profit reliably.  With this realization, it was only a matter of time before automated trading (AT) happened, and evolved into high-frequency trading with its competing algorithms operating without human oversight on millisecond timeframes.

The journey from daily prices to ever shorter prices over the trading day to millisecond prices was driven by availability of good data and reliable computing which could be counted to act on those flash prices.  Once a game of location (geographical arbitrage) turned into a game of speed (competitive pressures on geographical arbitrage) turned into a game of predictive analytics (proprietary trading and trend following) turned into a more complex game of predictive analytics (statistical arbitrage) was then ultimately turned back into a game of speed and location (High frequency trading).

The following chart shows a probability analysis of an ATM straddle position on IBM.  This is an options position.  It is not important to understand the instrument, only to understand what the image shows.  For IBM, the expected variance that exists in price at one standard deviation (+/- 1 s.d.) is plotted in below.  As time (days) increases along the X axis, the expected range widens, or becomes less accurate.

credit: TD Ameritrade
credit: TD Ameritrade

Is there a similar corollary for health care?

Yes, but.

First, recognize the distinction between the simpler price-time data which exists in the markets, vs the rich, complex multivariate data in healthcare.  

Second, assuming a random walk hypothesis , security price movement is unpredictable, and at best can only be calculated so that the next price will be in a range defined by a number of standard deviations according to one’s model as seen above in the picture. You cannot make this argument in healthcare.  This is because the patient’s disease is not a random walk.  Disease follows proscribed pathways and natural histories which allow us to make diagnoses and implement treatment options.

It is instructive to consider Clinical Decision Support tools.  Please note that these tools are not a substitute for expert medical advice (and my mention does not employ endorsement).  See Esagil and diagnosis pro.  If you enter “abdominal pain” into either of the algorithms, you’ll get back a list of 23 differentials (woefully incomplete) in Esagil and 739 differentials (more complete, but too many to be of help) in Diagnosis Pro.  But this is a typical presentation to a physician – a patient complains of “abdominal pain” and the differential must be narrowed.

At the onset, there is a wide differential diagnosis.  The possibility that the pain is a red herring and the patient really has some other, unsuspected, disease must be considered.  While there are a good number of diseases with a pathognomonic presentation, uncommon presentations of common diseases are more frequent than common presentations of rare diseases.

In comparison to the trading analogy above, where expected price movement is generally restricted to a quantifiable range based on the observable statistics of the security over a period of time, for a de novo presentation of a patient, this could be anything, and the range of possibilities is quite large.

Take, for example, a patient that presents to the ER complaining “I don’t feel well.”  When you question them, they tell you that they are having severe chest pain that started an hour and a half ago.  That puts you into the acute chest pain diagnostic tree.

Reverse Tree

With acute chest pain, there is a list of differentials that needs to be excluded (or ‘ruled out’), some quite serious.  A thorough history and physical is done, taking 10-30 minutes.  Initial labs are ordered (5-30 minutes if done in a rapid, in-ER test, longer if sent to the main laboratory) an EKG and CXR (chest X-ray) are done for their speed,(10 minutes for each)  and the patient is sent to CT for a CTA (CT Angiogram) to rule out a PE (Pulmonary embolism).  This is a useful test, because it will not only show the presence or absence of a clot, but will also allow a look at the lungs to exclude pneumonias, effusions, dissections, and malignancies. Estimate that the wait time for the CTA is at least 30 minutes.  

The ER doctor then reviews the results (5 minutes)- troponins are negative, excluding a heart attack (MI), the CT scan eliminated PE, Pneumonia, Dissection, Pneumothorax, Effusion, malignancy in the chest.  The Chest X-Ray excludes fracture.  The normal EKG excludes arrhythmia, gross valvular disease, and pericarditis.   The main diagnoses left are GERD, Pleurisy, referred pain, and anxiety.  ER doctor goes back to the patient (10 minutes) , patient doesn’t appear anxious & no stressors, so panic attack unlikely.  No history of reflux, so GERD unlikely.  No abdominal pain component, and labs were negative, so abdominal pathologies unlikely.  Point tenderness present on the physical exam at the costochondral junction – and the patient is diagnosed with costochondritis.  The patient is then discharged with a prescription for pain control.  (30 minutes).  

Ok, if you’ve stayed with me, here’s the payoff.

As we proceed down the decision tree, the number of possibilities narrows in medicine.

In comparison, price-time data – in which the range of potential prices increase as you proceed forward in time.

So, in healthcare the potential diagnosis narrows as you proceed down the x-axis of time.  Therefore, time is both one’s friend and enemy – friend as it provides for diagnostic and therapeutic interventions which establish the patient’s disease process; enemy as payment models in medicine favor making that diagnostic and treatment process as quick as possible (when a hospital inpatient).

We’ll continue this in part IV and compare it relevance to portfolio trading.

*As an aside, the phones in trading rooms had a switch on the handheld receiver – you would push them in to talk.  That way, the other party would not know that you were conducting an arbitrage!  They were often slammed down and broken by angry traders – one of the manager’s jobs was to keep a supply of extras in his desk, and they were not hard-wired but plugged in by a jack expressly for that purpose! trader's phone

**Yes, for the statisticians reading this, I know that there is an implication of a gaussian distribution that may not be proven.  I would suspect the successful houses have modified for this and have instituted non-parametric models as well.  Again, this is not a trading, medical or financial advice blog.

 

Processes and Modeling – a quick observation

Is it not somewhat obvious to the folks reading this blog that this:

simplified ER Process

 

Is the same thing as this:

GLMWhile I might be skewered for oversimplifying the process (and it is oversimplified – greatly), the fundamental principles are the same.  LOS for those not inured in the definition is Length of Stay, also known as Turn around Time (former is usually in days, latter in minutes or hours)

Out of curiosity, is anyone reading this blog willing to admit they are using something similar, or have tried to use something similar and failed?  I would love to know people’s thoughts on this.

Productivity in medicine – what’s real and what’s fake?

Let’s think about provider productivity.  As an armchair economist, I apologize to any PhD economists who feel I am oversimplifying things.

productivity
Why is productivity good?  It has enabled the standard of living increase over the last 200 years.  Economic output is tied to two variables: the number of individuals producing goods, and how many goods and services they can produce – productivity.  Technology supercharges productivity.   50 member platform companies now outproduce the corporation of 40 years ago which took a small army of people to achieve a lower output.  We live better lives because of productivity.

We strive for productivity in health care.  More patients seen per hour, more patients treated.  Simple enough.  But productivity focused on N(#) of patients seen per hour does not necessarily maintain quality of care as that metric increases.  A study of back office workers in banking validated that when the workers were overloaded, they sped up, but the quality of their work decreased (defects).  Banking is not healthcare, granted, but in finance defects are pretty quickly recognized and corrected [“Excuse me, but where is my money?”].  As to patient outcome, defects may take longer to show up and be more difficult to attribute to any one factor.  Providers usually have a differential diagnosis for their patient’s presenting complaints.   A careful review of the history and medical record can significantly narrow the differential.  Physician extenders can allow providers to see patients more effectively, with routine care shunted to the extender.  However, for a harried clinician, testing can also be used as a physician extender of sorts.  It increases diagnostic accuracy, at a cost to the patient (monetary and time) and the payor (monetary).  It is hardly fraudulent.  However, is it waste?  And since it usually requires a repeat visit, is it rework?  Possibly yes, to both.

The six-minute per encounter clinician who uses testing as a physician extender will likely have higher RVU production than one who diligently reviews the medical record for a half-an-hour and sees only 10 patients a day.  But who is providing better care?  If outcomes are evaluated, I would suspect that there is either no difference between the two or a slight outcome measure favoring the higher testing provider.  An analysis to judge whether the cost/benefit ratio is justified would probably be necessary.  Ultimately, if you account for all costs on the system, the provider that causes more defects, waste, and re-work is usually less efficient on aggregate, even though individually measured productivity may be high.  See: ‘The measure is the metric‘.  Right now, insurers are data mining to see which providers have best outcomes and lowest costs for specific disease processes, and will steer patients preferentially to them (Aetna CEO, keynote speech HIMSS 2014).

One of my real concerns is that we are training an entire generation of providers in this volume-oriented, RVU-production approach.  These folks may be high performers now, but when the value shift comes, these providers are going to have to re-learn a whole new set of skills.  More worrisome, there are entire practices that are being optimized under six sigma processes for greatest productivity.  Such a practice will have a real problem adapting to value-based care, because it represents a cultural shift.  It might affect the ability of a health system to pivot from volume to value, with resulting loss of competitiveness.

In the volume to value world, there are two types of productivity:

  • Fake productivity: High RVU generators who do so by cost shifting, waste, re-work, defects.
  • True productivity: Consistent RVU generators who follow efficient testing, appropriate # of follow-up visits, and have the good outcomes to prove it.

I am sure that most providers want to work in the space of real productivity – after all, it represents the ideal model learned as students.   Fake productivity is simply a maladaptive response to external pressures, and shouldn’t be conflated with True productivity.