Quick Post on Systems vs. Statistical Learning on large datasets

"Bp-6-node-network" by JamesQueue - Own work. Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons - https://commons.wikimedia.org/wiki/File:Bp-6-node-network.jpg#mediaviewer/File:Bp-6-node-network.jpgThe other day I attended a Webinar on Big Data vs. Systems Theory hosted by the MIT Systems design & management group which offers free, and usually very good, webinars.  I recommend them to anyone interested in data driven management using systems and processes.  The specific lecture was “Move Over, Big Data! – How Small, Simple Models can yield Big insights” given by Dr. Larson.  The lecture was very good – it discussed some of the pitfalls we might be likely to fall into with large data sets, and how algorithmic evaluation can alternatively get us to the same place, but in a different way.

Great points raised within the lecture were:
Always consider the average as a distribution (i.e.  a confidence interval) , and compare to its median to avoid some of the pitfalls of averages.
Outliers are easy to dismiss as noncontributory – but when your outlier causes significant effects on your function (i.e. ‘black swans’) you’d better include it!
Averages experienced by one population may be different than averages experienced by another.  (a bit more sophisticated than the N=1 concept)

There was a neat discussion of Queues with Little’s law cited – L=lambda W where L=time average # of customers in system, lambda is average arrival rate and W- mean time spent by customers in the queue.  M/M/K queue notation cited.  Dr. Larson’s Queue Inference Engine (using a poisson distribution) was reviewed.  You can find some more information about the Queue inference engine here.  The point was that small models are an alternative means to sassing out big data than simply using statistical regression.  I’ll admit to not knowing much about queue theory and Markov chains, but I can see some interesting applications in combination with large datasets.  Much along the lines of an ensemble model, but including the queue theory as part of the ensemble…  Unfortunately, as Dr. Larson noted, much like in the linear models we have been approaching, serial queues or networked queues require difficult math with many terms.   The question yet to be answered is – can we provide the best of both worlds?

Further developing the care model – theoretical to applied – part 1

Consider an adult patient who has presented to the ER for abdominal pain.  The ER doctor suspects an appendicitis, so next is a CT scan to “r/o appendicitis.”  There is an assumption that the patient has already had labs drawn and done upon presentation to the ER (probably a rapid test).ER_CT_process

First, the ER doctor has to decide to order the CT study, and write the order.  We’ll assume a modern CPOE system to take out the intervening steps of having the nurse pick up the order, sign off, and then give it to the HUC to call the order to the CT technologist.  We’ll also assume that the CPOE system automatically contacts patient transport and lets them know that there is a patient ready for transport.  Depending on your institution’s HIMSS level, these may be a lot of assumptions!

Second, patient transport needs to pick up the patient and bring them to the CT holding area (from the hallway to a dedicated room).

Third, the nurse (or a second technologist / tech assistant) will assess this patient and make sure that they are a proper candidate for the procedure.  This involves taking a focused history, making sure there is no renal compromise that would be made worse by the low osmolar contrast (LOCA) used in a CT scan, ensuring that IV access is satisfactory for the LOCA injection (or establishing it if it is not), and ensuring that the patient does not have a contrast allergy that would be a contraindication to the study.

Fourth, the CT technologist gets the patient from holding, places them on the CT gantry, hooks up the contrast, and protocols the patient, and then scans.  Once the scan finishes, the patient returns to holding, and the study posts to the PACS system for interpretation by the M.D. radiologist.

Fifth, the radiologist physician sees the study pop up on their PACS (picture archiving & communication system), interprets the study, generates a report (usually by dictating into voice recognition software these days), proofreads it and then approves the report.  If there is an urgent communication issue, the radiologist will personally telephone the ER physician, if not, ancillary staff on both sides usually notice the report is completed and alerts the ER physician to review it when he has time.

Sixth, the ER physician sees the radiologist’s report.  She or he then takes all the information on the patient, including that report, laboratory values, physical examination, patient history, and outside medical records and synthesizes that information to make a most likely diagnosis and exclude other diagnoses.  It is entirely possible that the patient may go on to additional imaging, and the process can repeat.

In comparison to the prior model where all interactions were considered, we can use a bit of common sense to get the number of interacting terms down.  The main rate limiting step is the ordering ER physician – the process initiates with that physician’s decision to get CT imaging.  It is possible for that person to exceed capacity.  Also, there are unexpected events which may require immediate discussion and interaction between members of the team – ER physician to either radiology physician, radiology nurse, or radiology technologist.  Note that the radiology physician and the radiology nurse can both interact with the ER physician both before (step 1) and after (step 6) the study, because of the nature of patient care.

An astute observer may note that there is no transport component of the patient back to ER from radiology holding.  This is because the patient has already been assessed by the ER physician, and more testing, disposition, etc… is pending the information generated by the CT scan.  While the patient certainly needs care, where that care is given during the assessment process (for a stable patient )is not critical.  It could be that the patient goes from CT holding to dialysis, or another testing area, etc…  Usually the next ordered test, consult, or disposition hinges on the basis of the CT results, and will be entered via CPOE, where the patient and ER physician need not be in the same physical space to execute.

From practical experience, ER physician – CT technologist interactions are most common and usually one-sided.  (please take this patient first, I want the study done this way, etc…)  ER physician – nurse interactions are uncommon and usually unidirectional (nurse to physician – this patient is in renal failure, we can’t use LOCA, etc…).  ER Physician and radiology physician interactions are even less common but bidirectional.  (‘This patient is confounding – how can we figure this out?’ vs. ‘your patient has a ruptured aortic aneurysm and will die immediately without surgical interaction!’)

Next post we will modify our generalized linear model and begin assembling a dataset to test our assumptions.

Further Development of a care model

Let’s go back to our care model expanded upon in our prior post.  As eluded to, once interdependencies are considered, things get complicated fast.  This might not be as apparent in our four-stage ER care delivery model, but consider a larger process with six stages, and each stage being able to interact with each other.  See the figure below:

6procFor this figure, this is the generalized linear model with first order single interactions:

Complex GLM

A 23 term generalized linear model is probably not going to really help anyone and is too unweildy, so something needs to be done to get to the heart of the matter and create a model which is reasonably simple and will well-approximate this process.  The issue of multi-collinearity also is relevant here.  So, the next step is to get the number of terms down to what matters.  This would probably be best served by a shrinkage technique or a dimension reduction technique.

Shrinkage:  The LASSO immediately comes to mind due to its coefficient minimization as a feature that may allow variable selection dependent on lambda.  A ridge regression wouldn’t apply the same parsimony to the equation, so it keeps terms which may not help us simplify.  It has been pointed out to me that there is a technique called elastic net regularization which combines features of both the LASSO and ridge regression – seems worth a look.

Dimension Reduction:  First using Principal component analysis to identify the most important terms in the model and then utilizing Partial least squares to consider the response.

At this point, we probably have gone about as far as we can on a theoretical basis, and need to proceed on a more applied basis.  That will be a subject of future posts.

Thanks to Flamdrag5 for clarifying my thoughts on this post.

 

Some thoughts on Revenue Cycle Management, predictive analytics, and competing algorithms

After some reflection, this is clearly Part 5 of the “What medicine can learn from Wall Street” series.

It occurred to me while thinking about the staid subject of revenue cycle management (RCM) that this is a likely hotspot for analytics.   First, there is data – tons of data.  Second, it is relevant – folks tend to care about payment.

RCM is the method by which healthcare providers get paid, beginning from patient contact, leading to evaluation and treatment, and submitting charges by which we are ultimately paid via contractual obligation.  Modern RCM goes beyond billing, to include marketing, pre-authorization, completeness in the medical records to decrease denials, and ‘working’ the claims until payment is made.

Providers get paid by making claims.  Insurers ‘keep the providers honest’ by claim denials when the claims are not properly 1) pre-authorized, 2)documented, 3)medically indicated (etc).  There is a tug of war between both entities, which usually results in a relationship that ranges somewhere between grudging wariness to outright war (with contracts terminated and legal challenges fired off).  The providers profit by extracting the maximum profit they are contractually allowed to, the insurer by denying payment so that they can obtain investment returns on the pool of reserves they have.  Typically, the larger the reserve pool, the larger the profit.
Insurers silently fume at ‘creative coding’ where a change of coding rules causes a procedure/illness that has previously been paid at a lower level now paid at a much higher level.  Providers seethe at ‘capricious’ denials which require  staff work to provide whatever documentation requested (perhaps relevant, perhaps not) and ‘gotcha’ downcoding due to a single missing piece of information.  In any case, there is plenty of work for the billing & IT folks on either side.

Computerized revenue cycle management seems like a solution until you realize that the business model of either entity has not changed, and now the same techniques on either side can be automated.  Unfortunately, if the other guy does it, you probably need to too – here’s why.

We could get into this scenario:   A payor (insurer) when evaluating claims, decides that there is too much spend ($) on a particular ICD-9 diagnosis (or ICD-10 if you prefer) than expected and targets these for claim denials.  A provider would submit claims for this group, be denied on many of them, re-submit, be denied, and then either start ‘working’ the claims to gain value from them or if they had a sloppy or lazy or limited billing department, simply let them go (with resultant loss of the claim).  That would be a 3-12 month process.   However, a provider that was using descriptive analytics (see part 1) on say a weekly or daily basis would be able to see something was wrong more quickly – probably within three months – and gear up for quicker recovery.  A determined (and agressive) payor could shift their denial strategy to a different ICD-9 and something similar would occur.  After a few cycles of this, if the provider was really astute, they might data mine the denials to identify what codes were being denied and set up a predictive algorithm to compare new denials relative to their old book of business.  This would identify statistical anomalies in new claims, and could alert the provider about the algorithm the payor was using to target claims for denial.  By anticipating these denials, and either re-coding them or providing superior documentation to force the payor to pay (negating the beneficial effects of the payor’s claim denial algo), claims are paid in a timely and expected manner.  I haven’t checked out some of the larger vendors’ RCM offerings but I suspect that this is not far in the offing.

I could see a time where a very aggressive payor (perhaps under financial strain) strikes back with an algorithm designed to deny some, but not all claims on a semi-random basis to ‘fly under the radar’ and escape the provider’s more simple detection algorithms.  A more sophisticated algorithm based upon anomaly detection techniques could then be used to identify these denials….  This seems like a nightmare to me.  Once things get to this point, it’s probably only a matter of time until these games are addressed by the legislature.

Welcome to the battles of the competing algorithms.  This is what happens in high-frequency trading.  Best algorithm wins, loser gets poorer.

One thing is sure: in negotiations, the party who holds & evaluates the data holds the advantage.   The other party will forever be negotiating from behind..

P.S.  As an aside, with the ultra-low short term interest rates after the 2008 financial crisis, the time value of money is near all-time lows.  Delayed payments are an annoyance but apart from a cash-flow basis there is not any real advantage to delaying payments.   Senior management who lived through or studied the higher short term interest rates of the 1970’s-1980’s will recall the importance of managing the ‘float’ and good treasury/receivables operations.  Changing economic conditions could make this even more of a hot topic.

Developing a simple care delivery model further – dependent interactions

Let’s go back to our simple generalized linear model of care delivery from this post:simplified ER Process

With its resultant Generalized Linear Function:

GLM

This model, elegant in its simplicity, does not account for the inter-dependencies in care delivery.  A more true-to-life revised model is:

ER Process with interdependencies - New PageWhere there are options for back and forth pathways depending on new clinical information, denoted in red.

A linear model that takes into account these inter-dependencies would look like this:

GLM2
Including these interactions, we go from 4 terms to 8.  And this is a overly-simplified model!  By drilling down in a typical PI/Six Sigma environment into an aspect of the healthcare delivery processes, its not hard to imagine creating well over four points of contact/patient interaction, each with their own set of interdependencies.  Imagine a process with 12-15 sub-processes and most of those sub-processes each having on average six (6) interdependencies.  And then the possibility of multiple interdependencies among the processes…  This doesn’t even account for a EMR dataset where the number of columns could be …. 350?  Quickly, your ‘simple’ linear model is looking not so simple with easily over 100 terms in the equation, which also causes solvation problems.     Not to despair!  There are ways to take this formula with a high number of terms and create a more manageable model as a reasonable approximation!  The mapping and initial modeling of the care process is of greatest utility from an operational standpoint, to allow for understanding and to guide interpretation of the ultimate data analysis.

I am a believer that statistical computational analysis can identify the terms which are most important for the model.  By inference, these inputs will have the most effect upon outcome, and can guide management to where precious effort, resources, and time should be guided to maximize outcomes.

 

What Medicine can learn from Wall Street – Part 4 – Portfolio Management and complex systems

attrib: Asy ArchLet’s consider a single security trader.

All they trade is IBM.  All they need to know is that security and  its included indexes.  But start trading another security, such as Cisco (CSCO), while they have a position in IBM, and they have a portfolio.   Portfolios behave differently – profiting or losing on an aggregate basis from the combination of movements in multiple securities.  For example, if you hold 10,000 shares of IBM and CSCO, and IBM appreciates by a dollar while CSCO loses a dollar, you have no net gain or loss.  That’s called portfolio risk.

Everything in the markets is connected.  For example, if you’re an institutional trader, with a large (1,000,000 shares +) position in IBM, you know that you can’t sell quickly without tanking the market.  That’s called execution risk.  Also, once the US market closes (less of a concern these days than 20 years ago) there is less liquidity.  Imagine you are this large institutional trader, at home at 11pm.   A breaking news story develops about a train derailment of toxic chemicals near IBM’s research campus causing fires.   You suspect that it destroyed all of their most prized experimental hardware which will take years to replace.  Immediately, you know that you have to get out of as much IBM as possible to limit your losses.  However, when you get over to your trading terminal, the first bid in the market is $50 lower than the price that afternoon for a minuscule 10,000 shares.  If you sell at that price, the next price will be even lower for a smaller amount.   You’re stuck.  However, there is a relationship between IBM and the general market called a beta which is a correlation coefficient.  Since you cannot get out of your IBM directly, you sell a defined number of short S&P futures in the open market to simulate a short position in IBM.  You’re going to take a bath, but not as bad as the folks that went to bed early and didn’t react to the news.

A sufficiently large portfolio with >250 stocks will approximate broader market indexes (such as the S&P 500 or Russell index) depending upon composition.  It’s beta will be in the 0.9-1.1 range with 1.0 equaling a perfect correlation coefficient (r).  Traders attempt to improve upon this expected rate of return by strategic buys and sells of the portfolio components.  Any extra return above the expected rate of return of the underlying is alpha.   Alpha is what you pay managers for instead of just purchasing the Vanguard S&P 500 index and forgetting about it.  It’s said that most managers underperform the market indexes.  A discussion of Modern Portfolio Theory is beyond the scope of this blog, but you can go here for more.

So, excepting an astute manager delivering alpha (or an undiversified portfolio), the larger & more diversified the portfolio is the more it behaves like an index and the less dependent it is upon the behavior of any individual security.  Also, without knowing the exact composition of the portfolio and it’s proportions, it’s overall behavior can be pretty opaque.

MAIN POINT: The portfolio behaves as it’s own process; the sum of the interactions of its constituents.

 

Courtesy Arnold C.I postulate that the complex system of healthcare delivery behaves like a multiple security portfolio.  It is large, complex, and without a clear understanding of its constituent processes, potentially opaque.  The individual components of care delivery summate together to form an overall process of care delivery.  The over-arching hospital, outpatient, office care delivery process is a derivative process – integrating multiple underlying sub-processes.

We trace, review, and document these sub-processes to better understand them.  Once understood, metrics can be established and process improvement tools applied.  The PI team is called in, and a LEAN/Six Sigma analysis performed.  Six sigma process analytics typically focus on one sub-process at a time to improve its efficiency.  Improving a sub-process’ efficiency is a laudable & worthwhile goal which can result in cost savings, better care outcomes, and reduced healthcare prices.  However, there is also the potential for Merton’s ‘unintended consequences‘.

Most importantly, the results of the six sigma PI need to be understood in the context of the overall enterprise – the larger complex system.  Optimizing the sub-process when causing a bottleneck in the larger enterprise process is not progress!
This is because a choice of the wrong metric or overzealous overfitting may, while improving the individual process, create a perturbation in the system (a ‘bottleneck’) the negative effects of which are, confoundingly, more problematic than the fix.  Everyone thinks that they are doing a great job, but things get worse, and senior management demands an explanation.   Thereafter, a lot of finger pointing occurs.  These effects are due to dependent variables  or feedback loops that exist in the system’s process.  Close monitoring of the overall process will help in identifying unintended consequences of process changes.  I suspect most senior management folks will recall the time when an overzealous cost-cutting manager decreased in-house transport to the point where equipment idled and LOS increased.  I.E. The .005% saved by patient transport re-org cost the overall institution 2-3% until the problem was fixed.

There is a difference between true process improvement and goosing the numbers.  I’ve written a bit about this in real vs. fake productivity and my post about cost shifting.  I strongly believe it is incumbent upon senior management to monitor middle management & prevent these outcomes.  Well thought out metrics and clear missions and directives can help.  Specifically – senior management needs to be aware that optimization of sub-processes exists in the setting of the larger overall process and that optimization must also optimize the overall care process (the derivative process) as well.   An initiative that fails to meet both the local and global goals is a failed initiative!

It’s the old leaky pipe analogy – put a band-aid on the pipe to contain one leak, and the increased pressure in the pipe causes the pipe to burst somewhere else, necessitating another band-aid.  You can’t patch the pipe enough – too old.  The whole pipe needs replacement.  And the sum of repairs over time exceeds the cost of simply replacing it.

I’m not saying that process improvement is useless – far from it, it is necessary to optimize efficiency and reduce waste to survive in our less-than-forgiving healthcare business environment.  However, consideration of the ‘big picture’ is essential – which can be mathematically modeled.  The utility of modeling is to gain an understanding of how the overall complex process responds to changes – to avoid unintended consequences of system perturbation.

What medicine can learn from Wall Street – Part 3 – The dynamics of time

This a somewhat challenging post with cross-discipline correlations, some unfamiliar terminology, and concepts.  There is a payoff!

You can recap part 1 and part 2 here. 

The crux of this discussion is time.  Understanding the progression towards shorter and shorter time frames on Wall Street enables us to draw parallels and differences in medical care delivery particularly pertaining to processes and data analytics.  This is relevant because some vendors tout real-time capabilities in health care data analysis.  Possibly not as useful as one thinks.

In trading, the best profit one is a risk-less one.  A profit that occurs by simply being present, is reliable, and reproducible, and exposes the trader to no risk.  Meet arbitrage.  Years ago, it was possible for the same security to be trading at different prices on different exchanges as there was no central marketplace.  A network of traders could execute a buy of a stock for $10 in New York, and then sell those same shares on the Los Angeles exchange for $11.  If one imagines a 1000 share transaction, a $1 profit per share yields $1000.  It was made by the head trader holding up two phones to his head and saying ‘buy’ into one and ’sell’ into the other.*   These relationships could be exploited over longer periods of time and represented an information deficit.  However, as more traders learned of them, the opportunities became harder to find as greater numbers pursued them.  This price arbitrage kept prices reasonably similar before centralized, computerized exchanges and data feeds.

As information flow increased, organizations became larger and more effective, and time frames for executing profitable arbitrages decreased.  This led traders to develop simple predictive algorithms, like Ed Seykota did, detailed in part 1.  New instruments re-opened the profit possibility for a window of time, which eventually closed.  The development of futures, options, indexes, all the way to closed exchanges (ICE, etc…) created opportunities for profit which eventually became crowded.  Since the actual arbitrages were mathematically complex (futures have an implied interest rate, options require a solution of multiple partial differential equations, and indexes require summing instantaneously hundreds of separate securities) a computational model was necessary as no individual could compute the required elements quickly enough to profit reliably.  With this realization, it was only a matter of time before automated trading (AT) happened, and evolved into high-frequency trading with its competing algorithms operating without human oversight on millisecond timeframes.

The journey from daily prices to ever shorter prices over the trading day to millisecond prices was driven by availability of good data and reliable computing which could be counted to act on those flash prices.  Once a game of location (geographical arbitrage) turned into a game of speed (competitive pressures on geographical arbitrage) turned into a game of predictive analytics (proprietary trading and trend following) turned into a more complex game of predictive analytics (statistical arbitrage) was then ultimately turned back into a game of speed and location (High frequency trading).

The following chart shows a probability analysis of an ATM straddle position on IBM.  This is an options position.  It is not important to understand the instrument, only to understand what the image shows.  For IBM, the expected variance that exists in price at one standard deviation (+/- 1 s.d.) is plotted in below.  As time (days) increases along the X axis, the expected range widens, or becomes less accurate.

credit: TD Ameritrade
credit: TD Ameritrade

Is there a similar corollary for health care?

Yes, but.

First, recognize the distinction between the simpler price-time data which exists in the markets, vs the rich, complex multivariate data in healthcare.  

Second, assuming a random walk hypothesis , security price movement is unpredictable, and at best can only be calculated so that the next price will be in a range defined by a number of standard deviations according to one’s model as seen above in the picture. You cannot make this argument in healthcare.  This is because the patient’s disease is not a random walk.  Disease follows proscribed pathways and natural histories which allow us to make diagnoses and implement treatment options.

It is instructive to consider Clinical Decision Support tools.  Please note that these tools are not a substitute for expert medical advice (and my mention does not employ endorsement).  See Esagil and diagnosis pro.  If you enter “abdominal pain” into either of the algorithms, you’ll get back a list of 23 differentials (woefully incomplete) in Esagil and 739 differentials (more complete, but too many to be of help) in Diagnosis Pro.  But this is a typical presentation to a physician – a patient complains of “abdominal pain” and the differential must be narrowed.

At the onset, there is a wide differential diagnosis.  The possibility that the pain is a red herring and the patient really has some other, unsuspected, disease must be considered.  While there are a good number of diseases with a pathognomonic presentation, uncommon presentations of common diseases are more frequent than common presentations of rare diseases.

In comparison to the trading analogy above, where expected price movement is generally restricted to a quantifiable range based on the observable statistics of the security over a period of time, for a de novo presentation of a patient, this could be anything, and the range of possibilities is quite large.

Take, for example, a patient that presents to the ER complaining “I don’t feel well.”  When you question them, they tell you that they are having severe chest pain that started an hour and a half ago.  That puts you into the acute chest pain diagnostic tree.

Reverse Tree

With acute chest pain, there is a list of differentials that needs to be excluded (or ‘ruled out’), some quite serious.  A thorough history and physical is done, taking 10-30 minutes.  Initial labs are ordered (5-30 minutes if done in a rapid, in-ER test, longer if sent to the main laboratory) an EKG and CXR (chest X-ray) are done for their speed,(10 minutes for each)  and the patient is sent to CT for a CTA (CT Angiogram) to rule out a PE (Pulmonary embolism).  This is a useful test, because it will not only show the presence or absence of a clot, but will also allow a look at the lungs to exclude pneumonias, effusions, dissections, and malignancies. Estimate that the wait time for the CTA is at least 30 minutes.  

The ER doctor then reviews the results (5 minutes)- troponins are negative, excluding a heart attack (MI), the CT scan eliminated PE, Pneumonia, Dissection, Pneumothorax, Effusion, malignancy in the chest.  The Chest X-Ray excludes fracture.  The normal EKG excludes arrhythmia, gross valvular disease, and pericarditis.   The main diagnoses left are GERD, Pleurisy, referred pain, and anxiety.  ER doctor goes back to the patient (10 minutes) , patient doesn’t appear anxious & no stressors, so panic attack unlikely.  No history of reflux, so GERD unlikely.  No abdominal pain component, and labs were negative, so abdominal pathologies unlikely.  Point tenderness present on the physical exam at the costochondral junction – and the patient is diagnosed with costochondritis.  The patient is then discharged with a prescription for pain control.  (30 minutes).  

Ok, if you’ve stayed with me, here’s the payoff.

As we proceed down the decision tree, the number of possibilities narrows in medicine.

In comparison, price-time data – in which the range of potential prices increase as you proceed forward in time.

So, in healthcare the potential diagnosis narrows as you proceed down the x-axis of time.  Therefore, time is both one’s friend and enemy – friend as it provides for diagnostic and therapeutic interventions which establish the patient’s disease process; enemy as payment models in medicine favor making that diagnostic and treatment process as quick as possible (when a hospital inpatient).

We’ll continue this in part IV and compare it relevance to portfolio trading.

*As an aside, the phones in trading rooms had a switch on the handheld receiver – you would push them in to talk.  That way, the other party would not know that you were conducting an arbitrage!  They were often slammed down and broken by angry traders – one of the manager’s jobs was to keep a supply of extras in his desk, and they were not hard-wired but plugged in by a jack expressly for that purpose! trader's phone

**Yes, for the statisticians reading this, I know that there is an implication of a gaussian distribution that may not be proven.  I would suspect the successful houses have modified for this and have instituted non-parametric models as well.  Again, this is not a trading, medical or financial advice blog.

 

The danger of choosing the wrong metric : The VA Scandal

The Veteran’s affair scandal has been newsworthy lately.  The facts about the VA scandal will be forthcoming in August, but David Brooks made some smart inferences back on May 16th on NPR’s Week In Politics:

BROOKS: Yeah, he’s (Shinkseki) in hot water. He’s been there since the beginning. So I don’t know if I’d necessarily want to bet on him. But, you know, I do have some sympathy for the VA. It’s obviously not a good thing to doctor and cook the books, but you – there is a certain fundamental reality here, which is the number of primary care visits over the last three years at this place rose 50 percent. The number of primary care physicians rose nine percent.
And so there’s just a backlog, and if you put a sort of standard in place that you have to see everybody in 14 days but you don’t provide enough physicians to actually do that, well, people are going to start cheating. And so there is a more fundamental problem here than just the cheating.

An administrative failure was made by mandating patients be seen within 14 days but not providing the staffing capabilities to do so.  The rule designed to promote a high level of care had ‘unintended consequences.’  However, I do have some sympathy for an institution which depends on procurement from congress for funding in a political process where funds can be yanked, redistributed, or earmarked based on political priorities.

More concerning, multiple centers may have been complicit with the impossibility of fulfilling the mandate, and whistleblowers were actively retaliated against.

I need to disclaim here that I both trained and worked at the VA as a physician.  I have tremendous respect for the veterans who seek care there, and I had great pride working there and in being in a place to give service to these men and women who gave service to us.  The level of care in the VA system is generally thought to be good, by myself and others.

As I’ve written before in The Measure is the Metric and Productivity in Medicine – what’s real and what’s fake?, the selection of metrics is important because those metrics will be followed by the organization, particularly if performance evaluations and bonuses are tied to the metrics.  Ben Horowitz, partner at Andreessen Horowitz, astutely notes the following from his experience as CEO at Opsware and an employee at HP (1):

At a basic level, metrics are incentives.  By measuring quality, features, and schedule and discussing them at every staff meeting, my people focused intensely on those metrics to the exclusion of other goals.  The metrics did not describe the real goals and I distracted the team as a result.

And if he didn’t get the point across clearly enough (2):

Some things that you will want to encourage will be quantifiable, and some will not.  If you report on the quantitative goals and ignore the qualitative onces, you won’t get the qualitative goals, which may be the most important ones.  Management purely by numbers is sort of like painting by numbers – it’s strictly for amateurs.
At HP, the company wanted high earnings now and in the future.  By focusing entirely on the numbers, HP got them now by sacrificing the future…
By managing the organization as though it were a black box, some divisions at HP optimized the present at the expense of their downstream competitiveness.  The company rewarded managers for achieving short-term objectives in a manner that was bad for the company.  It would have been better to take into account the white box.  The white box goes beyond the numbers and gets into how the organization produced the numbers.  It penalizes managers who sacrifice the future for the short-term and rewards those who invest in the future even if that investment cannot be easily measured.

I’ll have to wait until the official report on the VA scandal is released before commenting on why the failure occurred.  However, it does seem to me as a case of failure of the black box, as Ben Horowitz explained so adeptly.  His writing is recommended.

1.  Ben Horowitz, The Hard Thing about Hard Things, HarperCollins 2014, p.132

2. IBID p.132-133

 

What Big Data visualization analytics can learn from radiology

As I research on part III of the “What Healthcare can learn from Wall Street” series, which is probably going to turn in to a Part III, Part IV, and Part V, I was thinking about visualization tools in big data and how to use them to analyze large data sets rapidly (relatively) by a human (or a deep unsupervised learning type algorithm) – and it came to me that us radiologists have been doing this for years.
If you have ever watched a radiologist reading at a PACS station (a high-end computer system which displays images quickly) you will see them scroll at a blindingly fast speed through a large series of multiple anatomic images to arrive at a diagnosis or answer a specific question.  [N.B. if you haven’t, you really should – it’s quite cool!]  Stacked upon each other, these images assemble a complete anatomic picture of the area of data acquisition.

What the radiologist is doing while going over the images is comparing the expected appearance of a reference standard to that visualized image to find discrepancies.  The data set looks like THIS:

CT scan segmentationvoxelIt’s important to understand that each pixel on the screen represents not a point, but a volume, called a voxel.  The reconstruction algorithms can sometimes over or under emphasize the appearance of the voxel, so the data is usually reconstructed in multiple axes.  This improves diagnostic accuracy and confidence.

Also, the voxel is not a boolean (binary) zero or one variable – it is a scalar corresponding to a grey-scale value.

So, in data science thinking, what a radiologist is doing is examining a four-dimensional space (X,Y,Z, voxel grayscale) for relevant patterns and deviance from those patterns (Essentially a subtractive algorithm).  A fifth dimension can be added by including changes over time (comparison to a previous similar study at some prior point in time).

Rapid real-time pattern recognition in five variables on large data sets.  Done successfully day-in and day-out visually by your local radiologist.

 

Initial evaluation of a complex data set can give you something like this multiple scatter plot which I don’t find too useful:

Multiple scatter plots

Now, this data set, to me with my orientation and training, becomes much more useful:

3D datasetA cursory visual inspection yields a potential pattern, the orange circles, which to me suggests a possible model drawn in blue.  visuallyevaluatedThat curve looks parabolic, which suggests a polynomial linear model might be useful for describing that particular set of data, so we can model it like this and then run the dataset in R to prove or disprove our hypothesis.
Polynomial Linear Model
So, what I’m suggesting here is that by visually presenting complex data in a format of up to five dimensions (three axes, X, Y,Z, a point with grayscale corresponding to a normalized value, and a fifth, comparative dimension) complex patterns can be visually discovered, potentially quickly and on a screening basis, and then appropriate models can be tested to discover if they hold water.  I’ll save the nuts and bolts of this for a later post, but when a large dataset is evaluated (like an EHR) dimension reduction operations can allow focusing down on fewer variables to put it into a more visualization-friendly dataset.

And I’m willing to bet even money that if an analyst becomes intimately familiar with the dataset and visualization, as they spend more time with it and understand it better, they will be able to pick out relationships that will be absolutely mind-blowing.

Processes and Modeling – a quick observation

Is it not somewhat obvious to the folks reading this blog that this:

simplified ER Process

 

Is the same thing as this:

GLMWhile I might be skewered for oversimplifying the process (and it is oversimplified – greatly), the fundamental principles are the same.  LOS for those not inured in the definition is Length of Stay, also known as Turn around Time (former is usually in days, latter in minutes or hours)

Out of curiosity, is anyone reading this blog willing to admit they are using something similar, or have tried to use something similar and failed?  I would love to know people’s thoughts on this.