{"id":12851,"date":"2015-08-12T11:30:55","date_gmt":"2015-08-12T15:30:55","guid":{"rendered":"http:\/\/n2value.com\/blog\/?p=12851"},"modified":"2015-08-31T16:43:59","modified_gmt":"2015-08-31T20:43:59","slug":"further-developing-the-care-model-part-3-data-generation-and-code","status":"publish","type":"post","link":"https:\/\/n2value.com\/blog\/further-developing-the-care-model-part-3-data-generation-and-code\/","title":{"rendered":"Further Developing the Care Model &#8211; Part 3 &#8211; Data generation and code"},"content":{"rendered":"<p>Returning to our care model that discussed in <a href=\"http:\/\/n2value.com\/blog\/further-developing-the-care-model-theoretical-to-applied-part-1\/\">parts one<\/a> and <a href=\"http:\/\/n2value.com\/blog\/further-developing-the-care-model-part-2-definitions\/\">two<\/a>, we can begin by defining our variables.<\/p>\n<p><a href=\"http:\/\/n2value.com\/blog\/wp-content\/uploads\/2015\/08\/Kappa-and-Theta-for-gamma.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-12877\" src=\"http:\/\/n2value.com\/blog\/wp-content\/uploads\/2015\/08\/Kappa-and-Theta-for-gamma.jpg\" alt=\"n2value\" width=\"685\" height=\"224\" srcset=\"https:\/\/n2value.com\/blog\/wp-content\/uploads\/2015\/08\/Kappa-and-Theta-for-gamma.jpg 685w, https:\/\/n2value.com\/blog\/wp-content\/uploads\/2015\/08\/Kappa-and-Theta-for-gamma-300x98.jpg 300w\" sizes=\"auto, (max-width: 685px) 100vw, 685px\" \/><\/a><\/p>\n<p>Each sub-process variable is named for\u00a0 its starting sub-process and ending sub-process.\u00a0 We will define mean time for the sub-processes in minutes, and add a component of time variability.\u00a0 You will note that the variability is skewed &#8211; some shorter times exist, but disproportionately longer times are possible.\u00a0 This coincides with real-life: in a well-run operation, mean times may be close to lower limits &#8211; as these represent physical (occurring in the real world) processes, there may simply be a physical constraint on how quickly you can do anything!\u00a0 However, problems, complications and miscommunications may extend that time well beyond what we all would like it to be &#8211; for those of us who have had real-world hospital experience, does this not sound familiar?<\/p>\n<p>Because of this, we will choose a <strong>gamma distribution<\/strong> to model our processes:<\/p>\n<pre>                              <img src='https:\/\/s0.wp.com\/latex.php?latex=%5CGamma%28a%29+%3D+%5Cint_%7B0%7D%5E%7B%5Cinfty%7D+%7Bt%5E%7Ba-1%7De%5E%7B-t%7Ddt%7D+&#038;bg=ffffff&#038;fg=000000&#038;s=4' alt='\\Gamma(a) = \\int_{0}^{\\infty} {t^{a-1}e^{-t}dt} ' title='\\Gamma(a) = \\int_{0}^{\\infty} {t^{a-1}e^{-t}dt} ' class='latex' \/><\/pre>\n<p>The gamma distribution is useful because it deals with continuous time data, and we can skew it through its shaping parameters Kappa (<img src='https:\/\/s0.wp.com\/latex.php?latex=%5Ckappa&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\kappa' title='\\kappa' class='latex' \/>) and Theta (<img src='https:\/\/s0.wp.com\/latex.php?latex=%5Ctheta&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\theta' title='\\theta' class='latex' \/>) .\u00a0\u00a0 We will use the function in R : rgamma(N,<img src='https:\/\/s0.wp.com\/latex.php?latex=%5Ckappa&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\kappa' title='\\kappa' class='latex' \/>, <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Ctheta&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\theta' title='\\theta' class='latex' \/>) to generate our distribution between zero and 1, and use a multiplier (slope) and offset (Y-intercept) to adjust\u00a0 the distributions along the X-axis.\u00a0 The gamma distribution can deal with the absolute lower time limit &#8211; I consider this a feature, not a flaw.<\/p>\n<p>It is generally recognized that a probability density plot (or Kernel plot) as opposed to a histogram of distributions is more accurate and less prone to distortions related to number of samples (N).\u00a0 A plot of these distributions looks like this:<a href=\"http:\/\/n2value.com\/blog\/wp-content\/uploads\/2015\/08\/care-model-3.jpeg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-12888\" src=\"http:\/\/n2value.com\/blog\/wp-content\/uploads\/2015\/08\/care-model-3.jpeg\" alt=\"Property N2value.com\" width=\"610\" height=\"484\" srcset=\"https:\/\/n2value.com\/blog\/wp-content\/uploads\/2015\/08\/care-model-3.jpeg 610w, https:\/\/n2value.com\/blog\/wp-content\/uploads\/2015\/08\/care-model-3-300x238.jpeg 300w\" sizes=\"auto, (max-width: 610px) 100vw, 610px\" \/><\/a><\/p>\n<p>The R code to generate this distribution, graph, and our initial values dataframe is as follows:<\/p>\n<blockquote><p>seed &lt;- 3559<br \/>\nset.seed(seed,kind=NULL,normal.kind = NULL)<br \/>\nn &lt;- 16384 ## 2^14 number of samples then let&#8217;s initialize variables<br \/>\nk &lt;- c(1.9,1.9,6,1.9,3.0,3.0,3.0,3.0,3.0)<br \/>\ntheta &lt;- c(3.8,3.8,3.0,3.8,3.0,5.0,5.0,5.0,5.0)<br \/>\ns &lt;- c(10,10,5,10,10,5,5,5,5,5)<br \/>\no &lt;- c(4.8,10,5,5.2,10,1.6,1.8,2,2.2)<br \/>\nprosess1 &lt;- (rgamma(n,k[1],theta[1])*s[1])+o[1]<br \/>\nprosess2 &lt;- (rgamma(n,k[2],theta[2])*s[2])+o[2]<br \/>\nprosess3 &lt;- (rgamma(n,k[3],theta[3])*s[3])+o[3]<br \/>\nprosess4 &lt;- (rgamma(n,k[4],theta[4])*s[4])+o[4]<br \/>\nprosess5 &lt;- (rgamma(n,k[5],theta[5])*s[5])+o[5]<br \/>\nprosess6 &lt;- (rgamma(n,k[6],theta[6])*s[6])+o[6]<br \/>\nprosess7 &lt;- (rgamma(n,k[7],theta[7])*s[7])+o[7]<br \/>\nprosess8 &lt;- (rgamma(n,k[8],theta[8])*s[8])+o[8]<br \/>\nprosess9 &lt;- (rgamma(n,k[9],theta[9])*s[9])+o[9]<br \/>\nd1 &lt;- density(prosess1, n=16384)<br \/>\nd2 &lt;- density(prosess2, n=16384)<br \/>\nd3 &lt;- density(prosess3, n=16384)<br \/>\nd4 &lt;- density(prosess4, n=16384)<br \/>\nd5 &lt;- density(prosess5, n=16384)<br \/>\nd6 &lt;- density(prosess6, n=16384)<br \/>\nd7 &lt;- density(prosess7, n=16384)<br \/>\nd8 &lt;- density(prosess8, n=16384)<br \/>\nd9 &lt;- density(prosess9, n=16384)<br \/>\nplot.new()<br \/>\nplot(d9, col=&#8221;brown&#8221;, type = &#8220;n&#8221;,main=&#8221;Probability Densities&#8221;,xlab = &#8220;Process Time in minutes&#8221;, ylab=&#8221;Probability&#8221;,xlim=c(0,40), ylim=c(0,0.26))<br \/>\nlegend(&#8220;topright&#8221;,c(&#8220;process 1&#8243;,&#8221;process 2&#8243;,&#8221;process 3&#8243;,&#8221;process 4&#8243;,&#8221;process 5&#8243;,&#8221;process 6&#8243;,&#8221;process 7&#8243;,&#8221;process 8&#8243;,&#8221;process 9&#8221;),fill=c(&#8220;brown&#8221;,&#8221;red&#8221;,&#8221;blue&#8221;,&#8221;green&#8221;,&#8221;orange&#8221;,&#8221;purple&#8221;,&#8221;chartreuse&#8221;,&#8221;darkgreen&#8221;,&#8221;pink&#8221;))<br \/>\nlines(d1, col=&#8221;brown&#8221;, add=TRUE)<br \/>\nlines(d2, col=&#8221;red&#8221;, add=TRUE)<br \/>\nlines(d3, col=&#8221;blue&#8221;, add=TRUE)<br \/>\nlines(d4, col=&#8221;green&#8221;, add=TRUE)<br \/>\nlines(d5, col=&#8221;orange&#8221;, add=TRUE)<br \/>\nlines(d6, col=&#8221;purple&#8221;, add=TRUE)<br \/>\nlines(d7, col=&#8221;chartreuse&#8221;, add=TRUE)<br \/>\nlines(d8, col=&#8221;darkgreen&#8221;, add=TRUE)<br \/>\nlines(d9, col=&#8221;pink&#8221;, add=TRUE)<br \/>\nptime &lt;- c(d1[1],d2[1],d3[1],d4[1],d5[1],d6[1],d7[1],d8[1],d9[1])<br \/>\npdens &lt;- c(d1[2],d2[2],d3[2],d4[2],d5[2],d6[2],d7[2],d8[2],d9[2])<br \/>\nptotal &lt;- data.frame(prosess1,prosess2,prosess3,prosess4,prosess5,prosess6,prosess7,prosess8,prosess9)<br \/>\nnames(ptime) &lt;- c(&#8220;ptime1&#8243;,&#8221;ptime2&#8243;,&#8221;ptime3&#8243;,&#8221;ptime4&#8243;,&#8221;ptime5&#8243;,&#8221;ptime6&#8243;,&#8221;ptime7&#8243;,&#8221;ptime8&#8243;,&#8221;ptime9&#8221;)<br \/>\nnames(pdens) &lt;- c(&#8220;pdens1&#8243;,&#8221;pdens2&#8243;,&#8221;pdens3&#8243;,&#8221;pdens4&#8243;,&#8221;pdens5&#8243;,&#8221;pdens6&#8243;,&#8221;pdens7&#8243;,&#8221;pdens8&#8243;,&#8221;pdens9&#8221;)<br \/>\nnames(ptotal) &lt;- c(&#8220;pgamma1&#8243;,&#8221;pgamma2&#8243;,&#8221;pgamma3&#8243;,&#8221;pgamma4&#8243;,&#8221;pgamma5&#8243;,&#8221;pgamma6&#8243;,&#8221;pgamma7&#8243;,&#8221;pgamma8&#8243;,&#8221;pgamma9&#8221;)<br \/>\npall &lt;- data.frame(ptotal,ptime,pdens)<\/p>\n<p>&nbsp;<\/p><\/blockquote>\n<p>Where the relevant term is rgamma(n,<img src='https:\/\/s0.wp.com\/latex.php?latex=%5Ckappa&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\kappa' title='\\kappa' class='latex' \/>, <img src='https:\/\/s0.wp.com\/latex.php?latex=%5Ctheta&#038;bg=ffffff&#038;fg=000000&#038;s=0' alt='\\theta' title='\\theta' class='latex' \/>).\u00a0 We&#8217;ll use these distributions in our dataset.<\/p>\n<p>One last concept needs to be discussed: The probability of the sub-processes&#8217; occurence.\u00a0 Each sub-process has a percentage chance of happening &#8211; some a 100% certainty, others a fairly low 5% of cases.\u00a0 This reflects the real world reality of what happens &#8211; once a test is ordered, there&#8217;s a 100% certainty of the patient showing up for the test, but not 100% of the patients will get the test.\u00a0 Some cancel due to contraindications, others can&#8217;t tolerate it, others refuse, etc&#8230;\u00a0 The percentages that are &lt;100% reflect those probabilities and essentially are like a non-binary boolean switch applied to the beginning of the term that describes that sub-process.\u00a0 We&#8217;re evolving first toward a simple generalized linear equation similar to that put forward <a href=\"http:\/\/n2value.com\/blog\/further-development-of-a-care-model\/\">in this post<\/a>.\u00a0 I think its going to look somewhat like this:<\/p>\n<p><a href=\"http:\/\/n2value.com\/blog\/wp-content\/uploads\/2014\/09\/Modified-GLM.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-12366\" src=\"http:\/\/n2value.com\/blog\/wp-content\/uploads\/2014\/09\/Modified-GLM.png\" alt=\"N2Value.com\" width=\"546\" height=\"123\" srcset=\"https:\/\/n2value.com\/blog\/wp-content\/uploads\/2014\/09\/Modified-GLM.png 546w, https:\/\/n2value.com\/blog\/wp-content\/uploads\/2014\/09\/Modified-GLM-300x67.png 300w\" sizes=\"auto, (max-width: 546px) 100vw, 546px\" \/><\/a>But we&#8217;ll see how well this model fares as we develop it and compare it to some others.\u00a0 The x terms will likely represent the probabilities between 0 and 1.0 (100%).<\/p>\n<p>For a EMR based approach, we would assign a UID (medical record # plus 5-6 extra digits, helpful for encounter #&#8217;s).\u00a0 We will &#8216;disguise&#8217; the UID by adding or subtracting a constant known only to us and then performing a mathematical operation on it. However, for our purposes here, we would not need to do that.<\/p>\n<p>We&#8217;ll\u00a0 head on to our analysis in part 4.<\/p>\n<p>&nbsp;<\/p>\n<blockquote><p>Programming notes in R:<\/p>\n<p>1.\u00a0 I experimented with for loops and different configurations of apply with this, and after a few weeks of experimentation, decided I really can&#8217;t improve upon the repetitive but simple code above.\u00a0 The issue is that the density function returns a list of 7 variables, so it is not as easy as defining a matrix, as the length of the data frame changes.\u00a0 I&#8217;m sure there is a way to get around this, but for the purposes of this illustration it is beyond our needs.\u00a0 Email me at mailto:contact@n2value.com if you have working code that does it better!<\/p>\n<p>2.\u00a0 For the density function, the number of samples must be a power of 2.\u00a0 So by choosing 16384 (2^14) we meet that goal.\u00a0 Setting N to that number makes the data frame more symmetric.<\/p>\n<p>3.\u00a0 In variable names above, prosess is an intentional misspelling.<\/p><\/blockquote>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Returning to our care model that discussed in parts one and two, we can begin by defining our variables. Each sub-process variable is named for\u00a0 its starting sub-process and ending sub-process.\u00a0 We will define mean time for the sub-processes in minutes, and add a component of time variability.\u00a0 You will note that the variability is [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"New N2value post: Further Developing the Care Model - Part 3 - Data generation and code #bigdata #hcldr #workflow","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[4,2,6,5],"tags":[],"class_list":["post-12851","post","type-post","status-publish","format-standard","hentry","category-data-science","category-healthcare","category-process-analytics","category-workflow"],"jetpack_publicize_connections":[],"aioseo_notices":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4mtfP-3lh","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/posts\/12851","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/comments?post=12851"}],"version-history":[{"count":43,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/posts\/12851\/revisions"}],"predecessor-version":[{"id":12916,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/posts\/12851\/revisions\/12916"}],"wp:attachment":[{"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/media?parent=12851"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/categories?post=12851"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/tags?post=12851"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}