{"id":12377,"date":"2014-09-12T11:35:41","date_gmt":"2014-09-12T15:35:41","guid":{"rendered":"http:\/\/n2value.com\/blog\/?p=12377"},"modified":"2014-09-11T20:50:45","modified_gmt":"2014-09-12T00:50:45","slug":"quick-post-on-systems-vs-statistical-learning-on-large-datasets","status":"publish","type":"post","link":"https:\/\/n2value.com\/blog\/quick-post-on-systems-vs-statistical-learning-on-large-datasets\/","title":{"rendered":"Quick Post on Systems vs. Statistical Learning on large datasets"},"content":{"rendered":"<p><a href=\"http:\/\/n2value.com\/blog\/wp-content\/uploads\/2014\/09\/Bp-6-node-network.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-12381\" src=\"http:\/\/n2value.com\/blog\/wp-content\/uploads\/2014\/09\/Bp-6-node-network.jpg\" alt=\"&quot;Bp-6-node-network&quot; by JamesQueue - Own work. Licensed under Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons - https:\/\/commons.wikimedia.org\/wiki\/File:Bp-6-node-network.jpg#mediaviewer\/File:Bp-6-node-network.jpg\" width=\"384\" height=\"214\" srcset=\"https:\/\/n2value.com\/blog\/wp-content\/uploads\/2014\/09\/Bp-6-node-network.jpg 384w, https:\/\/n2value.com\/blog\/wp-content\/uploads\/2014\/09\/Bp-6-node-network-300x167.jpg 300w\" sizes=\"auto, (max-width: 384px) 100vw, 384px\" \/><\/a>The other day I attended a Webinar on Big Data vs. Systems Theory hosted by the <a title=\"MIT Systems design and management program\" href=\"http:\/\/sdm.mit.edu\/\" target=\"_blank\">MIT Systems design &amp; management group<\/a> which offers free, and usually very good, <a href=\"http:\/\/sdm.mit.edu\/voices\/webinars.html\" target=\"_blank\">webinars<\/a>. \u00a0I recommend them to anyone interested in data driven management using systems and processes. \u00a0The specific lecture was \u201cMove Over, Big Data! &#8211; How Small, Simple Models can yield Big insights\u201d given by Dr. Larson. \u00a0The lecture was very good &#8211; it discussed some of the pitfalls we might be likely to fall into with large data sets, and how algorithmic evaluation can alternatively get us to the same place, but in a different way.<\/p>\n<blockquote><p>Great points raised within the lecture were:<br \/>\nAlways consider the average as a distribution (i.e. \u00a0a confidence interval) , and compare to its median to avoid some of the pitfalls of averages.<br \/>\nOutliers are easy to dismiss as noncontributory &#8211; but when your outlier causes significant effects on your function (i.e. \u2018black swans\u2019) you\u2019d better include it!<br \/>\nAverages experienced by one population may be different than averages experienced by another.\u00a0 (a bit more sophisticated than the N=1 concept)<\/p><\/blockquote>\n<p>There was a neat discussion of Queues with Little\u2019s law cited &#8211; L=lambda W where L=time average # of customers in system, lambda is average arrival rate and W- mean time spent by customers in the queue. \u00a0M\/M\/K queue notation cited. \u00a0Dr. Larson\u2019s Queue Inference Engine (using a poisson distribution) was reviewed.\u00a0 You can find some more information about the <a title=\"Queue Inference Engine\" href=\"http:\/\/ocw.mit.edu\/courses\/engineering-systems-division\/esd-86-models-data-and-inference-for-socio-technical-systems-spring-2007\/lecture-notes\/lec12a.pdf\" target=\"_blank\">Queue inference engine here<\/a>.\u00a0 The point was that small models are an alternative means to sassing out big data than simply using statistical regression. \u00a0I\u2019ll admit to not knowing much about queue theory and <a title=\"Markov Chains - Wikipedia\" href=\"https:\/\/en.wikipedia.org\/wiki\/Markov_chain\" target=\"_blank\">Markov chains,<\/a> but I can see some interesting applications in combination with large datasets. \u00a0Much along the lines of an <a title=\"Developing a simple care delivery model further \u2013 dependent interactions\" href=\"http:\/\/n2value.com\/blog\/developing-a-simple-care-delivery-model-further-dependent-interactions\/\" target=\"_blank\">ensemble mode<\/a>l, but including the queue theory as part of the ensemble\u2026 \u00a0Unfortunately, as Dr. Larson noted, much like in the linear models we have been approaching, serial queues or networked queues <a title=\"Backpressure routing\" href=\"https:\/\/en.wikipedia.org\/wiki\/Backpressure_routing\" target=\"_blank\">require difficult math with many terms<\/a>. \u00a0 The question yet to be answered is &#8211; can we provide the best of both worlds?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The other day I attended a Webinar on Big Data vs. Systems Theory hosted by the MIT Systems design &amp; management group which offers free, and usually very good, webinars. \u00a0I recommend them to anyone interested in data driven management using systems and processes. \u00a0The specific lecture was \u201cMove Over, Big Data! &#8211; How Small, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"New N2value post: Systems thinking vs. Statistical Learning - http:\/\/wp.me\/p4mtfP-3dD","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[4,6,5],"tags":[],"class_list":["post-12377","post","type-post","status-publish","format-standard","hentry","category-data-science","category-process-analytics","category-workflow"],"jetpack_publicize_connections":[],"aioseo_notices":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4mtfP-3dD","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/posts\/12377","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/comments?post=12377"}],"version-history":[{"count":4,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/posts\/12377\/revisions"}],"predecessor-version":[{"id":12382,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/posts\/12377\/revisions\/12382"}],"wp:attachment":[{"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/media?parent=12377"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/categories?post=12377"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/tags?post=12377"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}