{"id":12972,"date":"2016-02-09T14:30:11","date_gmt":"2016-02-09T19:30:11","guid":{"rendered":"http:\/\/n2value.com\/blog\/?p=12972"},"modified":"2016-02-09T14:30:11","modified_gmt":"2016-02-09T19:30:11","slug":"the-coming-computer-vision-revolution","status":"publish","type":"post","link":"https:\/\/n2value.com\/blog\/the-coming-computer-vision-revolution\/","title":{"rendered":"The coming computer vision revolution"},"content":{"rendered":"<figure id=\"attachment_12975\" aria-describedby=\"caption-attachment-12975\" style=\"width: 679px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/n2value.com\/blog\/wp-content\/uploads\/2016\/02\/Rplot.jpeg\" rel=\"attachment wp-att-12975\"><img loading=\"lazy\" decoding=\"async\" class=\"size-full wp-image-12975\" src=\"http:\/\/n2value.com\/blog\/wp-content\/uploads\/2016\/02\/Rplot.jpeg\" alt=\"3 layer (7,5,3 hidden layers) neural network created in R using the neuralnet package. \" width=\"679\" height=\"657\" srcset=\"https:\/\/n2value.com\/blog\/wp-content\/uploads\/2016\/02\/Rplot.jpeg 679w, https:\/\/n2value.com\/blog\/wp-content\/uploads\/2016\/02\/Rplot-300x290.jpeg 300w\" sizes=\"auto, (max-width: 679px) 100vw, 679px\" \/><\/a><figcaption id=\"caption-attachment-12975\" class=\"wp-caption-text\">3 layer (7,5,3 hidden layers) neural network created in R using the neuralnet package.<\/figcaption><\/figure>\n<p>&nbsp;<\/p>\n<p style=\"text-align: center;\"><em>Nothing of him that doth fade<br \/>\nBut doth suffer a sea-change<br \/>\nInto something rich and strange.<\/em><\/p>\n<p style=\"text-align: right;\">&#8211; Shakespeare, The Tempest 1.2.396-401<\/p>\n<p>I\u2019m halfway through auditing Stanford\u2019s CS231n course \u2013 <strong><a href=\"http:\/\/cs231n.stanford.edu\/\" target=\"_blank\">Convolutional Neural Networks for Visual Recognition<\/a><\/strong>.<\/p>\n<p>Wow. Just Wow. There is a sea-changing paradigm shift that is happening NOW &#8211;\u00a0 we probably have not fully realized it yet.<\/p>\n<p>We are all tangentially aware of CV applications in our daily lives \u2013 Facebook\u2019s ability to find us in photos, optical character recognition (OCR) of our address on postal mail, that sort of thing. But these algorithms were rule-based expert systems grounded in supervised learning methods. Applications were largely one-off for a specific, single task. They were expensive, complicated, and somewhat error prone.<\/p>\n<p>So what changed?\u00a0\u00a0 First, a little history. In the early 1980\u2019s I had a good friend obtaining a MS in comp sci all atwitter about \u201cNeural Networks\u201d. Back then they went nowhere. Too much processing\/memory\/storage required, too difficult to tune, computationally slow. Fail.<\/p>\n<p>Then:<\/p>\n<blockquote><p>1999 \u2013\u00a0 Models beginning with SIFT &amp; ending with SVM (support vector machine) deformable parts. Best model only 74% accurate.<\/p>\n<p>2006 \u2013 Restricted Boltzmann Machines apply backpropogation to allow deep neural networks.<\/p>\n<p>2012 \u2013 AlexNet Deep learning applied to Imagenet classification database competition achieves a nearly 2X increase in accuracy to earlier SVM methods.<\/p>\n<p>2015-\u00a0\u00a0 ResNet Deep learning system achieves a 4.5X increase in accuracy compared to Alexnet and 8X increase in accuracy to old SVM models.<\/p><\/blockquote>\n<p>In practical aspects, what does this mean? On a data set with 1000 different items\u00a0 (<a href=\"http:\/\/www.image-net.org\/\" target=\"_blank\">ImageNet<\/a>), ResNet is getting the item 100% correct (compared to a human) about 80% of the time, and correctly classifies the image as one of a list of 5 most probable items 96.4% of the time. People are typically believed to have 95% accuracy identifying the correct image. It\u2019s clear to see that the computer is not far off.<\/p>\n<p>2012 was the watershed year with the first application and win of the CNN to the dataset, and the improvement was significant enough it sparked additional refinements and development. That is still going on &#8211; the <a href=\"http:\/\/image-net.org\/challenges\/LSVRC\/2015\/results\" target=\"_blank\">ResNet example was just released in December 2015<\/a>! It\u2019s clear that this is an area of active research and further improvements expected.<\/p>\n<p>The convolutional neural network is a game-changer and will likely approach and perhaps exceed human accuracy in computer vision and classification in the near future. That\u2019s a big deal.\u00a0 As this is a medical blog, the applications to healthcare are obvious &#8211; radiology, pathology, dermatology, ophthalmology for starters.\u00a0 But the CNN may also be useful for the complicated process problems I&#8217;ve developed here on the blog &#8211; the flows themselves resemble networks naturally.\u00a0 So why not model them as such?\u00a0 Why is it a game changer?\u00a0 Because the model is probably universally adaptable to visual classification problems and once trained, potentially cheap.<\/p>\n<p>&nbsp;<\/p>\n<p>I\u2019ll\u00a0 write more on this in the coming weeks \u2013 I\u2019ve been inching towards deep learning models (but lagging blogging about them) but there is no reason to wait any more. The era of the deep learning neural network is here.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&nbsp; Nothing of him that doth fade But doth suffer a sea-change Into something rich and strange. &#8211; Shakespeare, The Tempest 1.2.396-401 I\u2019m halfway through auditing Stanford\u2019s CS231n course \u2013 Convolutional Neural Networks for Visual Recognition. Wow. Just Wow. There is a sea-changing paradigm shift that is happening NOW &#8211;\u00a0 we probably have not fully [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"New N2Value.com post: The coming #computervision revolution #cs231n #bigdata #neuralnetworks","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false},"version":2}},"categories":[22],"tags":[23],"class_list":["post-12972","post","type-post","status-publish","format-standard","hentry","category-computer-vision","tag-cnn"],"jetpack_publicize_connections":[],"aioseo_notices":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p4mtfP-3ne","jetpack_sharing_enabled":true,"jetpack_likes_enabled":true,"_links":{"self":[{"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/posts\/12972","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/comments?post=12972"}],"version-history":[{"count":5,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/posts\/12972\/revisions"}],"predecessor-version":[{"id":12979,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/posts\/12972\/revisions\/12979"}],"wp:attachment":[{"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/media?parent=12972"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/categories?post=12972"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/n2value.com\/blog\/wp-json\/wp\/v2\/tags?post=12972"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}