{"id":352,"date":"2011-10-03T15:00:10","date_gmt":"2011-10-03T22:00:10","guid":{"rendered":"http:\/\/codecurmudgeon.com\/wp\/?p=352"},"modified":"2015-08-06T11:06:00","modified_gmt":"2015-08-06T18:06:00","slug":"going-with-the-flow-in-static-analysis","status":"publish","type":"post","link":"https:\/\/codecurmudgeon.com\/wp\/2011\/10\/going-with-the-flow-in-static-analysis\/","title":{"rendered":"Going with the Flow in Static Analysis"},"content":{"rendered":"<p><a href=\"http:\/\/codecurmudgeon.com\/wp\/wp-content\/uploads\/2011\/10\/RomeRiver.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/codecurmudgeon.com\/wp\/wp-content\/uploads\/2011\/10\/RomeRiver-300x182.jpg\" alt=\"\" title=\"RomeRiver\" width=\"300\" height=\"182\" class=\"alignright size-medium wp-image-445\" srcset=\"https:\/\/codecurmudgeon.com\/wp\/wp-content\/uploads\/2011\/10\/RomeRiver-300x182.jpg 300w, https:\/\/codecurmudgeon.com\/wp\/wp-content\/uploads\/2011\/10\/RomeRiver-1024x622.jpg 1024w, https:\/\/codecurmudgeon.com\/wp\/wp-content\/uploads\/2011\/10\/RomeRiver.jpg 1075w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/a> As part of my ongoing series about <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-65\">Static Analysis<\/span> issues I want to talk about the relationship between the traditional static method and the newer dynamic or <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-29\">flow analysis<\/span> method. People seem to misunderstand how the techniques relate and what each is good at. In particular, many seem to think that <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-29\">flow analysis<\/span> is a replacement for non-dynamic analysis, which couldn&#8217;t be more wrong.<\/p>\n<p>For the sake of having a simple term to identify both methods, I&#8217;ll refer to the older &#8220;static&#8221; method of <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-65\">static analysis<\/span> and &#8220;pattern-based&#8221; and the newer flow-based method as &#8220;flow-based&#8221;. This is somewhat of a misnomer in that both types are really based on patterns, but seems to be a somewhat common way of referring to the two methods. If the terms I use bother you, feel free to do a search-replace function in your head when reading. I&#8217;m not too worried at this point about a strict technical explanation of each, but rather to their relationship. The goal is to have a way to differentiate in terms these two particular types of <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-65\">static analysis<\/span>. Of course there are other types of <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-65\">static analysis<\/span> as well, but I&#8217;ll leave that for another day.<\/p>\n<p>Let me begin by saying that there is in fact a very strong relationship between pattern-based and flow-based <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-65\">static analysis<\/span>, at least at an academic level. In almost every situation there are a set of pattern-based rules that would allow you to code in such a way that would prevent the occurrence of the issue being found by the flow-based rule.  Given the nature of how <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-29\">flow analysis<\/span> works, it <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-10013\">can<\/span> never find all possible paths through an application. This makes it a good idea to start programming in a more pro-active way to prevent the possibility of issues you\u2019re concerned about.<\/p>\n<p>For example, in security, one of the <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-51\">basic<\/span> problems is using tainted data. Somewhere in the application between getting data from the user and operating on the data, you need to check if the data is safe. Depending on how far apart the operations are, it <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-10013\">can<\/span> be extremely difficult if not impossible to check every possible path. Security code scanners that rely on flow-based analysis attempt to find possible paths between user input and uses of the input that allow tainted data to be operated on. They <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-10013\">can<\/span> never find every possible path even if you let them run for an incredibly long time.<\/p>\n<p>Instead, if you restructure your code so that input validation is done at the moment of input, then you don\u2019t have any paths to chase, and you don\u2019t have to worry about tainted data in your application. Flow-based tools won\u2019t find anything anymore, because you won&#8217;t have any unprotected paths. This is sometimes a more difficult sell for developers, since it doesn\u2019t provide them with a single broken piece of code that needs to be fixed. Rather it tells them that the way they\u2019re writing code now could be improved \u2013 a bitter pill to swallow.<\/p>\n<p>However applying this same principle to things like memory corruption, resource consumption, etc. <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-10013\">can<\/span> make the program far more robust than chasing possible paths ever could. <\/p>\n<p>An excellent methodology is to start with flow-based analysis and fix the low-hanging fruit. Once you have compliance with your flow-based rule set, then review what you\u2019re doing with flow and compare it to pattern-based <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-65\">static analysis<\/span>. Determine as best you <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-10013\">can<\/span> how to apply <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-65\">static analysis<\/span> and catch all possible potential problems before they happen, and put that into place. This moves you from a reacting to issues in your software to a more preventative stance. <\/p>\n<p>There are those who say that flow-based analysis <i>is<\/i> preventative, but it&#8217;s still symptom driven &#8211; namely trying to find the openings and bugs you left in your code. <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-64\">Pattern-based analysis<\/span>, when deployed properly, <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-10013\">can<\/span> be used to address the root problems. In our tainted data example, this means changing our coding style so that we don&#8217;t have paths where data could be tainted &#8211; root problem handled.<\/p>\n<p>Essentially, flow-based analysis finds real bugs in possible paths. When you get a message from it, you just decide whether you care about that path or not. Static on the other hand tells you about the potential for a bug, not necessarily about the existence of a bug. Again, with our security example, Flow-based says &#8220;you used tainted data&#8221; where pattern-based says &#8220;this data could be tainted before use&#8221;. <\/p>\n<p>When compared, you <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-10013\">can<\/span> see that flow-based analysis is a great way to find low-hanging fruit, because it&#8217;s looking for bugs instead of you doing it. On the other hand, because it works by guessing (flow fans hate the &#8220;guessing&#8221; term) at possible paths through your code, it will always be by it&#8217;s very nature incomplete.<\/p>\n<p><span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-64\">Pattern-based analysis<\/span> on the other hand requires restructuring your code and behavior if you want to achieve it&#8217;s full value. Some code is not well suited to such change, such as working legacy code. <\/p>\n<p>Used together you have a very powerful solution that is much more robust than either technique on it&#8217;s own.<\/p>\n<p>[Disclaimer]<br \/>\nAs a reminder, I work for <a href=\"http:\/\/www.parasoft.com\/jsp\/capabilities\/static_analysis.jsp?itemId=547\" target=\"company\">Parasoft<\/a>, a company that among other things make <span class=\"explanatory-dictionary-highlight\" data-definition=\"explanatory-dictionary-definition-65\">static analysis<\/span> tools. This is however my personal blog, and everything said here is my personal opinion and in no way the view or opinion of Parasoft or possibly anyone else at all.<br \/>\n[\/Disclaimer]<\/p>\n<p><i>Resources<\/i><\/p>\n<ul>\n<li><a href=\"http:\/\/www.amazon.com\/gp\/product\/0132582201\/ref=as_li_tl?ie=UTF8&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0132582201&#038;linkCode=as2&#038;tag=codecurmu-20&#038;linkId=GXRFEHWQ3ZK5FGKZ\">The Economics of Software Quality<\/a><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/ir-na.amazon-adsystem.com\/e\/ir?t=codecurmu-20&#038;l=as2&#038;o=1&#038;a=0132582201\" width=\"1\" height=\"1\" border=\"0\" alt=\"\" style=\"border:none !important; margin:0px !important;\" \/>\n<\/li>\n<li><a href=\"http:\/\/www.amazon.com\/gp\/product\/0201835959\/ref=as_li_tl?ie=UTF8&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0201835959&#038;linkCode=as2&#038;tag=codecurmu-20&#038;linkId=QPTZ5VCYUOPL6YYT\">The Mythical Man-Month: Essays on Software Engineering, Anniversary Edition (2nd Edition)<\/a><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/ir-na.amazon-adsystem.com\/e\/ir?t=codecurmu-20&#038;l=as2&#038;o=1&#038;a=0201835959\" width=\"1\" height=\"1\" border=\"0\" alt=\"\" style=\"border:none !important; margin:0px !important;\" \/>\n<\/li>\n<li><a href=\"http:\/\/www.amazon.com\/gp\/product\/1597499633\/ref=as_li_tl?ie=UTF8&#038;camp=1789&#038;creative=390957&#038;creativeASIN=1597499633&#038;linkCode=as2&#038;tag=codecurmu-20&#038;linkId=Y2JHSMRXFLZG52IC\">SQL Injection Attacks and Defense, Second Edition<\/a><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/ir-na.amazon-adsystem.com\/e\/ir?t=codecurmu-20&#038;l=as2&#038;o=1&#038;a=1597499633\" width=\"1\" height=\"1\" border=\"0\" alt=\"\" style=\"border:none !important; margin:0px !important;\" \/><\/li>\n<li><a href=\"http:\/\/www.amazon.com\/gp\/product\/3659612243\/ref=as_li_tl?ie=UTF8&#038;camp=1789&#038;creative=390957&#038;creativeASIN=3659612243&#038;linkCode=as2&#038;tag=codecurmu-20&#038;linkId=ZFQ5NGVB4LPTHY5B\">Basics of SQL injection Analysis, Detection and Prevention: Web Security<\/a><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/ir-na.amazon-adsystem.com\/e\/ir?t=codecurmu-20&#038;l=as2&#038;o=1&#038;a=3659612243\" width=\"1\" height=\"1\" border=\"0\" alt=\"\" style=\"border:none !important; margin:0px !important;\" \/><\/li>\n<li><a href=\"http:\/\/www.amazon.com\/gp\/product\/0071835881\/ref=as_li_qf_sp_asin_il_tl?ie=UTF8&#038;camp=1789&#038;creative=9325&#038;creativeASIN=0071835881&#038;linkCode=as2&#038;tag=codecurmu-20&#038;linkId=DWRXYYWODCKH3D4V\" target=\"amazon\">Iron-Clad Java<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>As part of my ongoing series about Static Analysis issues I want to talk about the relationship between the traditional static method and the newer dynamic or flow analysis method. People seem to misunderstand how the techniques relate and what each is good at. In particular, many seem to think that flow analysis is a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"nf_dc_page":"","_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[4,7],"tags":[137,19,20],"class_list":["post-352","post","type-post","status-publish","format-standard","hentry","category-security","category-software-development","tag-security","tag-softwaredevelopment","tag-staticanalysis"],"_links":{"self":[{"href":"https:\/\/codecurmudgeon.com\/wp\/wp-json\/wp\/v2\/posts\/352","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/codecurmudgeon.com\/wp\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/codecurmudgeon.com\/wp\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/codecurmudgeon.com\/wp\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/codecurmudgeon.com\/wp\/wp-json\/wp\/v2\/comments?post=352"}],"version-history":[{"count":33,"href":"https:\/\/codecurmudgeon.com\/wp\/wp-json\/wp\/v2\/posts\/352\/revisions"}],"predecessor-version":[{"id":3224,"href":"https:\/\/codecurmudgeon.com\/wp\/wp-json\/wp\/v2\/posts\/352\/revisions\/3224"}],"wp:attachment":[{"href":"https:\/\/codecurmudgeon.com\/wp\/wp-json\/wp\/v2\/media?parent=352"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/codecurmudgeon.com\/wp\/wp-json\/wp\/v2\/categories?post=352"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/codecurmudgeon.com\/wp\/wp-json\/wp\/v2\/tags?post=352"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}