Just a reminder for those who aren’t aware – I maintain a list here I like to call the “SQL Injection Hall of Shame“. There was a quiet period at the first of the year, but now we seem to be back at it. I’ve added a couple of updates – one a large breach that was probably SQL injection and one a small one in healthcare that was for sure.
I’m doing a free seminar next week in the DC area “Development Testingcan help you comply with government regulations and security guidelines”. Entry is free and you can register here
This will be an informative lunch seminar on Thursday, May 16th from 10am to 12pm at FCN, Inc in Reston, VA. During this event we will be discussing trends, strategies and best practices for NIST compliance.
Discover how to best utilize your company investments to deliver compliance throughout your organization. Participate in a presentation by industry expert Arthur Hicken as he facilitates a discussion on how to continuously integrate software quality into the development process with Parasoft’s comprehensive Development Testing platform.
What you will learn:
Consistently apply static analysis, unit testing, peer code review, coverage analysis, runtime error detection, etc.
Accurately and objectively measure productivity and application quality
Drive the development process in the context of business expectations – for what needs to be developed as well as how it should be developed
Gain realtime visibility into how the software is being developed and where it is satisfying expectations
Reduce costs and risks across the entire SDLC
Following the presentation Parasoft will demonstrate Parasoft’s development testing solutions for C/C++, Java and .Net applications.
Hope to see you there. If you’ve always wanted to meet the CodeCurmudgeon in person, sign up here.
SSH is a wonderful tool and will let do do all kinds of amazing things – not to mention that it does them securely. However sometimes when you’re trying to automate steps, or are performing the same steps repeatedly on a trusted machine, the frequent retyping of your username can be a pain. Worse still, if you’re writing a script, you certainly don’t want to hardcode passwords into it for others to grab. In this case, what you can do is use ssh keys to secure your connection.
How to do this differs depending on the operating system of the source machine, IE the machine you are SSHing from. Suppose you have two machines, the local one (your laptop) and the remote one (some server, eg my.server.com) To ssh from the laptop to the server without needing a password, perform these steps:
On the local machine:
% ssh-keygen -t rsa
Either put in a passphrase or just hit return twice to skip. Note that using a passphrase makes it more secure, but makes automation tricky.
This produces a file called id_rsa.pub in a subfolder called .ssh underneath your some directory. Now you need to transfer that file to the remote server. Note that you’ll need your password to perform this step, and to avoid troubles we’ll rename the file during transfer.
Now we need to add the id_rsa.pub keys to the proper file on the remote machine (my.server.com). Note that if you don’t already have a .ssh folder on the server, you can just create it, or better yet, run the ssh-keygen command there, as above.
Use your favorite ssh tool and check it’s documentation. For now I’ve used puttygen, it should be where you installed putty, probably something like c:\bin. It is a graphical program for managing keys on windows with putty.
Select “SSH2 (RSA)” as the type of key (at the bottom of the screen)
Select “generate” and follow the instructions. It wants you to move your mouse around in a block for awhile to generate randomness. Then it makes a key.
Select “save private key” and either give it a passphrase, or ignore it when it tells you to think about using a passphrase. you can save your private key to disk somewhere. Note that using a passphrase makes it more secure, but makes automation tricky.
Select “save public key” and save it to disk somewhere.
In the normal putty window select load to pull in the profile you want to add the key to. Go to Connection and put the ID in the “auto-login username” box. IE your unix login name.
In the SSH Auth section select the browse button to go to where you stored the private key file, and select it. THen go to the “session” category and select save.
Now you need to take the public key stuff and add it to your ~/.ssh/authorized_keys file on the ssh server machine. If you’re using putty you have pscp that you can use. It’s in the same dir where you put your putty executable.
c:\> cd dir_with_public_key_file
c:\> pscp putty_public_key_file USERNAME@my.server.com:id_rsa.pub.mylaptop
Now connect to the remote system using ssh so you can add your public key to their authorized keys file, IE use ssh or putty. After you’re connected, edit the file you put there, id_rsa.pub or whatever you called it.
Remove the first line of the file that says “BEGIN SSH2 PUBLIC KEY”
Remove the last line of the file that says “END SSH2 PUBLIC KEY”
Remove the line that says “Comment: ”
At the beginning of the first line insert “ssh-rsa ”
At the end of the last line after the =, put something that says what the key is for future reference. IE your user/machine name, like this, instead of “=” put “= email@example.com”
Now there are probably 4 lines in this file, and they all need to be joined into one line. Plus if joining creates spaces they will need to be removed.
Now you can append this to the ~/.ssh/authorized_keys file:
Years ago the biggest challenge in static code analysis was trying to find more and more interesting things. In our original CodeWizard product back in the early 90’s, we had 30 some-odd rules based on items from Scott Meyers‘ book “Effective C++“. It was what I link to think of as “Scared Straight” for programmers. Since then static analysis researchers have constantly worked to push the envelope of what can be detected, including adding newer techniques like data flow analysis to expand what static analysiscan do.
It seems to me that the biggest challenge currently is not to keep adding new weakness to detect, although I hope at some point we can again return to that. Today one of the most common hurdles people run into is trying to make sense of the results they get. Although people do day “I wish static analysis would catch ____” (name your favorite unfindable bug), it’s far more common to say “Wow, I have way too many results, static analysis is noisy” and “static analysis is full of false positives”. Some companies have gone so far as to put triage into their workflow, having recognized that the results they produce aren’t exactly what developers need.
So while what developers seem to be saying is there are too many results, the most common statements are “it’s noisy” are even more commonly “It’s a false positive”. But what IS a false positive in static analysis?
The term or idea of a false positive seems to have two meanings. In the simplest sense, it means that the message that a rule was violated is incorrect, or in other words the rule was not violated – the message was false. Sometimes developers fall into the trap of labeling any error message they don’t like as a false positive, but this isn’t really correct. They may label it a false positive because they simply don’t agree with the rule, they may label it because they don’t understand how it applies in this situation, or they may label it because they don’t think it’s important in general or in this particular case.
False positives and “true” positives are in two different areas. One is pattern based static analysis, which also includes metrics. There is also flow-based static analysis. One thing to remember is that pattern based static analysis doesn’t have false positives, it’s really a bug in the rule, because the rule should not be ambiguous. If the rule doesn’t have a clear pattern to look for, it’s a bad rule.
This doesn’t mean that when a rule lists a violation that there is a bug, which is important to note. A violation simply means that the pattern was found, indicating a weakness in the code, a susceptibility if you will to having a bug. Once you recognize the difference between a bug and a potential bug or weakness, then you realize that looking for these patterns because they are dangerous to your code.
When I look at violation, I ask myself, does this apply to my code, or doesn’t it apply? If it applies, I fix the code, if it doesn’t apply I suppress it. It’s best to suppress it in the code directly so that it’s visible and others can look at it, and you won’t end up having to review it a second time.
It’s a bad idea to not suppress the violation explicitly, because then you will constantly be reviewing the same violation. Think of using a spell checker but never adding your words to it’s dictionary. The beauty of in-code suppression is that it’s independent of the engine. Anyone can look at the code and see that the code has been reviewed and that this pattern is deemed acceptable in this code.
In flow analysis you have to address false positives because it is in the nature of flow analysis to have false positives. Flow analysis cannot avoid false positives, for the same reason unit testing cannot generate perfect unit test cases. The analysis has to make determinations about expected behavior of the code, sometimes there are too many options to know what is realistic, or sometimes you simply don’t have enough information about what is happening in other parts of the system.
The important thing here is that the true false positive is once again something that is just completely wrong. For example the tool you’re using says you’re reading a null pointer. If you look at the code and see that it’s actually impossible, then you have a false positive.
If on the other hand, you simply aren’t worried about nulls in this piece of code because they’re handled elsewhere, then the message while not important is not a false positive. The messages range from “true and important” through “true and unimportant” and “true and improbable” to “untrue” – there is a lot of variation and they get handled differently.
There is a common trap here as well. As in the null example above, you may believe that a null value cannot make it to this point, but the tool found a way to make it happen. You want to be very sure to check and possibly to protect against this if it’s important to your application.
It’s critical to understand there there is both power and weakness in flow analysis. The power of flow analysis is that it goes through the code and tries to find hot spots and find problems around the hot spots. The weakness is that it is going some number of steps around the code it’s testing, like a star pattern.
The problem is that if you start thinking you’ve cleaned all the code because your flow analysis is clean, you are fooling yourself. Really, you’ve found some errors and you should be grateful for that.
In addition to flow analysis you should really think about using runtime error detection. Runtime error detection allows you to find much more complicated problems than flow analysiscan find, and you have the confidence that the condition actually occurred, simply because it did. Runtime error detection doesn’t have false positives in the way that static analysis does.
One other important context to consider – is it worth the time. My approach to false positives is this: If it takes 3 days to fix a bug, it’s better to spend 20 minutes to look at a false positive, as long as I can tag it and never have to look at the same issue again. It’s a matter of viewing it in the right context. If you have a problem, for example with threads. Problems with threads are dramatically difficult to discover. If you want to find an issue related to threads it might take you weeks to track it down. I’d prefer to write the code in such a way that problems cannot occur in the first place. IE move my process from detection to prevention.
Your runtime rule set should closely match your static analysis rule set. They can find the same kinds of problems, but the runtime analysis has a massive number of paths available to it. This is because at runtime stubs, setup, initialization, etc are not a problem. The only limit is it only checks the paths your test suite happens to execute.
It’s very important that static analysis doesn’t overrun people. This means that you need to make sure that the number of errors you report to people is reasonable – if you start to overrun them with errors you’ll have a problem. One of the most common problems with static analysis is starting with a set of rules that is too big. The first step in static analysis is to setup the right set of rules. Make sure that each of the rules is something that your developers will be glad they got a message about, rather than annoyed. You want the reaction of “wow, I’m glad the tool found that for me”, not a reaction that says “Ugh, another stupid static analysis error that doesn’t matter.”
It’s better to start with a single rule that everyone thinks is wonderful than have a comprehensive rule set that checks for everything under the sun. Pay attention to feedback on static analysis rules you’re using. If you find a lot of people are questioning the same rule, or that it frequently has suppressions, it’s a good idea to review that rule and decide whether it’s really appropriate at this time. The rule-of-thumb you can use is “would I fix the code if I found this error”. If not, the configuration needs to change.
There are other configuration options that will affect the perception of false-positives and thus the adoption rate, besides just the particular rules you choose. Some rules might be appropriate for new code, but not for legacy code. It’s important to setup cut-off dates for legacy code that match what your policy is. If you have a policy that you only touch the lines in the code related to a bug, then your static analysis needs to match that. If you have a policy that says “fix all static analysis when you work on a file” then your configuration should match that.
A simple trap to avoid is not to bother checking things you aren’t planning on fixing. This sounds simple but is surprisingly typical. For example, there is a rule that you really like and you would like to have code compliant with it someday. Right now you’re not willing to stop a release, or fight with the team, etc to get compliance, but you really want to use the rule someday. The best idea is wait until that day. For now, do only rules you care about, then no one gets overwhelmed and you don’t send a mixed message on static analysis.
Another common trap is running on legacy code where you don’t plan fixing violations, or even won’t allow it because of the possibility of problems introduced by changing the code. Put simply, don’t bother testing something you’re not going to fix.
Static analysis when deployed properly doesn’t have to be a noisy unpleasant experience. Make sure your deployment goes right.