As part of my ongoing series about Static Analysis issues I want to talk about the relationship between the traditional static method and the newer dynamic or flow analysis method. People seem to misunderstand how the techniques relate and what each is good at. In particular, many seem to think that flow analysis is a replacement for non-dynamic analysis, which couldn’t be more wrong.
For the sake of having a simple term to identify both methods, I’ll refer to the older “static” method of static analysis and “pattern-based” and the newer flow-based method as “flow-based”. This is somewhat of a misnomer in that both types are really based on patterns, but seems to be a somewhat common way of referring to the two methods. If the terms I use bother you, feel free to do a search-replace function in your head when reading. I’m not too worried at this point about a strict technical explanation of each, but rather to their relationship. The goal is to have a way to differentiate in terms these two particular types of static analysis. Of course there are other types of static analysis as well, but I’ll leave that for another day.
Let me begin by saying that there is in fact a very strong relationship between pattern-based and flow-based static analysis, at least at an academic level. In almost every situation there are a set of pattern-based rules that would allow you to code in such a way that would prevent the occurrence of the issue being found by the flow-based rule. Given the nature of how flow analysis works, it can never find all possible paths through an application. This makes it a good idea to start programming in a more pro-active way to prevent the possibility of issues you’re concerned about.
For example, in security, one of the basic problems is using tainted data. Somewhere in the application between getting data from the user and operating on the data, you need to check if the data is safe. Depending on how far apart the operations are, it can be extremely difficult if not impossible to check every possible path. Security code scanners that rely on flow-based analysis attempt to find possible paths between user input and uses of the input that allow tainted data to be operated on. They can never find every possible path even if you let them run for an incredibly long time.
Instead, if you restructure your code so that input validation is done at the moment of input, then you don’t have any paths to chase, and you don’t have to worry about tainted data in your application. Flow-based tools won’t find anything anymore, because you won’t have any unprotected paths. This is sometimes a more difficult sell for developers, since it doesn’t provide them with a single broken piece of code that needs to be fixed. Rather it tells them that the way they’re writing code now could be improved – a bitter pill to swallow.
However applying this same principle to things like memory corruption, resource consumption, etc. can make the program far more robust than chasing possible paths ever could.
An excellent methodology is to start with flow-based analysis and fix the low-hanging fruit. Once you have compliance with your flow-based rule set, then review what you’re doing with flow and compare it to pattern-based static analysis. Determine as best you can how to apply static analysis and catch all possible potential problems before they happen, and put that into place. This moves you from a reacting to issues in your software to a more preventative stance.
There are those who say that flow-based analysis is preventative, but it’s still symptom driven – namely trying to find the openings and bugs you left in your code. Pattern-based analysis, when deployed properly, can be used to address the root problems. In our tainted data example, this means changing our coding style so that we don’t have paths where data could be tainted – root problem handled.
Essentially, flow-based analysis finds real bugs in possible paths. When you get a message from it, you just decide whether you care about that path or not. Static on the other hand tells you about the potential for a bug, not necessarily about the existence of a bug. Again, with our security example, Flow-based says “you used tainted data” where pattern-based says “this data could be tainted before use”.
When compared, you can see that flow-based analysis is a great way to find low-hanging fruit, because it’s looking for bugs instead of you doing it. On the other hand, because it works by guessing (flow fans hate the “guessing” term) at possible paths through your code, it will always be by it’s very nature incomplete.
Pattern-based analysis on the other hand requires restructuring your code and behavior if you want to achieve it’s full value. Some code is not well suited to such change, such as working legacy code.
Used together you have a very powerful solution that is much more robust than either technique on it’s own.
As a reminder, I work for Parasoft, a company that among other things make static analysis tools. This is however my personal blog, and everything said here is my personal opinion and in no way the view or opinion of Parasoft or possibly anyone else at all.
I have a love-frustration relationship with open source software. I could never say I hate it, because I don’t. I am however painfully aware of not only how many bad open-projects are out there, but how many almost-great ones there are that come tantalizingly close to making the grade. We still haven’t reached the point, for example, where you can put a Linux desktop in the hands of the average consumer without good Sysadmin backup. Contrast that with the millions upon millions of windows systems in the hands of the Technologically challenged that continue to work. I wish it weren’t so, but it is.
Ubuntu has made strides in this area with their 10,000 paper cuts project but there is still a long way to go. Many open-source projects remain too geeky and too buggy for mainstream success. On the other hand, there are some astounding successes that continue to give me hope. Without going into problems inherent in open-source vs traditional proprietary development (I’ll leave that for another day) I’d at least like to mention the applications that have made my life better, or have even changed the world. Don’t worry there is a poll at the end for you to choose your own favorite. As always you can sound off in the comments as well.
My top 10 list of open source that really works. Criteria are it works well but doesn’t require you to be an uber-geek. At least not a sysadmin. You might be an uber-geek in your area of expertise, such as video or database. I expect a few people to complain about the criteria, but I think one of the biggest problems for the open-source movement, especially Linux and Android, is that they can’t gain traction with the mainstream because they not only have the option for radical configuration, but the requirement for the same. Most people have software to solve some particular need, not to play with the software itself. Yes, I know there is a group out there that loves playing with stuff for it’s own sake, and that’s OK – there’s nothing wrong with it. But recognize that it’s a minority position and that most people just want stuff to work.
Many will disagree with the list, but that’s not really possible – it’s a list of MY favorites, not yours. 😉 Feel free to mention yours in the comments though – maybe I’ll change my mind. There’s a poll at the bottom.
Top Ten Open Source Projects
In no particular order
Apache web server
Who can deny that Apache has changed the world? It’s a really great, really powerful web server. But it also just works out of the box. You’ve gotta love it. I’ll bundle Tomcat into this since I almost always use them together as do many others. You can use them separately if you want.
Who could miss putting Linux on a list like this? Currently my favorite flavor is Ubuntu. They have a simple installation, streamlined updates, a commitment to fixing annoyances, and more. They work well from desktop to server to virtualized JeOS environments (meaning Just enough Operating System). (More on JeOS at a later date).
Having a good, powerful yet simple database available without massive cost and unencumbered by crazy licensing is probably one of the unsung heros of the modern web. Without data-driven websites we’d all still be reading static text articles. Again, MySQL can be tweaked ad nauseum, but also works well essentially out of the box for people who don’t want to waste time. I hope that Oracle keeps this gem alive. [Update 2011-10-04]Oracle has released performance updates to MySQL[/Update]
In the bad-old days we used to have a variety of expensive, annually updated development environments on Windows. On Unix it was mostly command line – open four windows, one each for edit, compile, run, debug. Tools had to work hard to integrate, and tool vendors made difficult choices about what environments to support.
Eclipse being not only free but open with a well designed API let us move from working on our development environment to working on the projects we wanted. I used to spend a lot of time keeping my Emacs (with VI plugin!) working with all it’s crazy plugins. My favorite Eclipse flavor happens to be MyEclipse because it has so much useful stuff built into it. Probably most of it I could find somewhere, but this way I can just install what I need in one shot and it just works.
Standard disclaimers about working for Parasoft aside, I never leave home without Jtest plugged into my Eclipse for testing my java code even though it’s not open source.
You can’t build great software without having a good bug-tracking system, and while Bugzilla is open-source / free, it remains one of the best. It has enough features for most organizations, is light-weight, and easy to use. It also has a well-published API with all kinds of nifty clients and plugins being created for it. For example, I have one on my iPad that in some ways is better than the web interface.
What’s the web without blogging, self-publishing, storefronts, etc.? WordPress makes it easy for everyone to start a site without being an HTML expert. I have been using it after doing things by hands for years, and I’m starting to rethink some of the other projects I’m working on. I know some will think Drupal at this point. It certainly has a large following as well, perhaps even larger. But I prefer WordPress for it’s absolute usability. I’ve played with Drupal and was just never comfortable deploying it for real life. For most people, WordPress may be a better choice, I know it is for me.
I’m one of those people who has my computers integrated into my home entertainment system. I prefer living without silly disks and other dinosaur media. Years ago I moved away from music CDs, and I’m close to being done with DVD and blu-ray. The great thing about VLC is that it will play any video format you have. Really, anything. No messing around with plugins, codecs, video frame rates, and all that geeky stuff. Just right-click your video file and “open with VLC” and you’re off and running.
I have no desired to continually reconvert my videos to different formats to accommodate new devices and VLC let’s me just watch what I want. Again, it even works on iPhone and iPad, for all those who think you have to have Apple format video through iTunes.
Plex / XBMC
Plex is just a Mac fork of XBMC. From my experience, they seem to have improved on the original, but I haven’t spent enough time on the XBMC/Windows side to really be sure. This is the media-server equivalent of what VLC does for a single device. Basically you point it to the drive(s) where you store your videos, music and photos. It sets up a server which can speak DLNA so that your “usual” media server clients can use it, like Playstation, Samsung TV’s, Xbox, and more. You can also run a hard video line from your computer to your TV for an even better experience. This is one of those things that will ultimately lead to people cutting the cable. It has a plugin architecture and people are continually adding what they call “channels” which are really wrappers around existing web-based content such as Comedy Central, CNN or Aljazeera. You’ve got to try it to really understand, but it’s amazing.
For those who need high-power image editing and manipulation, this is your gnu-alternative to those expensive programs out there. Maybe this one isn’t for the faint of heart, but for those familiar with image editing, it’s no problem to use. I’ve moved almost completely to Gimp and don’t expect to pay for other image editing programs in the future.
Firefox & Thunderbird
Firefox and Thuderbird, both from Mozilla, are definitely starting to feel dated where once they seemed cutting edge – sorry guys. I hope you push back to the front. But I cannot discount the contribution they’ve made to the world. I was once a dedicated Firefox user, but now it just looks clunky compared to others. If it wasn’t for Firefox we’d probably all be stuck with lousy browsers where instead we have several choices now. Thunderbird let us escape from the horrible enterprise email monoliths before full Ajax web-based email clients made it easy to live without a local mail client. Kudos, Mozilla.
Open Source Honorable Mentions
Projects that just didn’t quite make the top ten, although many would probably make the top 20. You’ll notice a recurring theme that I left many out because they’re just too wonky or complicated for everyday use or for “regular” people.
I love Apache Commons. It has more cool useful libraries than I can count that keep me from inventing the wheel. However I suspect that it’s not used nearly as often as it could be. I almost put this in the top 10, but decided it’s probably too limited in use as well as scope – only developers feel this. Or maybe end-users should count as well? I can’t decide. But I can’t live without it either.
I have to admit I’m a fan of anything that keeps the world from a single provider for office tools. But my experience with OpenOffice is frequently that it’s not quite there yet. There are some annoyances in conversion to/from the MS Office files that I have to use in my everyday life, and this means I just can’t rely on it. I wish I could. But I see improvements being made, such as a native OSX client rather than relying on an Xserver, so I have hope.
Hudson / Jenkins
Continuous integration and build automation are very useful to software development. I find these tools useful for automation in general as well – goodbye Cron! It used to be just Hudson and then there was a split. Honestly I don’t know which one is a better choice right now – feel free to voice your opinion. I’m still using Hudson because I’ve gotten used to it. Probably you can’t go wrong with either.
Maven is a big improvement over “make” that we all used to use to build our software. It’s really the next generation of Ant, which is a great thing in itself. This one comes with a caveat though – there is a certain religious fervor that can cling to some Maven users. Using Maven can lead to lost productivity if pushed to extremes. And the “convention over configuration” mantra is a nice idea, but really the same is true if you simply do things in a standard Eclipse configuration. In practice it means “if you totally reconfigure all your development projects and source layout and builds to do what we think is good, you won’t have to reconfigure them”. We used to call that “my way or the highway”. Caveat emptor. Used properly, Maven will make your life better, used indiscriminately it will be painful.
Android isn’t really free, and at least the current release isn’t really open. But still, it’s a nifty idea. I’ll ignore for a moment the potential IP issues until they get resolved in court one way or another, but having a strong mobile OS to compete and drive innovation helps everyone. Awesome.
Audacity has a lot going for it. The times I’ve tried it I’ve always suspected it would probably do what I need, but it was tough to figure out. I think it’s still just too wonky for regular people. I suspect it will remain that way, as simple audio editing is becoming more and more available, even on our smartphones. Those who do heavy audio editing may disagree – let me know.
VNC is a remote desktop technology that works on Unix including Mac. It comes in a lot of varieties such as RealVNC and TightVNC – on the Mac it’s actually baked in as the native remote desktop. It’s a great idea, but obviously there are more people using Microsoft’s remote desktop, so I couldn’t put it int he top ten. But I use it all the time – maybe Microsoft will give up their proprietary ways and switch, but holding your breath is probably not a safe bet.
If you work with video files, such as ripping your movie collection, converting it to play on your Playstation, Xbox, PSP, iPod, smartphone, etc. then this is for you. It’s a powerful, full-featured open-source video conversion program. But suffers from nearly terminal geekiness. Out-of-the-box settings yield mediocre results compared to what a really good video file should have. Going beyond that requires an extreme amount of esoteric knowledge, and even at that it can be tricky to repeat at a later date. If you know video you can really enjoy this, if you’re a beginner you might get lost.
We’ve all used Wikipedia. This is the software behind it. Lots of great things are being done around the web with wikis, but even without all that, you only need to look at Wikipedia to see how amazing this can be and has been. There are still some core issues that need to be worked out with the idea of canonical encyclopedia coming from a wild open community.
As you’ve probably figured out by now, there is some really great stuff coming out of Apache and Sourceforge. I haven’t covered Google much here because they’re more on the free application side in many cases though they do manage a lot of open-source as well – it’s worth a look both as an end-user (gmail, etc) and as a developer.
I recently added an iPad to my technology arsenal. I’ve been using a MacBook Air for several years now and I really like the size & weight, but compared to the iPad it’s enormous and has a short battery life. I get a lot of use out of my iPhone, so I know what the iPad can do. Some continue to insist the iPad is just a toy.
For me, the big question mark is the keyboard. If I’m carrying an iPad and a notebook while traveling, it seems a bit ridiculous. If I carry an iPad and a keyboard, isn’t that just a notebook? Is there any real advantage to such a thing? Well I decided to give it a try and see what happens.
I travel a lot, so based on things like email, eBooks, and in-flight entertainment the iPad was a no-brainer, it can do them all very well. But I was wondering how useful I can actually make it in a software company.
If you’re going to try to use the iPad as a replacement device, there are a few categories of apps you’re going to be interested. I’ve broken out a few of these categories and selected some basic apps to see how well they will work. I’ll go into more detail on each after I’ve got some real use.
I checked out a list of 30 Business Apps and while it has some interesting ideas, it didn’t cover some specific software development needs. I’m purposely ignoring the entertainment category such as music, video and games, as it’s well covered in many other places.
I bought a couple of office suites for comparison purposes. I know, this sounds crazy, but it’s still less than the price for Microsoft Office on a desktop. My basic needs are typical word-processing, simple spreadsheet, and presentations. I also have some apps for note-taking and planning.
I happen to be a fan of the Mac office suite with Numbers, Pages, and Keynote. Especially Keynote is a great improvement over PowerPoint. So I bought each of those apps. I had already purchased a Keynote remote app for my iPhone that I haven’t had the chance to tryout yet.
However, our office and clients remain largely Microsoft (MSFT) based, so I bought the QuickOffice apps as well. After initially giving it a pass, I decided to get DocsToGo as well. There are one or two other major office suites for the iPad as well, I may try them at some point, but probably only if I’m missing something. I’ll be shaking them all down as much as I can in the coming weeks and I’ll give the long and short of it here.
There are a few other tools I use frequently in the office, such as a voice recorder to take dictation for creating presentations, training materials, whitepapers, articles, etc. At the moment this is centered heavily around the built-in memo recorder on the iPhone, and possibly Siri will help but I don’t know yet. I’ve also got Dragon Dictation but mostly it’s something I record and then later work from manually. I’ve got a few other audio recording apps as well. I’ll give a full shakedown in the near future.
Another useful office tool is having some kind of scanner software based on using the camera in the iPad. OCR on top of that is really the icing on the cake. I’ve got a couple of these installed, at the moment GeniusScan and JotNot Pro. I’ll see if I can figure out which is best.
For note taking I’ve frequently used simple text editors or a blank page in the word processor. While both work, neither is well-suited to the task. Ideally I should be able to type, write with my finger or stylus, and draw simple things to help illustrate the topic at hand. I should be able to save, edit, and share the document created. With that in mind, I’ve got a few note-takers installed like PenUltimate, but the noe I have high hopes for is Note Taker HD.
I’ve also got an app that lets me try to study/plan/organize based on putting 3×5 cards on a cork-board. It looks really great, but I’m not yet convinced it’s actually useful or sustainable.
This is a category that some office users won’t need. If you’re an iPad in the office and at home kind of person, you can probably skip this set. My initial set includes the apps for the airlines I use, just in case I need them. I also have the TSA app for airport information. The bulk of my travel information comes from TripIt which I have found very useful on the iPhone.
I also take a GPS on the road with me via my iPhone, so I haven’t listed it as a necessary item for the iPad, even though the bigger screen makes for easy mapping.
I also already have some file sharing apps to transfer files on and off my device. Mostly I use iDisk since I have MobileMe but long-term I will be doing something else. I’ve installed Box.net since they have the free 50GB offer running right now. I may get a small DropBox for comparison with that. And my Latest favorite in this area is FileBrowser which let’s me do normal file system browsing on remote computers, such as my desktop.
I’ve got a few others as well, some of which have already been deleted for lack of usefulness.
For software development I have a decent Bugzilla client called iBzilla, and nice SVNsource control client called CodeViewer 2, and I bought a few code editors I will be trying out, including Textastic, Koder, and of course what computer is complete without VI?
One could argue that database is part of the development tools category, but I think it’s big enough to warrant separate treatment. I do a lot of database work, so again I already had iPhone apps for connection to various DBs such as Oracle and MySQL. I updated them to iPad versions. Mostly I use Navicat. I’ll discuss this in a separate post, but if anyone has suggestions for good DB apps I’d be happy to check them out.
I also have Bento for quick and dirty db stuff, but I currently don’t use it much. If I find a way to leverage it for work I’ll let you know.
Social / Web stuff
For social and web stuff I have the usual suspects, Twitter, WordPress, Polldaddy, LinkedIn, GoToMeeting, WebEx, Instant messengers, etc. This covers blogging, posting info, messaging, as well as video conferencing.
Geek stuff (SysAdmin)
And I had the usual array of geek utilities like DNS tools such as nslookup, VNC for remote login, and a terminal client that includes SSH support. There is some overlap between these so I’ll try to narrow it down to what you actually need with what’s good and bad about each.
Other iPad stuff
There are a few other things that are interesting. For example, I am making more use of the Kindle application now for books. I do have a Kindle and there are times I prefer it, but that’s normally for when I am not carrying the iPad.
I find the Kindle device great for reading, but the larger iPad better for reference books. One other benefit is that most technical manuals are much cheaper on Kindle than in print, and it’s very convenient to have them with you when you need them.
I’m going to assume that at least conceptually all of the ideas here are equally useful for other tablets. This will depend of course on having the apps you need available on the platform of your choice.
I could certainly borrow a tablet from a friend and do a similar experiment, but it’s something that takes time to do in depth, so we’ll see. If any of you are interested in a similar idea, let’s talk.
Updates on real world experience will be coming in the near future as I shake down each category.
Did you ever buy something, only to find out that it just wasn’t quite right for you? I don’t mean the usual buyer’s remorse over a large purchase, like a new car. I mean you bought a sports car, and somehow missed the fact that you like to haul your motorcycle to the desert on weekends. Oops!
Not surprisingly, you’ll find people do this frequently with small purchase, for example apps for your phone. You’re hoping for a specific utility, you read a description, it sounds right so you buy it. It might even seem to work OK in simple tests. I had this happen to me recently with a small external microphone I bought for my smartphone to do audio recording. It worked for a couple of minutes, but when I tried to actually use it, the audio was garbled or non-existent for much of the recording. Argh!
Frequently, this is exactly what happens when people decide to buy development tools. They take advice from someone who has used the tool individually, or in a limited environment. When they try to test the tool, perhaps in a pilot program, everything appears fine. Then when deployment begins so do the problems. False positives, configuration problems, poor workflow… the list is seemingly endless and sadly too familiar.
What happens is that the selection process for the tool is inadequate. Most POCs (proof-of-concept) that I see are really simple bake-offs. Someone has an idea in mind of what they think a tool should do and they create the good old checklist of features. Sometimes this is done with the help of a single vendor – a recipe for disaster. Other products are categorized based on the checklist, rather than looked at holistically to see what else they have to offer.
In addition, this methodology tails to take into account the biggest costs and most likely hurdles to success. In order to select the right tool, you have to take into account how it will work in your world.
If for example your developers spend their days in Eclipse, and you select a tool that doesn’t run in Eclipse, then you force the to spend time opening a second tool, possibly dealing with extraneous configuration. Not to mention when they get the results, they’re not in place they’re need – the code editor.
Such issues compound over time and people, carrying a tremendous burden with them. For example, about 10 years ago people got enamored with the idea of doing batch testing for things like static analysis, and then emailing the results back to developers. While this may be the simplest way to setup static analysis, it’s nearly the worst way to deal with the results. You don’t need error messages in your email client, you need them in your editor. (see my earlier post on What Went Wrong with Static Analysis?)
These are just a couple of ways you can run into trouble. I’m doing a webinar at Parasoft about this on September 30th registration is free. Stop by and check it out if you get a chance.