Tag Archives: opensource

Open-Source Project Activity Demystified

Open-source projects are spread across a wide spectrum of maturity and activity. When choosing to use open-source it’s important to select a project that has lots of active contributors and recent development unless you’re expecting to take on the project development yourself.

Determining project activity can be done by looking at project statistics such as GitHub provides. Often projects are started by a single individual who has a particular problem they want/need to solve. Once the software is “working” the project can stagnate. A few select projects reach a critical mass where multiple contributors work to keep the project up to date, fix bugs, add features and create a large useful popular project.

Open-source activity basics

Here we will compare a small semi-active project Netflix curator with an active popular one, Angular.js to see how you can tell the difference. First, there are three basic statistics at the top of every GitHub project: Watch, Star and Fork.


Watch is the number of people who have added the project to their watchlist. This gives them updates about the project and is an indication of the number of people who care about changes to the code, rather than just use the project.

Star is the number of people who find a project interesting and want to indicate that. It also adds a bookmark for favorite projects.

Fork is the number of people who have cloned the repository with the intention of adding their own changes to it. Often times such people don’t actually contribute but it shows a level of interest in contributing.

Notice that the very popular and active Angular.js project has over ten times as many watchers as Netflix curator. As for Forks, Angular.js has an even bigger margin over Netflix curator – almost one thousand times as many forks.

Contributors

A second area to look is the “Graphs” tab which shows graphically information about contributors, frequency of code changes, etc. The graphs below show the contributors to each project.

Graph of top contributors to angular.js project
Angular.js top contributors

Notice that the top 4 contributors to Angular.js each have tens of thousands of commits. The list of significant contributors is quite large which not only provides a wealth of ideas for new features but also reduces risk when a contributor leaves the project.

In contrast, the top 4 contributors to Netflix curator quickly drops to less than 100 commits – again a difference of almost one thousand times. If the main contributor leaves, or grows bored and moves on to something else, the project is completely stagnant – if you want anything you’ll need to do it yourself.

Graph of top contributors to netflix curator project
Netflix curator top contributors

Code change frequency

Next we can look at the frequency of code change. The Netflix curator exhibits a common tendency for a project to stagnate at some point as it has the basics of the desired functionality from the single original contributor.

Graphs of code update frequency for netflix curator project
Netflix curator code update frequency

A larger set of contributors with more ideas and free time helps to keep a project vibrant as you can see with the Angular.js project. Studies have shown that larger and more complex open-source projects tend to attract more developers.

Graphs for code update frequency for angular.js project
Angular.js code update frequency

Network / Project forks

Finally, we can check the network graphs to see how many people are forking the project and doing something new with it, which is a telling indicator of how many people are really interested in the project and want to do their own thing with it. Note here that we have only a couple of forks for Netflix curator that were never merged back in,

graph showing how many forks there are for netflix curator project
Netflix curator network forks

while the Angular.js project has too many forks to display.

message that there are too many forks to display
angular.js network forks

At any given time you can quickly see which repositories are most active by checking https://bithub-ranking.com and an explanation of the GitHub statistical graphs are available at https://help.github.com/articles/about-repository-graphs/.

While most typical open-source projects won’t make the most-popular list, doing a bit of investigation into the health of an open-source project can help make sure that the code you’re using will be maintained and updated to keep up with emerging technologies for years to come.

Open Source Security Webinar

I don't want to know!
Don’t tell me, I don’t want to know!

I’m doing a webinar on open source security with Parasoft and my friends at Protecode about how to make sure that the open source you’re including in your application is secure. It seems like some people want to take the attitude of “don’t tell me, I don’t want to know what can happen”.

The truth is that open source has the same kinds of vulnerabilities that your own source code does. Does this mean that you should avoid it? Of course not. Join the webinar and we’ll explain in detail how to make sure you’ve got the latest patches, how to make use of the National Vulnerability Database from the US government, and how to make sure that there are no other vulnerabilities lurking in the code you rely on.

We’re holding a couple of sessions, so you should be able to join no matter what timezone you’re in. Sign up for free. Hope to see you there.

When: Wednesday June 18th 2014 at 9am EDT, 6:30pm IST (India), 3:00pm CET (Central Europe), 2:00pm (UK)
Repeat: Wednesday June 18th 2014 at 2pm EDT, 11:00am PDT

My Favorite Open Source software

open source word cloud on chalkboardI have a love-frustration relationship with open source software. I could never say I hate it, because I don’t. I am however painfully aware of not only how many bad open-projects are out there, but how many almost-great ones there are that come tantalizingly close to making the grade. We still haven’t reached the point, for example, where you can put a Linux desktop in the hands of the average consumer without good Sysadmin backup. Contrast that with the millions upon millions of windows systems in the hands of the Technologically challenged that continue to work. I wish it weren’t so, but it is.

Ubuntu has made strides in this area with their 10,000 paper cuts project but there is still a long way to go. Many open-source projects remain too geeky and too buggy for mainstream success. On the other hand, there are some astounding successes that continue to give me hope. Without going into problems inherent in open-source vs traditional proprietary development (I’ll leave that for another day) I’d at least like to mention the applications that have made my life better, or have even changed the world. Don’t worry there is a poll at the end for you to choose your own favorite. As always you can sound off in the comments as well.

My top 10 list of open source that really works. Criteria are it works well but doesn’t require you to be an uber-geek. At least not a sysadmin. You might be an uber-geek in your area of expertise, such as video or database. I expect a few people to complain about the criteria, but I think one of the biggest problems for the open-source movement, especially Linux and Android, is that they can’t gain traction with the mainstream because they not only have the option for radical configuration, but the requirement for the same. Most people have software to solve some particular need, not to play with the software itself. Yes, I know there is a group out there that loves playing with stuff for it’s own sake, and that’s OK – there’s nothing wrong with it. But recognize that it’s a minority position and that most people just want stuff to work.

Many will disagree with the list, but that’s not really possible – it’s a list of MY favorites, not yours. 😉 Feel free to mention yours in the comments though – maybe I’ll change my mind. There’s a poll at the bottom.

Top Ten Open Source Projects

In no particular order

Apache web server
Who can deny that Apache has changed the world? It’s a really great, really powerful web server. But it also just works out of the box. You’ve gotta love it. I’ll bundle Tomcat into this since I almost always use them together as do many others. You can use them separately if you want.

Ubuntu Linux
Who could miss putting Linux on a list like this? Currently my favorite flavor is Ubuntu. They have a simple installation, streamlined updates, a commitment to fixing annoyances, and more. They work well from desktop to server to virtualized JeOS environments (meaning Just enough Operating System). (More on JeOS at a later date).

MySQL
Having a good, powerful yet simple database available without massive cost and unencumbered by crazy licensing is probably one of the unsung heros of the modern web. Without data-driven websites we’d all still be reading static text articles. Again, MySQL can be tweaked ad nauseum, but also works well essentially out of the box for people who don’t want to waste time. I hope that Oracle keeps this gem alive. [Update 2011-10-04]Oracle has
released performance updates to MySQL[/Update]

Eclipse
In the bad-old days we used to have a variety of expensive, annually updated development environments on Windows. On Unix it was mostly command line – open four windows, one each for edit, compile, run, debug. Tools had to work hard to integrate, and tool vendors made difficult choices about what environments to support.

Eclipse being not only free but open with a well designed API let us move from working on our development environment to working on the projects we wanted. I used to spend a lot of time keeping my Emacs (with VI plugin!) working with all it’s crazy plugins. My favorite Eclipse flavor happens to be MyEclipse because it has so much useful stuff built into it. Probably most of it I could find somewhere, but this way I can just install what I need in one shot and it just works.

Standard disclaimers about working for Parasoft aside, I never leave home without Jtest plugged into my Eclipse for testing my java code even though it’s not open source.

Bugzilla
You can’t build great software without having a good bug-tracking system, and while Bugzilla is open-source / free, it remains one of the best. It has enough features for most organizations, is light-weight, and easy to use. It also has a well-published API with all kinds of nifty clients and plugins being created for it. For example, I have one on my iPad that in some ways is better than the web interface.

WordPress
What’s the web without blogging, self-publishing, storefronts, etc.? WordPress makes it easy for everyone to start a site without being an HTML expert. I have been using it after doing things by hands for years, and I’m starting to rethink some of the other projects I’m working on. I know some will think Drupal at this point. It certainly has a large following as well, perhaps even larger. But I prefer WordPress for it’s absolute usability. I’ve played with Drupal and was just never comfortable deploying it for real life. For most people, WordPress may be a better choice, I know it is for me.

VLC
I’m one of those people who has my computers integrated into my home entertainment system. I prefer living without silly disks and other dinosaur media. Years ago I moved away from music CDs, and I’m close to being done with DVD and blu-ray. The great thing about VLC is that it will play any video format you have. Really, anything. No messing around with plugins, codecs, video frame rates, and all that geeky stuff. Just right-click your video file and “open with VLC” and you’re off and running.

I have no desired to continually reconvert my videos to different formats to accommodate new devices and VLC let’s me just watch what I want. Again, it even works on iPhone and iPad, for all those who think you have to have Apple format video through iTunes.

Plex / XBMC
Plex is just a Mac fork of XBMC. From my experience, they seem to have improved on the original, but I haven’t spent enough time on the XBMC/Windows side to really be sure. This is the media-server equivalent of what VLC does for a single device. Basically you point it to the drive(s) where you store your videos, music and photos. It sets up a server which can speak DLNA so that your “usual” media server clients can use it, like Playstation, Samsung TV’s, Xbox, and more. You can also run a hard video line from your computer to your TV for an even better experience. This is one of those things that will ultimately lead to people cutting the cable. It has a plugin architecture and people are continually adding what they call “channels” which are really wrappers around existing web-based content such as Comedy Central, CNN or Aljazeera. You’ve got to try it to really understand, but it’s amazing.

Gimp
For those who need high-power image editing and manipulation, this is your gnu-alternative to those expensive programs out there. Maybe this one isn’t for the faint of heart, but for those familiar with image editing, it’s no problem to use. I’ve moved almost completely to Gimp and don’t expect to pay for other image editing programs in the future.

Firefox & Thunderbird
Firefox and Thuderbird, both from Mozilla, are definitely starting to feel dated where once they seemed cutting edge – sorry guys. I hope you push back to the front. But I cannot discount the contribution they’ve made to the world. I was once a dedicated Firefox user, but now it just looks clunky compared to others. If it wasn’t for Firefox we’d probably all be stuck with lousy browsers where instead we have several choices now. Thunderbird let us escape from the horrible enterprise email monoliths before full Ajax web-based email clients made it easy to live without a local mail client. Kudos, Mozilla.

Open Source Honorable Mentions

Projects that just didn’t quite make the top ten, although many would probably make the top 20. You’ll notice a recurring theme that I left many out because they’re just too wonky or complicated for everyday use or for “regular” people.

Apache Commons
I love Apache Commons. It has more cool useful libraries than I can count that keep me from inventing the wheel. However I suspect that it’s not used nearly as often as it could be. I almost put this in the top 10, but decided it’s probably too limited in use as well as scope – only developers feel this. Or maybe end-users should count as well? I can’t decide. But I can’t live without it either.

OpenOffice
I have to admit I’m a fan of anything that keeps the world from a single provider for office tools. But my experience with OpenOffice is frequently that it’s not quite there yet. There are some annoyances in conversion to/from the MS Office files that I have to use in my everyday life, and this means I just can’t rely on it. I wish I could. But I see improvements being made, such as a native OSX client rather than relying on an Xserver, so I have hope.

source control. There are a lot of good programs out there, but for most projects this will not only do the job, but do it well. By the way, it’s free of course. Why do people still pay for that big heavy source control program? You know the one I’m talking about.

Hudson / Jenkins
Continuous integration and build automation are very useful to software development. I find these tools useful for automation in general as well – goodbye Cron! It used to be just Hudson and then there was a split. Honestly I don’t know which one is a better choice right now – feel free to voice your opinion. I’m still using Hudson because I’ve gotten used to it. Probably you can’t go wrong with either.

Maven
Maven is a big improvement over “make” that we all used to use to build our software. It’s really the next generation of Ant, which is a great thing in itself. This one comes with a caveat though – there is a certain religious fervor that can cling to some Maven users. Using Maven can lead to lost productivity if pushed to extremes. And the “convention over configuration” mantra is a nice idea, but really the same is true if you simply do things in a standard Eclipse configuration. In practice it means “if you totally reconfigure all your development projects and source layout and builds to do what we think is good, you won’t have to reconfigure them”. We used to call that “my way or the highway”. Caveat emptor. Used properly, Maven will make your life better, used indiscriminately it will be painful.

Android
Android isn’t really free, and at least the current release isn’t really open. But still, it’s a nifty idea. I’ll ignore for a moment the potential IP issues until they get resolved in court one way or another, but having a strong mobile OS to compete and drive innovation helps everyone. Awesome.

Audacity
Audacity has a lot going for it. The times I’ve tried it I’ve always suspected it would probably do what I need, but it was tough to figure out. I think it’s still just too wonky for regular people. I suspect it will remain that way, as simple audio editing is becoming more and more available, even on our smartphones. Those who do heavy audio editing may disagree – let me know.

VNC
VNC is a remote desktop technology that works on Unix including Mac. It comes in a lot of varieties such as RealVNC and TightVNC – on the Mac it’s actually baked in as the native remote desktop. It’s a great idea, but obviously there are more people using Microsoft’s remote desktop, so I couldn’t put it int he top ten. But I use it all the time – maybe Microsoft will give up their proprietary ways and switch, but holding your breath is probably not a safe bet.

Handbrake
If you work with video files, such as ripping your movie collection, converting it to play on your Playstation, Xbox, PSP, iPod, smartphone, etc. then this is for you. It’s a powerful, full-featured open-source video conversion program. But suffers from nearly terminal geekiness. Out-of-the-box settings yield mediocre results compared to what a really good video file should have. Going beyond that requires an extreme amount of esoteric knowledge, and even at that it can be tricky to repeat at a later date. If you know video you can really enjoy this, if you’re a beginner you might get lost.

MediaWiki
We’ve all used Wikipedia. This is the software behind it. Lots of great things are being done around the web with wikis, but even without all that, you only need to look at Wikipedia to see how amazing this can be and has been. There are still some core issues that need to be worked out with the idea of canonical encyclopedia coming from a wild open community.

As you’ve probably figured out by now, there is some really great stuff coming out of Apache and Sourceforge. I haven’t covered Google much here because they’re more on the free application side in many cases though they do manage a lot of open-source as well – it’s worth a look both as an end-user (gmail, etc) and as a developer.