Category Archives: Software Development

AI Smackdown – Siri vs Eliza

John McCarthy © by eschipul
John McCarthy, creator of LISP, and father of modern artificial intelligence (AI) died today. He was one of the first guys to work seriously on getting computers to do things that you wouldn’t normally have expected computers to do, like play Chess or have a conversation.

One of the most common AI applications programmers have played with over the years is Eliza. Like others at the time, one of my first attempts at writing an interesting piece of software in my youth was to alter an Eliza program written in BASIC (as taken from the pages of Creative Computing, for all you old-timers). Eliza is a simple software psychoanalyst created in the 60’s by Joseph Weizenbaum. While Eliza wasn’t written by McCarthy, she was certainly a product of his inspiration on the industry.

Ever since the new iPhone came out with Siri, the digital assistant, I’ve been thinking about comparing her to Eliza. I don’t often get to play with AI here at Parasoft and the death of McCarthy brought the topic to my mind again. I’ve put together a small comparison of questions and answers from Siri and Eliza. Note that with Siri, your mileage may vary, she doesn’t always answer the same way. I’ve obviously skipped most of the “useful” things that Siri can do, as it would be an unfair comparison for poor old Eliza, who doesn’t have all the modern advantages that Siri does.

It’s interesting because while they share a common logical bond, their personalities are very different. Siri was designed to be helpful and amusing. Eliza was specifically designed not to be helpful, in that she’s supposed to be a therapist and lead you to your own conclusions rather than give you answers. Eliza isn’t an assistant, she’s an attempt to play the “imitation game” also known as the turing test.

For the samples below I used a java version of Eliza that you can try for yourself on the web.

Samples:
Q: Can you help me?
E: You believe I can help you don’t you ?
S: You can say thinks like… (A list of sample queries)

Q: What are you doing?
E: Why do you ask ?
S: Just thinking

Q: Who is your favorite composer?
E: Why are you concerned over my favorite composer?
S: I guess I don’t have one.

Q: Are you married?
E: Why are you interested in whether I am married or not ?
S: We were talking about you, not me.

Q: What is your name?
E: I am not interested in names.
S: My name? It’s Siri.

Q: You are beautiful
E: What makes you think I am beautiful?
S: Oh, Stop.

Q: I love you
E: Perhaps in your fantasies we love each other.
S: Oh, Stop.

Q: How are you?
E: Does that question interest you?
S: I am well ____ (name)

Q: What do you think?
E: What is it you really wanted to know?
S:It’s nice of you to ask, but it doesn’t really matter what I think
or I think differently.

Q: Good night
E: I’m not sure I understand you fully.
S: Ah… it’s still only _____ (fill in the time)
or other answers depending on the time of day.

Q: Can you send a message to my brother?
E: Tell me more about your family.
S: I don’t know who your brother is, To whom shall I send it, I need a contact name, phone number, or email address.
orWhat do you want to say to ____ (your brother)?

I’m not sure about all of the reasons they’re so different. Sure, the state of the art in AI has come a long way since the 60’s. Or is it just that men’s expectations of women have changed? I was tempted to write that perhaps people are more friendly or helpful now than in the 60’s but that’s ridiculous. Perhaps it’s only that computers are now more helpful and friendly than they were. Is it possible that Eliza’s seeming bad mood had something to do with her obvious handicaps in memory and CPU. Maybe she was aware of this, and it caused her to be ill-tempered? In any case, Eliza comes across as a bit cynical, while Siri is much more light-hearted most of the time. Siri’s mood can definitely change as you can see from some of the answers.

It occurs to me that it would be funny to get Siri to talk to Eliza – would Eliza help Siri, or would Siri end up making Eliza more friendly?

So if your computer was nice to you today, thank John McCarthy.

[Update I added a few more links and minor clarification as well as AI resources]

Here’s a list of my favorite fiction books about killer AI.

Some resources on AI artificial intelligence:

Artificial Intelligence: The Basics

Artificial Intelligence for Humans, Volume 1: Fundamental Algorithms

Artificial Intelligence in the 21st Century (Computer Science)

The Artificial Intelligence Revolution: Will Artificial Intelligence Serve Us Or Replace Us?

Books on AI at Amazon

Your Two Cents About What Went Wrong With Static Analysis

I’ve gotten a lot of interesting feedback on the What Went Wrong with Static Analysis? post. So many people had their ideas about what was working, what wasn’t, and how to address it, that I thought I’d give people a chance to give their two cents.

I’ve created a poll which some basic issues as listed in the post and in various comments on it. Feel free to vote – there is a place if you have something not already on the list. After it’s been up for a bit I’ll post some results and commentary as is applicable.

Resources

Dennis Ritchie… Father of C, UNIX, and Much, Much More

I just posted a brief note about Dennis Ritchie at the Parasoft Blog. You can read about this amazing man who helped create C and Unix. Our thanks to him.

Reprinted below:


Dennis Ritchie - The creator of Unix and C
Dennis Ritchie – The creator of Unix and C

Dennis Ritchie, co-creator of the C programming language and UNIX operating system, died this week (2011). Back in the early days of Parasoft, we used to refer to “C” as “K&R C” (for “Kernighan and Ritchie C”). In fact, like lots of other long-time C programmers, many Parasoft veterans still have the classic C Programming Language book sitting on their bookshelves today.

Although he’s hardly a household name, Ritchie has had a tremendous influence on the software development community. Where might we be today if we didn’t have the luxury of building on his foundations?
On the language side, consider all the languages that were derived from C. Without C, there’s no C++…without which there’s no Java, no C#, and no Objective C. An enourmous amount of the software that we use everyday was built on those languages.

And on the OS side, think of all the things that stemmed from UNIX. Without UNIX, there’s no Linux. No Mac OS X. No Solaris. And without Linux, where would the open source community be? Would we have Android? What would the mobile device market look like? The server market?

In many ways, his vision was eerily similar to that of Steve Jobs: have it do what you really need it to do… and no more. It’s the epitome of elegant engineering.

If you compare C to Java, one of the defining differences is that Java is a very rich language. It has a built-in library that will cover pretty much anything you can think of. C has none of this—but it’s fantastically fast. It takes a lot less code to do something in C than it does in Java, VB, C# and the like. That’s really why C is still so popular. It’s a great balance between being close to the computer (and thus efficient) and being human understandable. The newer languages are more human understandable, but the trade off is that they’re rather inefficient compared to C.

The UNIX kernel is the same way. Amazingly, Thompson and Ritchie’s UNIX kernel was only 64K—smaller than the current Linux keyboard driver! UNIX truly respects the concept of having layers in an operating system. At the core, there’s just a kernel that runs the computer. Services lay on top of that. Networking is separate. Hard drives are an add-on (not core to the OS). And the GUI is a very high-level layer. This separation enables extreme efficiency. For example, while moving Windows to a new chip tends to open a can of worms, it’s actually quite simple with UNIX.

A few remarkable quips from Ritchie:

“I am not now, nor have I ever been, a member of the demigodic party.”

“UNIX is very simple, it just needs a genius to understand its simplicity.”

“C is quirky, flawed, and an enormous success.”

As Jon “Maddog” Hall, executive director of Linux International, tweeted: “…all programmers owe him a moment of silence.”

For a nice tribute to Ritchie, see the special Dr. Dobb’s newsletter.

What is Static Analysis… and What is it Good For?

As I talk to people about static analysis, I get a lot of questions and it seems that static analysis means different things to different people. The definitions people use for static analysis can be any or all of the things in this list:

  • Peer Review / Manual Code Review / Code Inspection
  • Pattern-based code scanners
  • Flow-based code scanners
  • Metrics-based code scanners
  • Compiler / build output

A working definition of static code analysis is the analysis of computer software that is performed without actually executing the software being tested. I’d like to talk briefly about each of these techniques, when/where to use it, and why it’s helpful.

Perhaps the oldest version of static analysis is Metrics-based code scanners where we look at things like complexity or even simply number of lines or methods in a file. Pattern based code scanners are what some think of as the traditional static analysis technique. A more modern offshoot of this is Flow-based code scanners where they look at paths through the code and say “Oh this could happen to you or that could happen to you”. The last is, which most people don’t think about, it output from your compiler or your build process which is a very valuable thing.

Peer Code Review

Let’s start with what we would call peer code review or code review or manual code review or code inspection. The idea is that humans are looking over each other’s shoulders, the idea being to check to see if the code is doing what you are trying to do. And there’s some really cool stuff out there to help you do this more efficiently. What you don’t want code review to be is checking syntax, or in fact anything that could be checked by an automated tool.

What we want to do is in some point, get some other eyeballs into some other’s code beyond the people who write for themselves, so we don’t get some kind of self-sustained mechanism that someone decides to do something in certain way.

Peer Code Review will help you finding problems early and functional problems. Most important part of Peer Code Review is have an access to mentoring process where other people look over your shoulder and can give you feedback like “you know… I would do that differently for this case. Here is better way to do it”.

You learn from code review because you benefit from the experience of others. In addition it helps you learn the code base because you are looking at other pieces of an application rather than just your own.

Pattern based analysis

Pattern based static analysis is finding a specific pattern from your code. This could be a good pattern meaning something you want in you code, or a bad pattern meaning something you don’t want to be in your code (bugs).

For example, I may have a branding issue. I want to make sure when my code prints out copyright statement. In this case the pattern is that certain code exists where I expect it to be, such as the footer of a web page.

Or the pattern might be a bad one, such as code that doesn’t free up resources when it’s done, causing memory leaks.

It may also be formatting issues that curly is here and under bar used there and case sensitive names used here and there. But we have to remember that it’s not just syntax problems but really things could cause a bug, for example when we try to internationalize our product or it may cause performance issue things like that.

Additionally the really cool thing about pattern based static analysis is that it improves the developers themselves. For example, a developer writes code for accessing a database. The code includes a try/catch block but he forgot to free up resources with a finally block.

The pattern based static analysis tool catches it and lets the developer know. After a few times with this warning, the developer will learn from it and start right code with a finally block to avoid the violation (and the nagging from the tool).

In other words, the tool is actually teaching the developer by suggesting a best practice  over and over again. Ideally, you encapsulate intelligence of your best developers into pattern based rules.

Pattern based static analysis is not just to check syntax and code formatting. It’s designed to save you time, not to “take time”. Some people say they don’t have enough time to perform static analysis, but generally, you don’t have enough time to skip it.

Flow Analysis

Flow analysis is the idea that instead of looking for a specific pattern in a particular file or class, we are going to look for a pattern based on trying to follow particular path through the application. But rather than run the application, it simulates the application execution.

Flow analysis looks possible paths of the logic and then manipulates the data to see if the bad pattern appears. For instance, it might try to inject bad data to see if it causes a problem, such as a SQL injection.

The paths are hypothetical in that they may or may not actually occur when you use the application, but they are at least possible. The cool thing is that it find real bugs in your application.

One of the things that flow analysis can find is uncaught exceptions. This may always be a problem because sometimes you handle the exception another way. For example, web application servers commonly have a wrapper to catch all exceptions. This is important because system uncaught exception acts basically same way as system exit.

Sometimes you have application stability issues, and very frequently it is related to unhandled exceptions. In such a case, flow analysis is a big help in improving your application.

API misuse is another common source of problems that are handled with flow analysis. Where the API not well understood or poorly documented it can lead to memory leaks or corruption.

With security, it finds the some potential types of problems for you that you can start to work on. It’s a great first pass, but it’s not as powerful as pattern based analysis for preventing issues and for giving thorough coverage, being limited to the hypothetical paths that the testing tool can figure out.

Metrics

Metrics falls into two goals: you want to understand what’s going on in the code and the other is find possible problems. They do this by measuring something in the code. Sometimes when they people talk about metrics, they mean KLOC, cyclomatic complexity, number of methods or classes, things like that.

Metrics can point you to potentially dangerous design issue which is very helpful. They are generally more useful more at the design level than the debugging level.

When tools started doing static analysis about 20 years ago, there were a lot of metrics in place. When people had a bug in field and couldn’t reproduce it they tried to use metrics to suggest where the problem might be, then use a debugger to check the area suggested by the metrics.

The problem is that sometimes it gives you a good idea and sometimes it doesn’t. It really depends on what metrics you are looking for. If you’re using metrics to try and find bugs, it can be difficult and time-consuming. But if you’re trying to use them to understand your application, they actually end up telling you things.

So let’s assume you have a metric that measures the number of lines in files in your application and you start notice giant files. It probably means that design is not as good as it should be, because components should be very discrete and they should have known inputs and they should produce known outputs. When files get large they probably have a lot of complicated logic in the middle of them. Typically it is a good time to look at and refactor and build them down.

Compiler / build output

You should think of compiler warnings as a useful form of static analysis. Internally we set a policy many years ago that our products must compile without compiler warnings. It turns out that many of the compiler warnings are traceable to real problems in the field. At best, they mask real problems buried within your code. If you think that you can ignore compiler warnings, you’re assuming you know as much about the language as a compiler programmer. They put compiler warnings in place because they think about the code in terms of how the language is supposed to be used. If they give you a warning, it means they’re concerned that the code won’t operate properly. It’s best to pay attention to such warnings.

All of these types of static analysis can be valuable in improving your code and your development process, and even your developers as I discussed. I’ll go more in depth on these techniques in future posts.

[Disclaimer]
As a reminder, I work for Parasoft, a company that among other things make static analysis tools. This is however my personal blog, and everything said here is my personal opinion and in no way the view or opinion of Parasoft or possibly anyone else at all.
[/Disclaimer]

Resources