2013-07-30
Dynamic Program Analysis

In this post, I’ll go through dynamic program analysis. It complements static code analysis, which was covered in the previous post. If you develop software for a living, this is an area you must know!

What is it?

When a software program is analyzed during execution, it’s called dynamic program analysis. The program is fed with suitable inputs (like test data, sequence of API calls or user interaction), while its execution is monitored from several viewpoints. Interactive debugging and testing belong to this same area, but today I’ll only go through the automated means of analysis.

Inspecting bug

The following lists different analysis categories based on a brief market research:

  1. Tracing: When you monitor your program’s execution flow, you are tracing. Typically you do this by adding log statements into the right places of your program. This process is called instrumentation, and it can be done in various ways: A) by writing additional code into the source files by hand, B) by using a separate tool to do that for you, C) by letting a compiler add the statements during the compilation, D) by using a separate tool that transforms the original executable into instrumented one, or E) by letting a virtual machine modify the executable just before the execution.

  2. Latencies: When a user searches something in your application, are the results shown within acceptable time? Augment the trace output with timing information, and you can find this out. The most common latency measurement targets are UI and database operations, API calls and isolated algorithms.

  3. Code Coverage: If you want to know how well your program is tested, you want to know your code’s coverage. Coverage may be inspected from various viewpoints, like how many of the program’s functions have been called, have those functions been called using a fair set of values, and have all the execution paths been taken?

  4. Errors: Analysis falls into this category when it’s set to detect possible exceptions, crashes and memory corruptions. I bet you want to catch that segfault in your tests, rather than getting angry email from your customers?

  5. Security Flaws: How secure your program really is? Feed attack input into it, and see how it behaves! This is what various security analysis tools do when they hammer your application with things like SQL injection strings.

  6. Memory: Memory requirements may sneak up on you if you don’t monitor your memory consumption. When using languages that don’t manage memory on your behalf, memory leaks are especially nasty to deal with. In the optimal case, you take memory monitoring as part of your test suite, and get warnings when you exceed your predefined thresholds.

  7. CPU: You typically want to measure CPU consumption in devices that have limited battery power, or at least when the whole system starts to choke when running your application. Also if your non-optimized code is running in the core of a vast data center, you are not doing any favors to our environment. CPU consumption is typically measured either by sampling the execution, or listening to certain runtime events. After this, timing is calculated for various parts of your code based on gathered samples. Tracing may also be utilized for the same purpose, though the program execution might get much slower and distort the results.

  8. I/O: Forgetting to close the file and socket references in your long running program is usually a recipe for a failure. In a similar fashion as with CPU and memory, it’s best to ensure proper I/O usage before your code is shipped.

  9. Parallelization: Modern systems make your programs run faster by parallelizing the program execution. In case your program isn’t optimized for this, you may want to use systems that analyze the execution, and propose how you could change the code.

Tools

There seems to be fewer tools for dynamic than static analysis, although there is still plenty. Again, several tools combine functionalities from different categories, and some even add static analysis features into the mix.

Tracing and latency measurements seem to be not so well productized, probably due to their ad-hoc nature. Software and Programmer Efficiency Research Group lists various tools to visualize both traces and latencies. Another tool called jtracert takes documentative approach to trace visualization by creating UML sequence diagrams. The picture below shows a tool called Tuning Fork on the left, and jtracert on the right:

Trace visualization

JProfiler is an all-rounder for various analysis tasks for Java. It monitors CPU, memory and I/O usage and shows the results nicely. In case you lean towards .NET, you might want to give dotTRACE a try. For C/C++, Insure++ seems like an interesting choice, and if you prefer open source, Valgrind is certainly worth of look.

DotCover and Clover might help you to discover your code coverage. Both of them belong to a large a family of associated products, though in different ecosystems (Atlassian vs. JetBrains).

An example of security tooling is Parasoft’s SOA solution. As with most other products, also this one bundles several functionalities into the same tool. Parallelization is well illustrated in the Pareon-product, which promises to optimize your code for multicore architectures.

Conclusion

As with static analyzers, you should use some of the mentioned tools to your advantage. They typically focus on certain programming environments, which will affect your available choices. JVM, .NET and C/C++ are quite well represented, but many others have much less to offer at least in the commercial tooling front.