300x250 AD TOP

Search This Blog

Paling Dilihat

Powered by Blogger.

Sunday, March 6, 2016

Advanced Debugging

One of the most important steps in writing your code is making sure it can withstand any designed input, be it data or actions, so other than unit testing your code, you can also protect your code from other developers or even your own misuse, this is called robust programming and while it can help you to reduce the amount of bugs, almost every developer spends a lot of time in front of the computer when a bug is discovered.

Debugging is in common for all programmers in all languages, sometimes the easiest way to solve a problem is to attach a debugger and see a use case the code did not cover and fix it, the problem with this method is that unless you wrote everything you're not really sure if it will solve the problem, a bug is not just an error in the program, it could also be mishandling the state or data the software is supposed to process or something that was changed previously which the current code depends on.

Most of us encountered that software or API that if you make the slightest mistake, pass an invalid parameter, null or leave a field empty, the thing crashes like a racing car going 300mph straight into a wall. This can be frustrating for both developers and users.
Writing code that executes correctly, responds correctly and in a timely manner, while it might be more time consuming and requires more thinking and planning will no doubly benefit you when you spend the minimum effort of later debugging it.

To improve your code is relatively easy, when writing tests, also test for invalid parameters, make sure your code can fail gracefully and display an appropriate and helpful error rather than the famous Object reference not set to an instance of an object for example.

When you do find the need to debug your code, your debugging methodology, be it tests, asserts, logging and validations,  will help you to pinpoint the problem in a minimum amount of time/effort, not only that, as long as you have good testing practices you can fix bugs and refactor with the relative confidence the tests provide without fearing the house of cards that software without tests can suffer from.


Provide Meaningful Errors

A good error message will not only tell you that something failed to perform but also what and where, Usually you can figure out the problem of null reference exception by looking at the inner exception and the line number, but sometimes this requires to get a memory dump or even having a debugger attached.

Why not make life easier? For API that is accessible, make sure the data you get is the data you can operate on, otherwise throw a meaningful exception if possible and the current object state.


Logging

One of the best ways of enabling the developers to debug a production environment is via logging. If you log the flow and data, you can go over the code with that information in mind and see where things start to go wrong.


What to Log

In short, everything that will be needed to trace the code at the right logging level so it won't interfere with the normal operation of the application and its performance. You are allowed to make performance degrading logging in debugging environment and if done right, will not be executed when the particular level is turned off.

Examples:

Debug Level

bad:
entered into x function

good:
entered into x function with y parameter

better:
entered into x function with y parameter, object state is z.

Error Level

bad:
an error occurred

good:
an error occurred while executing x function with y parameter

better:
an error occurred while executing x function with y parameter, stack trace and current object state z.

Another thing to consider is that if you have multiple configurations, it's important to log the configuration on application start, if you have multiple versions or plan to deploy upgrades/updates in the future is to log also the file versions (could be done with FileVersionInfo).


Logging Levels

To separate the wheat from the chaff you should divide your logging into a logging levels. What it means is that you can get a rough look at the fatal errors, which will tell you there is a problem, then you can add the errors to see what failed and where, add warnings to see if recoverable errors occurred, if you need to get more information you can go into the info level and see what users are doing in the system and if that is not enough, look into the trace logs where you log decisions or even debug to see the actual data and functions that are being executed.

To surmise:

- Debug - log information that inside functions, information that will help you to understand what the function executed and on what data, for example the data that led to a condition being executed or not.

- Trace  - log information that describes what the application is doing behind the scenes, what its using to do it and how, for example, a file is being written to disk.

- Info - log information that the application is doing that the user sees, for example, user is searching for a keyword.

- Warn - log recoverable errors, for example, a much need and flagged as existing cache file is missing, the data can be retrieved again, so it's not an error, but that data should have been cached already.

- Error - log unrecoverable errors, processes the user can retry and perhaps will succeed.

- Fatal - log unrecoverable errors which no retry will help and the application must terminate.

The difference between Debug and Trace in .NET System.Diagnostics namespace is that Debug is compiled out in Release builds, so you can include absurd debugging information there, such as whole request headers.


Session vs State vs Method Calls

When logging data for multi user application, such as web application it's important to know which message belongs to which session. Some web applications maintain state across requests by using a session object, so a user can make state modifications and then expects to see results on their next request.

When logging object state, if the object has multiple instances, it's important to log the object id so we can differentiate between them, on top of that, if you're logging changes in state to objects, you can track through logs what changed these states and find the source of your problem.

When logging method executions, include the arguments passed, any warnings that object states are not "right" and output.


Exceptions and Stack Trace

Exceptions, while not debug logging per-say, are a great tool to provide meaningful errors to other developers, exceptions are just regular classes which inherit from Exception that can be thrown, they can include data, messages and the inner exception that occurred.

It's very important to maintain the inner exceptions as sometimes the origin of the error can be traced through it.

If you're using try..catch to just log an error and propagate it, use throw, like so:

try
{
    ...
}
catch (Exception ex)
{
    ...
    throw;
}

or if you don't care about the stack trace, you should at least preserve the inner exception like so:

try
{
     ...
}
catch (Exception ex)
{
     ...
     throw new Exception ("more data", ex)
}

Sometimes when debugging its very beneficial to preserve or get the stack trace as it can show you who called the method you're having trouble with. The stack trace can be easily retrieved with the Exception.StackTrace property or with the StackTrace class if you don't have an exception.

If you’re not handling exceptions, you should at least handle application errors and get a stack dump, for that you can attach a global error handler to AppDomain.UnhandledExceptions for general unhandled exceptions,  Application.DispatcherUnhandledException for WPF UI thread exceptions and Application_Error in a web application.


Security

While logging is a great tool to debug it also pose severe security risks, say your bank's software provider is logging usernames and passwords so they can debug the user log in process. say a hacker gets their hands on that log file, the debug level was left on by mistake and they know your username/password/token. Game over.

Be very careful with what you log and always provide a way to turn it off in case it's sensitive information.


Storage

To be useful the log messages needs to be stored for later on some medium.

- Event Log - Microsoft invested a lot in making the event log powerful, the performance of writing to the event log is very good, if you need your logs to be available for later processing by Event Log compatible tools, this is the way to go. In .NET you can access the event log directly via EventLog class, but I do recommend using a logging library to loose the coupling between your application and its logging method.

- Text/XML files - all logging libraries support files logging, formatting messages, timestamps etc', most of them even support log file rotations, which means that a file size is either limited by a number of messages or by bytes written, once a file is over the limit, its renamed and a new file is created, you can even set it up so that old files will be deleted.

- Database - some logging libraries support database logs, you can use SQL servers to store logs and later run statistics on them. in any case, you should not store them in the same database as your application as this will affect performance.

As most things in life, usually the simpler answer is the best one.

Whichever logging solution you select, make sure it's working with your requirements, if, for example, you'll be logging messages from a multithreaded application, make sure its logging whole messages and not mixing them with one another (bad) or crashing (worse) - make sure the library is threadsafe.


Analysis

While logs do contain a wealth of information about your application, sifting through 100MB (good case) or 1TB (bad case) of log records to find a bug or a problem is a lot of work, for that reason there are log analysis tools, Splunk, Logstash, Sumo and SawMill to name a few.


Helping the Debugger

Sometimes writing code that can help the debugger help you is worth the effort, overriding ToString can give you extra information when you dump the object to logs or when you look at a list of objects in the debugger, on the other hand, [DebuggerDisplay] attribute only affects the debugger and not the application, perhaps you have something else planned for ToString.

Sometimes you want the application to behave a bit differently when a debugger is attached to ease the pain of debugging, you can do that with Debugger.IsAttached, if you have a need to start a debugging session you can do that with Debugger.Launch and then break with Debugger.Break().

One example of using it is to debug the startup event for web applications(Application_OnStart) which is a bit harder to catch on IIS, as a side note, another trick is to attach the debugger and save web.config, it will recycle the application and you'll get your breakpoint.

Visual Studio debugger is very powerful, you can watch, you can drill into and you can break, conditionally or unconditionally.

.NET framework enables a few debugging mechanisms through attributes, some of them are only affected by Just My Code feature in the debugger, in short, Just My Code is a debugger feature used to save time so only the developer's code will be shown in the debugger, it hides call stacks and prevents the debugger from stepping into code not part of the developer's code.

DebuggerDisplayAttribute - we've seen this, it helps you to see what your object contains without actually drilling down into each one, this is especially helpful when debugging lists and you're trying to see their contents without drilling into each and one of them.

DebuggerBrowsableAttribute - used to control visibility of a property or field in the debugger's watch, this is useful for hiding certain items in your objects.

DebuggerHiddenAttribute - used to tell the debugger not to step into the marked method or property, no breakpoints will be hit, this is especially helpful when having utility methods that you never want to step into and then out, saving you precious debugging time, the call stack will not show anything, this attribute will also hide code from Code Coverage.

DebuggerNonUserCodeAttribute - used to hide classes/structs/methods and properties from the debugger, they will not be shown in the call stack, nor will the debugger step into them, this attribute will also hide code from Code Coverage.

DebuggerStepperBoundaryAttribute - used to continue running after user steps through a DebuggerNonUserCodeAttribute.

DebuggerStepThroughAttribute - used on class, struct, method, prevents the debugger from stepping into this method, a breakpoint will be hit if Just My Code is disabled.

DebuggerTypeProxyAttribute - used on assembly, class and struct, allowes the developer to design their own debugger view for types. this is especially useful for complex classes where having certain information in the debugger window is useful, this allows more flexibility than DebuggerDisplayAttribute.

DebuggerVisualizerAttribute - used on assembly, class and struct, provides a visualizer for a certain type, example usage would be to display the image for an image type instead of just the width/height/image bytes.


Symbols

Symbols are represented on Windows in PDB files, which are one of the greatest help to debugging on windows, on .NET it tells you source filenames and lines, which can help you to locate the source of the error/exception.


Logging Libraries

Common.Logging - provides an abstraction logging library, which on top you can get specific loggers for log4net and NLog for example.

Log4net - this is actually one of the most famous ones, there are many flavors of that library for many languages, log4j, log4javascript and log4cpp.

NLog - log4net and NLog provide similar functionality, the major difference is that NLog is more active.

Enterprise Library Logging and Semantic Logging - Part of Microsoft Patterns & Practices team, Microsoft's Logging Implementation.

ObjectGuy Logging Framework - One of the more popular and lightweight logging framework.

Here's a small comparison between some of them.


Trace Listeners

While full blown logging framework provide good and elaborate ways to log messages in your application, sometimes Debug and Trace messages are enough but you do want to capture these messages other than inside the debugger, this is where Trace Listeners come in, there are Event Log, File, Console, Delimited, XML and the Default, which is writing to the debugger log.

If you want to implement multiple logging levels in combination with the built-in Debug/Trace classes, you can use TraceSwitchs and if you want the advanced version of Trace, you should use TraceSource, which support multiple levels and traces.


Protecting against cryptic errors

Cryptic errors, Null References, Exceptions which tell you nothing, "input error", "field validation error". I hate them, so should you, they make you dig through the call stack, attach a debugger, read states, add watches, look at flows, etc' etc', this is not fun programming, let's see what we can do about them.

All code should verify its input, it should make sure that the data is ready to operate on, if you're worried about performance, you can code the validations so they can be compiled out, conditionals, preprocessor, choose.

The simplest example that comes to mind is a function that divides a number. throw an exception when the denominator is zero for example.

But that's too simple, why is the denominator zero? is it a configuration issue? is it coming from a different operation? is it a result of a malfunction? That's what we would really like to know.

Another thing to keep in mind is what if the denominator is not coming from an argument but rather a state, how did the state get there? who changed it? why is an invalid value allowed? why didn't we throw an error when the denominator was set to 0 in the first place?

When developing methods, classes, properties, one should always ask themselves "what if" questions and try to protect the objects from invalid state by throwing the right error as early as possible.

Choosing between Asserts and Exceptions should be straight forward, if the errors should be checked and handled on production, use Exceptions, if the errors are mostly development, use Debug.Assert or Trace.Assert. When compiling a release version, only Trace remains, this is right for all Trace vs Debug methods.


Why I shouldn’t use Asserts?

I’ll play the devil advocate here, asserts are, well, cool development helpers, but they – in essence, create a different code that execute for debug/release builds, some properties can cause an internal object state change and if observed, can change the outcome. In essence, this is a different behavior in two builds that are supposed to be the same code.

Never use asserts to validate user input.

Never use asserts to validate environment issues, such as memory allocation, network and database and file system issues, these are to be expected and handled with exceptions.


Debugging Methods

Debugging production is not the same as debugging a development build, which has all the symbols and debug messages, debugging web applications is not the same as desktop applications and so on.

Debugging development is relatively easy, you have all the symbols, you know which line an error occurred, you can attach a debugger and drill down through the call stack, examine variables and see the current state of almost everything related.

But what do we do on production?

That's why exceptions, call stack and logs are important. but without context they might mean something you didn't think about, so always log file version, configurations, environment settings, anything which might affect your application.

You should also try to divide you logs to meaningful modules, perhaps even classes if you enable tracing, so you'll know how you got where you did. If your object can be called from multiple paths, perhaps even including where the call came from. I've been chasing a bug more than once, thinking it came from one place but actually coming from a different place.

Another thing you can include in your logs is stack trace, this is especially easy when doing AOP where you can hook an interceptor on all methods and see who called what and even time it.

Other than knowing what to write to logs, you should know how to read these logs, otherwise it's just wasted disk space. make sure your log reader can filter, search and if its a web application, trace all the calls for the same session id.

If you find errors or warnings, don't just hang on the ones you see, filter the entire log for similar issues, if you wrote your log messages correctly, you'll most likely find other warnings or errors which might seem unrelated at first but could have a relationship.


Tools

Debugging tools are very important, but use the appropriate tool for the right debugging job, dump / memory tracing / performance analysis, web or desktop.

First and foremost is the Visual Studio debugger, which has an extensive set of debugging tools, with the most basic ones such as breakpoints and watches, and continuing with call stack, threads, tasks and immediate. It can open a dump file and load symbols, with this set of tools, debugging is easy.

Next in line is the windbg which can be used anywhere with just simple .net helpers, sos.dll, psscor2.dll and mscordacwks.dll.

Getting a memory dump is just a matter of right clicking the process in Task Manager and selecting Create Dump File.

If you're having memory issues, the simplest way would be to get a dump, open in Visual Studio and look at Heap View or use 3rd party tools like ANTS Memory Profiler or dotTrace.

The Visual Studio Performance Profiler can be run on both the development machine and as a standalone on production (but note it might slow down the overall performance), even on web applications, you should always test performance on your development machine or test machine, production should be the last resort and on 99% of cases is not needed.

Debugging multithreaded applications is easy, you can freeze/thaw threads, you can switch threads and on a higher level you can do the same with Tasks.


“An ounce of prevention is worth a pound of cure.” ― Benjamin Franklin

The idea of asserts is that you can choose to run these checks for every method, if an invalid value is provided, you have a few choices, exceptions, asserts and Contracts can help you to tell the developer if they made a mistake, like uninitialized properties, invalid variables passed into your methods, this is called Defensive Programming, the basic premise is that you write code as if everyone is trying to get you, paranoid level. In my opinion, when looking at methods that are not 100% clear what they are doing and which parameters are acceptable ones, contracts and asserts can really make things clear, if you see a line such as assert(x != null), you’re definitely going to need x.

Defensive or Robust Programming can help reduce bugs, security vulnerabilities and wrong outcome, it can help you to avoid hours of debugging by catching problems when they occur, think of it as a sanity check for every method.

Exceptions – Throw exceptions when things are not designed to work the way they do, for custom exceptions if possible, include the culprit so further analysis is possible, part of your unit tests is to check that bad conditions are indeed throwing exceptions.

Debug.Asserts – Assert as much as needed, check all parameters that can crash or make a method misbehave, check assumptions in the middle of the methods, always include an assert message that explains the error, it should be enough to understand the problem and how to fix it as asserts are helpers for developers, if you log, include the state that caused the exception, if you have a struct that defines the bad behavior, its relatively simple to point to the data that caused the assert.

Trace.Asserts – If an application state can cause damage to user data, its important to leave these checks even in production code, for this you can use Trace.Assert as it doesn’t get compiled out in release builds, the biggest difference between an unhandled exception and Trace.Assert is that the user can elect to ignore the assert while the unhandled exception will terminate the application.


Logging Performance

You might encounter performance issues in your application with regard to logging if you log too much, after all, each log record requires execution of code other than your application's logic.  When debugging you can turn on all the logging levels but turn off the debug and trace levels in production.

Another thing to remember is that if your log method looks anything like so:

void Log(string message, params object[] args);
Log("writing to file", objects.Where(i=>i > 1).ToArray());

it will go into the call and execute the Where(i=>i > 1).ToArray() even if the log is not written.


You can fix this by having your logging methods accept a lambda and have a more complex Log method, similar to this:

Log("writing to file", ()=>{return objects.Where(i=>i > 1).ToArray()});

You'll end up with code that doesn't execute if the logging level is not enabled and you don't have to check the log level every time you log a message.

If your code is using Debug and Trace built-in functionality (Write / Assert), Debug is compiled out of the code in release builds, if you have special logs for debug builds you can mark your code with Conditionals and preprocessor.

What are Conditionals? Let's say we define the symbol LIVE:


Project Properties / Build / Conditional compilation symbols

Essentially this:

[Conditional("LIVE")]
void sendDetailsToServer()
{
}

will be compiled in. but if you remove the symbol from the list, the compiler will not compile sendDetailsToServer and all the calling code.

Another option is to use the preprocessor, it is more primitive, surrounding any code in #if, #endif could be a bit exhaustive since you also need to surround the calling code as well.


Performance Logging

Debugging issues is not only bugs, it can also be performance issues, while performance analysis can help you to pinpoint the location of the bottlenecks in your application, it's important to give the outside world some performance metrics of your general operations.

If you look at the Windows Performance Monitor, you will see many such performance metrics, they are called Performance Counters, some of the values there are indicated as a total and some as per-interval. .NET provides access to the Performance Monitor through PerformanceCounter class, on the plus side of working with Performance Counters, anyone can add them and measure the application's performance, an administrator can put an alert on a high value and you get the added benefit of making your application system administrator friendly.

In the end, they can use your metrics with your guidance to detect problems or potential problems before they become a catastrophe.

Another thing that can help you to look for problems in the log is including how much time a certain operation took, you can use StopWatch to time your methods.


Contracts

Code Contracts allows developers to do static validations that methods are indeed getting valid parameters and producing a valid output, code contracts are not new, Microsoft has recently open sourced it so you can take a look.

To use code contracts you need to download a package that hooks to the compilation process and does static analysis of the code and its contracts.

Eventually you will get a list of contract violations in the TODO pane which will indicate potential problems.

The static code analysis has some problems though, it's very slow and sometimes it can give false positives.

If you choose not to use the static code analysis, you can still benefit from it by hooking the Contract.ContractFailed event which will give you an indication there is a problem.

I can't really recommend it, it looks cool but I don't have experience with any large scale project using it, so feel free to explore the Contract class, faq and documentation.


Summary

Debugging is part of every developer's life, while it is important and useful way to fix bugs it's more important to avoid bugs as much as possible, unit tests are a great development tool, they allow you to easily fix and refactor your code without being overly cautious. but as much as we would like to find a magic bullet, there isn't one really, so my moto is prepare for the worst, log everything that could help you diagnose a problem or indicate a suspicious state.

In the end, no article is going to make you a good programmer, this is only an introduction to a nice set of tools to get started, test, log, debug.
Tags: , , , , , , , , , , , , , , ,

0 comments:

Post a Comment