300x250 AD TOP

Search This Blog

Pages

Paling Dilihat

Powered by Blogger.

Showing posts with label Performance. Show all posts
Showing posts with label Performance. Show all posts

Sunday, March 6, 2016

Advanced Debugging

One of the most important steps in writing your code is making sure it can withstand any designed input, be it data or actions, so other than unit testing your code, you can also protect your code from other developers or even your own misuse, this is called robust programming and while it can help you to reduce the amount of bugs, almost every developer spends a lot of time in front of the computer when a bug is discovered.

Debugging is in common for all programmers in all languages, sometimes the easiest way to solve a problem is to attach a debugger and see a use case the code did not cover and fix it, the problem with this method is that unless you wrote everything you're not really sure if it will solve the problem, a bug is not just an error in the program, it could also be mishandling the state or data the software is supposed to process or something that was changed previously which the current code depends on.

Most of us encountered that software or API that if you make the slightest mistake, pass an invalid parameter, null or leave a field empty, the thing crashes like a racing car going 300mph straight into a wall. This can be frustrating for both developers and users.
Writing code that executes correctly, responds correctly and in a timely manner, while it might be more time consuming and requires more thinking and planning will no doubly benefit you when you spend the minimum effort of later debugging it.

To improve your code is relatively easy, when writing tests, also test for invalid parameters, make sure your code can fail gracefully and display an appropriate and helpful error rather than the famous Object reference not set to an instance of an object for example.

When you do find the need to debug your code, your debugging methodology, be it tests, asserts, logging and validations,  will help you to pinpoint the problem in a minimum amount of time/effort, not only that, as long as you have good testing practices you can fix bugs and refactor with the relative confidence the tests provide without fearing the house of cards that software without tests can suffer from.


Provide Meaningful Errors

A good error message will not only tell you that something failed to perform but also what and where, Usually you can figure out the problem of null reference exception by looking at the inner exception and the line number, but sometimes this requires to get a memory dump or even having a debugger attached.

Why not make life easier? For API that is accessible, make sure the data you get is the data you can operate on, otherwise throw a meaningful exception if possible and the current object state.


Logging

One of the best ways of enabling the developers to debug a production environment is via logging. If you log the flow and data, you can go over the code with that information in mind and see where things start to go wrong.


What to Log

In short, everything that will be needed to trace the code at the right logging level so it won't interfere with the normal operation of the application and its performance. You are allowed to make performance degrading logging in debugging environment and if done right, will not be executed when the particular level is turned off.

Examples:

Debug Level

bad:
entered into x function

good:
entered into x function with y parameter

better:
entered into x function with y parameter, object state is z.

Error Level

bad:
an error occurred

good:
an error occurred while executing x function with y parameter

better:
an error occurred while executing x function with y parameter, stack trace and current object state z.

Another thing to consider is that if you have multiple configurations, it's important to log the configuration on application start, if you have multiple versions or plan to deploy upgrades/updates in the future is to log also the file versions (could be done with FileVersionInfo).


Logging Levels

To separate the wheat from the chaff you should divide your logging into a logging levels. What it means is that you can get a rough look at the fatal errors, which will tell you there is a problem, then you can add the errors to see what failed and where, add warnings to see if recoverable errors occurred, if you need to get more information you can go into the info level and see what users are doing in the system and if that is not enough, look into the trace logs where you log decisions or even debug to see the actual data and functions that are being executed.

To surmise:

- Debug - log information that inside functions, information that will help you to understand what the function executed and on what data, for example the data that led to a condition being executed or not.

- Trace  - log information that describes what the application is doing behind the scenes, what its using to do it and how, for example, a file is being written to disk.

- Info - log information that the application is doing that the user sees, for example, user is searching for a keyword.

- Warn - log recoverable errors, for example, a much need and flagged as existing cache file is missing, the data can be retrieved again, so it's not an error, but that data should have been cached already.

- Error - log unrecoverable errors, processes the user can retry and perhaps will succeed.

- Fatal - log unrecoverable errors which no retry will help and the application must terminate.

The difference between Debug and Trace in .NET System.Diagnostics namespace is that Debug is compiled out in Release builds, so you can include absurd debugging information there, such as whole request headers.


Session vs State vs Method Calls

When logging data for multi user application, such as web application it's important to know which message belongs to which session. Some web applications maintain state across requests by using a session object, so a user can make state modifications and then expects to see results on their next request.

When logging object state, if the object has multiple instances, it's important to log the object id so we can differentiate between them, on top of that, if you're logging changes in state to objects, you can track through logs what changed these states and find the source of your problem.

When logging method executions, include the arguments passed, any warnings that object states are not "right" and output.


Exceptions and Stack Trace

Exceptions, while not debug logging per-say, are a great tool to provide meaningful errors to other developers, exceptions are just regular classes which inherit from Exception that can be thrown, they can include data, messages and the inner exception that occurred.

It's very important to maintain the inner exceptions as sometimes the origin of the error can be traced through it.

If you're using try..catch to just log an error and propagate it, use throw, like so:

try
{
    ...
}
catch (Exception ex)
{
    ...
    throw;
}

or if you don't care about the stack trace, you should at least preserve the inner exception like so:

try
{
     ...
}
catch (Exception ex)
{
     ...
     throw new Exception ("more data", ex)
}

Sometimes when debugging its very beneficial to preserve or get the stack trace as it can show you who called the method you're having trouble with. The stack trace can be easily retrieved with the Exception.StackTrace property or with the StackTrace class if you don't have an exception.

If you’re not handling exceptions, you should at least handle application errors and get a stack dump, for that you can attach a global error handler to AppDomain.UnhandledExceptions for general unhandled exceptions,  Application.DispatcherUnhandledException for WPF UI thread exceptions and Application_Error in a web application.


Security

While logging is a great tool to debug it also pose severe security risks, say your bank's software provider is logging usernames and passwords so they can debug the user log in process. say a hacker gets their hands on that log file, the debug level was left on by mistake and they know your username/password/token. Game over.

Be very careful with what you log and always provide a way to turn it off in case it's sensitive information.


Storage

To be useful the log messages needs to be stored for later on some medium.

- Event Log - Microsoft invested a lot in making the event log powerful, the performance of writing to the event log is very good, if you need your logs to be available for later processing by Event Log compatible tools, this is the way to go. In .NET you can access the event log directly via EventLog class, but I do recommend using a logging library to loose the coupling between your application and its logging method.

- Text/XML files - all logging libraries support files logging, formatting messages, timestamps etc', most of them even support log file rotations, which means that a file size is either limited by a number of messages or by bytes written, once a file is over the limit, its renamed and a new file is created, you can even set it up so that old files will be deleted.

- Database - some logging libraries support database logs, you can use SQL servers to store logs and later run statistics on them. in any case, you should not store them in the same database as your application as this will affect performance.

As most things in life, usually the simpler answer is the best one.

Whichever logging solution you select, make sure it's working with your requirements, if, for example, you'll be logging messages from a multithreaded application, make sure its logging whole messages and not mixing them with one another (bad) or crashing (worse) - make sure the library is threadsafe.


Analysis

While logs do contain a wealth of information about your application, sifting through 100MB (good case) or 1TB (bad case) of log records to find a bug or a problem is a lot of work, for that reason there are log analysis tools, Splunk, Logstash, Sumo and SawMill to name a few.


Helping the Debugger

Sometimes writing code that can help the debugger help you is worth the effort, overriding ToString can give you extra information when you dump the object to logs or when you look at a list of objects in the debugger, on the other hand, [DebuggerDisplay] attribute only affects the debugger and not the application, perhaps you have something else planned for ToString.

Sometimes you want the application to behave a bit differently when a debugger is attached to ease the pain of debugging, you can do that with Debugger.IsAttached, if you have a need to start a debugging session you can do that with Debugger.Launch and then break with Debugger.Break().

One example of using it is to debug the startup event for web applications(Application_OnStart) which is a bit harder to catch on IIS, as a side note, another trick is to attach the debugger and save web.config, it will recycle the application and you'll get your breakpoint.

Visual Studio debugger is very powerful, you can watch, you can drill into and you can break, conditionally or unconditionally.

.NET framework enables a few debugging mechanisms through attributes, some of them are only affected by Just My Code feature in the debugger, in short, Just My Code is a debugger feature used to save time so only the developer's code will be shown in the debugger, it hides call stacks and prevents the debugger from stepping into code not part of the developer's code.

DebuggerDisplayAttribute - we've seen this, it helps you to see what your object contains without actually drilling down into each one, this is especially helpful when debugging lists and you're trying to see their contents without drilling into each and one of them.

DebuggerBrowsableAttribute - used to control visibility of a property or field in the debugger's watch, this is useful for hiding certain items in your objects.

DebuggerHiddenAttribute - used to tell the debugger not to step into the marked method or property, no breakpoints will be hit, this is especially helpful when having utility methods that you never want to step into and then out, saving you precious debugging time, the call stack will not show anything, this attribute will also hide code from Code Coverage.

DebuggerNonUserCodeAttribute - used to hide classes/structs/methods and properties from the debugger, they will not be shown in the call stack, nor will the debugger step into them, this attribute will also hide code from Code Coverage.

DebuggerStepperBoundaryAttribute - used to continue running after user steps through a DebuggerNonUserCodeAttribute.

DebuggerStepThroughAttribute - used on class, struct, method, prevents the debugger from stepping into this method, a breakpoint will be hit if Just My Code is disabled.

DebuggerTypeProxyAttribute - used on assembly, class and struct, allowes the developer to design their own debugger view for types. this is especially useful for complex classes where having certain information in the debugger window is useful, this allows more flexibility than DebuggerDisplayAttribute.

DebuggerVisualizerAttribute - used on assembly, class and struct, provides a visualizer for a certain type, example usage would be to display the image for an image type instead of just the width/height/image bytes.


Symbols

Symbols are represented on Windows in PDB files, which are one of the greatest help to debugging on windows, on .NET it tells you source filenames and lines, which can help you to locate the source of the error/exception.


Logging Libraries

Common.Logging - provides an abstraction logging library, which on top you can get specific loggers for log4net and NLog for example.

Log4net - this is actually one of the most famous ones, there are many flavors of that library for many languages, log4j, log4javascript and log4cpp.

NLog - log4net and NLog provide similar functionality, the major difference is that NLog is more active.

Enterprise Library Logging and Semantic Logging - Part of Microsoft Patterns & Practices team, Microsoft's Logging Implementation.

ObjectGuy Logging Framework - One of the more popular and lightweight logging framework.

Here's a small comparison between some of them.


Trace Listeners

While full blown logging framework provide good and elaborate ways to log messages in your application, sometimes Debug and Trace messages are enough but you do want to capture these messages other than inside the debugger, this is where Trace Listeners come in, there are Event Log, File, Console, Delimited, XML and the Default, which is writing to the debugger log.

If you want to implement multiple logging levels in combination with the built-in Debug/Trace classes, you can use TraceSwitchs and if you want the advanced version of Trace, you should use TraceSource, which support multiple levels and traces.


Protecting against cryptic errors

Cryptic errors, Null References, Exceptions which tell you nothing, "input error", "field validation error". I hate them, so should you, they make you dig through the call stack, attach a debugger, read states, add watches, look at flows, etc' etc', this is not fun programming, let's see what we can do about them.

All code should verify its input, it should make sure that the data is ready to operate on, if you're worried about performance, you can code the validations so they can be compiled out, conditionals, preprocessor, choose.

The simplest example that comes to mind is a function that divides a number. throw an exception when the denominator is zero for example.

But that's too simple, why is the denominator zero? is it a configuration issue? is it coming from a different operation? is it a result of a malfunction? That's what we would really like to know.

Another thing to keep in mind is what if the denominator is not coming from an argument but rather a state, how did the state get there? who changed it? why is an invalid value allowed? why didn't we throw an error when the denominator was set to 0 in the first place?

When developing methods, classes, properties, one should always ask themselves "what if" questions and try to protect the objects from invalid state by throwing the right error as early as possible.

Choosing between Asserts and Exceptions should be straight forward, if the errors should be checked and handled on production, use Exceptions, if the errors are mostly development, use Debug.Assert or Trace.Assert. When compiling a release version, only Trace remains, this is right for all Trace vs Debug methods.


Why I shouldn’t use Asserts?

I’ll play the devil advocate here, asserts are, well, cool development helpers, but they – in essence, create a different code that execute for debug/release builds, some properties can cause an internal object state change and if observed, can change the outcome. In essence, this is a different behavior in two builds that are supposed to be the same code.

Never use asserts to validate user input.

Never use asserts to validate environment issues, such as memory allocation, network and database and file system issues, these are to be expected and handled with exceptions.


Debugging Methods

Debugging production is not the same as debugging a development build, which has all the symbols and debug messages, debugging web applications is not the same as desktop applications and so on.

Debugging development is relatively easy, you have all the symbols, you know which line an error occurred, you can attach a debugger and drill down through the call stack, examine variables and see the current state of almost everything related.

But what do we do on production?

That's why exceptions, call stack and logs are important. but without context they might mean something you didn't think about, so always log file version, configurations, environment settings, anything which might affect your application.

You should also try to divide you logs to meaningful modules, perhaps even classes if you enable tracing, so you'll know how you got where you did. If your object can be called from multiple paths, perhaps even including where the call came from. I've been chasing a bug more than once, thinking it came from one place but actually coming from a different place.

Another thing you can include in your logs is stack trace, this is especially easy when doing AOP where you can hook an interceptor on all methods and see who called what and even time it.

Other than knowing what to write to logs, you should know how to read these logs, otherwise it's just wasted disk space. make sure your log reader can filter, search and if its a web application, trace all the calls for the same session id.

If you find errors or warnings, don't just hang on the ones you see, filter the entire log for similar issues, if you wrote your log messages correctly, you'll most likely find other warnings or errors which might seem unrelated at first but could have a relationship.


Tools

Debugging tools are very important, but use the appropriate tool for the right debugging job, dump / memory tracing / performance analysis, web or desktop.

First and foremost is the Visual Studio debugger, which has an extensive set of debugging tools, with the most basic ones such as breakpoints and watches, and continuing with call stack, threads, tasks and immediate. It can open a dump file and load symbols, with this set of tools, debugging is easy.

Next in line is the windbg which can be used anywhere with just simple .net helpers, sos.dll, psscor2.dll and mscordacwks.dll.

Getting a memory dump is just a matter of right clicking the process in Task Manager and selecting Create Dump File.

If you're having memory issues, the simplest way would be to get a dump, open in Visual Studio and look at Heap View or use 3rd party tools like ANTS Memory Profiler or dotTrace.

The Visual Studio Performance Profiler can be run on both the development machine and as a standalone on production (but note it might slow down the overall performance), even on web applications, you should always test performance on your development machine or test machine, production should be the last resort and on 99% of cases is not needed.

Debugging multithreaded applications is easy, you can freeze/thaw threads, you can switch threads and on a higher level you can do the same with Tasks.


“An ounce of prevention is worth a pound of cure.” ― Benjamin Franklin

The idea of asserts is that you can choose to run these checks for every method, if an invalid value is provided, you have a few choices, exceptions, asserts and Contracts can help you to tell the developer if they made a mistake, like uninitialized properties, invalid variables passed into your methods, this is called Defensive Programming, the basic premise is that you write code as if everyone is trying to get you, paranoid level. In my opinion, when looking at methods that are not 100% clear what they are doing and which parameters are acceptable ones, contracts and asserts can really make things clear, if you see a line such as assert(x != null), you’re definitely going to need x.

Defensive or Robust Programming can help reduce bugs, security vulnerabilities and wrong outcome, it can help you to avoid hours of debugging by catching problems when they occur, think of it as a sanity check for every method.

Exceptions – Throw exceptions when things are not designed to work the way they do, for custom exceptions if possible, include the culprit so further analysis is possible, part of your unit tests is to check that bad conditions are indeed throwing exceptions.

Debug.Asserts – Assert as much as needed, check all parameters that can crash or make a method misbehave, check assumptions in the middle of the methods, always include an assert message that explains the error, it should be enough to understand the problem and how to fix it as asserts are helpers for developers, if you log, include the state that caused the exception, if you have a struct that defines the bad behavior, its relatively simple to point to the data that caused the assert.

Trace.Asserts – If an application state can cause damage to user data, its important to leave these checks even in production code, for this you can use Trace.Assert as it doesn’t get compiled out in release builds, the biggest difference between an unhandled exception and Trace.Assert is that the user can elect to ignore the assert while the unhandled exception will terminate the application.


Logging Performance

You might encounter performance issues in your application with regard to logging if you log too much, after all, each log record requires execution of code other than your application's logic.  When debugging you can turn on all the logging levels but turn off the debug and trace levels in production.

Another thing to remember is that if your log method looks anything like so:

void Log(string message, params object[] args);
Log("writing to file", objects.Where(i=>i > 1).ToArray());

it will go into the call and execute the Where(i=>i > 1).ToArray() even if the log is not written.


You can fix this by having your logging methods accept a lambda and have a more complex Log method, similar to this:

Log("writing to file", ()=>{return objects.Where(i=>i > 1).ToArray()});

You'll end up with code that doesn't execute if the logging level is not enabled and you don't have to check the log level every time you log a message.

If your code is using Debug and Trace built-in functionality (Write / Assert), Debug is compiled out of the code in release builds, if you have special logs for debug builds you can mark your code with Conditionals and preprocessor.

What are Conditionals? Let's say we define the symbol LIVE:


Project Properties / Build / Conditional compilation symbols

Essentially this:

[Conditional("LIVE")]
void sendDetailsToServer()
{
}

will be compiled in. but if you remove the symbol from the list, the compiler will not compile sendDetailsToServer and all the calling code.

Another option is to use the preprocessor, it is more primitive, surrounding any code in #if, #endif could be a bit exhaustive since you also need to surround the calling code as well.


Performance Logging

Debugging issues is not only bugs, it can also be performance issues, while performance analysis can help you to pinpoint the location of the bottlenecks in your application, it's important to give the outside world some performance metrics of your general operations.

If you look at the Windows Performance Monitor, you will see many such performance metrics, they are called Performance Counters, some of the values there are indicated as a total and some as per-interval. .NET provides access to the Performance Monitor through PerformanceCounter class, on the plus side of working with Performance Counters, anyone can add them and measure the application's performance, an administrator can put an alert on a high value and you get the added benefit of making your application system administrator friendly.

In the end, they can use your metrics with your guidance to detect problems or potential problems before they become a catastrophe.

Another thing that can help you to look for problems in the log is including how much time a certain operation took, you can use StopWatch to time your methods.


Contracts

Code Contracts allows developers to do static validations that methods are indeed getting valid parameters and producing a valid output, code contracts are not new, Microsoft has recently open sourced it so you can take a look.

To use code contracts you need to download a package that hooks to the compilation process and does static analysis of the code and its contracts.

Eventually you will get a list of contract violations in the TODO pane which will indicate potential problems.

The static code analysis has some problems though, it's very slow and sometimes it can give false positives.

If you choose not to use the static code analysis, you can still benefit from it by hooking the Contract.ContractFailed event which will give you an indication there is a problem.

I can't really recommend it, it looks cool but I don't have experience with any large scale project using it, so feel free to explore the Contract class, faq and documentation.


Summary

Debugging is part of every developer's life, while it is important and useful way to fix bugs it's more important to avoid bugs as much as possible, unit tests are a great development tool, they allow you to easily fix and refactor your code without being overly cautious. but as much as we would like to find a magic bullet, there isn't one really, so my moto is prepare for the worst, log everything that could help you diagnose a problem or indicate a suspicious state.

In the end, no article is going to make you a good programmer, this is only an introduction to a nice set of tools to get started, test, log, debug.
Tags: , , , , , , , , , , , , , , ,

Tuesday, June 24, 2014

Performance, Profiling and Optimization 101

This article is about common sense sprinkled with personal opinion and experience, don’t follow everything to the letter, remember:

There is an exception to every rule

In my opinion, optimization is for either one of two cases, customer demands it, or you’re doing it for fun. We’ll look at most of these components and attitudes in this article, some optimizations are more for a challenge than a real income as the performance gains could be great but will benefit no one other than our own pat on our backs. I enjoy both attitudes as some of my past articles show, I’ll optimize string format for fun, but check if I can accelerate an entire processing job for business if my customer demands the minimalistic latency that a computer can possibly achieve, but I will give them the information regarding which process will take how many developer resources and what’s the maximum amount of benefit that can be gained so they will decide if that second is worth 2 days of development time. 

Though everyone knows the value of tests, in performance tuning there is a special weight for it, not only that we improved the performance but also to make sure the outcome is the correct one, as performance tuning makes things more complicated it is sometimes hard to keep everything perfectly together, an outside inspector (tests) can help you keep your mistakes to a minimum.

What is performance analysis?

- From Wikipedia: profiling is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or frequency and duration of function calls. The most common use of profiling information is to aid program optimization.

What is software optimization? 


- From Wikipedia: software optimization  is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources. In general, a computer program may be optimized so that it executes more rapidly, or is capable of operating with less memory storage or other resources, or draw less power.

Let us assume

  • The more work you do, the more resources it will take, where resources are any combination of CPU time, IO (network/disk) and memory.
  • For the same amount of data, a few big chunks load faster than many small chunks, where chunks can be files on disk or network, the two major reasons are latency and overhead.
  • Don’t optimize unless you have to!  Many times the optimization is idealized in the developer’s eyes, write this optimized, its more efficient, write that optimized, its less resources, most of the time optimizations make things more complicated and less maintainable and you don’t want that overhead when a real bug comes in and you need to figure out what went wrong. Many times optimizations without profiling is time wasted, think like a businessman in that sense, if you don’t have the extra cash, don’t spend it, if you're considering a loan, then be prepared to pay the interest (delays).
  • For many server applications its cheaper to throw more hardware on the problem than optimize.
  • For many client applications the customer’s time is more expensive than developer time! If you’re offering a free application you might encourage your users to abandon it, if it’s a paid application, you should think of your customers, their time is not free and if their productivity is lowered, they won’t want to spend much time/money on your application, when thinking about customer time, double the amount of users with the time wasted, is it worth more than developer time and potential income?
  • Understand before optimization. Always understand what the application is trying to accomplish before you try to optimize, a big picture will show you that perhaps the whole process isn't necessary to achieve the end result, now you've cut a significant amount of time and resources. Algorithms are bottlenecks, is there a more efficient one?
  • Your hardware has finite resources. 

David Knuth wrote in The Art of Computer Programming - “premature optimization is the root of all evil”.

  • It is hard to be certain where a bottleneck will be in production system, you might assume that a certain piece of code slows things down, but in reality there could be so many variables affecting the system, that the piece you thought was the problem is negligible in the grand scheme of things, its also inefficient to optimize tasks you don’t care they are slow and tasks that execute rarely, unless these task need to provide an answer in real time.
  • Design with performance in mind. Its much cheaper to develop something that will withstand or almost withstand the planned performance requirements and then optimize rather than develop a cheap solution that will require much more work to change it later for the designed performance, find your balance for time to market vs performance. I know this looks like a contradiction to premature optimization (...), but really its not, assume you have a web application that needs to display 1000’s of records in a grid, planning to display 10 and later optimizing it will waste time, infinite scroll or other common practice is much more suitable.
  • Do a preliminary profiling as early as possible for the busiest parts so you’ll have a big picture of what smells and needs another peek. Do your profiling later when the system goes beta/production so you’ll know where the real bottlenecks are.
  • Working, Correctly, Fast. Your correct development flow should be first to make things work. Then work correctly and only then look for optimizations. Its easier and more accurate to do your performance optimization after everything works correctly.
  • Optimization usually complicates rather than simplifies. Optimizing code for performance usually requires caches, dictionaries, arrays and other less readable methodologies, that’s why it should be done at the end, so that the first working code will actually work correctly, complexity usually defies readability and ease of maintenance.
  • Waste not, want not. Don’t waste resources, don’t over-calculate stuff, don’t store things you don’t need, don’t retrieve data you’re not going to use. This will leave you with more resources for the stuff that really needs it. When you load more data than you need, IO needs to work more, depending on latency and throughput, in turn this uses more CPU and sometimes also memory, in the end, you paid a high price for doing unnecessary work.
  • Don’t waste memory resources. High memory usage leads to swapping, don’t cache what you don’t need to cache or might rarely use, cache needs to be maintained so it won’t become stale. Overusing memory leads to swapping and leaving less memory for file system cache which could make the whole system slow. One of the first things I check when I have a slow server is how much memory is available and if SQL is not configured with a hard limit. If you suspect your program is misbehaving, profile it!
  • Profile, Optimize and Validate, repeat as needed, don’t assume anything is faster/slower and don’t assume your improvement actually improves anything, under the right circumstances slow things can be fast and vice versa. Validate, I can’t stress that enough.
  • Don’t keep your code if it’s a minuscule improvement, most of us have a sentiment to the code we write, thinking we’ve spent an hour or more on something that it shouldn’t go to waste but remember that optimizations usually impair readability and maintainability and keeping the code will do more harm than good, if you’re really attached to it, open a graveyard blog and put it on display, this way others can learn and you get to keep it.
  • Why code metrics is not enough. Code metrics can be great to detect complexity and maintainability hotspots in your code, but it can’t measure performance, the fact that your code is complex or simple doesn't mean its slow or fast. A lot of performance bottlenecks can only be detected based on the data that goes through the program’s pipeline, that’s why its even more important to check the performance with real-world data rather than a mockup.
  • Understand Big O notation for algorithmic use. Big O notation is a measurement that describes the performance a certain function will have based on input length. The closer it is to 1, the more chance a long input will affect performance linearly.

How to get performance improvements?

  • Tweaking, usually minuscule, while reordering commands might affect CPU cache performance and can speed things up and using SIMD commands can even double the performance for certain actions,  but unless these commands execute in big loops or on very big data, these tweaks will probably save a second here and there, if you see a significant amount of work being done on blocks of data, like matrix/vector calculations, you should look into SIMD programming, this is completely outside the scope of this article and you’ll probably need to go native – C++, but there is a also a .NET Implementation.
  • Memory access - Understanding reference/value. All references are 32/64bits (depending on architecture), primitives and structs are value based and you should know that calling a method with a parameters by value will copy the contents, this is a type of micro-optimizations but in large loops can have an effect on the performance. In case you didn't know, strings are passed by reference but strings are also immutable, which means that any manipulation on strings creates a copy and discards the old reference, strings are a nice meal for the garbage collector. In any case, the time spent on copying structs is directly related to their sizes and worse case, you can always pass them by reference with ref.
  • Algorithmic. Usually big improvements, in memory and/or time. For example, different sorting algorithms, Search trees, etc'.
  • Object reuse. Object pools were created because we know that allocating memory and creating objects is time consuming (not to mention destroying them), its probably going to be more efficient to use object pools, you can request an object and when you’re done with it, return it to the pool.
  • Exceptions. try-catch is usually handled by saving the current state, executing the try body and in case there’s an error, the state is rolled back, this is expensive in terms of CPU resources and try-catch is advertised as a low performance mechanism. In extreme cases where performance is critical, it might be wrong in design sense but right in performance sense to use other error handling methods, again, if you have such significant performance loses due to exceptions, you should reconsider your design anyway.
  • Caching. Processing power costs money, memory costs money, time costs money, Caching can be a balance between the three, but it can also take your application to the wrong direction. Don’t cache everything, maintaining a dirty cache is as bad as no cache. Distributed cache can be a scale out solution but it can also be a design pitfall. With cache – design for failure, never assume something is already in the cache or fresh enough, keep a timestamp or other marks to make sure your cache is not stale.
  • Deferred execution. If you don’t need real time handling, don’t do real time. Processing things now is costly and most servers are not busy 24/7, if you can postpone your reporting until the server is idle, you’ll look more professional than letting your users wait for an undetermined amount of time. Don’t promise the report to run in real time and get your system frozen until it finishes and don’t display the due time and miss it, regularly. Its unprofessional.
  • Serial execution/Queues. If your application cannot run concurrently without using too many locks and creating random deadlocks (other than the fact that its misbehaving), sometimes its more beneficial to use queued execution, for example, sometimes its even faster to execute jobs in serial as none of them is attempting any locks. 
  • Push vs Polling. Both have their pros and cons, while push is usually faster and in some sense its less resource intensive since its not being queried every certain amount of time, its also keeping a connection open all the time. You should consider both options and decide based on how much delay you're willing to accept and how much resources you're willing to invest. Push can be more complicated to implement but solutions such as SignalR/Socket.IO/WCF callbacks provide enough infrastructure to never use that excuse again.
  • Casts. Casts take time, not a lot, but in some cases can affect performance. You should use Generics or Interfaces to make the program more readable and not worry about casts too much.
  • Accuracy. If you only need float, don't use double. This has varying results on different architectures, sometimes double is faster, sometimes its slower, I've contemplated if I should add it as an optimization, if your code is heavy on floating point operations you should at least look at it.

Where to optimize?



This is Visual Studio Profiler. I've made a little test program that compares several sorting algorithms, we can see clearly which ones take more CPU, the faster ones don't even show up, this is just to show you how easy it is to look for performance bottlenecks with this tool.
Note the views at the top, each view help you determine which are the most probable sources for bottlenecks, each one has a different purpose so explore all of them.

Internet Explorer



IE 10 Profiler on http://demos.dojotoolkit.org/demos/mobileCharting/demo.html
What should we look at?

  • setAttribute is executed 51k times and takes 329ms. perhaps its affecting the UI's performance? is there something we can optimize in that function or the calling functions?
  • elementFromPoint is not executed too many times but still takes 316ms, what does it do? does it have internal loops?
  • hideChartView is executed only twice but take 192ms. What else does it do beside hiding a chart view?
From first glance it doesn't look that this web application is taking too much CPU, so unless our customer demands it, we will probably not optimize anything.

But wait, we should also look at the call tree:


Here we can actually see which function is calling which, it can help us to understand some of the hotspots calling tree and we see that the most significant slow down is when a mouse is moving, something is rendering, creating a rectangle and setting stroke and fill. With some more digging, perhaps its even possible to speed up the execution.

Chrome


While IE provides a basic profiler, Chrome has a more robust profiler, I recommend going over the documentation from google, Network, Timeline, Javascript and Memory.

Chrome can also show you how CSS selectors slow down your application, it is part of the style recalculation.


Javascript Profiling


Timeline - Events Profiling


YSlow


Yahoo's YSlow is a nice plugin, it gives you a general performance overview, if you loaded too many files, where you link you styles and other common issues, it doesn't give you a thorough knowledge of the application, but its still needed information for optimizing your application/website.




There are many more tools that can help you diagnose and optimize your websites and applications, but I've come to like Chrome and Visual Studio and for now they are exactly what I needed.

Methods of profiling:

  • Sampling  - Sampling takes a snapshot of the currently executing threads stacks, by statistically analyzing which functions are caught more in these snapshots, you can see which method takes more CPU resources.
  • Instrumentation – Instrumentation inserts interception points in the code so its being measured whenever a function is called. This is more accurate than statistical sampling, but also slows down the execution significantly, it should be used in environments where the system being profiled is under load and its not known how many resources the application actually have at its disposal, the reason is that if the application have unstable CPU resources, the results will be skewed whenever there is or there isn't any load on the system, fast methods will seem like slow methods. Its also useful in multithreaded applications where some threads are affecting other threads performances though concurrency profiling might be more suitable.
  • Performance Counters – Application performance can also be monitored by windows Performance Monitor, the user can also assign triggers to values you know have bad effect on performance, but this way you leave it to the administrator instead of implementing your own. Its very useful in server environments where these performance counters can be collected and analyzed later. 
  • Memory Profiling - Performance issues may arise as more and more memory is being allocated and the system goes into swapping, but that's not the only concern, each allocation incurs an overhead, it is usually better to allocate more than needed right now than to allocate very small chunks. Like everything, the key is balance. Visual Studio Profiler allows you to see which functions allocated memory, how many allocations and their total size.
  • Resource Contention (concurrency) Profiling - When developing multithreaded applications, sometimes its not clear why a certain function is slow, you may see that there is a lot of time wasted on locks, but to understand how the interaction between the threads is hampering the application, you can use Resource Contention (concurrency) profiling, which visualizes how the application behaves.

Understanding trade-offs

There is no “best” solution, there are trade-offs, either memory efficiency, high performance or readability and maintainability. It is rare to have more than two, in a cost-effective planning you should decide which is more important to you for each module.
  • List vs Dictionary. While list is usually more memory efficient, pulling non-sequential information from it, is usually slower, not to mention a search, which is O(n) on the other hand an ideal dictionary is close to O(1) but nothing is really ideal, on the other hand, adding a value to a list is quicker than adding it to a dictionary as a dictionary has to maintain an internal state for its key.
  • Buffering. Buffering data can do two things, it can either prepare a larger block for an IO operation or it can buffer incoming blocks to achieve the same result in one large operation instead of multiple small operations, Buffering is usually done to reduce overheads at the cost of delay and memory. Buffering can be used in many ways, you can buffer a number of messages you want to pass to the client and then push one big bulk of messages, you can buffer incoming information so you'll execute a parser only once to show a few.
  • Filtering/Grouping. If you do store a dataset in the cache, consider breaking it down to what your queries are going to use, storing multiple groups can be more efficient for later processing, but remember that storing multiple groups can add more latency so then again, it can also be less efficient.
  • Precompute. Precomputing results can save time if these results will be used multiple times, partial precomputing can also help, again, take into account the problems stale data might create.
  • Caching. Caching can help with static or partially static data especially when using a slow storage medium, this might give you a rough idea of what to expect when you think about caching:
Read 1 MB sequentially from memory... 250,000 ns   = 250 µs
Read 1 MB sequentially from SSD* .... 1,000,000 ns   = 1 ms
Disk seek ........................... 10,000,000 ns  = 10 ms
Read 1 MB sequentially from disk .... 20,000,000 ns  = 20 ms
Send packet CA->Netherlands->CA ..... 150,000,000 ns = 150 ms

Note: that it talks about a packet, not 1 MB.
  • Hashing. Lets think of this scenario, you want to find duplicate files in a file system based on content, for that you can really compare each file content to each file content, which will give you the best results. Lets think of optimizations for that program, first, we’ll remove all the 0 size files from comparison, because you can’t really compare 0 to 0. Then we can only compare same file sizes, if the file size doesn't match, the files are not the same anyway. Then we’re left with all the files with the same size. But what do we do with these 100mb files? We have 20 of them, comparing them is the same as comparing 19GB of data! We can save some time by hashing all the files, that’s only 2GB, then comparing all the hashes and only if any of the hashes match, compare the file contents, so now we’re comparing a minimum of 2GB and already know if there’s a chance anything else will match, we can also decide if our hash is enough for us or we want to compare the actual files. Now, a bigger question should be, should we optimize the hash function? Well, does it hash slower than your disk drive? How many comparisons like that are we going to execute each day? 1? Not worth it. All day? Perhaps worth it. Is it critical we’ll compare everything in the shortest amount of time? Could be worth it. Who is paying for that? You get the point.
  • Parallelism. For a single user its probably more efficient to use their entire CPU with all their cores, but consider a situation on a server that you optimize a certain operation to use all its cores, a 2nd user comes along and gets access to all its cores, the performance effects are unpredictable, especially if the application is designed to create more and more threads, context switching is a real issue when using too many threads. I would advise against using multiple threads in web applications for the same request. Instead, you can use queuing mechanism and keep the heavy lifting on a different server.
  • Inlining. Inlining has a minuscule performance gain, with that in mind, Inlining is a method the JIT compiler uses to speed things up for very small methods, when a program calls a method, a stack has to be filled with all parameters and the processor jumps to the method’s body. Inlining saves some of these steps by pushing the method’s body directly into the calling method, for very small methods it can actually speed things up, for medium to large methods, its negligible. .NET 4.0 introduced flags to force inlining (AggressiveInlining), previous frameworks only supplied an attribute to avoid inlining (NoInlining).
  • Lambda/LINQ limitations. Lambda and Linq (which could use lambda) are great, I love them, especially for RAD, but since all lambda methods can’t be optimized at JIT compiler time, inlining will never happen. Take that into account if you’re processing many records. You should also consider using CompiledQuery for your LINQ.

Most probable performance hotspots:

  • Loops/For/for each/recursion.  Any loop has the potential to occupy the CPU and use a lot of memory, You should check how many objects are being processed and how much time each loop takes.
  • IO (DISK/Network/Database) . Disk/Network/Database IO are many times slower than memory access, see if you can load the data in the background or load less data.
  • UI. UI updates takes time, each layout has to be calculated and each control has to draw itself.
  • Program flow. You should make sure a 2nd command is not executed if it depends on the first one to be successful - you are just throwing resources away, if attempting a retry for something, don’t discard the work that was already done if you can.

Most probable memory hogs

  • Indexes, Hash-tables, Lists, Arrays. Containers of any sort hold data by definition, check if its really needed, offload it to cache if its giving you performance benefits but still want it to be off the application's memory.  Be careful with distributed cache/out of process cache, network latency and throughput takes away the benefit of storing large collections as they need to be serialized and deserialized for going over the network, this is also important when using ASP.NET state service.
  • RAW data, file contents (audio, video, text files). Most probably they don’t need to be stored in memory, the file system cache does a great job of using the unused memory.

Basic SQL Optimizations

  • IO is expensive! You can minimize the amount of data being transferred inside the engine by using index includes and using the smallest possible temporary tables. Table variables are suitable for 1000 records or less in most cases.
  • Functions are expensive! Especially when they run on each row, reconsider every function! Also, in that context, avoid joining on expressions, filtering by expressions and selecting with expressions, if you do, check the execution plans and make sure these queries are not too heavy on the CPU and not falling back to table scans.
  • Avoid comparing different data types, some of them will cause a scan, it might be better to have an indexed computed column.
  • Consider using Persisted Computed Columns with indexes and consider executing your queries on them instead of expressions, persisted computed columns execute the function on row update and also update their associated indexes like regular columns which makes them very efficient for a data-warehouse application. 
  • Execution plans are expensive! Use query parameters or stored procedures.
  • Cross joins are expensive! Cartesian products are expensive, for each row in table ‘a’ join each row on table ‘b’. that’s a times b rows!
  • Avoid Cursors! Cursors are slow, sometimes they are the only way to go but for many problems there are faster options than cursors.
  • Indexes can make your life either heaven or hell depending on your understanding of them. Indexes can help you quickly retrieve data but each index adds overhead for each update, insert and delete. Use index includes (include columns in each index for quicker retrieval – avoids Key Lookup) when the benefits are greater than the downsides, index includes duplicate the data, that means more memory and IO, you can even add indexes on persisted calculated columns and have indexed views (schemabinding).
  • Partitioning. Partitioning is an old trick, RAID0 is a type of partitioning without redundancy, RAID5 is a 3 disk partitioning. While SATA3/SAS removed most of the bottlenecks controller wise, more common disk drives are still relatively slow, by splitting your database files on multiple disks, you can have more IO/s available for your database, also, datacenters are starting to use SSD more and more, which speed things up even further.
  • Memory. SQL assumes all the server’s memory belongs to it, it might free some memory when the server is in a really bad situation, but from my experience that’s usually too little too late and can bring a server to its knees. Its best to tell it how much memory to use, reserve about 1 GB of for Windows and even more if the server is not a dedicated database server. Ideally you should have at least enough memory for all the active database tables + enough memory for their most common queries. These days memory is cheap enough that saving money on it doesn't pay off but actually causes you to waste money on a slow server.
  • Avoid negative comparisons. Negative comparisons will most likely cause a scan since it can’t use an index, consider your negative comparisons and the table sizes, Check your execution plans!
  • Retrieve only what you need, join only what you have to. Retrieving more adds overhead which is usually CPU/memory/network/disk waste, save these resources for operations that needs them.
  • Avoid Dynamic SQL, use prepared statements if you have to. All queries are compiled to execution plans, if these plans contain changing values then a new plan has to be created each time the query executes instead of reusing the previous query.
  • Avoid Like comparison, especially with wildcards. Like with wildcards causes a table scan, if you must search for text, use full text search or avoid wildcards.
  • Learn to read execution plans! I can’t stress that enough!
  • Once your database is running for a while, get the top 10 most expensive queries to get a clue whats taking most of your resources.
  • SQL Profiler is a great tool for knowing what's going on in real time, if I have performance issues, I'm setting it up to show everything above 100ms. Bear in mind that running a profiler on a server is not without risks, I've had to restart the SQL Server service once after a profiler froze and got the database to misbehave.

Basic HTML/Javascript Optimizations

  • Most expensive is IO and UI! Every time you show/hide/create/delete an element, the browser recalculates where everything should be, its called a reflow.
  • Big/Multiple loops are slow, use associative arrays if you need a dictionary and regular arrays if you want fast iteration over the elements (for.. in is slower).
  • On old browsers, DOM updates/parsing is very slow, innerHTML on IE is faster than on Chrome and Firefox.
  • Avoid updating elements which cause reflows, padding on IE6-IE8 has the highest performance penalty (x4), while DOM manipulations generally trigger reflows, multiple DOM updates which create reflows are slow to terrible performance, absolute positioned objects have only their own performance penalties rather than affecting the whole document. 
  • Updating elements is faster before they are part of the DOM for the same reason - reflows.
  • CSS wildcards are slower, use the most selective selectors
  • Avoid IFRAMEs, browser in browser have a similar penalty/overhead to having a new browser window open. Also, IFRAMES block onload until they are done.
  • Nested DIVs are slower in certain circumstances, deeper elements to reflow.
  • Mind your scope! local scope is faster than global scope, prefix local variables with var.
  • Avoid using nested properties in loops, they are not optimized. Make local variables and use them instead.
  • Always measure! Don’t assume anything in the browser, there are too many variables! 
var start = (new Date());
...
console.log("executed " + ((new Date()) - start).toString() + "ms");

  • Use CDNs where needed. Browsers have a limitation for how many concurrent open connections they have, loading resources from multiple domains can overcome that limitation, in addition, CDNs can provide a closer to browser server, which can serve resources faster.
  • Avoid eval(s), not only for performance issues but also security.
  • Minify your Javascript files and CSSs
  • Know when to use setInterval and recursive setTimeout. setInterval raises an event every x ms, but if your browser is a bit slow then it will be unresponsive as its executing the interval events. Another option is to use setTimeout and part of the function, call setTimeout again, this way you can tell the browser how much time to rest between events instead.

C# Optimizations

  • Most expensive is IO and UI!
  • C# is very fast, in some conditions competitive with C++
  • Use the data types you need, avoid structs as method parameters or use ref if applicable, structs are considered values, a copy is created for every function call.
  • Avoid dynamic. While its not directly related to performance, and it is a static type, its not checked at compile time, dynamic types are making your code less readable and could introduce problems. 
  • Avoid Reflection, accessing object type on runtime is slow, if you have to, cache accessors and learn about dynamic proxies.
  • Avoid casting, generics is typesafe and avoids unneeded boxing/unboxing.
  • Avoid COM, prefer managed code.
  • Use Dictionaries for fast retrieval, avoid searching/linq on lists in critical sections, compare with CompiledQuery(ies).
  • StringBuilder. Use StringBuilder if you have long list of manipulations, this can save both CPU time and memory as strings are immutable.
  • If using multithreading, understand different locking mechanisms, use busy waits for very quick operations and wait locks for long locks.
  • If creating/destroying many/large objects, understand garbage collection, generations, Dispose/Finalize, SuppressFinalize and Large Object Heap. While garbage collection provides a simple memory management for .NET applications, its important to know that reference rich and large objects have a certain penalty, reference rich objects needs to go through more work for unreferencing them and large objects are stored in the Large Object Heap which is unable to relocate objects and therefore doesn't free memory as often. While garbage collection could be a performance hit, it should only be optimized if collection times are affecting performance significantly
  • Like everything else, measure! (StopWatch)
  • Defer IO execution until the system is either idle or you must. IO takes CPU time, but it also uses system interrupts, which sometimes lock the whole system for the duration of the interrupt, more IO means less CPU for other tasks.
  • Virtual methods/Interfaces. They have a performance penalty of about 10%, I would not recommend avoiding them as most of the time, the readability and maintainability they provide well exceeds their penalty, on the other hand, a property override might have a higher penalty (using 'new' keyword).
  • Use System.Diagnostics.Debug if you want the compiler to remove the calls when compiling in Release and System.Diagnostics.Trace if you want to keep.

Guidelines for designing a new application/feature

  • Ask/Decide on minimum requirements. Don’t optimize prematurely but design for requirements!
  • Prefer library/framework methods, most of them have been already optimized but don’t follow blindly, when you profile, also look at the libraries.
  • Avoid unnecessary function calls, where necessary could also be necessary right now.
  • Prefer positive checks rather than negative in SQL. 
  • Learn to use threading mechanisms, Tasks, Actions, Async.
  • Don’t waste!

Ask yourselves:

  • Is the execution order optimal?
  • Does it have to run now?
  • Am I discarding data?
  • Is the application doing too much work in this point in time? How can I minimize it?
  • Does the code trigger too many actions? Can I combine them to a single execution?




Further reading:



Tags: , , , , ,