300x250 AD TOP

Search This Blog


Paling Dilihat

Powered by Blogger.

Sunday, June 24, 2012

Finding differences between files

Whenever you're working on a big project with multiple developers, its always important to keep yourself updated with the latest code changes, scrum meetings are great for that but don't provide the insight into someone's implementation, so whether its time for you to pull/update changes or push/commit changes, its always important to go over all changes that were made, by your coworkers or yourself, its a great way to remind you of everything you did and if you've forgot something or just see someone duplicating code you remember seeing someplace else.

Working with Tortoise versioning (SVN or GIT), I've come across two diff/merging programs, TortoiseMerge and KDIFF3, I'm sure there are plenty of tools, to each his own.

First, lets take a look at TortoiseMerge.

tortoisemerge with whitespaces and inline diff word-wise

The original view is full of dots, lines and arrows, you can't really see the forest for the trees, so lets start by removing the Whitespaces and Inline diff word-wise.

Good, now we can see that only the r changed. 

On with KDIFF3

Not much difference except its not showing the line as deleted on the original side but only that it changed and what changed.

I personally prefer KDIFF3, it looks cleaner to me and when doing merges I can see the original, their version and my version and it makes my life easier figuring out exactly what and how to merge. another plus in my eyes is its ability to compare whole directories.

Keyboard shortcuts for TortoiseMerge when merging are Ctrl-F8 for the next conflict and Ctrl-F9 and Ctrl-F10 for selecting either theirs or mine changes.

And for KDIFF3, Ctrl-PageDown for the next conflict and Ctrl-1,2,3 for selecting base, their or my changes.

To summaries, knowing what's been changed and reviewing your own changes is one of the most important things you need to do as a programmer. Use any diff tool you're comfortable with, be consistent with it and you'll see the quality of code you and the team produce go up because everyone is keeping tabs on everyone.


Tuesday, June 19, 2012

Internet Connection Sharing (ICS) Scripts

I've been trying to make my laptop an access point but with no success, Connectify and VirtualRouter either throw an error or act like they work but they don't.

So, I've decided to write a script, I've been taking pieces from here and there and here's what I've got.

At first I've tried to do everything with WMI, but it didn't work with VPN connections that are not for all users and come to think about it, WMI uses the system account, so it doesn't have access to my connections, so I'm executing route and ipconfig and parsing their output.

The biggest challenge was finding out which connection I should share, I wanted it to select the right interface automatically, so if I have an active VPN it will use that connection. The only assumption I had while coding these methods is that the last interface to go up was the one with the internet and its either a coincidence or by design, it was showing first in the route print command. so far its been working, but if it is a coincidence and not by design, I might need to replace it with a more consistent function, I couldn't find anything about how to find which connection was the active/internet/VPN one, but if you think about it, you just need to find the default route, and see which IP is on which interface and you're done.

Then, from getting the interface name, I could get the connection name via ipconfig /all or use the API from "HNetCfg.HNetShare.1" -> EnumEveryConnection -> NetConnectionProps -> Name/DeviceName, you can see an example from Microsoft in ShowConnections.vbs

And then all I needed to do was update the IP range (if it was changed), start the hostednetwork, set static ips on the virtual interface and start ICS.

netsh interface ip set address name="Connection 2" source=static addr= mask=
netsh interface ip set dns name="Connection 2" source=static addr=none register=PRIMARY
netsh interface ip set wins name="Connection 2" source=static addr=none

netsh wlan set hostednetwork mode=allow ssid=MyAccessPoint key=mypasskey
netsh wlan start hostednetwork

Starting ICS is a 2 step process, first you start the private network, then the public.

ics.vbs "Public Connection" "Private Connection" true

While working out some problems in the script, it has come to my attention that most of the things I was able to do with ipconfig is possible to do through the "HNetCfg.HNetShare.1" object, I guess if I'll have spare time, I'll change it, but its good enough for now. 

For stopping the virtual router, I needed to change back the virtual adapter's IP to DHCP, stop the hostednetwork and stop ICS.

netsh wlan set hostednetwork mode=disallow
netsh wlan stop hostednetwork

netsh interface ip set address name="Connection 2" source=dhcp
netsh interface ip set dns name="Connection 2" source=dhcp register=PRIMARY
netsh interface ip set wins name="Connection 2" source=dhcp

ics.vbs "Public Connection" "Private Connection" false

I've included a status, basically its executing

netsh wlan show hostednetwork

and the results are similar to


Hosted network settings
    Mode                   : Allowed
    SSID name              : "MyAccessPoint"
    Max number of clients  : 100
    Authentication         : WPA2-Personal
    Cipher                 : CCMP

Hosted network status
    Status                 : Started
    BSSID                  : xx:xx:xx:xx:xx:xx
    Radio type             : 802.11x
    Channel                : x
    Number of clients      : 1
        xx:xx:xx:xx:xx:xx        Authenticated

You can find the scripts here:

The scripts have been tested on Windows 7 only!

I would like to thank the following for posting their work and making my life easier:
Jerry Lees - for his Registry read/write
Richard Mueller - for leading me in the right direction
Big_Daddy - Whoever you are, thank you for this.

Also, interesting links that I came across during the search:
WIFI API - http://managedwifi.codeplex.com/
Microsoft - About the Wireless Hosted Network - http://msdn.microsoft.com/en-us/library/dd815243(v=vs.85).aspx
How to Change the IP Range for the Internet Connection Sharing DHCP service - http://support.microsoft.com/kb/230148


Thursday, June 14, 2012

The Pragmatic Programmer: From Journeyman to Master - Book Review

The Pragmatic Programmer is a must book for any developer wishing to take his profession to the next level.

The book shows you a way of thinking that will make your coding more robust, quicker, more stable and overall better quality.

The key concepts in the book are (affected by my own views):

- Learn and try new things, IDEs, Journals, magazines, books, blogs, tools, programming languages.
- Take responsibility for your code, communicate your problems, listen to the customer's desires, don't be afraid to be wrong, don't be too proud to accept criticism.
- Know what you're doing, don't guess and don't put out any fires, fix the underlying problems.
- Write good enough code, you can aim for perfect, but remember that as much as we want to, no one writes bug-free code, that doesn't say you shouldn't test and check your code.
- DRY (don't repeat yourself), be it code, documentation, etc'
- Decouple your modules.
- Write flexible code, for example, make it relatively easy to switch from mysql to postregsql.
- Learn the difference between Tracer bullets and prototypes and how and when to use each one.
- Learn to use mini-languages, scripting languages in your code, so if something changes often, is should be easy to maintain.
- Learn to estimate, data sizes, coding time, design, algorithm performance etc'.
- Learn to use tools of the trade, IDE, command line, scripting languages, code generators, pick a few and learn most if not all their features so you'll be more productive with your tools.
- Use source code control (SVN, VSS, GIT etc'), as long as you know what's been done in your code, it will be easier to solve and trace bugs, who did what for learning opportunities etc'.
- Learn to debug, visualize data, visualize program flow, throwing exceptions, error handling, know what's happening, not just what's causing the problem.
- Check resource usage.
- Design for concurrency, learn what threads are, how do they work, accessing local memory, shared memory, locks, collections, queues, what is a memory barrier, racing conditions, deadlocks, lock-free implementations.
- Understanding the O() notation.
- Refactoring
- Testing
- Better understanding of the design process, specifications, analysis.
- How to be a team of pragmatic programmers, no broken windows, no boiled frogs, DRY, good communication ,decoupling and automation.
- Automate everything, builds, tests, deployment, code generation.
- Testing - validation, verification, performance, test data.

The Pragmatic Programmer: From Journeyman to Master


Amazon AWS 101 Seminar

Yesterday the team and I had Amazon's AWS 101 Seminar, it was a whole day introduction to the services Amazon's cloud is offering.

The seminar was done by Jean-Pierre Legoaller while Tiago Henriques and Guillem Veiga were answering questions.

If you're planning to use Amazon's AWS and have an opportunity to go through their seminar, go for it, they told us about the services offered, some pricing, architecture and provided a nice workshop pdf with a detailed walk-through so we'd feel more comfortable with it.

I knew about AWS before the seminar but didn't really work with it in my daily routine and now I have a good common language with our IT.

Some of the basics covered were EC2, instance vs ebs, RDS, S3 and the command line tools and there was an introduction to VPC, CloudFormation, CloudFront, SES, SQS, ElastiCache and DynamoDB.

I'm happy to share with you some links we went over in the course:

AWS Presentations Channel on slideshare

AWS What's New 

AWS Blog

AWS Architecture Center - great resource for learning the possibilities with AWS!


Monday, June 11, 2012

SQL Query Engine Without Dynamic Queries

There are many ways to implement query engines, dynamic SQL, stored procedures with finite amount of IFs, dynamically creating stored procedures for each query  and a solution I'm writing here.

(I'm sure there are other solutions, I'll be glad to hear about them...)

Dynamic SQL is probably the fastest if you need to execute a one time query, but if multiple actions execute different queries the sql query engine will be very busy with building query plans and executing them, if you need more performance, sometimes the amount of indexes you'll need to create is significant and then the updates and deletes become very slow.

The problem with stored procedure is either your one stored procedures is going to be a maintenance nightmare or if you're creating stored procedures dynamically, its going to be a nightmare to write the code that maintains these stored procedures.

And then again, there's the indexes issue which must be addressed.

I've figured out another way of creating dynamic queries, its not very fast since its using cross joins, but its very quick to set up, so you can get started with low volume tables (< 100k records).

So lets start with our data structure:

1. The Entities table

  EntityId int  IDENTITY(1,1) NOT NULL,
  Name varchar(255) default NULL,
  Phone varchar(100) default NULL,
  Email varchar(255) default NULL,
  Address varchar(255) default NULL,
  City varchar(50) default NULL,
  Zip varchar(10) default NULL,
  State varchar(50) default NULL,
  Country varchar(50) default NULL,
  BirthDate varchar(50) default NULL

2. The Query table

    QueryId int NOT NULL IDENTITY (1, 1),
    Name varchar(100) NOT NULL

3. The Query Items table

create table QueryItems
    QueryItemId int not null identity (1,1),
    QueryId int,
    GroupId tinyint,
    AttributeCode tinyint,
    Value nvarchar(255)

The Entities table is pretty straight forward and so is the Queries table.

The QueryItems table has GroupId for queries with groups of items (each group is ANDed, each item inside a group is ORed), the AttributeCode is what this item is asking for, 0 for Country, 1 for Birthday Month and 2 for Email Contains queries, so all possible combinations of these attributes, groups and values will produce a correct answer.

And for the query engine part:

WITH cte AS 
   SELECT EntitiesRulesGroups.EntityId, EntitiesRulesGroups.QueryId, MatchingUserGroups.EntityId AS MatchingEntityId
   FROM (
    SELECT EntityId, QueryId, GroupId
    FROM Entities
     SELECT DISTINCT QueryId, GroupId
     FROM QueryItems
     ) AS QueryItemsGroups
    ) EntitiesRulesGroups
    SELECT EntityId, QueryId, GroupId, Sum(MatchBit) AS MatchBit
         SELECT EntityId, QueryItems.QueryId, QueryItems.GroupId, 
       /*Country*/            (CASE WHEN ((AttributeCode = 0) AND (Entities.Country = QueryItems.Value))                      THEN 1 ELSE 0 END) +
       /*Birthday Month*/   (CASE WHEN ((AttributeCode = 1) AND (datepart(MONTH, Entities.BirthDate) = QueryItems.Value)) THEN 1 ELSE 0 END) +
       /*email contains*/   (CASE WHEN ((AttributeCode = 2) AND (Entities.email like '%' + QueryItems.Value + '%'))          THEN 1 ELSE 0 END) 
            ) AS MatchBit
         FROM Entities
     CROSS JOIN QueryItems
    ) as Matching
   WHERE (MatchBit > 0)
   GROUP BY EntityId, QueryId, GroupId
   ) AS MatchingUserGroups
    ON MatchingUserGroups.EntityId = EntitiesRulesGroups.EntityId AND MatchingUserGroups.QueryId = EntitiesRulesGroups.QueryId AND MatchingUserGroups.GroupId = EntitiesRulesGroups.GroupId

select EntityId, QueryId
from Entities
cross join Queries
where not exists
    SELECT *
    FROM cte
    WHERE cte.EntityId = Entities.EntityId AND cte.QueryId = Queries.QueryId AND MatchingEntityId IS NULL

So what's happening here?

First, we're writing a CTE that will cross check all entities against all query items, so we're creating a big table that tells us if an entity is valid for a specific query item.

In the actual query, we're generating a cross on all entities and all queries and then we're asking the CTE if there is not match in the list.

So why aren't we checking if there is a record instead of there is no nulls?

Well, if there is a record will only tell us if there is a match between a query item and an entity, but there is no way to detect if there are no matches for the ANDed groups except asking if there was a failure to compare a group and an entity.

Of curse its a slow query, its a cross join on multiple tables and there is no way for the query optimizer to do anything, the amount of comparison that is done its almost as comparing each item in code rather than in SQL, but the advantage of it is that everything is done inside the SQL without cursors so data access is actually faster.

There are ways to improve this engine, I'm showing here only the basics for better understanding:

1. Caching the results and only update the cache on changes to either a specific entity against all queries or for a specific query against all entities, then querying the cache is very fast.
If the cache ever gets out of sync, its relatively quick to do a rebuild of the results. You should read up on SQL Merge.
2. Building a pre-calculated temp tables for the queries and the entities so a cast or convert is done only once for an entity <> query item pair.
3. Use with (nolock) where applicable.

I've tested this method with 500k entities (and their own entities - up to 10 sub entities) and 1k queries, the performance was about 500ms for a query to all entities and entity to all queries and a total cache rebuild was about 5 minutes, your results may vary based on too many things to count.

Tags: , ,

Wednesday, June 6, 2012

Unused/Duplicate Indexes in SQL Server

One of the most common tasks when maintaining a database still under development is cleanup. 

Cleaning up unused/duplicate indexes is important, say you created a great index in the past for a particularly slow query, two months after that, there was a redesign for that section of the code making your old index obsolete. 

Unused/duplicate indexes waste space, more so, they slow down updates and can even slow down queries if the query engine chooses a very bad index as its datasource, we don't want that in our pristine database.

So now what? in a large database going after every table, every query and every index is sometimes impractical and narrowing the number of indexes you need to verify could be the only way to go.

There are two almost exact ways of doing it, one is with Microsoft SQL Server Management Studio and the other is via query.

In the Management Studio, go to Object Explorer, right click the desired database, Reports, Standard Reports, Index Usage Statistics.

You'll get something similar to this:

So what are all these numbers? What should you look at?

The most important information to you are the number of seeks, scans and updates. The last user seek, scan, lookup times.

The number of user Seeks, Scans and Updates are important to know which index is most/least active, but what if the most active index is also a one deprecated a month ago and nobody noticed it? that's why the dates are there.

Note that Index Usage Statistics only shows indexes that were actually touched by the system, either by updates or by queries.

To find really dead indexes, you should also take a look at Index Operational Statistics, look for indexes without any Range Scans or Singleton Lookups.

In the table below I've added an index but didn't bother to query by it, so the last time time that index was touched was on creation, the table was only touched by queries not requiring it so it doesn't show up in the Index Usage Statistics.

Another very important thing to notice are indexes with excessive scans and no seeks, indexes are there to reduce the load on the database, its cool that the database can use them also by their includes to shrink down the subset of the data to work with, but a lot of the time its an indication there's a rogue query on the loose, catch it before it becomes an issue.

In my opinion the information piece missing most from this report is the size of the index, as much as its bad practice to keep an unused index, for example, if its 1k in size and it doesn't slow down any updates its a waste of your time and resources to even look at it (unless its your task to clean up and not to enhance performance).

Use common sense, if the index is on a table you don't know what it does, check with your team mates who last handled that table before you make ANY decisions, could be a report table for the bookkeeping department, executed once a year and when they need it all of the sudden they are doing over night because the reports didn't came out in time... measure twice, cut once.

Martin Thwaites has a nice article in CodeProject, he wrote a query (I Just added the last write time) that shows you all this info and the size of the index, thanks Martin!

    sch.name + '.' + t.name AS [Table Name], 
    i.name AS [Index Name], 
    ISNULL(user_updates,0) AS [Total Writes], 
    ISNULL(user_seeks + user_scans + user_lookups,0) AS [Total Reads],
    s.last_user_scan ,
    ISNULL(user_updates,0) - ISNULL((user_seeks + user_scans + user_lookups),0) AS [Difference],  
    p.reserved_page_count * 8.0 / 1024 as SpaceInMB
FROM sys.indexes AS i WITH (NOLOCK)  
    LEFT OUTER JOIN sys.dm_db_index_usage_stats AS s    WITH (NOLOCK) ON s.object_id = i.object_id  AND i.index_id = s.index_id  AND s.database_id=db_id()  AND objectproperty(s.object_id,'IsUserTable') = 1  
    INNER JOIN        sys.tables                    AS t    WITH (NOLOCK) ON i.object_id = t.object_id  
    INNER JOIN        sys.schemas                    AS sch    WITH (NOLOCK) ON t.schema_id = sch.schema_id  
    LEFT OUTER JOIN sys.dm_db_partition_stats    AS p    WITH (NOLOCK) ON i.index_id = p.index_id and i.object_id = p.object_id
WHERE (1=1)
    --AND ISNULL(user_updates,0) >= ISNULL((user_seeks + user_scans + user_lookups),0) --shows all indexes including those that have not been used  
    --AND ISNULL(user_updates,0) - ISNULL((user_seeks + user_scans + user_lookups),0)>0 --only shows those indexes which have been used  
    --AND i.index_id > 1            -- Only non-first indexes (I.E. non-primary key)
    --AND i.is_primary_key<>1        -- Only those that are not defined as a Primary Key)
    --AND i.is_unique_constraint<>1 -- Only those that are not classed as "UniqueConstraints".  
ORDER BY [Table Name], [index name]

Tags: ,

Tuesday, June 5, 2012


So you want to use XML in your application?

What should you use? there are XmlDocument, XDocument, XmlReader? XmlWriter? Help!

Lets make some order in this mess. 

What do you need to do? 

Are your XML files very big? use XmlReader/XmlWriter, but you'll have to write more code than the other options, think of it as reading a text file with a little help.

Are you used to XPath and pressed for time? you can use XmlDocument, it might make your life easy, just remember its impossible to serialize XmlDocuments.

Do you like LINQ? use XDocument/XElements, they are also serializable so its a good option if you're using WCF and need to pass XML elements.

Personally I prefer XDocument/XElement, I can serialize them when I need to, I'm used to LINQ queries so its very easy for me to query XML documents and it supports explicit casting so I don't need to deal with the whole primitive to string and back conversions.

You can find the demo project at:

And you can't really look at these without doing some benchmarking, you may see it in the project code, but I'll just give you the highlights, XmlReader/XmlWriter are the fastest, then XDocument and last is XmlDocument.

I've included some extensions which helped me in the past with XML documents containing enums and byte arrays (or binary data):

/// <summary>
/// Converts XAttribute to enum
/// </summary>
/// <typeparam name="T">Enum type</typeparam>
/// <param name="xattribute">XAttribute object</param>
/// <param name="defaultValue">return this value if conversion fails</param>
/// <returns>defaultValur or converted enum value</returns>
public static T ToEnum<T>(this XAttribute xattribute, Enum defaultValue) where T : struct, IConvertible
    if (!typeof(T).IsEnum)
        throw new ArgumentException("T must be an enum");

    return (T)((xattribute == null || (string.IsNullOrEmpty(xattribute.Value))) ? defaultValue : Enum.Parse(typeof(T), xattribute.Value));

/// <summary>
/// Converts XElement to enum
/// </summary>
/// <typeparam name="T">Enum type</typeparam>
/// <param name="xelement">XElement object</param>
/// <param name="defaultValue">return this value if conversion fails</param>
/// <returns>defaultValur or converted enum value</returns>
public static T ToEnum<T>(this XElement xelement, Enum defaultValue) where T : struct, IConvertible
    if (!typeof(T).IsEnum)
        throw new ArgumentException("T must be an enum");

    return (T)((xelement == null || (string.IsNullOrEmpty(xelement.Value)))  ? defaultValue : Enum.Parse(typeof(T), xelement.Value));

/// <summary>
/// Converts XAttibute's value to byte array from base64 encoded string
/// </summary>
/// <param name="xattribute">XAttribute object</param>
/// <returns>byte array if successful, null if not</returns>
public static byte[] ToByteArray(this XAttribute xattribute)
    if (string.IsNullOrEmpty(xattribute.Value))
        return null;

    return Convert.FromBase64String(xattribute.Value);

/// <summary>
/// Converts XElement's value to byte array from base64 encoded string
/// </summary>
/// <param name="xelement">XElement object</param>
/// <returns>byte array if successful, null if not</returns>
public static byte[] ToByteArray(this XElement xelement)
    if (string.IsNullOrEmpty(xelement.Value))
        return null;

    return Convert.FromBase64String(xelement.Value);


Sunday, June 3, 2012


Since RedGate started charging for .NET Reflector I've been looking for an alternative, I use it mostly to understand how a specific class is implemented, one example is the ConcurrentDictionary I've checked a few weeks ago but I don't use it in my daily tasks so it doesn't really justify purchasing it even if its "only" $70-$190.

Pretty quick I've found ILSpy, I know there are many features it doesn't have over reflector, but its enough for me. Give it a try if you're also looking for an alternative.


Friday, June 1, 2012

Explicit Operators

So, you want to be able to cast your class into other classes, but you need to be able to control the casting process.

Good, we have just what the doctor ordered, explicit operators.

I did a quick demo project to explain how it works, I have a CSV file with names, numbers, dates and every time I order the records by a different column, which is a different type: string, int, DateTime.

parser.Rows.OrderBy(i => (string)i.Cell[0])

parser.Rows.OrderBy(i => (int?)i.Cell[1])

parser.Rows.OrderBy(i => (DateTime?)i.Cell[2])

And the class:

        /// <summary>
        /// Parsed Cell
        /// </summary>
        public class Cell
            /// <summary>
            /// Column Number of cell
            /// </summary>
            public int ColumnNumber;

            /// <summary>
            /// Text value of cell
            /// </summary>
            public string Value;

            /// <summary>
            /// Explicit cast to int
            /// </summary>
            /// <param name="cell"></param>
            /// <returns></returns>
            public static explicit operator int(Cell cell)
                return Convert.ToInt32(cell.Value);
            /// <summary>
            /// Explicit cast to int?
            /// </summary>
            /// <param name="cell"></param>
            /// <returns></returns>
            public static explicit operator int?(Cell cell)
                int retval;
                if (int.TryParse(cell.Value, out retval))
                    return retval;

                return null;

            /// <summary>
            /// Explicit cast to string
            /// </summary>
            /// <param name="cell"></param>
            /// <returns></returns>
            public static explicit operator string(Cell cell)
                return cell.Value;

            /// <summary>
            /// Explicit cast to DateTime
            /// </summary>
            /// <param name="cell"></param>
            /// <returns></returns>
            public static explicit operator DateTime(Cell cell)
                return DateTime.Parse(cell.Value);

            /// <summary>
            /// Explicit cast to DateTime?
            /// </summary>
            /// <param name="cell"></param>
            /// <returns></returns>
            public static explicit operator DateTime?(Cell cell)
                DateTime retval;
                if (DateTime.TryParse(cell.Value, out retval))
                    return retval;

                return null;

And for the demo project:

You can generate your own test data with this nice website:


Quick Open File for Visual Studio

Have you ever worked on a big solution? sometimes with many projects?

Going back and forth in the solution explorer could waste a lot of your time, if you're a bit tired just before getting a version out its even worse, not to mention getting you out of focus.

I'm a big fan of Quick Open File, I think you will too when you try it.

Just remember Ctrl-K O and type the file you're looking for.


Update 2012-08-22: You can always use the Navigate To, or Ctrl+, key combination:

I've discovered this while looking for a similar solution in VS2012, its showing too much information, but its better than digging through the solution files and you can always type the extension.