6 more things C# developers should (not) do

For all wondering what are the things a C# developer should and should not do. As the continuation of my previous post 8 Most common mistakes C# developers make I decided to write another article about the things C# developers should always be aware of.

1. Try to avoid using the “ref” and “out” keywords.

There is rarely a situation that you really have to use “ref” or “out” keywords. It should be avoided as much as possible, because when you use them it means that your method probably tries to do too much (and therefore breaks the Single Responsibility Principle). It can also make your method less readable. To accomplish the goal of not using ref/out keywords it is usually enough to create a class that will represent multiple return values and then return an instance of this class as a result of the method. There are obviously situations when the usage of ref/out cannot be avoided like when using TryParse method, but these should be treated as exceptions to the rule.

2. Do not use OrderBy before Where

The above statement might sound very obvious but anyway some developers tend to forget about this. Let’s take a look at the following code:

var allCitiesInPoland =
    allCitiesInTheWorld.OrderBy(x => x.Name)
                       .Where(x => x.Country == Country.Poland);

In this hypothetical scenario firstly the collection of all cities in the world (according to some statistics: 2,469,501) will be sorted and then only these from Poland (897) will be returned. The result will be appropriate, the only problem is that it will be simply inefficient. Using the following code:

var allCitiesInPoland =
    allCitiesInTheWorld.Where(x => x.Country == Country.Poland)
                       .OrderBy(x => x.Name);

First of all we will choose only the cities from Poland and then order this significantly smaller collection alphabetically by name.

3. Always use Properties instead of public variables

When using getters and setters you can restrict the user directly from accessing the member variables. Secondly, you can explicitly restrict setting the values. Thanks to this you make your data protected from accidental changes. What is more, when using properties,  it is much easier to validate your data.

4. Take advantage of string.IsNullOrEmpty() and string.IsNullOrWhiteSpace()

Instead of:

if (name != null && name != string.Empty)
{
    [...]
}

It’s better to use:

if (string.IsNullOrEmpty(name))
{
    [...]
}

Sometimes we want to make sure that a string is not only not empty and not null, but also does not comprise of whitespaces. In such situation we could of course use the following code:

if (string.IsNullOrEmpty(name.Trim()))
{
    [...]
}

But why not using something that is already provided by the .NET framework (in version 4.0 and higher):

if (string.IsNullOrWhiteSpace(name))
{
    [...]
}

The example above can be treated as a more general principle. You should simply always try to find out whether something that you implemented is not already existing as a ready method in the framework that you use. Make sure that you went through all the methods in such .Net classes as e.g. String or IO.Path. This might potentially save you some time that you could have spent on reinventing the wheel.

5. Understand the difference between First() and Single()

Always remember that First() returns the very first item of the sequence, if no item exists it throws InvalidOperationException, whereas Single() Returns the only item in the sequence, if no item exists it throws InvalidOperationException,and if more than one item exists, it also throws InvalidOperationException.

6. Do not blindly use List

There are some cases when the usage of List is simply not recommended from performance point of view. You should be aware of the existence of e.g. HashSet or SortedSet which can considerably improve the performance, especially in case of large sets of items. To make you aware of how big the difference in performance between these two is, let’s consider the following piece of code:

    class Program
    {
        static void Main(string[] args)
        {
            const int COUNT = 100000;
            HashSet<int> hashSetOfInts = new HashSet<int>();
            Stopwatch stopWatch = new Stopwatch();
            for (int i = 0; i < COUNT; i++)
            {
                hashSetOfInts.Add(i);
            }

            stopWatch.Start();
            for (int i = 0; i < COUNT; i++)
            {
                hashSetOfInts.Contains(i);
            }
            stopWatch.Stop();

            Console.WriteLine(stopWatch.Elapsed);

            stopWatch.Reset(); 
            List<int> listOfInts = new List<int>();
            for (int i = 0; i < COUNT; i++)
            {
                listOfInts.Add(i);
            }

            stopWatch.Start();
            for (int i = 0; i < COUNT; i++)
            {
                listOfInts.Contains(i);
            }
            stopWatch.Stop();

            Console.WriteLine(stopWatch.Elapsed);
            Console.Read();
        }
    }

After executing it you will see the difference:
100000 ‘Contains’ operations using HashSet: 0.002 sec
100000 ‘Contains’ operations using List: 28.74 sec

Of course HashSets or SortedSets are not golden means for every possible scenario. HashSet should be used in case when you care about the performance (especially if you know that you will operate on a large set of items) but do not care about the order. SortedSet is a bit slower than HashSet but will give you a sorted collection and is much faster than List.

Use List when you want to iterate through the collection. Iterating through all the items in a List it is generally much faster than through a set (unless you use inside such methods as Contains).

If you know any other thing that C# developers should or should not do feel free to share it with us and leave a comment below!

P.S. What you should be aware of is that the guidelines presented in this (and in the previous post) will not suddenly make you a great C# programmer. What makes someone a good developer is the everyday compliance with the basic principles of creating high quality code which I described in details in the post Top 9 qualities of clean code.

Tags: ,

19 comments

  1. Nice article but I would not say that enumeration of all items in a List is much faster than in a SortedSet. As far as I know SortedSet is implemented using red-black trees so it can be traversed in O(n) time like all other types of trees. Exactly the same complexity applies to iterating a List which should be iterated faster but probably no more than just a few times regardless of the number of items. Something to investigate I think 🙂

    1. Definitely Enumeration of list is faster than SortedSet.

      class Program

      {

      static void Main(string[] args)

      {

      const int COUNT = 100000;

      SortedSet hashSetOfInts = new SortedSet();

      Stopwatch stopWatch = new Stopwatch();

      for (int i = 0; i < COUNT; i++)

      {

      hashSetOfInts.Add(i);

      }

      stopWatch.Start();

      foreach( int r in hashSetOfInts)

      {

      int p = r;

      }

      stopWatch.Stop();

      Console.WriteLine(stopWatch.Elapsed);

      stopWatch.Reset();

      List listOfInts = new List();

      for (int i = 0; i < COUNT; i++)

      {

      listOfInts.Add(i);

      }

      stopWatch.Start();

      for (int i = 0; i < COUNT; i++)

      {

      int p = listOfInts[i];

      }

      stopWatch.Stop();

      Console.WriteLine(stopWatch.Elapsed);

      Console.Read();

      }

      }

  2. string.IsNullOrEmpty(name.Trim())

    if name is null, that piece of code throws ArgumentNullException

    1. .Trim() is throwing the NullReferenceException. string.IsNullOrEmpty() is not even being called.

      1. I use a collection of extensions methods where i have a public static bool IsNullOrEmpty(this string)
        is ArgumentNullException save as well as a extension method called TrimExtended also ArgumentNullException safe

    2. Why you want to use name.Trim() when you do not sure name is null or what?
      How about this? if(!string.IsNullOrEmpty(name)) name.Trim()

      OR
      You can use null checker:
      string.IsNullOrEmpty(name?.Trim())

  3. im surprised no one commented on #1
    if anything this is the one that struck me as odd and opinion

    please don’t assume those are bad keywords
    the only reason to say that is to not understand how they work.
    here is a detailed article on how out and ref works ,
    http://yoda.arachsys.com/csharp/parameters.html

    for new programmers yes i wouldn’t advise just using ref , Out is fairly safe even for new people though and pretty unique to c#

    out has its use’s in fact TryParse solves problems that otherwise would make you jump thru hoops and curse the gods of c#

    ref is useful as well and you are using ref all over the place anyways even in c#
    myclass a; well a that’s a ref-erence

  4. these articles are really helpful, yet it brought a doubt to my mind…
    why are there so many types of collections? lists, hashsets, sortedset, arraylists, etc etc.
    is there any place you can recommend me so i can learn where and when to use each?

    been using list(kinda exclusively) until now, because it’s pretty confortable to use, but i might’ve been making my programs really sluggish because of that

    thank you

  5. Regarding #1: It seems to me that if you’re using a new class to represent multiple outputs, you’re dealing with the readability problem, but not with the multiple responsibilities problem. That being said, I think multiple output parameters – or their equivalent in the form of a result object – can be appropriate in contexts where there are strong performance considerations which would make a single-responsibility-per-function approach slow, for example when a single iteration through an enumerable needs to handle multiple tasks piece by piece without starting over.

  6. What’s wrong with ref and out ?
    When passing a massive datatable via a function, you can bypass a very long memory copy event.

    And sometimes people want to change the original object, not a copy of it, in different functions.

    1. Reference types, by default, are passed as pointers. The memory copy would usually only occur with a value type. If your datatable is a strict, you are doing it wrong

  7. Regarding point 6, would you go so far to say that, if you don’t need an order and all your elements are unique, that choosing a set instead of a list is a form of information/implementation hiding?

    For example if you have a public method part of an API that returns a list, is the order of that list a part of the general contract yes or no (even if there is no logical order).

    If the answer is yes, isn’t it better to always use a set instead of list when order is not important and the collection contains only unique elements?

    I mean, the worst thing that could happen is people misusing the API by thinking there is a guaranteed order which can result in breaking multiple applications/frameworks when you change the order when implementing a faster/better algorithm.

  8. Your html encoding is all messed up. It sure is hard to read the code that has &amp; and &lt; all over the place. I recommend you fix it!

  9. For point #2 I generally agree, but here is a counter example:

    If the compare function is fast (and hence sorting is fast) and the where function is very costly, and considering IEnumerables are lazy, then doing OrderBy before the Where may be faster. E.g. if you did
    people.OrderBy(p => p.Age).Where(p => simulateTheUniverseWithout(p).Improvement > 0).First(p => p.Name.First() != ‘Z’)
    Then assuming that simulating the universe is expensive and that most people don’t have names that start with Z and that most people are bad for the universe, then this will be faster that doing the Where before the OrderBy.

Comments are closed.