While working with (young) C# programmers I’ve noticed that some mistakes are being repeated by almost every one of them. These are mostly the mistakes, which once you point them, are quite easy to remember. However, if a developer is not aware of them, they can cause many problems with the efficiency and quality of the developed software. Therefore, I decided to gather the 8 most common mistakes made.
1. String concatenation instead of StringBuilder
String concatenation works in such a way that every time when you add something to a string, a new address in the memory is being allocated. The previous string is copied to a new location with the newly added part. This is inefficient. On the other hand we have StringBuilder which keeps the same position in the memory without performing the copy operation. Thanks to the strings’ appending by means of StringBuilder the process is much more efficient, especially in case of hundreds of append operations.
//INCORRECT List values = new List(){"This ","is ","Sparta ","!"}; string outputValue = string.Empty; foreach (var value in values) { outputValue += value; }
//CORRECT StringBuilder outputValueBuilder = new StringBuilder(); foreach (var value in values) { outputValueBuilder.Append(value); }
2. LINQ – ‘Where’ with ‘First’ instead of FirstOrDefault
A lot of programmers find a certain set of elements by means of ‘Where’ and then return the first occurrence. This is inappropriate, because the ‘First’ method can be also applied with the ‘Where’ condition. What’s more, it shouldn’t be taken for granted that the value will always be found. If “First” is used when no value is found, an exception will be thrown. Thus, it’s better to use FirstOrDefault instead. When using FirstOrDefault, if no value has been found, the default value for this type will be returned and no exception will be thrown.
//INCORRECT List numbers = new List(){1,4,5,9,11,15,20,21,25,34,55}; return numbers.Where(x => Fibonacci.IsInFibonacciSequence(x)).First();
//PARTLY CORRECT return numbers.First(x => Fibonacci.IsInFibonacciSequence(x));
//CORRECT return numbers.FirstOrDefault(x => Fibonacci.IsInFibonacciSequence(x));
3. Casting by means of ‘(T)’ instead of ‘as (T)’ when possibly not castable
It’s common that software developers use simple ‘(T)’ casting, instead of ‘as (T)’. And usually it doesn’t have any negative influence because casted objects are always castable. Yet, if there is even a very slight probability that an object can be under some circumstances not castable, „as (T)” casting should be used. See Difference Between C# Cast Syntax and the AS Operators for more details.
//INCORRECT var woman = (Woman)person;
//CORRECT var woman = person as Woman;
4. Not using mapping for rewriting properties
There are a lot of ready and very powerful C# mappers (e.g. AutoMapper). If a few lines of code are simply connected with rewriting properties, it’s definitely a place for a mapper. Even if some properties aren’t directly copied but some additional logic is performed, using a mapper is still a good choice (mappers enable defining the rules of rewriting properties to a big extend).
5. Incorrect exceptions re-throwing
C# programmers usually forget that when they throw an exception using „throw ex” they loose the stack trace. It is then considerably harder to debug an application and to achieve appropriate log messages. When simply using „throw” no data is lost and the whole exception together with the stack trace can be easily retrieved.
//INCORRECT try { //some code that can throw exception [...] } catch (Exception ex) { //some exception logic [...] throw ex; }
//CORRECT try { //some code that can throw exception [...] } catch (Exception ex) { //some exception logic [...] throw; }
6. Not using ‘using’ for objects disposal
Many C# software developers don’t even know that ‘using’ keyword is not only used as a directive for adding namespaces, but also for disposing objects. If you know that a certain object should be disposed after performing some operations, always use the ‘using’ statement to make sure that the object will actually be disposed.
//the below code: using(SomeDisposableClass someDisposableObject = new SomeDisposableClass()) { someDisposableObject.DoTheJob(); } //does the same as: SomeDisposableClass someDisposableObject = new SomeDisposableClass(); try { someDisposableObject.DoTheJob(); } finally { someDisposableObject.Dispose(); }
7. Using ‘foreach’ instead of ‘for’ for anything else than collections
Remember that if you want to iterate through anything that is not a collection (so through e.g. an array), using the ‘for’ loop is much more efficient than using the ‘foreach’ loop. See Foreach vs For Performance for more details.
8. Retrieving or saving data to DB in more than 1 call
This is a very common mistake, especially among junior developers and especially when using ORMs like Entity Framework or NHibernate. Every DB call consumes some amount of time and therefore it’s crucial to decrease the amount of DB calls as much as possible. There are many ways to do so:
- Using fetching (Eager Loading)
- Enclosing DB operations in transactions
- In case of a really complex logic, just moving it to the DB by building a stored procedure
It goes without saying that there are hundreds of other types of mistakes made by C# programmers. If you know any interesting ones or you want to share your opinion about the subject, feel free to leave a comment below!
If you would like to read more about common mistakes of C# developers read this post.
As for the first one – it’s not always better to use StringBuilder for string concatentation. It is better if you concat a lot, but using StringBuilder just to replace
var fullName = firstName + ” ” + lastName;
is not more efficient.
Use StringBuilder if you have a really large number of strings to concat, or number of concatentations in unknown at compile time. For very short concats (like the one above), string concatenation is faster and better (StringBuilder is an object that needs to be instantiated – remember that).
For long concatenations, where you exactly know what you need to concat (like in your example), it’s better to use string.Concat() method (or string.Join() if you need a delimiter between concatenated strings).
Just my 3 cents 😉
As a matter of fact the “string” is also a class, therefore it needs to be instantiated. Neverthless you are right. Using a StringBuilder for concatenation first name and last name is like using a hammer for sticking a pin. StringBuilder is not the golden hammer (http://sourcemaking.com/antipatterns/golden-hammer) but it’s good to know about it.
P.S. For easy concatenation is worth to think about string.Format(“{0} {1}”, firstName, lastName);
I didn’t say string is not a class – just pointed out that StringBuilder also needs to be instantiated. If you concatenate only two strings, single new string is created, so number of newly created objects in both solutions is 1 – no need to use StringBuilder.
I guess the main problem you wanted to point out in #1 is the fact that those people don’t know that C# string is immutable. If they knew that, they’d look for alternatives and would find StringBuilder and/or various string methods (at least I hope they would).
humm most of these points are matters of taste (in or for a) context
so i dont believe this was a well put together post but live and learn… i.e. just because you understand something doesn’t mean you can verbally express your thoughts well , or even teach
i want to point out something everyone seems to be dancing around
about the stringbuilder class and that is its context of use
the best reason to use stringbuilder is because you dont want rampant garbage collection , you know strings are immutable … so you know what that means for … garbage collection right…?
example
we have 2 contexts
1)we are building a string just once or very infrequently
2) we are building/rebuilding a string in say a update situation or simulation looping situation
in the case of 1 it really matters little so readabilty is the best choice
in the case of 2 we have a serious matter of GC consideration
which takes priority over readability in this case stringbuilder should be used as it can be used to avoid rampant Garbage Collection
this is one place were regular garbage collection can be a problem in this context but in other contexts it is negligible
this
changes the nature of the mistake simply to newer programmers have not been taught or may not understand context were they should be using string builder.
I see there are a few issues with the original list! lol However, don’t get discouraged, this is an awesome discussion for someone recently picking up C# from years of Java – like me!
I also prefer to use strong.Format() for simple text formatting. I feel it separates the string from the variables nicely and I don’t loose any sleep over the possible microseconds I could be loosing performance-wise.
I fully agree 🙂
Unless MS has changed the implementation of string.Format in .NET 4.5, it uses StringBuilder internally, so string.Format is actually just “syntactic sugar” for using StringBuilder.
1. Could be replaced with one line
outputValue = string.Join(” “, values);
It’s also better since you wouldn’t have to remember to leave a trailing space after each item in values.
Thanks for reading my post and sharing your thoughts. You are fully right that there are other alternatives to a simple string concatenation as well (like string.join or string.format). Comparing all these alternatives is a subject for a separate post (and many people already wrote such posts) and its usually very subjective to a certain context. What I wanted to achieve when creating this point of my post was to simply make developers aware that they should not blindly use ‘+’ operators because it’s not applicable in case when there are lots of append operations. Differences in performance between such alternatives of StringBuilder like string.join or string format (which is indeed based inside on StringBuilder) are very tiny and usually depending on the context and it was not the goal which I wanted to achieve when writing down this point.
IMHO you should change the title from ‘8 Most common mistakes C# developers make’ to ‘8 things which I think some developers do wrong’. Saying that everything you listed are mistakes which young (assumed inexperienced) developers makes it look like that whoever makes any of such ‘mistakes’ is young and inexperienced developer. In other words I am sure that you want to help developers with your article but to me it came of a bit arrogant and condescending. My thoughts on your points:
1. Already answered by other comments, you can use String.Join to make it more readable (and probably just as efficient)
2. If you are not aware of FirstOrDefault then Where/First is just as good. IMHO not knowing that there is a method in the framework is not a mistake. There are hundreds/thousands of nice methods inside .NET framework for which any developer could say ‘I wish I knew that before’.
3. I agree that every developer should know the difference between ‘(T)x’ and ‘x as T’
4. I guess that by ‘rewriting’ you mean copying values from/to properties. This is very context related ‘mistake’ – if you do not use DTOs then there is not that many places where tools like AutoMapper are helpful. I could also say something like ‘you should always use OR mapper if you are dealing with the database’ but I would not say it is a mistake if someone does not use it (because its not a mistake).
5. I agree that every developer should know about exception re-throwing
6. I agree that every developer should know about what some basic keywords (such as ‘using’) do
7. That is just a micro-optimization and if you are going that way why not include collections as well (http://codebetter.com/patricksmacchia/2008/11/19/an-easy-and-efficient-way-to-improve-net-code-performances/). I prefer foreach in almost all scenarios except in some specific cases (such as doing reverse for loop through collection in order to remove values).
8. Same as 4, depends on the application and the context.
First of all sorry for the late response to your comment – unfortunately I did not
have time to do this earlier. Secondly I really appreciate that you found interest in my
post and that you found time to share your views and opinions about it.
Below I have put my answers to your points:
1.The general thought here was that StringBuilder is better
than simple string concatenation when we have multiple append operations. By “multiple”
I meant “a lot” but probably I should have put more emphasis on the fact that I
am talking about large numbers of elements.
There are obviously other alternatives as well like
string.join, that’s fully true. What’s important though is not to use (due to
the lack of knowledge or laziness) simple string concatenations when we have large
numbers of append operations.
2. That’s true that there are hundreds other methods in .NET
framework as well. Yet there are some methods that C# developers use quite
often, therefore it is relevant that they understand that there might be some
alternatives to what they currently use.
What I have meant with suggesting the usage of
FirstOrDefault are the situations when it’s possible that null will be returned
and it should not make the application crashes. If course it could be handled
by means of exceptions but I prefer if exceptions handle errors and not just
some exceptional paths. I think we should not abuse exceptions.
4.The problem that I found with rewriting properties
(especially if you have an architecture that distinguishes between entities and
business models) is that 1) it’s a lot of writing, 2) it’s easy to ommit some
property that you want to rewrite from one object to another. Generally it’s
not necessarily a problem when you have only a few properties, but it might be
when you have many of them. So let’s say you have an entity Person and a
business model PersonModel. PersonModel is enriched with some additional
properties like FullName, Age, etc. but the rest, let’s say, 10 properties
should be simply copied. In this case you can either rewrite the properties
line by line or use some mapper which makes it in my opinion much cleaner.
7.What I meant here is not to blindly use foreach. There are
some developers who forget that for-loop exist at all though there are indeed
some scenarios when the usage of for-loop is preferred.
8.Can you provide me with the context when having more than
1 call to a DB is a better solution?
Actually string builder is very efficient for a larger data set of strings that need to be concatenated. For a very definite set, StringBuilder is an overhead and we can use string.join method.
I would prefer the below model to throwing exceptions (incase of Exception ex)
try{
}
catch(Exception){
throw;
}
recent development in JIT produces same IL code for both foreach and for loop. I prefer foreach since i don’t need to worry about the length and index out of range issues.
Sure, you’re gonna go with “catch(Exception){ …”
Now you want to e.g. log the exception’s .Message property. (I.e. “//some exception logic […]”) Then what?
using the as (T) will not always work how you would expect, check out Eric Lippert’s post on the matter http://blogs.msdn.com/b/ericlippert/archive/2009/10/08/what-s-the-difference-between-as-and-cast-operators.aspx
The article you use to support point 7 is 9 years old – the C# and JIT compiler has changed a lot since then! Can you back up your point with a more recent example? (Of course as Dalibor points out it’s a micro-optimisation anyways so can hardly be considered a ‘mistake’)
2. FirstOrDefault has its place, but in your example with integers using FirstOrDefault is a bad idea: suppose your list doesn’t contain any fibonacci numbers. Then the function will return the default value of zero and the caller may act on the wrong premise that zero IS a fibonacci number. And throwing an exception constitues a strong contract with the caller, so it’s definitely better in this case.
7. for is almost never faster than foreach. Even in cases when it is, the difference is negligible in majority of practical applications. BTW, “not a collection, e.g. an array” is somewhat sloppy and vague definition.
8. Very debatable, if you are talking about relational sql DBs. In many scenarios the opposite is true: manipulating data in multiple calls results in better perfromance. It allows to shorten the transaction duration that minimizes blocking, and allows to use simpler queries increasing the chance that the DB engine will use an optimal execution plan.
Thanks for sharing your point of view. Some of your thoughts have been already answered by me in replies to other comments, so, trying not to break the DRY rule, I will write about the point that have not been mentioned in other comments:
You wrote: “not a collection, e.g. an array” is somewhat sloppy and vague definition.
My answer: You are right that an array might also be treated as a collection of objects. But because it’s a post for C# developers and in C# language Collections (System.Collections) and Arrays are two different concepts, I made such a division.
You can read more about the differences between Collections and Arrays in C# in this article:
http://msdn.microsoft.com/en-us/library/9ct4ey7x(v=vs.90).aspx
Using ‘as’ definitely has its place and I do so quite often, but if the code following the type check requires that the cast succeed, then having the code throw an InvalidCastException can be desirable. It’s certainly much better than using ‘as’, then not checking for null and throwing a NullReferenceException.
Another thing is that for value types the as T would not work, in cases it’s obvious that only reference types will be used then we can utilize as T otherwise not.
I was going to go into detail about the problems with this article but previous posters have already pointed out all the problems I noticed. I just want to add that I feel like articles like this are dangerous. While points 5 and 6 are, in my opinion, absolutely best practices that should be followed in 99% of cases the rest of the items are dependent on what you are doing, how much data you are working with, and a host of other considerations. To tell a developer “always use for instead of foreach” will generate a bunch of developers who don’t understand why other than they read it on some blog.
Thanks – good list.
Step 5 – The sentence should be “they lose the stack trace” not “they loose the stack trace”.
Thanks for reading my post and for pointing out this typo 🙂
I completely disagree with the First vs FirstOrDefault issue. You should only use FirstOrDefault if you know it’s possible that you won’t find a match and that you’re going to handle things differently. Otherwise, if you always expect there to be at least one match, then you should use First() and let the exception be thrown. It’s much easier to debug First() throwing an exception than random null object exceptions.
Thanks for reading the post and sharing your thoughts, I really appreciate this.
What I have meant in fact with suggesting the usage of FirstOrDefault are the situations when it’s possible that null will be returned and it should not make the application crashes. If course it could be handled by means of exceptions but I prefer if exceptions handle errors and not just
some exceptional paths. I think we should not abuse exceptions. But you are right that I should have explained it better in the post.
Thanks for reading the post and sharing your thoughts, I really appreciate this.
What I have meant in fact with suggesting the usage of FirstOrDefault are the situations when it’s possible that null will be returned
and it should not make the application crashes. If course it could be handled by means of exceptions but I prefer if exceptions handle errors and not just
some exceptional paths. I think we should not abuse exceptions.But you are right that I should have explained it better in the post.
Thanks for reading the post and sharing your thoughts, I really appreciate this.
What I have meant in fact with suggesting the usage of FirstOrDefault are the situations when it’s possible that null will be returned
and it should not make the application crashes. If course it could be handled by means of exceptions but I prefer if exceptions handle errors and not just
some exceptional paths. I think we should not abuse exceptions.But you are right that I should have explained it better in the post.
Good article although point 7 is just wrong. There is almost no performance difference between for and foreach, even for arrays, and foreach is more readable (unless you need to do something with the index), so it’s almost always better to use foreach. Also, how is array NOT a collection?
1. You seem to be repeating mostly misinformation. In the exact scenario you described, your first code sample would actually likely be extremely more performant. Also this type of code would nearly never likely exist. More realistically it would be something like
outputValue = a + b + c + d;
The compiler and string interning is pretty magically when it comes to this code.
As others noted, the real magic string method is string.Join.
string.Join is unequivocally the best method. It is the fastest overall, faster than everything else.
The only reason to use stringbuilder is for code readability and also using AppendFormat possibly mixed in with Appends. If you only need .Append() you should be using string.Join.
If you’re getting into strings that have thousands or millions of characters then you should be using textwriters and stuff like that as opposed to requiring huge stack allocations.
Agreed with Dalibor and others. Most of the points here are very subjective and, as always, there are a lot of scenarios when completely different advices can be given. As always, the most correct advice is “it depends”.
I’ve commented a bit broader on my blog –
9th most common mistake – using “always” and “never”
> 3. Casting by means of ‘(T)’ instead of ‘as (T)’ when possibly not castable
So instead of a type cast exception telling you exactly what went wrong you are going to inject a null reference exception at a later date? That’s not an improvement, that’s a landmine.
The ONLY time an “as T” cast is acceptable is if it is immediately followed by a null check.
Personally I use the rule that there is only 2 cases where you should use an as…
1. When returning a value from a property
2. When used with a coalesce – var str = obj as string ?? string.Empty;
Personally, I’d love it if you were forced to put a coalesce with as, if you really wanted null you could say
obj as string ?? null;
and it could lead to being able to use it with primatives –
obj as int ?? -1;
Thanks for sharing your point of view.
You are right that “as T” should be followed by a null-check, otherwise it might be hard to debug the program appropriately. In my opinion when using “as T” and complying with the rule of immediate null-checking we make our code cleaner, more readable and easier to debug, then when using the prefix-cast with exception throwing approach.
There are some articles that go deeper into this subject:
http://www.codingware.com/2011/02/difference-between-c-cast-syntax-and-as.html
http://blog.gfader.com/2010/08/avoid-type-casts-use-operator-and-check.html
http://www.techgalaxy.net/Docs/Dev/5ways.htm
Thanks for sharing your point of view.
You are right that “as T” should be followed by a null-check, otherwise it might be hard to debug the program appropriately. In my opinion when using “as T” and complying with the rule of immediate null-checking we make our code cleaner, more readable and easier to debug, then when using the prefix-cast with exception throwing approach.
There are some articles that go deeper into this subject:
http://www.codingware.com/2011/02/difference-between-c-cast-syntax-and-as.html
http://blog.gfader.com/2010/08/avoid-type-casts-use-operator-and-check.html
http://www.techgalaxy.net/Docs/Dev/5ways.htm
Thanks for sharing your point of view.
You are right that “as T” should be followed by a null-check, otherwise it might be hard to debug the program appropriately. In my opinion when using “as T” and complying with the rule of immediate null-checking we make our code cleaner, more readable and easier to debug, then when using the prefix-cast with exception throwing approach.
There are some articles that go deeper into this subject:
http://www.codingware.com/2011/02/difference-between-c-cast-syntax-and-as.html
http://blog.gfader.com/2010/08/avoid-type-casts-use-operator-and-check.html
http://www.techgalaxy.net/Docs/Dev/5ways.htm
Thanks for reading my post and sharing your point of view.
You are right that “as T” should be followed by a
null-check, otherwise it might be hard to debug the program appropriately. In
my opinion when using “as T” and complying with the rule of immediate
null-checking we make our code cleaner, more readable and easier to debug, then
when using the prefix-cast with exception throwing approach.
There are some articles that go deeper into this subject:
http://www.codingware.com/2011/02/difference-between-c-cast-syntax-and-as.html
http://blog.gfader.com/2010/08/avoid-type-casts-use-operator-and-check.html
http://www.techgalaxy.net/Docs/Dev/5ways.htm
Thanks for reading my post and sharing your point of view.
You are right that “as T” should be followed by a
null-check, otherwise it might be hard to debug the program appropriately. In
my opinion when using “as T” and complying with the rule of immediate
null-checking we make our code cleaner, more readable and easier to debug, then
when using the prefix-cast with exception throwing approach.
There are some articles that go deeper into this subject like this one:
http://www.techgalaxy.net/Docs/Dev/5ways.htm
One more thing is that for value types the as T would not work, in cases it’s obvious that only reference types will be used then we can utilize as T otherwise not.
> When using FirstOrDefault, if no value has been found, the default value for this type will be returned and no exception will be thrown.
That is not a good thing. That is sloppy programming done by people who would rather paper over problems instead of actually fixing them.
The developers down the road are going to hate you when they find out the reason their code was silently failing.
I use FirstOrDefault; the next line is
If (varObject != null )
Which is proper checking. How is it sloppy programming to run a linq query that might yield no results?
And yet for some code it means something has gone quite wrong, and you should stop, if you have a null value. So why reduplicate the null check when First() will do it for you?
Thanks for reading my post and sharing your point of view.
You are right that “as T” should be followed by a
null-check, otherwise it might be hard to debug the program appropriately. In
my opinion when using “as T” and complying with the rule of immediate
null-checking we make our code cleaner, more readable and easier to debug, then
when using the prefix-cast with exception throwing approach.
There are some articles that go deeper into this subject:
http://www.techgalaxy.net/Docs/Dev/5ways.htm
http://www.codingware.com/2011/02/difference-between-c-cast-syntax-and-as.html
http://blog.gfader.com/2010/08/avoid-type-casts-use-operator-and-check.html
Quote: “That is sloppy programming done by people who would rather paper over problems instead of actually fixing them”….
I understand your remark that hiding possible null reference execptions is wrong, and giving you as a programmer a hard time in you day to day maintenenace chores.
However, as much as you like production code to be your playground, I wonder how you would react when for example your local ATM machine thows you an exception instead of dishing out your money. Nobody was papering over anything here, but I bet you would be really annoyed if this would happen in real life!
This highlights the real core of the “sloppyness”. With the rush to market mindset there’s almost no considereration what a program (or method) really should do. Discussions bloated with oneliners and one sided remarks do not really contribute in highlighting the real problems.
I’m talking about pre- and posconditions, and invariants. The real sloppyness can be found here.
It is not a question if a linq query should or should not contain a First i.s.o a FirstOrDefault, but whether the returned data is allowed to contain a null value in the first place.
When in does, then obviously you should have an explicit scenario coded.
If it isn’t allowed, simply blowing up and let the linq query throw isn’t a complete solution either. You still have to write additional code to properly deal with this situation.
Dealing with the situation in a gracefull way, and defining how an application or usecase should proceed is more than often overlooked, or plastered in ad an afterthought.
No, but on the other hand, if my banking software couldn’t find the entity to which I was transferring money I wouldn’t want it to blindly press on with taking money out of my account anyway.
> String concatenation instead of StringBuilder
You got that one wrong too, but at least you get half-credit.
You should be using String.Join so that it can pre-allocate a buffer of the correct size.
It’s true that string.join is an alternative for StringBuilder (and there are many other alternatives which I could write about if it was a post focused purely on the comparison between different string append possibilities).
What is though not fully true is that string.join is better than StringBuilder in all possible cases.
When you do a string.join, the runtime has to: 1: Allocate memory for the resulting string 2: copy the contents of the first string to the beginning of the output string 3: copy the contents of the second string to the end of the output string.If you do two joins, it has to copy the data twice, and so on.StringBuilder allocates one buffer with space to spare, so data can be appended without having to copy the original string. As there is space left over in the buffer, the appended string can be written into the buffer directly. Then it just has to copy the entire string once, at the end.
Therefore whether to use StringBuilder, string.join, string.format or any other alternative is a good subject for a separate post. What is though true (and it was my point when writing about this) is that developers should not use “+” operator for appending string in case of big amounts of strings to concatenate. Probably, and here I agree, I should have listed down all the most popular alternatives instead of refraining myself to StringBuilder.
I actually disagree with a lot of these
1. Yes you should use StringBuilder for long or indeterminate number of joins, but it can also overused. When dealing with very short number of strings it’s actually faster to just concat the strings (The benefit of StringBuilder doesn’t outweigh the cost of setting it up). And for the next few the performance is so minimally better it makes no difference. And StringBuilder does have one disadvantage, it’s uglier.
2. I agree with using .First(x => …) instead of .Where(x => …).First() but the decision to use First or FirstOrDefault is a completely different discussion. If you are going to assume that it was found (so you don’t check if it was afterwards) FirstOrDefault will (assuming it’s nullable) make you throw a NullReferenceException. Using First will return a more meaningful error.
3. I’m really sick of people claiming this. (T)obj is so much better 90% of the time. Like before, as leads to meaningless null reference exceptions possibly nowhere near the problem is. While (T) throws a meaningful InvalidCast where the problem happens.
But people say, “Oh but you should immediately say “if (x == null)” which prevents that, but people don’t always do it, they think, “It will always work”. And let’s not forget you can say if (x is T) on old style casts.
As is meant to perform better, but I doubt you could ever find a situation where it’s actually noticeable and the dangers aren’t worth it.
7. The article is pretty old, so I don’t know if that’s true anymore, but foreach reads nicer and allows for better abstraction.
7. The article is not true nowadays. JIT generates exactly the same code for both iterations (http://lj.rossia.org/users/steinkrauz/300537.html)
How i wish i knew about “StringBuilder” several years ago, in fact “the benefit of StringBuilder do outweigh the cost of setting it up”.
I had severe memory issues, but after making use of the StringBuilder the issues are all gone.
Thanks Pawel!
The comments in the Foreach vs For article are all I think about this one.
🙂
great, agree
In Point 3 you’re suggesting to use as (T) instead of (T) but the article you’re referencing is suggesting the opposite…
http://gen5.info/q/2008/06/13/prefix-casting-versus-as-casting-in-c/
” Prefix-cast should be the default cast operator on your fingertips”
Number 9: Thinking that performance is everything. You should remember: “Premature optimization is the root of all evil.” (Donald Knuth)
Thanks for reading my post and for writing down this famous quote. I fully agree that we should remember about the fact that “Premature optimization is the root of all evil”, but I would also add a sentence “Don’t always excuse yourself by abusing Premature optimization and start thinking about the Design For Performance principle as well”. What I mean here is that I have no doubts that Premature optimization is a very good principle, but it’s applicable mostly to the situations when it requires a lot of energy and time of a developer to optimize the system before it’s really needed (before bottlenecks were actually found). Design For Performance says about having a certain set of good code practices to keep code optimized “out of the box”. Such practices once learnt, do not require a developer to sit hours on optimizing the solution, they are just done automatically and “eat” a similar amount of time as the normal development would take.
First of all: I liked your article a lot because 5 actually solved a problem for me (I just never took the time to look deeper into it – so I wasn’t aware I could use “throw” without an exception). The way you wrote it, you might get a warning, though – because you declare a variable that might not get used (depending on what the actual exception handling code looks like). However, you can also say “catch (SomeException)” – without assigning the exception to a variable. I guess that would be the best way (unless, of course, you really do need the exception for your exception handling).
So, to elaborate a “little” on premature optimization: I commented with that well-known quote as a reminder primarily due to your “common mistakes” 1, 7 and 8 which in my opinion violate that “rule”, or “guideline” (“do not optimize prematurely”) for no good reason:
There’s various ways of putting together strings, and it’s good to be aware of all of them and while it doesn’t hurt to also be aware of the performance implications, I wouldn’t really use performance to drive decisions in most cases.
Personally, I use string.Format most of the time, string.Join for cases like your example and StringBuilder when I have some complex logic for building longer strings, usually with some conditionals in-between (“complex”). The reason I’m using string.Format is mainly for globalization (make it easy to localize) and code readability. string.Join is most elegant when you build a string list from a list of items. Honestly, for all of the cases, I don’t care about performance at all – and no one should unless they see in the profiler that this particular snippet of code is exactly the one that’s pulling their whole app down.In other words: If that code either kills your web server, or makes your user wait 5 seconds instead of 1, or you realized that your app is killing the battery of your mobile device, or you notice that the garbage collector goes crazy, is when you should start optimizing. If it’s just a millisecond, or ten, or even 100, in some operation that occurs only every once in a while – why bother? If you wanted the best possible performance you wouldn’t use C# in first place and go with hand-optimized assembly code instead (and yes, there are cases where you’ll have to consider that … fortunately, they are very, very rare; and of course, there’s a few options between C# and hand-optimized assembly code) ;-)Good luck trying to gain a significant performance boost from changing from += to StringBuilder … but of course, if we’re talking about millions of items you might get lucky and have an optimization that is actually noticeable without using a stopwatch.And that’s exactly why it’s good to avoid premature optimization. Having coding conventions (and that’s what you are basically suggesting) that are driven only by performance considerations is actually the worst kind of premature optimization one could think of. It’s an “optimization” done even before you were looking at the actual problem you’re trying to solve.Same with for and foreach. What matters here is whether you care about the index or just about the actual objects / collection. Using “for” if you don’t need the index is just bloating your code for no good reason. Deciding “for vs. foreach” due to performance considerations compromises readability / expressiveness of your code => bad idea.With database calls … again, it really depends: Personally, I use stored procedures by convention because it simplifies my development process and creates a clear separation between logic in C#-code and logic in SQL-code. I just don’t like mixing SQL and C# in one file (and that’s a beneficial convention for more reasons than just personal preference). This does also have performance benefits but that’s not the reason why I’m doing it.Sometimes, the most elegant solution – even when using stored procedures – is doing some things in C# and other things in SQL / Stored Procedures. In other words: You get some data from the database, then do some processing of that data in C# (that would be cumbersome to do if you only had SQL), then get some more data from the database. In those (rare) cases, I live with the performance penalty I have to pay for having multiple calls to the database for a single operation … until I find out that this specific operation creates a performance bottleneck. Then, I either put it all into the stored procedure, or all into the C#-logic; or find a way of still having some parts of the logic in C# and others in the stored procedure but do it in a more optimized way. Or I let the whole thing run in C# directly on the database (if that’s an option for the project in question).To be honest, so far, I’ve never ever run into a situation, where the overhead of calling into the database was even noticeable. And if it’s noticeable for you, you might question why your application server and your database server have such a terrible network connection (or you are really doing something very very very bad with your code, kind of like intentionally DoS-ing your db-server) ;-)Most of the time, when performance of database related operations is an issue, we’re talking about seconds, or in some cases minutes that certain complex stored procedures take to complete! However, sometimes, combining something in a way that allows for SQL optimization does give you significant benefits compared to calling very similar code (with very similar data) many times (and losing the context you could use for optimization that way).TL;DR: It all depends on a lot of factors and in most cases, performance isn’t the one that I would look at to make a decision.
You might be interested in a pretty good talk Jonathan Blow (developer of “Braid” and “The Witness”) gave on develop independent games. Mind you: Games are a category of software where performance really really matters. Games are realtime systems and have to deal with all kinds of constraints. And still: The main part of the talk is about *not* optimizing – at least not for performance.
How to program independent games
http://bit.ly/JBProgrammingIndependentGames
In many cases, foreach makes the code a lot more clear to understand that a simple for. And you’re refering to a 9-year-old article about compiler performance. I bet a lot has changed since then.
Number 7 is a classic example of premature optimization. Which leads me to believe that you have fallen for the most common programming error throughout most any language:
Not writing for maintenance.
Thanks for reading my post and sharing your thought regarding point 7. I fully see your point, but I think you overinterpret the meaning of Premature Optimization. Prematura Optimization says about not spending hours on thinking out how we can tune performance upfront, but doing this only when some bottlenecks will be found. But it does not give a right not to focus on performance at all in the earlier phases, especially when it does not cost much from a developer. There is principle called Design For Performance, which in fact says, shortly speaking, about not abusing Premature Optimization. If there are some code practices which do not require big effort from developer’ perspective (or are just a matter of some habit of a developer), but can influence performance then they should be used. You can read more about Design For Performance principle here:
http://c2.com/cgi/wiki?DesignForPerformance
And about the comparison between these two:
http://blogs.msdn.com/b/ericgu/archive/2005/03/23/401057.aspx
I agree with all the points, except 7. It is a classical example of “Premature optimization”. I think you should put this mistake instead of 7.
Mike- thanks for reading my post and sharing your thoughts. I have already responded to the point you made by answering to Kirk Wood, who also mentioned the Premature Optimization issue in his comment.
Thanks for the post!
1. As pointed out, StringBuilder is NOT always better. Blindly using StringBuilder is as much a mistake as blinding concating your strings.
2. Dropping the Where in favor of the lambda predicate on First/FirstOrDefault I’ll agree with. The absolute about using FirstOrDefault I won’t. As pointed out your example is one scenario where I absolutely cannot agree. If you post “newbie mistakes” type articles, you’d best be very sure your examples are correct and thoroughly explained, because newbies can’t figure out when you blunder so badly.
3. The description here was pretty horrible. What does it mean to be “possibly not castable”? As others have pointed out, what you really mean is you can in some way handle the cast failing. Otherwise, the strict cast that throws an exception IS the correct way to code this.
4. Mapping is something that isn’t all that common, and in many cases a mapping framework is extreme overkill and a likely cause of performance issues. It’s a tool you should be aware of (though I’d debate whether or not newbies should be aware of them), but not using it is hardly a mistake.
7. You’re advocating a micro-optimization that’s unlikely to actually make any difference in the overall performance but is much more likely to result in bugs (off by one errors, etc.), especially by newbies. Probably the worst advice in this article.
there are instances where using First is correct and there are instances where using FirstOrDefault is correct, neither is generally more correct than the other.
Good post Pawel. While one or 2 points might be debated under particular circumstances, there’s one thing I think is dead wrong. “In case of a really complex logic, just moving it to the DB layer by building a stored procedure” implies that embedded SQL is a good idea. It’s not. Unless it executes every time with the same predicate and parameter values, it will compile every time it hits the DB server. This takes a finite amount of time that is eliminated if you put even your simple SQL into a stored procedure. Even better, as a stored procedure, unless you wrote it to recompile every time it runs (necessary when there is a wide variation in the rowcounts of different result sets, 1.e., 100 one time and 100,000 the next), the execution plan will stay loaded for as long as there is available memory and that page is not needed by a new process, thus gaining even more efficiency.
This may not make much difference if you have a service that consumes a few hundred or a few thousand requests per hour, but scale that up to a few hundred thousand per hour and tens of millions per day and now you are talking about significant time if your SQL is embedded vs. a stored proc.
Moreover, a good developer would use the same TRY…CATCH logic in the stored proc as he would use in the application to capture and store the errors at the source, rather than in the application log.
Robert – thanks for complimenting my posts and for sharing your thoughts. There is though one slight misunderstanding of what you have interpreted from my blog post: I have used the term DB layer (so not the data access layer- DAL or data logic layer- DLL, but database layer) and by this I meant in fact the database itself. Probably I should have written simply “stored procedure in a DB” to avoid any misunderstandings. Anyway – what you wrote about the difference in having SQL logic in DAL/DLL and DB is a very interesting subject and I think that you could write a nice blog post out of this 😀
// EVEN
BETTER
try
{
//doing something specific with SomeValue
}
catch (Exception
ex)
{
//some exception logic […]
throw new Exception(
string.Format(“FAIL!
HERES’S SOME VALUE TO HELP FIND OUT WHY: {0}”,
SomeValue),
ex);
}
// gives you
// 1.) nested stack trace,
// 2.) something to tell the client
// 3.) something to duplicate/fix the issue
// call it defensive programming if you must,
// but truth is computations generally don’t fail,
// what fails is another part of the system out of your control
// writes data incorrectly or inconsistently// or you’re uncovering a heretofore unseen edge case
IMHO: If not logic in catch, bad practice :
try
{
// Logic…
}
catch (Exception ex)
{
throw; // ONLY throw, not logic for logging or another
}
Better only
// Logic…
and the exception up to level above.
Though, as a .Net developer, you might be much more interested in clean and easier to read code more, than in several microseconds.
Which is the same as with foreach vs for. Yes, for may be more faster, but foreach states the intention of the code way more clearer than a for.
I agree with most of the points you made, however some are misleading or downright incorrect. As this would require a lengthy post, I’ve decided to put the answer on my blog instead, for whoever’s interested: http://zeckul.wordpress.com/2013/01/09/re-8-most-common-mistakes-c-developers-make/
great article
Interesting. I’m not sure I agree with all of it, but it’s definitely thought-provoking. I think what really needs to happen with C# (and please tell me if such a thing already exists) is for there to be some kind of “bible” which tells us what’s going on internally when a lot of the library structures are used – especially things in System.Collections.* – and how many of the loops and so forth (not to mention LINQ) are compiled. I think such a reference would help programmers decide which methods to use when writing code, so that it becomes more efficient.
Thanks!
Remove the post. To many wrong ideas for such a small post.
I have a question about string concatenation…can we get around the boxing problem by using
string x = string.Concat(a,b,c);
Always wanted to know, but couldn’t find the answer.
This is inefficient? … Dude .. code C++
Concering the performance difference between for and foreach, it all depends in each particular case. Foreach is not used on collections. It is used on Enumerables. And that’s different. An enumerable may not be eligible for use with the for statement at all. Moreover an enumerable’s items may not exist as sequential elements in memory, or may not exist at memory at all prior to their first access. An enumerable may create objects “on the fly”, as you request them.
A “for” loop may (or may not) be slightly faster in case of arrays, but as a .NET developer (meaning, a developer on a managed language) gaining a few milliseconds here and there is not crucial. Optimizing the program in the technique or algorithm level will be extremely more effective than optimizing it in the IL. One of the most common optimizations I do, is using Dictionaries, instead of just Collections, where keyed access to objects is needed. It certainly takes some time to build an index over your data using a dictionary [actually O(n) time], but saves extremelly more time later when you access the data, especially when the data are cross referencing each other by their key.
I could say, not using index on keyed data, and using just collections, is worse than using a foreach instead of a for.
Also, the difference between (T)Val and Val as T, again is not about choosing what is best. It’s all about choosing what fits best to what you want to do.
Val as T will return null if Val is not castable to T. (Τ)Val will just throw an exception.
If it is within your program’s logic to expect a Val that is not castable to T, and you want to perform certain actions in such a case then Val as T is a good choice. You check the result and compare it to null and you do whatever you want to do.
On the other hand, if you always expect a value that is castable to T, then (T)Val MAY be better choice. If you get an exception there it is a good indicator that something has gone wrong with your program that needs to be fixed.
Hey Pawel, thank you for this interesting post! Even though I think I’m an experienced C# developer I found some good hints in your Top 8 😉 Do you think you could do another issue with the followups? “8 more things C# developers should think about” 😉
Hi Florian – thanks for reading my post and the words of appreciation 🙂 I think that your idea is indeed very good, because there are many other things that could be mentioned by me as well. I will definitely work on a followup post in the nearby future.
Reading about strings concatenation, I remember I read this article some months ago…
http://www.dotnetperls.com/stringbuilder-performance
One comment: #3 is a bit misguided. I’ve know of programmers who doesn’t even know how to do a “unsafe cast”, which has a benefit of throwing a early InvalidCastException, whereas whe using operator “as”, if you don’t check for nulls, will throw a NullReferenceException. I think the latter is a worse mistake — I prefer exceptions that describe the problem.
I discussed this in an article here (http://blog.brunobrant.net/2011/08/careful-with-c-as-operator.html).
I would like to add some more common mistakes,
1. Using. ToString() instead Convert.String()
We should not use.ToString() blindly, since it will throw null exception, where Convert.ToString() will not throw the exception instead i will return empty string.
2. LinQ- Use Count() instead of Any()
Count will loop through all the items available in the collection whereas Any() will check items in the collection, if it finds first element then it will return.
eg: Incorrect
If(Coll.Count()>0)
{
// Statement
}
Correct
If(Coll.Any())
{
// Statement
}
If we don’t do any mistakes, we can’t learn.Thank you Pawel for great list. I think every c# developer who read this article will never do above mistakes in future. So i wish these 8 mistakes will disappear from “common mistakes” category in future:-)
and 3 is not a mistake to not know about some 3rd party tool
id rather they know about the adapter pattern
and or the other patterns that apply to this situation
[Difference Between C# Cast Syntax and the AS Operators] link is not working anymore
8. Very interesting would be nice to run a test against multiple singletons vs one large result set.
Mr. Bejger did you check the discussion/comments on the article at code project about Foreach vs For?
Automapper is good but I’m not sure it is a simple “mistake” to not use it. Automapper uses reflection to find properties, in some cases you might decide to do your mapping manually for performance.
nice tutorial…very good information….
Avoid using “var”.
Use “var” everywhere, except in those places that it cannot be used.
Great post. Just ran into the String.Concat with a C# Dev Team when analyzing their app. We also found Excessive SQL, Bad DB Connection Handling, High GC caused by loading too much data and wrong configured worker thread pools as their main problems. I wrote my own little blog about that: http://apmblog.compuware.com/2015/01/14/c-performance-mistakes-top-problems-solved-december/
Programmers should avoid linq expressions inside loops
You’re assuming that the goal of software engineering is efficiency, and therefore it is the job of the engineer to ensure optimal performance at all costs.
This is not entirely true, at least not all of the time.
Your job is to engineer something that works, that works well and that plays nicely with other engineered solutions, considers future maintenance costs, complexity… the list goes on and on. I will say that there is generally a minimum performance requirement for any engineering project, but optimizing beyond this requirement at the expense of any other concerns such as clarity, simplicity, terseness or just friendliness to the eyes and brains of other people is actually sort of a newbie mistake.
If you go so far as to intentionally over-complicate your designs with cryptic variable naming, multiple statements on the same line, excessive method chaining,
using weird stuff because it’s “better” or writing your own versions of the provided library methods I actually think you might be a super villain.
Some days ago I wrote an article about mistake number 8 🙂
It is funny to see someone mentioning it as a common mistake between programmers
This article is so fantastic!!! Unfortunately, after a live of fooling around C# and .NET I recognize my self in half of these 🙂 … if only I read this 5-6 years ago… 🙂
Thanks man – love it!
Thanks for sharing! This is why we keep sharing our experiences. Stay tuned!
Here’s another article on for vs foreach vs while loops, and which looping structure is faster when: http://cc.davelozinski.com/c-sharp/for-vs-foreach-vs-while