Magic Strings – No More!

How many times have you had to work with some Legacy code? Was it readable? Silly question… But what was the main problem? If the code was written in an unclear way and you did not have any idea how to refactor it please visit my post under this LINK. I have recently stumbled across another problem: magic strings. And boy, I was surprised by human ingenuity once again! Imagine having a class whose source code you cannot change due to technical or business reasons. It looks like this:

class Worlflow
{
    public int Type; 
    public string CurrentStep;

    // workflow data
}

At first glance it seems to be ok, but when you find a method dependent on this class that looks like this:

bool CanUserAbort(Worlflow workflow)
{
    if (workflow.CurrentStep != "Cld")
    {
        if (workflow.Type == 0 && workflow.CurrentStep == "St")
        {
            return false;
        }
        if (workflow.Type == 0 && workflow.CurrentStep == "Pd")
        {
            return false;
        }
        if (workflow.Type == 0)
        {
            return true;
        }
        if (workflow.Type == 1 && workflow.CurrentStep == "Acc")
        {
            return false;
        }
        if (workflow.Type == 1)
        {
            return true;
        }
    }
    return false;
}

you may quickly change your mind. Especially if you have to modify this method due to Business Rule change or any other reason. Just to be sure that everyone is on the same page, the main problems are:

  • CurrentStep is compared with some strange looking strings,
  • Type is at best unclear, not to say it does not give a clue about its information,
  • method structure suggests that it has been incrementally changed and other programmers were mainly adding new ‘if’ statements rather than refactoring the existing code,

Enum but not Enum

Obviously both workflow.Type and workflow.CurrentStep should be replaced by enums (instead of string and integer). But since we cannot change Workflow class this solution is unavailable. One of the most common solutions is to wrap one Workflow class in another that would act as a layer translating information between Workflow class and the rest of the application. But this is not the pattern I would like to share with you today. Instead I will create classes that would act like enums and will translate data from string and int to more reliable and readable form. How to do this? Here is a list of things that must be done:

  • Since we do not create an instance of enum every time we use it there would be no point creating another instance of whatever we are going to use – so there will be some statics involved as well as a private constructor.
  • As we need to pass data and be able to customize our “enums” we will need to pass a parameter in the constructor.
  • Although workflow.CurrentStep and workflow.Type will be a different type from what we are about to create, we can always override ‘==’ and ‘!=’ operators.

The outcome of this assumptions may look like this:

class Step
{
    public static readonly Step Accepted = new Step("Acc");
    public static readonly Step Send = new Step("St");
    public static readonly Step Paid = new Step("Pd");
    public static readonly Step Closed = new Step("Cld");

    private string _step;
    private Step(string step)
    {
        _step = step;
    }

    public bool Equals(string name)
    {
        return _step.Equals(name);
    }

    public static bool operator ==(string name, Step obj)
    {
        return obj.Equals(name);
    }

    public static bool operator !=(string name, Step obj)
    {
        return !(obj.Equals(name));
    }
}
class WorkflowType
{
    public static readonly WorkflowType Purchase = new WorkflowType(0);
    public static readonly WorkflowType Refound = new WorkflowType(1);

    private int _type;
    private WorkflowType(int type)
    {
        _type = type;
    }

    public bool Equals(int type)
    {
        return _type.Equals(type);
    }

    public static bool operator ==(int type, WorkflowType obj)
    {
        return obj.Equals(type);
    }

    public static bool operator !=(int type, WorkflowType obj)
    {
        return !(obj.Equals(type));
    }
}

And these new classes can be used in CanUserAbort method like this:

bool CanUserAbort(Worlflow workflow)
{
    if (workflow.CurrentStep != Step.Closed)
    {
        if (workflow.Type == WorkflowType.Purchase && workflow.CurrentStep == Step.Send)
        {
            return false;
        }
        if (workflow.Type == WorkflowType.Purchase && workflow.CurrentStep == Step.Paid)
        {
            return false;
        }
        if (workflow.Type == WorkflowType.Purchase)
        {
            return true;
        }
        if (workflow.Type == WorkflowType.Refound && workflow.CurrentStep == Step.Accepted)
        {
            return false;
        }
        if (workflow.Type == WorkflowType.Refound)
        {
            return true;
        }
    }
    return false;
}

As you can see, all strings and integers were replaced by our custom ‘enum’ classes. On could also change multiple ‘if’ statements to switch/case structure.  But let’s not stop here. After a short refactor we get:

bool CanUserAbort(Worlflow workflow)
{
    if (workflow.CurrentStep == Step.Closed) 
    { 
        return false; 
    }
    if (workflow.Type == WorkflowType.Purchase)
    {
        return !(workflow.CurrentStep == Step.Send || workflow.CurrentStep == Step.Paid);
    }
    if (workflow.Type == WorkflowType.Refound)
    {
        return workflow.CurrentStep != Step.Accepted;
    }
    return false;
}

Which looks way better than the original method and is incomparably more readable. This approach gives us significant benefits:

  • Workflow class has not been changed so there is no way we can cause errors in other parts of the application.
  • Since all strings and integers were removed, there is no real danger of introducing a bug to the application by misspelling a single item.
  • All comparisons are now made in an easily readable form that is closer to Business Rules in the application rather than physical data implementation.
  • Equals method is way safer option to compare two strings than ‘==’ operator.

Of course one can argue that this solution brings some additional calculation overhead. But to be fair this overhead is negligible in the scope of the entire application. And an application that uses magic strings is probably also not very optimized.

Additionally we can  extend  ‘Enum’ classes by inheriting from  IEquatable<T> and IComparable Interfaces that will allow to use CompareTo() and Equals() methods throughout the system – but that is a subject for another article.

There is also another solution. We can use static class with constant string fields like this:

public static Step
{
    public const string Accepted = "Acc";
    public const string Send = "St";
    public const string Paid = "Pd";
    public const string Closed = "Cld";
}

This will also allow the refactored CanUserAbort method to look exactly like in previous example. The drawback is that you will not be able to do something like this:

public void ProceedStep(Step step){…}

And instead it would still look like this:

public void ProceedStep(string step){…}

Which brings us back to original problem with magic strings in random places in our code.

What about real Enums?

Well we can always use real C# enums. Consider the following Enums:

public enum WorkflowType 
{
    Purchase = 0,
    Refound = 1
}
public enum Step
{
    Acc = 0,
    St,
    Pd,
    Cld
}

And we can use them in CanUserAbort method like this:

bool CanUserAbort(Worlflow workflow)
{
    if (workflow.CurrentStep == Step.Cld.ToString())
    {
        return false;
    }
    if (workflow.Type == (int)WorkflowType.Purchase)
    {
        return !(workflow.CurrentStep == Step.St.ToString() || workflow.CurrentStep == Step.Pd.ToString());
    }
    if (workflow.Type == (int)WorkflowType.Refound)
    {
        return workflow.CurrentStep != Step.Acc.ToString();
    }
    return false;
}

It is usually a good idea to set one enum value as default (Acc = 0) because adding next value on top of all enum values may mess up the application if default value is not explicitly defined.  Although this solution will work as well as all previous ones I personally (and subjectively) object to it. There are several reasons:

  • To compare two string values we must use ToString method every time we do the comparison.
  • Step enum has strange values like Step.Cld, Step.St, Step.Pd or Step.Acc which are no more descriptive than magic strings from the original method.
  • In order to be compared with integer WorkflowType enum must be casted to integer each time the comparison takes place.
  • In this particular scenario enums do not bring much help and the code is still pretty unclear.

Conclusion

Usually there are many different ways of programming in C#, but just because some options exist does not mean that they should be used. Always consider not only short term consequences (e.g. development speed) of creating code but also long term ones (e.g. maintainability). From my own experience most (if not all) things that we neglect or leave behind with ‘To Do: refactor ’ comment will eventually come back and bite you in the ass. So deal with them at once. Now go and clean up your code until it is not too late and it becomes Legacy code. And please let me know how it went.

Tags: , , ,

17 comments

  1. I’m pretty anti-enums, they’re just setting you up for failure later. At this point I always prefer files full of static fields if i need “constants”. The number one advantage is you can have as rich of “constants” as you want. Most people that use enums eventually end up with [Description(“I have spaces”) MyEnum = 1. This works OK but once you need one more piece of data, everything falls apart.

    Your step class is pretty silly. You could have easily just used

    public static readonly string Accepted = “Acc”;
    public static readonly string Send = “St”;

    Why add all that other code? (Exactly the same for WorkflowType)

    Honestly this is still all magic strings. From the outside this is just slightly prettier looking, but more code and minimal to zero actual improvement to maintainability.

    The only way you can truly improve the code in this example is replace it with a simple finite state machine. The state machine at each step would provide the available actions, and the Refound-Accepted state would return Abort is an available transition.

    1. static fields are still represented string types, so comparisons are not as elegant or maintainable, as the author pointed out…

    2. Hi, to be honest, the solution that you proposed:

      public static readonly string Accepted = “Acc”;
      public static readonly string Send = “St”;

      is hiding the problem instead of dealing with it (Yes – CanUserAbort() method will look exactly the same as mine, but here are some drawbacks). Consider creating NextStep() method – using your solution it would look something like this:

      public string NextStep(string step){…}

      You take one “magic” string and return other “magic” string. And this pattern will continue through the entire application. Solution I presented above allows you to create method NextStep() that will look like:

      public Step NextStep(Step step){…}

      which encapsulates “magic” string to comparable class that can be used like Enum in your part of the application and still be compared with strings that come from different sources.

      1. In both cases these are merely putting lipstick on the pig.

        The problem gets solved by using a state machine, not burying where the strings are typed into a .cs file.

        1. There are multiple solutions to “magic” string problems. It doesn’t’ matter if we have those reference strings in Data Base, .cs file, resources project, .config file, etc. These solutions (in almost every case) boil down to mechanism that translates string value to custom class or enum. And state machine (by default) does not solve “magic” string problems.

          The point of this article was to present one solution to “magic” string problem, show its advantages and disadvantages. This article is not about managing workflows and ways in which they should be treated.

          1. Then this article just has a poor illustration because what is shown the wrong problem is solved. This is actually one of the most common things done in software development, solving the wrong problem.

          2. Actually dotnetchris the most common problem in software development is not reading/understanding required specifications, use cases, and any other documentation related to business or technological site of project – but that is just my opinion.

            The problem stated at the beginning of this article was to get reid of the “magic” strings. “Workflow” was used as one of examples where this problem can be encountered. Code patterns presented in this article are examples of how one can solve problem stated at e beginning of the article.

            If you still think that this is poor solution to stated problem – fine, I don’t mind it – point out any obvious errors and/or present us with better solution – that is the point of this blog posts – to discuss topics and share knowledge.

            But please (and I really mean it) stick to the point and stay (more or less) on topic of this article when you comment it. And do not make argument on certain idea that at best is very loosely related to article.

          3. Except you didn’t show any patterns except for perhaps a zero value facade. The simplest metric to apply to determing the value of a pattern is the number of lines of code deleted. In this case you added tons of lines of code resulting in taking bad code and making it appear better. Now there’s more code to maintain with no equity gained.

            And I completely disagree with you about use cases, for the most part use cases are entirely meaningless. Who cares if you misunderstood a use case? It should be picked up in acceptance testing and should be resolved in minutes or hours in a well designed system. If addressing a use case requires a substantial amount of work this is a clear symptom of a functionally designed system. A well designed system should encapsulate the volatility of a business. All systems I design thrive on change, I absolutely love change. It doesn’t even matter if the change was for an elaborate set of use cases they created only yesterday.

            In all systems have a set of core use cases that facilitate the creation of dozens or hundreds of business use cases. In general the core use case count really sits around 5 total core use cases. Even some of the most elaborate systems won’t exceed a dozen core use cases. Any system that is even approaching these double digit numbers should be decomposed into multiple systems anyway.

          4. Let’s agree that we disagree on this topic, because this conversation has become a bit pointless and definitely off topic.

            I have presented a well know “Type safe Enum” pattern who is also mentioned in several .NET articles:
            http://www.codeproject.com/Articles/301419/A-different-type-of-enums
            http://blog.falafel.com/introducing-type-safe-enum-pattern/
            http://possiblythemostboringblogever.blogspot.com/2012/06/type-safe-enum-pattern-in-c.html

            And also java article:
            http://www.javacamp.org/designPattern/enum.html

            It is a pattern used to work with legacy code – so no – it is not suitable for every project. The aim of the above article was to show its advantages and disadvantages and to compare it to few other solutions, which it did.

            I wish you very best in your future projects and I’m looking forward to your comments in my upcomming articles.

          5. Let’s agree that we disagree on this topic, because this conversation has become a bit pointless and definitely off topic.

            I have presented a well know “Type safe Enum” pattern who is also mentioned in several .NET articles:
            http://www.codeproject.com/Articles/301419/A-different-type-of-enums
            http://blog.falafel.com/introducing-type-safe-enum-pattern/
            http://possiblythemostboringblogever.blogspot.com/2012/06/type-safe-enum-pattern-in-c.html

            And also java article:
            http://www.javacamp.org/designPattern/enum.html

            It is a pattern used to work with legacy code – so no – it is not suitable for every project. The aim of the above article was to show its advantages and disadvantages and to compare it to few other solutions, which it did.

            I wish you very best in your future projects and I’m looking forward to your comments in my upcomming articles.

  2. Why not just subclass Step and add a CanUserAbort to each type of step? seems like a much nicer way to group functionality to me.

    1. That’s still not a great solution. Why should you be checking IF that’s an allowable action? If this code was a state machine it would return Abort is a valid transition.

      Any code that is a true workflow that doesn’t behave as a state machine is just fundamentally wrong. Most business applications are just developed terribly, they have no design (picking webapi and sql server isn’t a design, those are constraints), they lack state machines, they lack the strategy pattern, code is duplicated enmass because there’s no usage of actual design.

      If a system is built as CRUD, it should be generated with tools like lightswitch, but most people don’t have slightest idea how to build a system that is not CRUD. CRUD is the absence of design. If you truly have CRUD, it should be implemented exactly once and that’s where things fall apart. Nearly no person understands how to write detail agnostic code.

      1. I have a feeling you misunderstoof my suggestion, it was to use the strategy pattern for the workflow engine states.

        1. Suppose we have the following contrived state machine

          ….. -> [ Refound-Accepted ] -> [ Abort ]

          You could implement Refound-Accepted and Abort as strategies. However what is missing is that directed arrow between Refound-Accepted & Abort. From the way this question is constructed CanUserAbort is fundamentally that directed arrow.

          That’s why the original code illustration of if-else-soup sucks. It’s about the worst possible implementation conceivable for the directed arrow in a state machine.

Comments are closed.