Principles of Coding

In my ~15 years as a software engineer/lead in global tech corporations, I made 100s if not 1000s of Code Review s. I captured here a list of common patterns in code reviews holding them up from completion and causing 10+, sometimes 20+ review iterations before the code can be checked in.

I expect these to be beneficial especially for the junior developers but also experienced developers as a baseline and a refresher. Like all principles and best practices in software engineering discipline, there could be cases not to follow these, but the reason should be a good one and thought through.  So lets dive in.

1. Parameter Validation:

All public methods and constructors need to have parameter validation. Even better, when invalid parameters are detected, parameter name and value, should be logged.

2. Dependencies and Dependency Injection:

Take dependencies as Interfaces (not concrete classes) through class constructors explicitly. That allows your components to be testable as you can easily pass a mock implementation of the interface to your class.

Use a IoC container for object composition. This makes it easy to manage dependencies in a central location (composition root).

3. Logging:

Logging is super important. All unexpected errors (and high informational value non-errors) should be logged with enough logging metadata and information to allow us properly classify, diagnose and root cause issues happening with the code we wrote when it is used in the wild in production.

Wrap your code with performance counters to be able to monitor its metrics in Production.

4. Asynchronous Code

All I/O calls (service call, disc I/O read etc.) should be async. All async methods should be awaited ie. await FooAsync(); instead of making a blocking wait on them ie. FooAsync.Wait() , FooAsync.Result.

Making a blocking wait on an async method defeats the whole purpose of asynchrony and introduces potential deadlock issues.

Making synchronuous calls on I/O requests unnecessarily blocks the Core while I/O request is executed and introduces scalability, perf bottlenecks which can be hard to fix after the fact once these issues accumulate over time.

5. Parallel Processing

For the code that does not need to be sequentially run, leverage parallelism in your code but balance this with an upper bound on Max Degree of Parallelism. Unbounded parallelism/ over parallelism also may result in side effects.

If in doubt, measure the execution time and CPU of your code before and after and optimize the level of parallelism.

6. Code Reuse

Do not reinvent the wheel and contribute to shared code. Search code locations in your team’s repository that contains shared, reusable code to see if the api you want to implement exists, if it does use it, if not then add it to a suitable shared project so it can be reused by other engineers/teams if it has potential benefit for other teams.

7. Small Code Reviews

Send small code reviews that can ideally be completed within 5 iterations. Talk through unclear comments with the reviewer in person.

8. Engage in Design Reviews Early On

Major software and system level architectural changes should go through a Design Review before the code review is sent. Before coding starts.

Hope this helps 🙂 Happy coding.

 

Object Graph Traversal by Recursive Reflection

Every software engineer faces with this problem of traversing an object graph and processing it, at least once in his or her career :). If you are a software engineer and did not have to solve this puzzle yet, believe me that day will eventually come… If you are not a software engineer and still you had to solve this problem please send me the details, I would be very curious about how you got yourself into that situation :))

In my case, I needed to implement a public .net api as part of a public SDK in Github, basically needed to implement my own Serializer, that takes an object as a parameter, traverse its nodes and end nodes (properties) and process these as it traverses them.I put no type, interface or inheritance constraint on the input object, the type of the input object is base type of all base types “object”. That makes the api super flexible and powerful, basically you are telling the client code, send me whatever type of object you want and I am going to process it, but on the other side it is a very bold statement because the object can really be anything, we have no compile time knowledge or run time type constraints of its type, nor do we know its object graph and hierarchical structure. It can be a simple object with simple properties directly under the object root, or it can be a complex object with a depth of many layers (practically infinite) with all kinds of complex /nested/composite properties, these nested complex properties can be reference type (a class) or value type (ie. struct) and there may be a full or partial recursive loops within the object graph, to name a few of the interesting points.

So this is an object traversal/processing problem where I need to visit every intermediate and end node property of the input object and process these properties. The output of this processing logic could not be “log something to command line/logging infrastructure” either, the output generated by processor had to be accumulated and returned to the caller.

I solved this problem by recursive reflection. For more interested, I implemented a depth first traversal with structural and multiple recursion technique. You can have a look at different types of recursion in this good and short wiki article (https://en.wikipedia.org/wiki/Recursion_(computer_science)#Types_of_recursion).

The public facing API is called Flatten. It takes an object as I mentioned before, processes all of its intermediate and end nodes, lets call that function EndNodeProcessor.

The algorithm passes each node it visits to EndNodeProcessor function, in that way the responsibility of traversing the object graph and processing of each node is clearly separated between the recursive algorithm and the EndNodeProcessor function.

EndNodeProcessor’s exact signature depends on what do you want to do with the exact node and what kind of information you need from the algorithm to be passed in, as a minimum you need the nodes to be passed so EndNodeProcessor should take the node as a parameter, in my case I also needed the Type information of the current node, so that’s the second parameter.

The return type of your EndNodeProcessor also defines the return type of the public method that does traversal and node processing. (In my case it is called Flatten).

Lets say EndNodeprocessor takes each visited node as a parameter, processes them and returns an object of type of T back to the recursive algorithm, the algorithm then collects these objects and persists them in a collection. This collection is then returned back from the public api. In my case this collection is Dictionary<string, T>. In my case just an IEnumerable<T> was not enough, I also needed a way to recompose the original object back so I associated each output with a key of string type in a Dictionary <string, T>.

So the signature of EndNodeProcessor is: T EndNodeProcessor (object Node, Type type)

Note that  as mentioned, the Type parameter is something specific to my requirements, as a minimum it needs to take the node as object parameter. The return type T is a pseudo concept, you would need to decide what exact type you want the processor to return and replace the T with that exact type. I specifically left out the actual implementation of the EndNodeProcessor function because what it does with the passed in Node object is specific to the problem you want to solve and is decoupled from the traversal algorithm.

Let’s look at the code:

public static Dictionary<string, T> Flatten(object root)
{

if (root == null) return null;

//Here within the public method we instantiate Dictionary which will hold the output values generated by the NodeProcessor.

Dictionary<key, value> propertyDictionary = new Dictionary<key, value>();

//We create a HashSet to keep track of antecedents during object traversal. This is to detect complete (all the way back to root node) or partial recursive loops (back to an internal node) within object graph and terminate the recursion. If we do not do that  we will end up an infinite loop during recursion, will run out of stack memory and cause a Stack Overflow exception eventually. We like StackOverflow as a web site, we do not want any stack over flows in any other shape or form 🙂

HashSet<object> antecedents = new HashSet<object>(new ObjectReferenceEqualityComparer());

//We will pass the propertyDictionary to the private overloaded recursive method which is also called Flatten, the recursive Flatten method will update this Dictionary by reference. Dictionary is a reference type therefore stored on the heap and so any updates to the dictionary during the recursion will persist after all recursion ends and the private Flatten method returns.

return Flatten(propertyDictionary, root, string.Empty, antecedents);
}

private static bool Flatten(
Dictionary<string, EntityProperty> propertyDictionary,
object current,
string objectPath,
HashSet<object> antecedents)
{

// Recursion Termination Condition: if passed in object is null, return. This heuristically means that we have processed an end node property and we are already at the bottom of the object graph. There is no deeper layer to process.

if (current == null)
{
return true;
}

// If we reach here, that means we are either in the middle of the object graph or we are actually at the deepest layer, at the end node property but we have not yet processed it. We need to make a decision. We first get the type of the current node.

Type type = current.GetType();

// Then we assume we are at an end node property and attempt to process it by passing the current object to our NodeProcessor along with its type.
T output = EndNodeProcessor(current, type);

// If EndNodeProcessor managed to process the current object we infer it was an end node property. This logic could also be put inside the EndNodeProcessor directly.

if (output != null)
{

// We update the passed in PropertyDictionary with the output generated by the NodeProcessor.
propertyDictionary.Add(objectPath, output);

// Recursion Termination Condition: We processed an end node and now we can return from this recursion.

return true;
}

// If we reach this part of the code there are 2 options. Either we are in an intermediate node in the object graph (ie. not an end node property), therefore our EndNodeProcessor returned null. Or we are actually in an end node but the EndNodeProcessor failed to process the request. we need to make a decision. First thing we check is if we are in an intermediate node.

IEnumerable<PropertyInfo> propertyInfos = type.GetProperties();

if (!propertyInfos.Any())
{

// If we are here then the current node does not have any child nodes (properties in our example) of its own then it means that we are actually on an end node property but EndNodeProcessor failed to process that node. This is an unexpected failure so we terminate overall method execution at this point by throwing an exception.
throw new SerializationException(string.Format(CultureInfo.InvariantCulture, UnsupportedPropertyTypeForEntityPropertyConversion, type, objectPath));
}

//Recursive Reference Detection: If we are at this stage in the code we are at an intermediate node in the object graph. We need to make sure we detect recursive loops in the passed in object and avoid going into infinite recursive traversal within this part of the object graph.

bool isAntecedent = false;

// If current type is a value type ie. we are at a struct type intermediate property then there is no danger of recursive reference, recursive reference by definition only happens on reference type s (ie. a child reference type property and its parent are pointing to the same point in memory, therefore creating a circular loop.)

if (!type.IsValueType)
{

// antecedents is a custom HashSet<object> which I initialized with my own EqualityComparer implementation (ObjectReferenceEqualityComparer()).  ObjectReferenceEqualityComparer uses purely the object references to detect object equality and avoids any overloaded GetHashCode() implementations (if any) of input objects. The hashset contains all of the child properties of the current parent property. We update this hashset also recursively as we traverse in the object graph. We add all child nodes under the current node to the hashset as we traverse recursively in this part of the object graph where current node becomes the root.  And we remove all of the nodes from the hashset after we process them.

If any time during this traversal, antecdents.Contains(current) returns true, it means that we saw this reference before within ths sub tree and hence we detected a recursive reference. So we terminate. We could here decide what other things we do with recursive references, may be take an option to alter that (ie. skip those as opposed to terminate the execution etc.). In fact Circular Reference Detection is actually a topic of its own, so I may write up another article just about it to go over that part of the solution
if (antecedents.Contains(current))
{
throw new SerializationException(string.Format(CultureInfo.InvariantCulture, RecursiveReferencedObject, objectPath, type));
}

// we detect that current node is not in our hashset and so we add it to the hashset. We also set isAntecedent flag to true because we are at this stage in the code in an internal node so it is antecedent to child properties.

antecedents.Add(current);
isAntecedent = true;
}

// This is where we apply multiple recursion, we call this private Flatten method on all of the child properties of the current node.

bool success = propertyInfos
.Where(propertyInfo => !ShouldSkip(propertyInfo))
.All(propertyInfo =>
{
return Flatten(
propertyDictionary,
propertyInfo.GetValue(current, index: null),
string.IsNullOrWhiteSpace(objectPath) ? propertyInfo.Name : objectPath + propertyNameDelimiter + propertyInfo.Name,
antecedents);
});

// Once we are here in the code, we completed processing all child nodes (and their child nodes and their child nodes, all the way to their end nodes ) that are under the current property in the object graph. Since we completed the processing all nodes under this sub tree of the object starting from the current node downwards, we can safely remove the current node now from the antecedents hashset, there is no risk of circular reference beyond this ppoint in relatio to the current node.

if (isAntecedent)
{
antecedents.Remove(current);
}

// Here we return this is also the exit point when all intermediate and end node properties under the object root is traversed and we processed the entire object graph successfully.

return success;
}