Being a fan of Foote and Yoder’s Big Ball of Mud paper. Well, fan in the sense that I think it is a correct observation of most software developed, yet not liking that fact.

So I have been pondering on this â€“ donning my amateur physicists hat I have contemplated some of the issues.

## Tournaround

Let’s say that we have a team of developers pushing the project forward. Let’s call them Sisyphus. They have a development velocity, v. The ball of mud has a radius, r, which is comparable to the number of features in the product. The angular veloticy, Ï‰, is proportional to the frequency of revolution, let’s denote this as the version number.

Ï‰ = v/r

That is, as the radius grows, then â€“ given the velocity is constant â€“ the angular velocity reduces, which in turn means that the versions will be harder to crank out.

## Work

Work, W, is force, F, times the distance, s, which in turn is the velocity times the time passed.

The force is mass times acceleration, and the mass is density, Ï, times the volume, V.

W = F Î”s

Î”s = v t

F = m a = ÏV a

The volume of a sphere is 4Ï€ r^{3}/3

Thus the work is proportional to the radius of the sphere cubed.

### The interpretation

If something requires twice the number of features, then the work required will be 2^{3} = 8 times greater or run at 1/8 the velocity.

If â€“ on the other hand â€“ this could be done in two disjoint balls of mud, it would only require twice the amount of work.

More generally, we could probably divide the features of the big ball of mud into two smaller balls having different radii, R and r.

The Volume of the original ball with radius R+r will be:

With the volumes of each of the lesser balls as:

Effectively being smaller than the larger ball. We are actually saving more than 6 times the volume of the smallest ball.

If we split the big ball into 80% and 20% â€“ which is likely, then the combined volumes of those two smaller balls would save

48% of the original volume.

If volume is comparable to complexity, then we will save complexity and development effort in keeping the balls as small as possible. Nothing new there.

This is just another argument for low coupling and high cohesion. But it is also a reminder that we must look at our code from a different perspective from time to time. Are classes adhering to the Single Responsibility Principle?

## Disjoint Communication

If the Big Ball of Mud is split into several smaller balls, they would probably need to communicate. This would have been some of the intrinsic complexity we shaved off.

Let us try to add it in the most stupid form possibly: A complete graph in which each and every ball will know and communicate with each and every other ball.

Renaming the radii having R as the radius of the original Bug Ball of mud, and divide it into n = R/r balls of approximately the same diameter, r. Then taking the added complexity into account we get n(n-1)/2 additional entanglement. Leaving us with:

Which about the original volume divided by 2n.

As a complete graph is just inverting the big ball of mud, there is a minimum of n-1 connections for a completely connected graph. That is, the refactored ball of mud could â€“ at best â€“ save an additional factor of n.

### Different distributions

What if the balls cannot be split using an equal distribution? Perhaps we should contemplate a more natural distribution such as a power law.

#### Doubling down

Dividing the original radius into n different sizes each half the length of the previous, but with twice as many balls.

Dividing into n=4 different sizes we get:

1/4 + 2/8 + 4/16 + 8/32 = 4/4 = 1

That is, we get 2^{4}-1 = 15 balls, the biggest of radius R/4.

The general formula for volume per size is now:

V(c,r) = 4/3 Ï€ c r^{3}

Where c is the number of balls with the radius, r.

As the communication is no longer of equal size, we have to do some more work. The rectangle spanned by the two communicating parties radii seems a fitting proporsition.

We end up with the following formula for added complexity due to full communication needs:

As R is a factor of all radii, we see that Vcomm is an area in R2, whereas the volume, V, is a volume in R3. This means that we shouldn’t be allowed to add them, but as we’re not concerned with this type of detail we add them anyway but notice that the effect is dependent upon the initial radius, R.

Using the relative complexity as the yardstick, we can plot the effect of R and n. Remembering that we’re getting 3 balls for n=2, 7 balls for n=3, and 15 balls for n=4.

From the graph it looks as though the initial radius doesn’t matter as much.

Furthermore, it seems that the best value would be to split into 2 or 3 different sized smaller balls. Going beyond seems futile as the gain from splitting the largest ball left would probably be bigger â€“ given that the initial radius doesn’t matter as much.

Without full communication needs the relative complexity would be even lower.

## Conclusion

While it is human nature to hoard development into a single monolithic big ball of mud. It does pay off to break down the monolith to a certain degree. Whether for sanity, savings, or speed.

The notion of equal distribution â€“ which corresponds to disection into primitive obsession â€“ is luckily reverted by the doubling down distribution. Which is likely to be a more natural division of elements.

It does seem counterintuitive to work at a constant pace and still grind to a halt. Yet this seems to be a logical conclusion of the above calculations. If not to a complete standstill, then at least turning so slowly that it seems like a complete stop.

I didn’t consider friction in the above calculations. Friction would impact on the complete forces, and thus on the work â€“ but not in a positive way.

The question now is: Will a Big Ball of Mud collapse under its own weight if not cared for?