**Asymptotic complexity and invisible constant factor**

Today I’m going to explain what stands behind asymptotic complexity of algorithms when it comes to measurement of performance on modern computer hardware. Let’s recall that asymptotic analysis is based on idealized sequential RAM-model. Complexity shows how good an algorithm scales as N grows from mathematical point of view. Two or more algorithms with different complexity solving the same task DOES NOT mean one will work slower or faster. Computer architecture is more complex than RAM-model. There are different memory hierarchies, branching strategies, prefetching techniques, pipelining and other hardware factors which are not taken into account in this model. Constant-factors hidden in Big-Oh notation are not fixed, but variable in fact. Popular books on algorithms not always explain clearly this fact. While Big-Oh has many theoretically useful insights, it cannot actually predict performance. Read on, to find out why.

**1. Constant hidden even in simple arithmetic operations is not a constant at all:**

In terms of sequential Random Access Model and computer hardware we now try to add two numbers without a calculator using our brain: 1+2, 2+10, 34+55. A piece of cake! OK, so far so good. Try again, now: 34523+119987. Feel the difference? It takes you much longer to calculate the latter. This analogy shows the difference in complexity. In the RAM model adding 2 numbers **is always a constant-time**, but in real computer hardware it’s not like that as constant-factor is variable. On top of this, some instructions are slower than others. Take addition and division as an example. They are both O(1) but with different constant factors.

**2. Some algorithms require more instructions than others**

For instance, certain algorithms may make fewer instructions per one pass (e.g. copy, exchange, shift, whatever). This also partly makes up a hidden constant. Due to this fact they tend to work faster in some cases.

**3. Computer memory hierarchy gets in the way of analysis**

Computer memory in the analysis of complexity is treated as a linear array with uniform access times. In the analysis A superior algorithm is the one that executes fewer instructions. In reality the assumption that every memory access has equal cost is not valid.

**Conclusion**

I didn’t mention compilers that make optimizations. All of these factors form a hidden constant in the big-Oh which spoils the analysis. Asymptotic analysis is built upon mathematical model which is machine-independent and thus fragile. Hidden constants might impact algorithm’s scalability very heavily. Due to this fact performance benchmarks may yield the opposite results. On some data an algorithm might be much slower then one with better complexity. The most vicious enemy is memory-hierarchy in modern CPUs. Yes, Big-Oh still matters, but you should handle it with care. In the next post I’ll try to show this on real examples.

in my book Soubhik Chakraborty and Suman Kumar Sourabh, “A computer experiment oriented approach to algorithmic complexity”, published by Lambert Academic Publishing, 2010, as well as in several papers and reviews I have talked about conceptually building a weight based statistical bound that allows for mixing of operations of different types and then estimating the bound over a finite range by running computer experiments whereby numerical values are supplied to these weights. For example, time of an operation (CPU time) can be taken as its weight.The credibility of the statistical bound estimate (I have called it empirical O written as O with a subscript emp) depends on the proper design and analysis of the computer experiment in which the response variable was a complexity such as time.

Interesting, I’ve seen some paper-works on the net similar to ones you described above. But unfortunately, those methods are almost not known to the public.

Here are some papers where I have talked about statistical bounds and their empirical estimates:-

1. Chakraborty, S. and Sourabh, S.K., On Why an Algorithmic Time Complexity Measure can

be System Invariant rather than System Independent, Applied Mathematics and

Computation, Vol. 190(1), 2007, p. 195-204 (Elsevier Science)

2. Singh, N. K. and Chakraborty, S.: Partition Sort and its Empirical Analysis, CIIT-2011,

CCIS 250, pp. 340-346, 2011. © Springer-Verlag Heidelberg 2011

3.Singh, N. K.and Chakraborty, S, A Statistical Approach to the Relative Performance

Analysis of Sorting Algorithms in Computational Science, Engineering and Information

Technology, ACM International Conference Proceedings edited by N. Meghanathan and

M. Wozniak, CCSEIT-2012, Coimbatore, India, p. 57-62, 2012

4.Chakraborty, S., Modi, D.N. and Panigrahi, S., Will the Weight-based Statistical Bounds

Revolutionize the IT?, International Journal of Computational Cognition, Vol. 7(3), 2009,

16-22, http://www.yangsky.com/ijcc/pdf/ijcc731.pdf (Yang’s Scientific Research Institute,

USA)

5. Singh, N. K.and Chakraborty, S, Partition Sort versus Quick Sort: A Comparative Average

Case Analysis with special emphasis on parameterized complexity published in N.

Meghanathan et. al., (eds.), advances in Computing and Information Technology, AISC 177,

pp. 107-113, Springer-Verlag Berlin Heidelberg 2012

…and here are some reviews where statistical bound estimates have been briefed:-

1. Chakraborty, S., Review of the book Design and Modeling of Computer Experiments

authored by K. T. Fang, R. Li and A. Sudjianto, Chapman and Hall, 2006, published in

Computing Reviews, Feb 12, 2008

2. Chakraborty, S., Review of the book Computational Complexity: A Conceptual Perspective

(Ist Ed.) by O. Goldreich, Cambridge University Press, N. Y. 2008, published in Computing

Reviews, April 27, 2009

3. Chakraborty, S., Review of the book Methods in Algorithmic Analysis by V. Dobrushkin,

Chapman and Hall, 2009 published in Computing Reviews, June 11, 2010