Marc Snir presented a high-level picture of the current status of technologies for supercomputing, the current challenges threatening the continuation of advancements in CMOS-based components and the possible tracks for coping up with such challenges in his talk outlined in [Snir, 2014]. Among the top challenges in CMOS-based manufacturing of integrated circuits (IC) are limitations in increasing the clock speed of systems implemented using current hardware implementations, lack of proven commercial grade technologies which could readily replace current computing technologies, limitations on manufacturing technologies for producing IC components with ever decreasing dimension which do not necessarily promise considerable increase in performance, and the demand for higher energy supply to power up proposed advancements in CMOS-based technologies. Some possible future tracks for supercomputing technologies mentioned by the author which could pave the way for a new Moore’s observation are new models of computing which are massively task-parallel in nature, esoteric computing device technologies which currently has a non-commodity status, further advancements in the software domain, i.e. algorithms for optimization, data structures, software architectures, task management strategies.
Investigating new computing models for long term solution
A good balance between the research and development in long term solutions and short term patches is key to securing continuous advancements in solving hard computational problems and a continuous healthy economy which rely on computing devices. The investigation of new computing models is one long term approach to solving the current limitations of Turing model-based computing devices. One unconventional computing model actively investigated is the quantum computing model. The investigation of this model can be attributed to two motivations. The first motivation is as a response to the steady decrease in dimension of IC components. Current manufacturing technologies can now produce ICs with billions of transistors in the scale of a few nanometers [Courtland, 2013, Cavin et al., 2012]. As the dimension of transistors decrease, quantum mechanical effects such as interference between nanoscale components become more highly manifest. This effects could affect the stability of transistors during information processing which could result to variable outputs. Such effects are also investigated when proposing new designs for computing components in the nanoscale level [Smith et al., 2016]. Quantum computing aims to utilize some of these effects, i.e. quantum interference, linear superposition of states, entanglement of quantum states, to gain speedup in solving computational problems [Nielsen and Chuang, 2010]. The second motivation comes from the need for simulating quantum mechanical systems. In [Feynman, 1982] physicist Richard Feynman posed the question of how to simulate quan- tum mechanical systems and in which he also proposed to use a computing device which exhibit quantum mechanical effects. Simulation of quantum mechanical systems on Von- neumann model-based computers will require an exponential amount of time due to the exponential increase in computational paths with respect to the number of quantum bits. In [Feynman, 1985] Feynman presented a proposed schema of a quantum mechanical sim- ulator. Computation on a quantum computer is parallel in nature due to the possibility of quantum mechanical systems to be in a linear superposition of all of its possible states. Quantum operators on the other hand are linear which implies that an operation applied to a quantum mechanical system in a linear superposition state equates to operation on the system’s individual states. Early fundamental quantum algorithms which provided sub-linear to exponential speedup relied heavily on operations on quantum superposition states [Deutsch and Jozsa, 1992, Coppersmith, 1994, Shor, 1994, Grover, 1996]. The mas- sive parallelism in the quantum computing model makes the model’s simulation (at least computation wise) suitable to be implemented as general purpose computation on graphics processing units (GPGPU). Data-wise, efficient simulation of computation in a quantum computer will require massive memory for storing exponential amount of data with low latency as not to cancel the speedup in computation. One caveat in quantum computing is the susceptibility of quantum mechanical systems to the destructive effect of decoherence [Chuang et al., 1995, Zurek, 2002]. A quantum system in a superposition state tends to decohere into a definite classical state in a very short amount time. The decoherence time of a system further decreases as the number of quantum bits in the system increases. This limits the scaling and operation time of a quantum computer. Different technological implementations of a quantum computer also differs in their decoherence time [Hughes et al., 2004].
Another path currently being investigated for possible future alternative computing device technology is biological computing. This model focuses on computing using biological structures such as unicellular organisms, DNA and ATP. Wet lab investigations aim to discover if these structures are truly capable of performing information processing with high energy efficiency and can truly store large amount of information. An E.coli cell has been investigated in comparison to a CMOS-based information processor of same scale (1 micrometer) in terms of energy efficiency and data storage capacity [Cavin et al., 2012]. The E.coli cell utilizes its DNA as long-term memory (data storage), some proteins and RNA molecules for short-term and logic units, receptor proteins for sensing both outside and inside the cell, and ribosomes for outputting computation result and communicating with other cells in the population. The analysis has identified that the logic units of an E.coli cell has count of at least a million. Accesses to its memory (DNA) can also be executed in parallel where data retrieved can be processed also in parallel. An E.coli cell thus support massive parallel computation. It would be interesting to simulate the computation of an E.coli cell as a computer using GPGPU for the parallel computation and the CPU for the control flow. It will be a more complex task to simulate the operation of an E.coli cell as its biological operating system also supports inter-cellular communication simultaneous to its internal operation. Data re- ceived from the population also serve as parameter to its internal operation. Some challenges that comes to mind in implementing a full-scale computer using a population of unicellular organisms such as E.coli cells is the control of each cell for computation and the control of intercellular communication for cooperative computation. It is interesting to investigate if a master-slave architecture can be applied to such unicellular organism population-based computer wherein a singe cell will act as a master for facilitating the computation and dis- tribution of tasks to its slave cells. Slave cells will mainly act as parallel computing devices which could interact with other cells in the population.
One recent study in biocomputing is that of D. Nicolau et. al in [Nicolau et al., 2016]. Their group was able to synthesize a book-sized biological supercomputer which utilizes a network of lithographically constructed channels for encoding combinatorial problems. Computation in this biocomputer is implemented using independent biological agents which can explore the synthesized network in a massively parallel fashion. In the study, they used the following criteria in identifying candidate biological agents: (1) can be produced in large quantity, low cost, (2) self-propelled, (3) operates independently for exploration of the network, (4) with small dimensions for use in high-density networks (large problem instances), and (5) move rapidly for fast computation. They investigated the effectiveness and efficiency of the biocomputer in solving a small instance of the Subset Sum problem (SSP) which is an NP-Complete problem. Results of the study has shown that the SSP can be solved using their biocomputer in polynomial time only. The results of the study is interesting such that technologies for synthesizing the network is currently available, the independent agents can be produced in a cost-effective manner, and that the proposed physical implementation of a biocomputer can solve an NP-Complete problem with massive parallelism. The proposed biological supercomputer may be deemed as an alternative to a quantum computer which is currently plagued with challenges with regards to its scalability and stability in carrying out computation.
Thoughts about the paper
The paper in [Snir, 2014] provides a good outline of the current computing device technologies and the challenges which hampers their advancement in terms of computing, energy and cost efficiency. An interesting point in the paper is the question of whether we really need continuous increase in supercomputing performance. The answer to this question may be influenced by several factors like ever occurring computational challenges which hit the performance wall of current technologies, the continuing search for polynomial-time solutions to exponentially hard computational problems and the ever striving economy of computing devices. It is of my utmost interest to witness the approach of a shift from the current computing models to new computing models with unexpected physical implementations, just like the shift from the mechanical computer to CMOS-based computer.
[Cavin et al., 2012] Cavin, R. K., Lugli, P., and Zhirnov, V. V. (2012). Science and Engineer- ing Beyond Moore’s Law. Proceedings of the IEEE, 100(Special Centennial Issue):1720–1749.
[Chuang et al., 1995] Chuang, I. L., Laflamme, R., Shor, P. W., and Zurek, W. H. (1995). Quantum computers Quantum Computers, Factoring, and Decoherence. Science, 270(5242):1633–1635.
[Coppersmith, 1994] Coppersmith, D. (1994). An Approximate Fourier Transform Useful in Quantum Factoring. Technical report, IBM Research Division.
[Courtland, 2013] Courtland, R. (2013). The Status of Moore’s Law: It’s Complicated.
[Deutsch and Jozsa, 1992] Deutsch, D. and Jozsa, R. (1992). Rapid Solution of Problems by Quantum Computation. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 439(1907):553–558.
[Feynman, 1982] Feynman, R. (1982). Simulating Physics with Computers. International Journal of Theoretical Physics, 21(6/7):467–488.
[Feynman, 1985] Feynman, R. (1985). Quantum Mechanical Computers. Foundations of Physics, 16(6):507–531.
[Grover, 1996] Grover, L. K. (1996). A fast quantum mechanical algorithm for database search. In Proceedings, STOC 1996, pages 212–219.
[Hughes et al., 2004] Hughes, R., Doolen, G., Awschalom, D., Caves, C., Chapman, M., Clark, R., Cory, D., DiVincenzo, D., Ekert, A., Hammel, P. C., Kwiat, P., Lloyd, S., Milburn, G., Orlando, T., Steel, D., Vazirani, U., Whaley, K. B., and Wineland, D. (2004). A quantum information science and technology roadmap.
[Nicolau et al., 2016] Nicolau, D. V., Lard, M., Korten, T., van Delft, F. C. M. J. M., Pers- son, M., Bengtsson, E., M ̊ansson, A., Diez, S., Linke, H., and Nicolau, D. V. (2016). Parallel computation with molecular-motor-propelled agents in nanofabricated networks. Proceedings of the National Academy of Sciences of the United States of America, 113(10):2591–6.
[Nielsen and Chuang, 2010] Nielsen, M. A. and Chuang, I. L. (2010). Quantum Computation and Quantum Information. Cambridge University Press, Cambridge, 1st edition.
[Shor, 1994] Shor, P. W. (1994). Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer. In Proceedings of the 35th Annual Symposium on Foundations of Computer Science, pages 124–134.
[Smith et al., 2016] Smith, L. W., Al-Taie, H., Lesage, A. A. J., Thomas, K. J., Sfigakis, F., See, P., Griffiths, J. P., Farrer, I., Jones, G. A. C., Ritchie, D. A., Kelly, M. J., and Smith, C. G. (2016). Effect of Split Gate Size on the Electrostatic Potential and 0.7 Anomaly within Quantum Wires on a Modulation-Doped <math display=”inline”> <mrow> <mi>GaAs</mi> <mo>/</mo> <mi>AlGaAs</mi> </mrow> </math> Heterostructure. Physical Review Applied, 5(4):044015.
[Snir, 2014] Snir, M. (2014). The Future of Supercomputing. In Proceedings of the 28th ACM international conference on Supercomputing – ICS ’14, volume 25, pages 261–262, New York, New York, USA. ACM Press.
[Zurek, 2002] Zurek, W. H. (2002). Decoherence and the Transition from Quantum to Classical Revisited. Los Alamos Science, (27):1–24.