Tutorials

Enjoy This Site? Join Our Remote R/Bioinformatics Classes

Note: These tutorials are incomplete. More complete versions are being made available for our members. Sign up for free.

Architecture of Modern Computer

Even though most bioinformaticians work on a variety of software packages for data analysis, efficient design and use of those software programs is not possible without understanding the properties of the underlying machine. In the next picture, we show a computer as seen by an user. The user interacts with it using a keyboard and a monitor, but the real action takes place in the large featureless box.

How does the computing machine process user’s requests? The answer is best understood by first taking a look at the circuitry inside a computer. A simplified description of the circuitry focusses on two components - hard-drive and motherboard. The hard-drive is a cylindrical unit that stores all data from users, whereas the motherboard contains chips to perform the computing. From time to time, data are pulled from hard-drive into the motherboard, analyzed and then the results are sent back to hard-drive for permanent storage.

Within the motherboard, the user sees a replica of the same pattern in the form of memory (RAM) and processor. The memory is where data are temporarily stored after being brought in from the hard-drive, whereas the processor performs the computing. If an user is allowed to look inside the processor, he would identify smaller fractal structure of storage in terms of cache memory and computing units called arithmethc and logical unit(ALU). Modern multi-core processors take the pattern one step further by incorporating cache inside each core in addition to the global cache memory of the processor. Finally, the ALU itself is designed with registers for holding data and core logical block for number crunching.

When you see this repetition of storage and processing units at various scales, you can conceptually think about the computing machine as solar system. The ALU (shown in black) sits at the center to perform all computing, whereas the hard drive is located at the farthest point from the ‘sun’. In that context, the computing exercise becomes repetition of dragging data from storage to the ALU and then sending back of results to hard-drive for permanent storage. Let us say an user wants to add 2 numbers - 5 and 7. The numbers are stored in the hard drive, which is far, far away from ALU. To add them once, it may make sense to carry the numbers to ALU, do the addition, and send the result 12 back to the hard drive. If the addition is immediately followed by another addition of 3 to the sum of 5 and 7, 12 needs to be again carried back from hard drive all the way to ALU. Then, after the addition of 3, the result is sent to the hard drive again. That is not efficient at all, because the communication between the hard drive and motherboard is relatively slow. Therefore, the goal of all computer algorithms is to minimize the total time spent by various numbers to move between places by making intelligent use of temporary storages like RAM, cache and registers. Naturally, an algorithm that is deemed good for a slow hard drive may not be the best, if the same hard-disk is replaced with an expensive but faster storage, such as SSD.