Understanding hardware infrastructure requires different analytical mindset than designing software programs. Algorithms are abstractions of the underlying computing system and are usually presented in a mathematical language. In contrast, hardware is described in physical terms, and all factors like physical location, type of connections, speed, etc. come into consideration in modeling performance gain through hardware replacement.
In the hardware world, the best way to visualize a complex data-intensive computing is through an analogy with flow of water. During the course of computation, data flows from its original location through millions of intelligent pipes and channels, and gets morphed into the output. A carefully crafted solution removes barriers and creates as many parallel channels as possible, so that datacan flow very rapidly to its final form.
For example, the RAM is a lot closer to ALU and the programmer can reduce the time required for moving numbers back and forth between hard drive and ALU by temporarily copying them into the RAM. On the other hand, RAM is expensive it is not possible to get enough space as the hard drive. So, RAM needs to be used efficiently and that is where clever algorithm and data structure help.
RAM is still not fast enough and the programming speed can be further increased by adding more expensive cache memory closer to the ALU. However, cache being expensive is smaller in size than RAM and can carry even a smaller amount of data. The registers within the CPU are the fastest and can be cleverly used to speed up the processing dramatically. However, the registers hold very small amount of data and the coding to optimize their use becomes very difficult. So, most programmers leave it to the compiler to optimize that part.
Overall, one can make his code efficient by trying to visualize the entire process of movement of data between hard disk, main memory, caches, registers and back to disk and thinking about how his algorithm operates within that framework.