Register now or log in to join your professional community.
distributed shared memory is a form of memory architecture where the memories are physically separated and it can be addressed as one logically shared address space but the shared memory is a form of memory architecture where the memories are physically shared and it can be addressed as one logically shared address space
There are two issues to consider regarding the terms shared memory and distributed memory. One is what do these mean as programming abstractions, and the other is what do they mean in terms of how the hardware is actually implemented.
In the past there were true shared memory cache-coherent multiprocessor systems. The systems communicated with each other and with shared main memory over a shared bus. This meant that any access from any processor to main memory would have equal latency. Today these types of systems are not manufactured. Instead there are various point-to-point links between processing elements and memory elements (this is the reason for non-uniform memory access, or NUMA). However, the idea of communicating directly through memory remains a useful programming abstraction. So in many systems this is handled by the hardware and the programmer does not need to insert any special directives. Some common programming techniques that use these abstractions are OpenMP and Pthreads.
Distributed memory has traditionally been associated with processors performing computation on local memory and then once it using explicit messages to transfer data with remote processors. This adds complexity for the programmer, but simplifies the hardware implementation because the system no longer has to maintain the illusion that all memory is actually shared. This type of programming has traditionally been used with supercomputers that have hundreds or thousands of processing elements. A commonly used technique is MPI.
However, supercomputers are not the only systems with distributed memory. Another example isGPGPU programming which is available for many desktop and laptop systems sold today. Both CUDAand OpenCL require the programmer to explicitly manage sharing between the CPU and the GPU (or other accelerator in the case of OpenCL). This is largely because when GPU programming started the GPU and CPU memory was separated by the PCI bus which has a very long latency compared to performing computation on the locally attached memory. So the programming models were developed assuming that the memory was separate (or distributed) and communication between the two processing elements (CPU and GPU) required explicit communication. Now that many systems have GPU and CPU elements on the same die there are proposals to allow GPGPU programming to have an interface that is more like shared memory.
One option uses a single address space. Systems based on this concept, otherwise known as shared-memory systems, allow processor communication through variables stored in a shared address space.
The other alternative employs a scheme by which each processor has its own memory module. Such a distributed-memory system (cluster) is constructed by connecting each component with a high-speed communications network. Processors communicate to each other over the network.
The architectural differences between shared-memory systems and distributed-memory systems have implications on how each is programmed. With a shared-memory multiprocessor, different processors can access the same variables. This makes referencing data stored in memory similar to traditional single-processor programs, but adds the complexity of shared data integrity. A distributed-memory system introduces a different problem: how to distribute a computational task to multiple processors with distinct memory spaces and reassemble the results from each processor into one solution.
Distributed Computing is a way of combining the processing power of thousands of small computers (ie: PCs) to solve very complex problems that are too large for traditional supercomputers, which are very expensive to build and run.
In shared memory systems, communication of data values between processors is by way of memory, supported by hardware in the memory interface. Interfacing many processors may lead to long and variable memory latency. Distinguishing characteristics of distributed shared memory rest on the fact that communication is done in software by data transmission instructions, so that the machine level instruction set has send/receive instructions as well as read/write. The long and variable latency of the interconnection network is not associated with memory and may be masked by software which assembles and transmits long messages. However, to move an intermediate datum from its producer to its consumer a distributed shared memory machine ideally sends it to the consumer as soon as it is produced, while a shared memory system stores it in memory to be pick up by consumer when it is needed.
Shared memory can be a useful programming model for multithreaded programs. The threads run on the same machine and all communicate through a common address space.
Distributed shared memory uses software techniques to let these machines share (what seems to be) a common address space anyway.