Ques : Describe Advantages of Data Distribution

Ans :

The primary advantage of distributed database systems is the ability to share and
access data in a reliable and efficient manner.

Data sharing and Distributed Control
The geographical distribution of an organization can be reflected in the distribution of
the data; if a number of different sites are connected to each other, then a user at one
site may be able to access data that is available at another site. The main advantage
here is that the user need not know the site from which data is being accessed. Data
can be placed at the site close to the users who normally use that data. The local
control of data allows establishing and enforcement of local policies regarding use of
local data. A global database administrator (DBA) is responsible for the entire system.
Generally, part of this responsibility is given to the local administrator, so that the
local DBA can manage the local DBMS. Thus in the distributed banking system, it is
possible for a user to get his/her information from any branch office. This external
mechanism would, in effect to a user, look to be a single centralized database.

The primary advantage of accomplishing data sharing by means of data distribution is
that each site is able to retain a degree of control over data stored locally. Depending
upon the design of the distributed database system, each local administrator may have
a different degree of autonomy. This is often a major advantage of distributed
databases. In a centralized system, the database administrator of the central site
controls the database and, thus, no local control is possible.

Reflects organizational structure
Many organizations are distributed over several locations. If an organisation has
many offices in different cities, databases used in such an application are distributed
over these locations. Such an organisation may keep a database at each branch office
containing details of the staff that work at that location, the local properties that are
for rent, etc. The staff at a branch office will make local inquiries to such data of the
database. The company headquarters may wish to make global inquiries involving
the access of data at all or a number of branches.


Improved Reliability
In a centralized DBMS, a server failure terminates the operations of the DBMS.
However, a failure at one site of a DDBMS, or a failure of a communication link
making some sites inaccessible, does not make the entire system inoperable.
Distributed DBMSs are designed to continue to function despite such failures. In
particular, if data are replicated in several sites, a transaction needing a particular data
item may find it at several sites. Thus, the failure of a site does not necessarily imply
the shutdown of the system.

The failure of one site must be detected by the system, and appropriate action may be
needed to recover from the failure. The system must no longer use the services of the
failed site. Finally, when the failed site recovers or is repaired, mechanisms must be
available to integrate it smoothly back into the system. The recovery from failure in
distributed systems is much more complex than in a centralized system.

Improved availability
The data in a distributed system may be replicated so that it exists at more than one
site. Thus, the failure of a node or a communication link does not necessarily make the
data inaccessible. The ability of most of the systems to continue to operate despite the
failure of one site results in increased availability which is crucial for database systems used for real-time applications. For example, loss of access to data in an airline may result in the loss of potential ticket buyers to competitors.

Improved performance
As the data is located near the site of its demand, and given the inherent parallelism
due to multiple copies, speed of database access may be better for distributed
databases than that of the speed that is achievable through a remote centralized
database. Furthermore, since each site handles only a part of the entire database, there
may not be the same contention for CPU and I/O services as characterized by a
centralized DBMS.

Speedup Query Processing
A query that involves data at several sites can be split into sub-queries. These subqueries can be executed in parallel by several sites. Such parallel sub-query evaluation allows faster processing of a user’s query. In those cases in which data is replicated, queries may be sent to the least heavily loaded sites.

Economics
It is now generally accepted that it costs less to create a system of smaller computers
with the equivalent power of a single large computer. It is more cost-effective to
obtain separate computers. The second potential cost saving may occurs where
geographically remote access to distributed data is required. In such cases the
economics is to minimize cost due to the data being transmitted across the network
for data updates as opposed to the cost of local access. It may be more economical to
partition the application and perform the processing locally at application site.

Modular growth
In distributed environments, it is easier to expand. New sites can be added to the
network without affecting the operations of other sites, as they are somewhat
independent. This flexibility allows an organization to expand gradually. Adding
processing and storage power to the network can generally result in better handling of
ever increasing database size. A more powerful system in contrast, a centralized
DBMS, would require changes in both the hardware and software with increasing size
and more powerful DBMS to be procured.

Leave a Reply