Disadvantages Of Data Mining

1027 Words5 Pages
Abstract — With the rapid increase in the number of organizations that independently collect different types of data, with the growing knowledge of corporations and public to keep their sensitive data private, there is a definite demand for data mining services even when the data owners refuse to provide their data directly. In the past, techniques such as random perturbation were used by data owners before sharing the data with a third-party data miner. But, even these techniques are likely to experience privacy-violation. In this paper, a completely different approach has been discussed. Each data owner derives association rules locally, sanitizes them if necessary, and sends them to a third-party data miner. The data miner collects local…show more content…
According to HIPAA passed by US Congress, any disclosure of personal information about patients is crime and health institutions are responsible for the personal information of patients. The end results of several perturbation techniques were ineffective in preserving privacy or not very useful for data mining purposes. This technique requires structural information of the entire data to be shared with the data miner in order to produce meaningful…show more content…
PROBLEM STATEMENT One major disadvantage of data perturbation techniques is that they can often destroy the integrity of the data. Here we make an assumption that the Data Miner does not have any direct access to any of the user data. • Distributed data mining consists of two ways such as horizontally partitioned data and vertically partitioned data. • Horizontal Partitioning: Horizontal partitioning of data assumes that different sites collect the same sort of information about different entities. For example, supermarkets collecting transaction information of their clients. As a result, the data to be mined is the union of the data at the sites. • Vertical Partitioning: Vertical partitioning of data assumes that different sites or organizations gather different information about the same set of entities or people. For example, hospitals and insurance companies collecting data about the set of people which can be jointly linked. So the data to be mined is the join of data at the sites. • Each site is capable of generating association rules from its own local (private) data. In this paper, we address the following issues: 1) What information should be shared between the sites and the data

More about Disadvantages Of Data Mining

Open Document