In the Age of Big Data, Can Objectivity Be Achieved Without Privacy Protection and Ownership?

This blog post examines whether objectivity can be achieved in the age of big data when issues of privacy protection and data ownership remain unresolved.

 

People generate enormous amounts of data daily without even realizing it. On Facebook, one of the social networking services (SNS), hundreds of billions of posts accumulate each month, while messenger apps like WhatsApp, Zalo, and Telegram see hundreds of billions of messages sent and received daily. Beyond these posts and messages, search term histories, purchase information, web page visit records, blog posts, and more are all accumulated as data. According to MIT Sloan School of Management, 250 quintillion bytes of information are generated daily through mobile devices, SNS, and online commerce. We are creating an overwhelmingly large amount of data. This vast data is called Big Data, and the term is evolving beyond simply meaning large data to encompass the process of analyzing and utilizing it.
Big Data is not an entirely new concept. Numerous successful cases already exist, with Google, Netflix, and Apple being prime examples. Google analyzed search frequency for terms like fever and cough to forecast flu outbreaks faster than the U.S. Centers for Disease Control and Prevention (CDC). It also statistically compared billions of documents to improve the accuracy of its automatic translation system. Netflix developed its Cinematch service, which analyzes members’ viewing history to recommend movies based on their preferences. Apple’s voice command software, Siri, is another example of big data utilization. When a user asks a question via Siri, this data is transmitted to Apple’s main server. Apple’s headquarters then uses artificial intelligence algorithms to analyze the question and send the answer back to the user. These AI algorithms are built on vast amounts of data. As questions continue to accumulate on Apple’s servers, the database strengthens, making Siri’s responses increasingly sophisticated. In this way, utilizing big data enables the discovery of new information, and depending on how this information is processed, it can create boundless value. So, is it enough to simply apply big data analysis here and there? However, significant challenges remain to be addressed in big data analysis.
First, a major challenge in big data analysis is the issue of infringing on individual privacy. The ultimate goal of big data is to encompass everything. That is, viewing everything that happens everywhere as data is the aspiration of big data. Indeed, converting everything we pass by into data and expanding the scope of data can bring significant change and progress. For example, imagine a service that measures a person’s biological information and behavioral patterns in real-time to collect data, and based on this, sounds an alarm before hypertension or a stroke occurs. The data that could be considered here is extremely diverse: an individual’s food intake, real-time blood pressure, bathroom visit frequency, gait, sleep patterns, and more. If a simple chip could be implanted in the body to collect this data in real time, transmit it to an analysis center, and predict disease, could such a service become practical? The prospect of predicting illness is certainly appealing, but most people would be skeptical about allowing all their actions and physical information to be analyzed. While the previous example allows individuals to choose whether to provide their information, many people are already providing their data without realizing it. People post on social media simply to communicate, but those posts are analyzed to understand consumer demand and identify product reactions and areas for improvement. Such practices have become commonplace. However, just as photos taken without consent must not be used—similar to protecting portrait rights—the fact that analysts can access social media posts does not mean they can collect and analyze them freely. Given the privacy infringement concerns, legal clarification or agreements between individuals, sites, and analytics companies are necessary.
Beyond privacy issues, the blurring of data ownership is also a significant problem that cannot be ignored. People cannot know how long companies retain their personal information. It is also unclear to what extent companies can reprocess personal data or whether individuals have the right to completely delete personal information held by companies. If personal data is transferred to data centers in the U.S. headquarters via cloud services like Google and Apple, who owns that data? The dispute between individuals and companies over data ownership and usage rights will not end easily. Moreover, data crosses borders. Therefore, it is necessary to discuss issues such as who owns data and how much information should be disclosed not just within a single country, but on a global scale.
Finally, big data analysis cannot be perfectly objective. In the past, data was scarce, necessitating certain assumptions. Now, however, with vast amounts of data available, such assumptions are no longer needed. This allows for greater exclusion of subjectivity in data analysis compared to the past, but it does not mean big data analysis is purely quantitative and objective. Subjectivity inevitably enters the process, starting with which data to handle based on the subject or purpose of the analysis. Even if all desired data is collected, the initial data inherently contains outliers or unnecessary values. Subjectivity also intervenes in the process of judging these and refining the data to be used in the actual analysis. Particularly in the critical process of identifying the most significant meaning within the analysis, the analyst’s subjectivity inevitably comes into play. This intrusion of subjectivity risks distorting the true meaning of the original data, potentially undermining the fundamental purpose of big data analysis. In other words, rather than extracting valuable information from the original data to classify it more accurately and make predictions, the analyst may simply arrive at results aligned with their own subjective views. Therefore, it is inappropriate to blindly trust that big data analysis is inherently quantitative and objective simply because it is based on large volumes of data. We must recognize the inherent subjectivity of analysis and seek ways to enhance objectivity.
Big data analysis is gaining attention as a powerful tool across various fields, and successful cases seem to point to a bright future for it. Companies captivated by these advantages are rushing to embrace big data analysis. However, big data analysis comes with the aforementioned issues: invasion of personal privacy, data ownership and usage rights, and the problem of analytical objectivity. Without concurrent efforts to resolve these issues, big data analysis will inevitably hit its limits.

 

About the author

Writer

I'm a "Cat Detective" I help reunite lost cats with their families.
I recharge over a cup of café latte, enjoy walking and traveling, and expand my thoughts through writing. By observing the world closely and following my intellectual curiosity as a blog writer, I hope my words can offer help and comfort to others.