Researching of Big Data and Privacy

Introduction

Information has become such a significant commodity that owning and managing it is now analogous to power. In the digital age, information remains the point that connects people from all over the world in such a way as to ensure virtually seamless interaction in real-time. At the same time, the Internet has given humanity the ability to access vast amounts of information quickly. As a consequence of profound digitalization, any individual with only a smartphone can quickly learn breaking news from another country, access scientific publications, or buy goods not produced locally. Nevertheless, every action has consequences, and in the case of using such a wide range of recreational opportunities, an individual must pay for his or her privacy. Every day people give away vast amounts of information about themselves and their online activities to web aggregators, including large companies, government agencies, and banks. In turn, such information is immensely important for analysis because it provides insight into an individual’s consumer, credit, and civic habits. As a result, in the age of profound digitalization, the concept of big data is not compatible with the ideas of personal privacy and anonymity on the Web. This report aims to support the above argument through academic research on personal cybersecurity and big data. The following material is a valuable and relevant summary for students and others interested in privacy research in today’s world.

The Big Data Model

Back in the old days, helpful information about a person and their intentions was a strategic weapon to adjust management strategies in advance. For example, in monarchical countries, information about underground revolutionary organizations or planned terrorist acts helped to eliminate such sentiments and stabilize society preventively. Concerning the modern world, such information is also potentially important for national security, but even more relevant is the use of data for sociological analysis. It is worth recognizing that every time an individual accesses the Internet, regardless of the type of device, he or she leaves behind some piece of information about his or her actions. As the actual owner of personal data, the user often unknowingly or consciously alienates it in favor of their comfort. Through the vast Network of web-based services developed to simplify an individual’s life, information about location, movements, credit cards, habits, and preferences ceases to be the personal property of the user. This broad layer of data that forms a person’s virtual portrait is commonly referred to as big data (Oussous et al., 2018). As is apparent, this is an entirely ambiguous practice that has both advantages and disadvantages.

First of all, big data should be technically classified in a way that makes it easier to understand. The term big in this phrase refers to the size of an array that cannot be either preserved or processed by classical computing technologies: two consequences follow from this at once. First, the digital development of society once reached a point where the content was able to overtake form. More specifically, people already knew how big the data to be collected could be, but there was no great solution. Second, in order to use big data, it was necessary to create qualitatively new computing systems that could process information at high speed and create the resulting outputs and models. This leads to the idea that big data had specific characteristics by which any information could be classified as such. In fact, such characteristics create a framework called the four rules of big data (Marr, 2020). The information must have a critically large volume, a high rate of new data generation, differential diversity, and trustworthiness. The last trait is crucial to emphasize because if companies collect unreliable, false information about the user, it will lead to technical and reputational problems. Thus, colossal storage reserves will be filled with useless information, and if mismanaged, this data will be used for erroneous forecasting.

The trend toward big data was a relatively new phenomenon associated with the overall development of the Internet. As soon as most people became everyday users of the Web, large companies and agencies thought about using their information to optimize. In the case of banks, big data can analyze a citizen’s credit history and draw conclusions about his or her trustworthiness. So, when trying to take out a new loan, the bank can refuse a person if it considers him/her a risk group member. Not only banks but also government agencies are using the concept of big data to manage human resources. Through an in-depth study of societal attitudes and trends, authorities can adjust socio-economic policies and inhibit the development of negative civic emotions. At the same time, the use of big data in the public sector allows for more efficient management of the technical resources of the city, as it is used in London. An advanced street video surveillance system makes it possible to collect and analyze socio-demographic information about city areas and create a predictive fire risk model (Yang & Liu, 2017). In this way, it is possible to reduce the likelihood of property damage and loss of life. Finally, the biggest consumer of personal data is social media, which collects all user actions and decisions on the Web literally. Every time an individual performs a particular action and writes a specific text, this is collected by media holding servers (Mindruta, 2019). As a result, social networks can not only personalize ads for an individual but also change their algorithms to meet the needs of the average user better.

Thus, summarizing the concept of big data provides insight into how strategically advantageous it is. The use of big data is not limited to the illustrative examples of banks, government agencies, and social media but is instead becoming an increasingly prominent trend in today’s information market. At the same time, several potentially intriguing trends are noticeable in such an industry. First of all, vast amounts of data are concentrated in the resources of just a few companies, which means it is appropriate to argue about the oligopoly of the market (Opher et al., 2016). A consequence of this fact is the concentration of personal user data in companies, while definitely, a minor part of it ends up on the Internet. Second, as information has become a liquid commodity, there is a noticeable trend toward more partnerships exchanging it (McKinsey Analytics, 2018). Companies interested in building a complete portrait of an individual tend to buy some data from other aggregators. As a result, big data is a strategically important component of the digital society, allowing for optimized and personalized algorithms.

Advantages and Disadvantages of Big Data

There is no one-size-fits-all judgment on how proper and ethical the procedures for using a person’s personal data are. Reducing an entire phenomenon to a single answer postulating its rightness or wrongness is not a sound strategy. On the contrary, big data as a phenomenon should be critically examined from two angles: positive and negative, to create an academically correct account.

On the one hand, big data is an incredible achievement of humanity, showing not only the sophistication of technological thought but also the cunning of the mind. In fact, as noted earlier, throughout time, information has been an excellent tool for learning and development. In the era of the digital revolution, the use of information has reached its apogee, so by now, the big data model can be found almost everywhere. This is not surprising, given the tremendous number of benefits and advantages of using the information in this way. First and foremost, big data gives companies in the marketplace a decisive advantage over others because it can increase customer loyalty and engagement. Secondly, big data technologies are usually paired with high-speed processing, which modernizes and improves operational efficiency (Marr, 2020). Third, big banking data helps to put in perspective a culture in which a beggarly society does not end up in a debt hole. Fourth, extensive data analysis allows for predicting and modeling situations, which means it benefits science, national security systems, and analytical agencies. From all of the above, it is clear that the use of big data in all spheres is justified by the desire to simplify and systematize life. Thus, humanity receives the expected comfort, and businesses increase profits.

In contrast, the use of big data can be fraught with serious risks. One technique is the technical imperfection of existing systems with high levels of requirements and standards for data use. More specifically, the collection and processing of enormous data sets require not only vast amounts of storage but also the training of service staff (Rombaut, 2020). As a result, the use of big data is complex and challenging for companies and agencies. However, a much bigger problem with big data is the unresolved issues of personal security and anonymity on the Web. When an individual uses online shopping, social networking, or Web-based services, usage data is recorded on the servers of the companies responsible. Although such data does not become public, its possession by others may seem like an unethical practice. In particular, an individual may choose not to share information about his or her search queries with organizations, and in a free Internet environment, he or she is entitled to do so. In reality, practice shows precisely the opposite scenario, and hundreds of gigabytes of useful information per person are stored and processed by companies. As a result, there are two severe risks in using the big data model: technical complexity and the lack of anonymity and user privacy.

The Impact of Big Data on Privacy

Once it has become clear exactly what big data is and why it should not be eliminated from modern society, it is important to discuss its manifestations and effects on each individual’s privacy. It is important to say that this is a sensitive issue since no society in the world has yet to find a straightforward solution to the ethical dilemma of security and comfort versus anonymity. In that case, they should be considered the problem objectively, without emotional coloring. To do this, it is first necessary to formulate the concept of privacy and identify what is meant by the term.

Privacy in the traditional interpretation should be seen as an inalienable right of any citizen, providing a guarantee of autonomy and protection of human dignity. In reality, privacy is not only characteristic of the Network because, in real life, one is not inclined to tell all one’s secrets to others. Thus, the individual draws boundaries between the personal and the public. In the virtual environment, it is not easy to draw such boundaries on one’s own, because unlike physical interaction with people, the virtual does not submit to the will of the individual. For instance, even if a user does not communicate with other people on social networks, does not post personal photos, or fills out personal information profiles, it is not enough to ensure complete anonymity. Any data about actions on the Web clicks on links, time spent on the site, and the number of lines reads become data that an individual cannot control. Thus, if one continues to use the Internet, the user must recognize the failure of the model of complete data privacy. In that case, it is worthwhile to be aware in advance of all the possible negative consequences of voluntarily handing over big data to corporations and agencies.

One of the most significant vulnerabilities that compromise the integrity and security of user privacy is the technical imperfection of security methods. In reality, a user may not mind that all data about his or her use of the Internet is available to organizations. This opinion looks pretty fair if the individual is interested in the technological development of favorite companies and more fine-tuning of personalized advertising. In such a case, the user voluntarily agrees to share his or her data and track actions, including through Cookies (Degeling et al., 2020). The user’s only expectation of such a procedure is to provide transparent protection and absolute security of the stored information. In practice, however, the situation is often the opposite. Numerous cases of personal data leakage and storage hacking are known (Amthul, 2020). As a result, attackers gain access to sensitive information and can use it to blackmail an individual. This scenario obviously raises questions about the ethics of the data collected, but, more importantly, about why databases are not inviolate.

It is essential to discuss the problem of insufficient security in the context of the development of the smart home and the Internet of Things. Today’s technology has advanced to the point where electrical devices inside the home can be controlled literally from the phone (Domb, 2019). For example, a person can set a schedule to turn on the coffee machine, open the blinds, lock the door, or adjust the brightness of the lights. Since all of the actions described are linked in a single ecosystem, data about a person’s habits can be collected on the manufacturing company’s servers. Hacking into such databases means that the perpetrator gains access to susceptible mechanisms and can control the home remotely. In turn, this can be linked in the long term to the development of cybercrime in order to eliminate the victim physically.

Another negative factor affecting user privacy is the possibility of discrimination against specific categories of people at the federal level. More specifically, modern banking, hotel, and airline reservation systems often operate on the principle of automated control through machine learning (Dushimana et al., 2020). This means that in order to give a verdict of approval or denial to a user’s application, the systems look at many data to analyze the trustworthiness and integrity of a citizen. As a result, if a person does not have a good credit history or has been found to have contract violations, he or she may receive a low trust rating. Consequently, this will lead to the inability to take out a new loan for the desired purchase or undergo the airline ticketing procedure. Therefore, in this context, it is appropriate to talk about discriminatory practices, which on the one hand, concern the improvement of society in the long term, but on the other hand, create difficulties for some groups. Moreover, if the information about the low rating turns out to be public, it will additionally create discomfort for the privacy of the city since his or her surroundings will be aware of the individual’s unreliability.

It is worth noting that the trend toward complete de-anonymization of big data is not popular, so companies are often interested in designing an encryption mechanism. Encrypted information about user actions is depersonalized so that even if there is a leak, it will be impossible to determine who exactly belongs to what data (Dimitrova, 2020). In practice, modern technology allows de-anonymization with high accuracy, so data matching is not such a challenging task for criminals. In other words, no matter how much companies flaunt complete protection of the data they collect from outsiders, this is not an absolute guarantee. So, one day the entire set of information about a particular user may be public.

In this regard, it is imperative to remember Apple’s recent innovation of the iOS 14.5 operating system, which has revolutionized big data. In fact, the company gave users of its gadgets the ability to choose which apps to track, collect and analyze (Hardwick, 2021). Thus, the social network Facebook, for example, will not be able to collect data on website visits and online activities without user consent, which means targeting ads will no longer be possible. The best demonstration of public attitudes toward this innovation is a recent survey that showed that 96% of users did not allow apps to track their actions (Hardwick, 2021). In other words, ordinary people have little interest in the current management of big data and are opting for privacy.

Conclusion

In conclusion, it should be reiterated that there is a severe imbalance between personal privacy and the concept of big data in the current stage of society. Individuals are often interested in the security of their own data, so there are two paths for them. First, they can use strategies of complete detachment from Web interfaces or innovations of operating systems that allow them to partially or entirely prevent the collection of user data. Second, they can voluntarily consent to the transfer and tracking if it is in their interest. Regardless of the choice, every individual wants an array of sensitive information about him or her to be securely protected. Practice shows the opposite situation: data often leaks out of companies’ vaults and becomes public. In turn, this has the potential to grow into a policy of blackmail or social discrimination against individuals. Thus, in today’s world of advanced and pervasive digital technology, it is not appropriate to claim the possibility of complete privacy on the Web.

References

Amthul, N. (2020). Persistent data exposure is a much riskier problem in today’s remote world. Security. Web.

Degeling, M., Utz, C., Lentzsch, C., Hosseini, H., Schaub, F., & Holz, T. (2018). We value your privacy… now take some cookies: Measuring the GDPR’s impact on web privacy. Web.

Dimitrova, Z. (2020). Data anonymization techniques and best practices: A quick guide. Record Evolution. Web.

Domb, M. (2019). Smart home systems based on the internet of things. Internet of Things (IoT) for automated and smart applications. Intechopen. Web.

Dushimimana, B., Wambui, Y., Lubega, T., & McSharry, P. E. (2020). Use of machine learning techniques to create a credit score model for airtime loans. Journal of Risk and Financial Management, 13(8), 180.

Hardwick, T. (2021). Analytics suggest 96% of users leave app tracking disabled in iOS 14.5. MacRumors. Web.

Marr, B. (2020). What are the 4 V’s of big data? BM&Co. Web.

McKinsey Analytics. (2018). Analytics comes of age. Web.

Mindruta, R. (2019). The top social media monitoring tools. Brandwatch.

Opher, A., Chou, A., Onda, A., & Sounderrajan, K. (2016). The rise of the data economy: Driving value through the internet of things data monetization. Web.

Oussous, A., Benjelloun, F. Z., Lahcen, A. A., & Belfkih, S. (2018). Big Data technologies: A survey. Journal of King Saud University-Computer and Information Sciences, 30(4), 431-448.

Rombaut, V. (2020). Top 5 problems with big data — and how to solve them. Piesync. Web.

Yang, Z. & Liu, Y. (2017). Using statistical and machine learning approaches to investigate the factors affecting fire incidents. Web.