All you need to know about data scraping and what the NDPC should do

Omoleye Omoruyi
Data scraping

In Nigeria, data scraping, the automated extraction of information from websites and online platforms, is becoming increasingly popular. This is due to the country’s large and growing population, as well as its increasing internet penetration. As a result, there is a wealth of data available online that can be scraped for valuable insights, or malicious intents.

And, in today’s digital age, where data has become a valuable asset driving innovation and economic growth, the collection and utilisation of data also raise concerns regarding privacy, security, and ethical practices. So, it is now an important conversation.

The big deal about data scraping

...using specialised tools or software, data scrapers navigate through web pages, capture specific data points, and compile them into structured datasets for analysis or further use.

Data scraping is a common practice in the tech industry, and it can be used for a variety of purposes, such as market research, product development, and fraud detection.

However, data scraping can also be used for malicious purposes. For example, it can be used to collect personal information without consent – which can be a privacy violation, or to generate spam or phishing emails, and more recently, to train AI tools.

data scraping

On the business side, data scraping can overload websites and servers. This can make websites slow down or even crash. It can also be used to attack websites and steal data.

Also, data scraping can be used to manipulate markets. For example, a data scraper could collect data on stock prices and then use that data to make trades that would benefit them financially.

A little history

Data scraping has been around for as long as there have been websites. The first instance of web scraping was recorded in 1993 when Matthew Gray developed the World Wide Web Wanderer Offsite Link Detector to measure the size of the internet.

The World Wide Web Wanderer Offsite Link Detector is a software program that crawls the web, collecting data about the number of websites and the number of links between websites.

The program starts by randomly selecting a website to crawl. It then follows all of the links on that website to other websites. It continues crawling websites and following links until it reaches a predetermined number of websites or links.

The data collected by the World Wide Web Wanderer Offsite Link Detector is used by a wide range of organisations, including businesses, governments, and academic institutions.

Who are the data scrapers?

Businesses, with a special focus on social media companies.

Unless you read the privacy document, including the terms and conditions of social media platforms, and decide not to use any of them, you are part of the privacy concerns many individuals and groups have raised about how these platforms use our data.

Data scraping

Some people worry that social media platforms are collecting too much data about us [emphasis on knowing you better than people around you] and that they are using this data for inappropriate purposes. Others worry that social media platforms are not doing enough to protect our privacy and that our data could be used to harm us.

Of course, scraping public data is legal, even if it feels like an invasion of privacy. If your profile is public, there is a 99.9% chance that your data has been scraped. This data includes any demographic that is public and is used for advertising, including age, race, gender, location, interests, ethnicity, etc.

Third-party companies don’t need your permission to collect your data, since technically, they’re not liable for what you willingly post on social media.

But to curb what many call an ‘invasion of privacy’, there has been increasing regulation of social media platforms in the United States and around the world. This regulation is designed to protect our privacy and to ensure that social media platforms are transparent about how they collect and use our data. And, societies like Nigeria are just catching up to regulations like this.

AI platforms. In the realm of artificial intelligence (AI), data plays a critical role in training and improving machine learning models. So, your data is most likely in the hands of one AI tool or the other, unless you don’t use social media -or the internet

It is this AI conversation that led Elon Musk to restrict the number of tweets each category of user can see per day – even though it is argued he only wants more people to pay to use the platform.

Yet, as AI technology continues to develop, AI platforms will continue to scrape data. Though, this will allow them to train more accurate AI models, which will have a significant impact on a variety of industries.

Government agencies. State actors use data scraping to collect data about their citizens, businesses, and the economy. This data can be used to make better policy decisions, such as allocating resources and enforcing regulations.

On the flip side, they can use it to monitor opposition, as in the case of #EndSARS in Nigeria; to gain intelligence on their adversaries and cause chaos in another country; manipulate public opinion, steal trade secrets, etc.

Hackers.

Hackers scrape data using a variety of techniques, including:

  • Web scraping: Hackers can use web scraping to collect data such as usernames, passwords, credit card numbers, and other sensitive information.
  • API scraping: APIs are software interfaces that allow programs to communicate with each other. Hackers can use APIs to scrape data from websites that do not have public APIs.
  • Social engineering: Social engineering is the process of tricking people into giving up their personal information. Hackers can use social engineering techniques to trick people into clicking on malicious links, providing their personal information, or giving up their passwords.
  • Malware: Malware is software that is designed to harm computers. Hackers can use malware to scrape data from computers by installing it on the computer without the user’s knowledge.
Data scraping

There is hardly any good that can come out of the Nazareth of hackers. But, what benefits are there in data scraping?

Data scraping enables businesses, researchers, and analysts to access a wealth of information available on the internet. It allows them to gather data on market trends, consumer preferences, competitor analysis, pricing information, and much more. This valuable data can drive strategic decision-making, inform product development, and identify emerging opportunities.

By leveraging data scraping, organisations can gain a competitive edge in their respective industries. The ability to gather real-time, comprehensive, and accurate data allows companies to stay informed about market dynamics, adapt quickly to changing trends, and identify gaps or areas for innovation.

There are several other benefits but striking a balance between data accessibility and privacy rights becomes critical. Respecting privacy regulations and obtaining appropriate consent is essential to ensure ethical data scraping practices.

How should NDPC play in this field?

According to the NDPC: “A Data Controller must seek this consent either in writing or by any other action through which the Data Subject knows he is giving consent. There are exceptions where duly constituted authorities can process data without consent in the public interest or where private organisations may have a lawful and cogent basis (albeit rebuttable) for data processing. These exceptions are without prejudice to the principles of data protection. Hence every data controller whether acting in public interest or in private interest can be held to account under the NDPR.”

The National Data Protection Commission (NDPC) in Nigeria can ensure that data scraping is done ethically by:

  • Establishing clear guidelines for data scraping: The NDPC should establish clear guidelines for data scraping, specifying what data can be scraped, how it can be scraped, and how it can be used. These guidelines should be designed to protect the privacy of individuals and to ensure that data scraping is used for legitimate purposes.
  • Providing training and education: The NDPC should provide training and education to data scrapers about the NDPC‘s guidelines and the rules and regulations governing data scraping. This training would help to ensure that data scrapers are aware of their legal obligations and that they are using data scraping in a responsible manner.
  • Monitoring data scraping activity: The NDPC should monitor data scraping activity to identify and take action against data scrapers who are violating the NDPC‘s guidelines or the law. This monitoring would help to deter data scrapers from engaging in illegal or harmful activities.
  • Working with other government agencies: The NDPC should work with other government agencies, such as the Nigeria Communications Commission (NCC) and the Economic and Financial Crimes Commission (EFCC), to share information and coordinate enforcement efforts. This collaboration would help to ensure that data scraping is regulated in a consistent and comprehensive manner.

Also, the NDPC can create a public awareness campaign to educate people about data scraping and the risks associated with it. This campaign would help people to understand their rights and how to protect their privacy when it comes to data scraping.

To make this better, the Commission can provide a complaint mechanism for people who believe that their privacy has been violated by data scraping. This mechanism would allow people report data scrapers who are violating the law or the Commission’s guidelines.

On a final note, the NCDC should review its guidelines regularly to ensure that they are up-to-date and reflect the latest developments in data scraping. This would help to ensure that the guidelines are effective in protecting the privacy of individuals.


Technext Newsletter

Get the best of Africa’s daily tech to your inbox – first thing every morning.
Join the community now!

Register for Technext Coinference 2023, the Largest blockchain and DeFi Gathering in Africa.

Technext Newsletter

Get the best of Africa’s daily tech to your inbox – first thing every morning.
Join the community now!