By A Damodaran
In 2006, when the British mathematician and data aficionado Clive Humby said ‘Data is the new oil’, he was criticised for being flippant both about data and oil. Yet, even today, his remark continues to be one of favourite quotes in data science gatherings.
The data flow dynamic
India’s time tested, public databases (including the consumer data bases of the National Sample Survey) have the limitation of being ‘time-specific’. Contrast this with the data bases maintained by e-commerce and streaming platforms. These entities collect, store and process data, at the individual consumer level, on real time basis. They utilise their dynamic datasets to create novel business opportunities through new products and services.
Ideally speaking, for a country-wide data economy to emerge, it is essential that there exists a robust machinery that catalyses the generation, processing, and analysis of real-time data. Datasets so created, need to be made available to different categories of users through organised data markets. The financial economist, Mark Garman, refers to such systems as ‘micro-market structures’.
India’s data scenario is riddled with contradictions. While on the one hand, we have large public databases that have not been updated for years, on the other end of the spectrum, we see clear signs of a vibrant ’data economy’ in evolution.
India’s evolving data economy
India’s progress on the digitisation front has opened the doors for the emergence of its data economy. According to the global data platform Statista, India’s population of digital buyers was 289 million in 2021. India’s e-commerce market is estimated to reach $350 billion by 2030 which would be five times the e-commerce market size in 2022. The success of the Unified Payments Interface (UPI) and the ensuing tempo in digital transactions, opens possibilities for rapid growth of consumer transactions data. Meanwhile, the advent of the Open Network for Digital Commerce (ONDC) promises the emergence of a rich reservoir of data on consumer preferences. What further strengthens the scenario is the not-so-widely advertised fact that India is currently ranked as the 13th largest data-centre market in the world.
Today, a variety of players operate in India’s evolving data ecosystem. At the data providing end, one sees large data generating entities like Amazon, Flipkart, Swiggy, etc platforms, co-existing with small data-providing companies that scrape data from the web and turn them into structured data sets.
Data users in India too are a heterogenous lot, ranging from corporates that source analysed data from large data providers, to SMEs and start-ups which buy raw or semi-processed data from smaller providers and utilise Software-as-a-Service (SaaS) tools to fine-tune them to meet the specific needs of their customers.
However, India does not enjoy the presence of robust, home-grown, data trading platforms. Most data related transactions are performed ‘over- the-counter’, on fixed price basis. Many small data providers are not able to access a wider network of data-seekers.
Challenges and prospects
The growth trajectory of India’s data economy in the coming years would centrally depend on how rapidly we increase internet coverage and how swiftly we are able to incorporate our large unorganised sector into the data economy fold. It is noteworthy that the Digital Personal Data Protection Act 2023 does not prohibit data extraction from websites. Nevertheless, the absence of easily-accessible trading platforms could act as a serious constraint for R&D institutions with limited budgets, in accessing quality datasets at competitive prices. The situation could change once the India Data Sets Programme (IDSP) gets fully operational and its large corpus of non-personal data is open for use by the country’s public institutions.
At a more fundamental level, for India to transform into a mature data economy, it is essential that our emerging data sources are optimally utilised with the aid of ‘data processing infrastructure’ that enable large datasets to be matched. For instance, transactions-related data, emerging from the open network for digital commerce data (ONDC), if matched with NSS consumer expenditure data sets, can go a long way in creating large dynamic consumer data bases. However, for this to happen, it is essential that our prowess in Machine learning and data integration are well harnessed.
Viewed this way, Artificial Intelligence (AI) is both a pre-condition and a consequence of India’s evolving data economy.
The author is Distinguished Professor, ICRIER-Prosus Centre for Internet and Digital Economy
Disclaimer: Views expressed are personal and do not reflect the official position or policy of Financial Express Online. Reproducing this content without permission is prohibited.