By Aniruddh Nigam
In a previous post, we had discussed about why the concept of community data rights had the potential to be a framing device for claims of distributive justice in the digital economy, both in using data for the public good, and in furthering Indian developmental interests.
This potential is, however, trapped by the tangled web weaved by the Report of the Committee of Experts on Non-Personal Data (NPD Committee Report) constituted by the Ministry of Electronics and Information Technology. The report asserts clear intentions but suffers from a lack of clarity in key definitions and a lack of detail in its institutional vision of the digital economy.
It may be a positive first step on issues such as recognising community data rights, but it should be put into perspective. This means two things: first, the report is merely the first step and the maturity or sophistication of Indian policy on non-personal data is not self-evident, and second, it must be followed by other steps to truly realise the proposed vision of the report. In a worrying development for both propositions, news reports have claimed that the Joint Parliamentary Committee on the Personal Data Protection Bill has been considering expanding the scope of the Bill to the regulation of non-personal data.
This has been assessed by some, rightly, to be an unwise idea. Key questions have been raised about the scope of data sharing under the policy, the potential costs of this policy for data-oriented businesses and the desirability of a unified regulator for both personal and non-personal data. Many of these questions remain unanswered and sound policy would require striking complex balances between many stakeholders, who must be consulted in the process. More importantly, the key question to ask is whether the lofty goals of the NPD Committee Report, such as realising community data rights, are likely to be achieved by the rushed inclusion of non-personal data within the PDP Bill.
Need for clarity on desired outcomes in the NPD Committee Report
A good example of this is the concept of community data rights, which is the stated principled basis for the Report’s measures of creating data sharing mandates and enabling greater data sharing. The goals of this policy are quite ambitious - that data is put to uses that benefit communities, communities have greater agency over how non-personal data about them is used and that the economic benefits of the digital economy be distributed amongst communities. Achieving these goals require better theoretical clarity about the definitions of ‘communities’, and the kinds of control that are envisaged, as well as several practical measures to create a data economy and institutions that are oriented to these ends. Merely legislating for community data rights is unlikely to make them real.
What would it take, then, to realise the concept of community data rights? The goal of this policy cannot be simply making more datasets available on a data exchange. It would be myopic to think of the problem as purely about a scarcity of datasets or excel sheets. Enabling the use of data for the good of proximately situated communities requires building the right institutions, norms, incentives and mechanisms around data sharing. The identification of these factors requires a clarity of desired outcomes that is currently lacking in the NPD Committee Report.
This clarity of outcomes is even more necessary considering the potential costs of these policies. The financial incentive to build a data-driven business is intricately tied to the ability to obtain a competitive advantage through the use of data. Rash policy can strike a sub-optimal balance and damage economic productivity and start-ups in the digital economy. This makes it key to ask what the proposed societal benefits of this policy are, and what practical measures it would take to realise them.
Big picture of developmental interests driving ‘community data rights’
There are a few goals tied to the idea of ‘community data rights’ which are worthy of examination, and which reveal the long road that is still to be traversed. First, that enabling greater data sharing would realise ‘community data rights’ by spurring innovation and putting data to uses which benefit communities. Fostering innovation in developing artificial intelligence (AI), however, is a broader proposition than merely sharing of data. This data should not just be available, but it should also be standardised, consist of high-quality datasets, there should be enough people with the capability and competence to use this data and there should be economic incentives for those people to build local solutions. This goal has to be conceptualised as being more ambitious than the creation of a few local travel scheduling apps – it requires building a digital economy that enables and rewards innovation without necessarily achieving scale.
This needs significant investment in human capital, organisational capacity and perhaps, financial subsidies – particularly, in a way that does not concentrate capacity in a few Tier-I cities. This is possible only if fostering innovation and enabling the use of AI are a core focus of economic planning and policy.
Second, that this would allow emerging domestic businesses to compete with entrenched monopolies in the digital economy. The question to ask here is whether it would merely be enough for an upstart e-commerce platform to have, for example, access to sales data from Amazon. It would be myopic to think that only access to data would bring about a level playing field – not only would they be competing against the significant investment that a platform like Amazon has in its product teams across the world, they would also be competing with the bundled logistics services and pre-existing network effects that already exist with entrenched players. To enable true competition then, would require not just data sharing mandates, but also ensuring legislation on data portability and interoperability through personal data laws, ensuring disintermediation of platforms and logistics layers, enacting measures in competition law and work on scaling open platforms for digital commerce. The closed systems within which data collection is currently embedded must be systematically unlocked through holistic reform of the digital economy, and to focus only on enabling sharing of non-personal data would be to miss the forest for the trees.
Third, that this would give communities a greater role and collective agency in how data generated by them is used, as well as a greater share of the economic benefits of this data. There are muddy questions about the definition of a ‘community’ – take a simple, sectoral use-case of public transit data generated in a city. The definition of a ‘community’ could be the general population of the city, or only the people who use public transport, or it could be more granular with the community defined based on localities. The mechanisms for exercise of this agency would have to be effective at the correct scale.
Further, participatory governance depends on the continued engagement of the community. In a situation with low general awareness, the low priority of non-personal data governance in people’s lives and the likelihood of voter fatigue, it is a formidable task to imagine effective participatory governance in dispersed urban communities. More bureaucratic forums and committees cannot solve for this, and systems need to be built which can govern for the community, in their interest.
There are proposed models for commons-based governance which address some of these challenges, like different models of data stewardship and data trusts, but these must be evaluated through on-ground pilots. If the objective is also to redistribute the economic benefits of data to communities, then it requires not just governance mechanisms but also the development of benefit-sharing frameworks and structuring financial incentives for data providers and data users. This form of institution-building and norm formation is a weighty, and inevitably, lengthy task requiring sustained engagement from civil society as well as government.
Crafting a multi-pronged approach for realisation of ‘community data rights’
It is clearly a mammoth task that lies ahead for the Indian government if it wishes to follow through on its proposed goals in the NPD Committee Report. In its rush to challenge the skewed nature of the global digital economy and legislate for non-personal data in the PDP Bill, the government runs the risk of being penny wise but pound foolish. To effectively develop a policy that can reform the digital economy in India’s developmental interests, it is important to take a measured, incremental, and multi-pronged approach to policymaking. This would require, in the context of non-personal data, a focus on some key pillars, such as institutions, norms and incentives.
Institutions – such as data stewards, regulators and technology solution providers – are essential for the effective functioning of this vision. The role of institutions here is to enable repeatable and trusted data sharing, ensure the enforcement of rules, protect against reidentification harms, develop standards, protocols and best practices and safeguard the interests of people and communities.
Importantly, in the context of community data rights, the institution through which the collective agency of the community will be exercised will determine the shape taken by this policy. It would be a folly to assume the State as the proxy of ‘communities’ generally for the purpose of this policy and ignore the hard work of institution building necessary to realise the optimal form of this vision. Instead, this requires experiments with localised and sectoral pilots evaluating different forms of data stewardship, combined with a consultative approach to developing a regulatory structure for non-personal data.
Norms, such as the adoption of common data standards and protocols, developing interoperability standards, the operation of data exchanges, participatory governance norms and the broader norms about the use of data in other processes would also be key to these benefits playing out. For example, the accessibility of data is often touted to improve and enable better informed policymaking, but this does not always realistically align with the importance given to data-based insights in policy processes.
Finally, in the absence of the proper incentive structure, this policy runs the risk of stopping at the creation of some databases of non-personal data, while crippling emerging businesses and failing to realise any benefits of AI for the community in developing local solutions. Incentives need to be structured for data providers to part with their data where possible – perhaps through developing targeted, sectoral collaborations to solve specific goals. Incentives also need to be structured for data stewards to act as responsible guardians of data, for users of data to develop local solutions and for ordinary individuals to participate in the mechanisms for community-based governance.
These are just some of the many issues which must be confronted in developing a non-personal data governance framework that actually works to benefit communities. The holistic nature of reform required for this policy to bear fruit will take time and sustained engagement. It would be wrong to believe that merely inserting some legal provisions can achieve these goals, especially when many key questions are unanswered and the institutions responsible for this policy are vaguely conceptualised. In enacting this framework, which has significant implications for the digital economy and for economic well-being, the government must be certain that it has picked a winning horse. Perhaps to know this, it is important that for now, it holds its horses.
Aniruddh Nigam is a Research Fellow at Vidhi Centre for Legal Policy.
Vidhispeaks is a fortnightly column on law and policy curated by Vidhi. The views expressed are of the fellow and do not reflect the views of Vidhi or Bar & Bench.