This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
The recently released
香京julia种子在线播放
In recent years, there has been a proliferation of health research worldwide. Health research contributes to the understanding of disease, the improvement of healthcare systems, the development of new medicines and treatments, and technologies aimed at bettering health and healthcare (
With the growth of health research in South Africa came the need to address various ethical concerns in health research, align with international standards, protect research participants, and ensure the proper conduct of health research. In 2015, the Department of Health (DoH) released the second edition of the
Health research has further progressed with the advancement of genome sequencing, which led to genomics research and the use of large datasets. The availability of health research data, which could have huge positive impacts on population health, led to calls for datasets, materials, processes, protocols, findings, results, and software to be made more accessible (
The newly released
Given that advancements in technology have allowed science to become more “open,” open science must be viewed as distinct from the previous
A common definition of open science, put forward by
Central to the implementation of open science is the FAIR Guiding Principles, which are applicable to scientific data management and stewardship (
Open science has not only been promoted by the AOSP in various strategies and reports (
Given the importance of open science, one would expect it to appear in most government documents. However, in South Africa, a focus on open science has been lacking, and it has not featured in many recent and relevant publications—such as the
The Draft Guidelines are intended to provide minimum standards for undertaking ethical and responsible research in South Africa (
In what follows, I analyze various problematic aspects of the Draft Guidelines, specifically in relation to open science—namely, the failure to consider open science, the definition of open data, the importance of comprehensive definitions, the matter of privacy and consent, and the failure to provide proper guiding principles for open access data—and point towards potential solutions, where relevant.
Open data, which is explicitly referred to in the Draft Guidelines, is regarded as a “sub-set” of open science (
In recent years, there has been a push for open science in South Africa and the concept has featured in two government documents: The 2019 STI White Paper and the 2022 Draft National Open Science Policy. However, the Draft Guidelines only focus on one aspect of open science—namely open data—and fail to even mention open science. Therefore, the Draft Guidelines do not promote government policies and strategies intended to further research in South Africa and make it more open and accessible.
However, it should be noted that, without expressly stating so, the Draft Guidelines do appear to point towards open science. The Draft Guidelines recognize that the sharing of data has the potential to
Generally, definitions of open data denote that such data must be freely accessible to be used and re-used by anyone (
The Draft Guidelines rely on the definition of “open data” provided in the Draft National Policy on Data and Cloud, which it defines as “data that is made freely available to everyone for use, re-use and republishing as they wish, subject to ensuring protection of privacy, confidentiality and security in line with the Constitution” (
The provision of definitions serves to assist in providing a common understanding of key terms, thereby lessening the chance of ambiguity and misinterpretation, and ensuring consistent implementation (
The Draft Guidelines lack definitions relevant to open access data, and only contain a definition of “open data” (defined above). However, had the Draft Guidelines placed this within the broader concept of open science, a definition of such would have been beneficial. Notwithstanding this, there are other definitions relevant to open access and data in research that are pertinent to include. For example, the Draft National Open Science Policy defines “open access” as “a set of principles and a range of practices through which research outputs are distributed online, free of cost or other access barriers” (
Additionally, the Draft Guidelines seem to make fundamental errors in basic definitions. The terms “open data” and “open access” are not synonymous and should therefore be distinguished. However, the Draft Guidelines refer to “open access,” “open data,” and “open access data” and appear to conflate these three terms—which causes confusion regarding what is being referred to (
A further point to note is the differences between the two definitions of “open data”—one provided in the Draft National Open Science Policy and the other in the Draft National Policy on Data and Cloud (and utilized in the Draft Guidelines). Both definitions refer to data that is freely available to all and can be used and shared—although the Draft National Policy on Data and Cloud refers to re-use and republishing (
Given the above, I suggest that the Draft Guidelines consider revising the definitions provided in relation to open access data. The inclusion of additional relevant definitions—such as open science and open access—as well as the provision of a comprehensive and integrated definition of open data will serve to provide greater clarity when interpreting the Draft Guidelines.
Central to health research is the sharing of data and results. Increased access to such data serves to streamline the research process, making it more efficient and participatory by lessening duplication as well as the costs associated with the creation, transfer, and re-use of data (
The Draft Guidelines also note that although many participants may not want to publicize their health and genetic data, there are some that do and there should be no obstacles to prevent participants, who wish to share their data in an identifiable manner, from doing so—provided that all foreseeable harms resulting from identification are negligible and understood by participants (
Given the complexities of health and genomics research, as well as the potential risks involved, consent is vital in all health research involving human participants. The Draft Guidelines provide for three types of consent—specific (or narrow) consent, tiered (or differentiated) consent, and broad consent (
A potential legal and ethical pathway for an open consent model for genomics research and open access databases in South Africa has already been established (
The Draft Guidelines deal with, what it refers to as, “guiding principles for open access”. The Draft Guidelines provide that because the Draft National Policy on Data and Cloud supports open access to data, there is a need for guiding principles for health research. Contrary to what is stated in the Draft Guidelines, it is not only the Draft National Policy on Data and Cloud that supports open access to data. Other policies and reports—such as the Draft National Open Science Policy (
Before examining each of the guiding principles for open access in the Draft Guidelines, it should be noted that some of the principles in the Draft Guidelines come from the
Data curation is important in promoting open access (and thereby open science) in research as it maintains the integrity and value of open data. However, the concept of curation is broad and multifaceted, ranging from the selection of data to its management (
The Draft Guidelines mention the preservation of data with “acknowledged long-term value” (
Although data management plans tend to focus on active research, and long-term data curation deals with the preservation, maintenance, and accessibility of data after the research has been completed (
To provide greater clarity to researchers, I suggest that the Draft Guidelines amend this principle to be more in line with the Draft National Open Science Policy. There are two possible ways in which this can be achieved: (1) the Draft Guidelines amend the current principle to “strategies for long-term data curation are required”; or (2) the Draft Guidelines remove the current principle and combine it with principle (4) regarding data management plans, which is discussed below. I suggest that each of the guiding principles for managing open access data provided in the Draft Guidelines contain an explanation in order to expand on the principle and provide proper, and more detailed, guidance to researchers. Therefore, in terms of (1), the Draft Guidelines can explain that detailed long-term data curation may not be required for all research projects, and it depends on the research. In terms of (2), the Draft Guidelines can specify that long-term data curation be included as part of the data management plan—in line with the Draft National Open Science Policy—or, where required and depending on certain factors like the nature of the research and the type of data collected, long-term data curation be detailed separately.
The principle relating to reasonable first use in the Draft Guidelines was adopted from the UK Concordat on Open Research Data (
I suggest that the Draft Guidelines remove reference to the right to reasonable first use, and instead focus on ownership. In South Africa, the current position is that the data generator can acquire ownership of the data (
It is important that the Draft Guidelines differentiate data ownership from copyright in datasets. While ownership of data is governed by property law—as found in South Africa’s common law—copyright in a dataset is governed by intellectual property law—specifically the
Being the data owner will assist in giving researchers the confidence that they have the right to openly share their data—thereby promoting open access and open science. As such, I suggest that this principle be replaced with the following: “Data generators, as owners of the data, should be encouraged to openly share their data”. This revised principle should explain: (1) the position on ownership of data in South African law; (2) the fact that ownership and intellectual property rights should not be confused; and (3) how data generators should promote open access and open science by sharing their data. Additionally, recognition should be given to indigenous people, in line with the CARE Principles. The Draft National Open Science Policy acknowledges that the CARE Principles deal with research that is not unethical or exploitative, and where the design of data ecosystems ensures that indigenous people benefit from such research (
The Draft Guidelines provide that the openness of research data may be limited if there are “sound reasons” for doing so (
In South Africa, publicly funded research is governed in part by the
A data management plan is a formal document that details how data will be handled throughout a research project. It addresses the data to be gathered during a research project, its management, analysis, and storage, as well as measures for sharing and preserving data once the research is complete (
Additionally, the POPIA Code of Conduct for Research contains the relevant information that researchers must include in their research protocol. A research protocol is defined as “documentation that outlines the plan of a research study” and is inclusive of a data management plan (
The Draft Guidelines state that the use of secondary should be governed by legal, ethical, and regulatory frameworks that protect personal information, but fail to expand on what these frameworks are. For example, POPIA—as well as the POPIA Code of Conduct for Research—are specifically designed for this purpose, but are not mentioned in this section of the Draft Guidelines. Without concrete guidance and clarity, the guiding principles for open access data provided in the Draft Guidelines fall short.
Furthermore, it is not just the secondary use of data that is important. The initial processing of data must adhere to data protection laws. Section 13(1) of POPIA requires that personal information be collected for a “specific, explicitly defined and lawful purpose”. Section 15(1) of POPIA allows for the further processing of personal information, provided that it is compatible with the purpose for which it was originally collected. Therefore, if data was initially collected for research, any subsequent use of the data for research is allowed in terms of POPIA. Further, where personal information is used for
The POPIA Code of Conduct for Research is mentioned in the Draft Guidelines in terms of privacy and confidentiality of participants, and offers a means to ensure that researchers are compliant with POPIA (
The Draft Guidelines, while providing a principle regarding the protection of personal information, only consider secondary use of data (and not initial use) and fail to define the “legal, ethical and regulatory frameworks” that are applicable. This means that there is a lack of guidance regarding this important aspect of research, and which could lead to a contravention of the provisions in POPIA. To amend this, I suggest that the Draft Guidelines revise this guiding principle as follows: “the use and re-use of data should be governed by legal, ethical, and regulatory frameworks that promote the protection of personal information”. Additionally, I suggest that the Draft Guidelines: (1) provide for both the initial use, as well as the re-use, of data; and (2) make reference to POPIA and the POPIA Code of Conduct for Research. However, the Draft Guidelines should ensure that they state the law as it exists, rather than attempting to engage in an interpretive exercise.
The final guiding principle for managing open access data in the Draft Guidelines provides that use of secondary data should acknowledge its sources and comply with the terms of access and use. This principle in the Draft Guidelines is taken from the UK Concordat on Open Research Data (
While the Draft Guidelines refer to the “use of secondary data,” most other policies and strategies in South Africa dealing with open science, open access, and open data refer to re-use. Although POPIA and the POPIA Code of Conduct for Research do not specifically require acknowledgement of the data source, it promotes transparency—a lawful ground for the processing of personal information in POPIA—and it is good practice to acknowledge sources.
The Draft National Open Science Policy, while not specifically referring to “secondary use,” does refer to “re-use” and permits data to be used and re-used freely without restriction, and without the need to acknowledge sources (
As good practice, I suggest that the Draft Guidelines amend this guiding principle to read as follows: “The re-use of data should include appropriate acknowledgement of the sources and adhere to the terms of access and use”. It is important for the Draft Guidelines to clarify what is meant by this guiding principle and what is required of researchers in this regard.
In determining guiding principles for open access data, the Draft Guidelines rely solely on the Draft National Policy on Data and Cloud to the exclusion of other relevant policies and documents. However, open data—as the Draft Guidelines define it—cannot be viewed in isolation, and regard must be had to the broader concept of open science. Open science and its related terms—such as open access and open data—feature in several government policies and strategies and offer potential pathways for the open sharing of data. Many of the existing policies and strategies do not provide concrete guidance on open science or open access, but rather call for the establishment of a policy or framework to govern the field (
The ICT Policy White Paper aims to utilize Information and Communication Technologies (ICTs) to reduce poverty and inequality in South Africa (
The Draft National Policy on Data and Cloud aims to promote the socio-economic value of data and create an enabling environment for the data ecosystem to flourish through
The AOSP recognizes that the shift to open science is necessary (
Among the policy intents of the STI White Paper is ensuring that South Africa’s knowledge system is open, diverse, and responsive (
Importantly, the Draft National Open Science Policy specifically provides guiding principles for open science in South Africa (
Based on the above, it is essentially only the Draft National Open Science Policy that explicitly provides guidelines for open science in South Africa. Although useful, it is clear that these guidelines are broad and are not tailored to the specific area of health research. Nevertheless, I suggest that the Draft Guidelines place greater reliance on the various government policies and strategies in existence as they are essential in the realization of open science in South Africa. The Draft Guidelines should be cautioned against adopting principles from other jurisdictions, as was done through reliance on the UK Concordat on Open Research Data (
The AOSP highlights that, in adapting to open science, Africa should do so in its own way and based on its own priorities, rather than following other jurisdictions (
The guiding principles for managing open access data provided by the Draft Guidelines lack concrete guidance on a pathway for the use and sharing of open access data in health research in some respects. These guiding principles appear more as values that have little to do with promoting openness and access, and rather focus on the protection and limitation of such data. As such, there is certainly room for improvement, specifically in terms of the guiding principles for managing open access data. Below, I provide consolidated suggestions for improving the Draft Guidelines based on my analysis above.
The suggestions for principle (1) are as follows: (1) provide a definition of “curation” in order to provide clarity to researchers; (2) remove reference to data curation in terms of POPIA as it does not appear in the Act; and (3) clarify how long-term value will be determined, or acknowledge that in the context of health research, it is likely that all data will be valuable in the long-term. The Draft Guidelines can explain that detailed long-term data curation may not be required for all research projects, and it depends on the research. Alternatively, this principle can be combined with principle (4) regarding data management plans below, in which case the Draft Guidelines can specify that long-term data curation be included as part of the data management plan or, where required and depending on certain factors like the nature of the research and the type of data collected, long-term data curation be detailed separately.
The suggestions for principle (2) are as follows: (1) remove reference to the right to reasonable first use, and instead focus on ownership; (2) explain the position on ownership of data in South African law; (3) differentiate data ownership from copyright in datasets; (4) promote the open sharing of data by data generators; and (5) recognition should be given to indigenous people and their data in terms of the CARE Principles.
The suggestions for principle (3) are as follows: (1) elaborate on situations when the openness of research data may be restricted, what these sound reasons are, and how they will be implemented; and (2) provide an explanation under this guiding principle that contains a caveat listing instances where openness may be restricted.
The suggestions for principle (4) are as follows: (1) require that data management plans, where applicable, describe how data used in research will be made open; and (2) make specific reference to the POPIA Code of Conduct for Research, which contains requirements for research protocols.
The suggestions for principle (5) are as follows: (1) provide for both the initial use, as well as the re-use, of data; and (2) make reference to POPIA and the POPIA Code of Conduct for Research as the “legal, ethical and regulatory frameworks” that are applicable. The Draft Guidelines should be cautioned against interpreting the law, and should rather state the law as it exists.
The suggestions for principle (6) are as follows: (1) remove reference to “secondary data” and replace it with “re-use”; and (2) clarify what is meant by this guiding principle and what is required of researchers in this regard.
In addition to the guiding principles for open access data, there are additional considerations that I suggest the Draft Guidelines take into account: (1) avoid placing sole reliance on the Draft National Policy on Data and Cloud and adopting principles from the UK Concordat on Open Research Data that may not apply in South Africa in their current form; (2) explicitly mention open science and expand on its importance in health research; (3) develop a comprehensive definition of “open data” that takes into account other definitions provided by the Draft National Open Science Policy and the ICT Policy White Paper; (4) provide other definitions relevant to open access and data in research, such as “open science” and “open access,” and differentiate between “open data,” “open access,” and “open access data”; (5) provide a potential pathway for open consent to further open science; (6) retain the previous provision in the 2015 DoH Ethics Guidelines regarding blanket consent to allow for the possibility of open consent; (7) refer to other South African government documents that deal with open science, open access, and open data to bolster the Draft Guidelines; and (8) include reference to South African legislation, where relevant. I also suggest that each of the guiding principles for managing open access data provided in the Draft Guidelines are accompanied by an explanation in order to expand on the principle and provide proper, and more detailed, guidance to researchers.
Health and genomics research in South Africa have a vital role to play in bettering the health of the population through an increased understanding of various diseases and the ability to develop more effective treatments and advance healthcare and technologies. However, its full potential cannot be realized if data and resources are not open and accessible to others. The Draft Guidelines serve to guide researchers in conducting health research in an ethical and responsible manner. Although the Draft Guidelines set the benchmark for health research in South Africa and are invaluable in certain respects, the inclusion of open access databases in the Draft Guidelines requires improvement. By only relying on one draft government policy—namely, the Draft National Policy on Data and Cloud—and overlooking other drafts that are relevant, such as the Draft National Open Science Policy, the Draft Guidelines cannot provide a comprehensive and context-specific pathway for open access data in research. Additionally, and from a policy perspective, the Draft Guidelines have an obligation to consider, and align with, principles of open science. By failing to expressly do so, the Draft Guidelines fall short in this regard.
While the Draft Guidelines and its inclusion of open access, especially in the context of health research, is a positive step towards open science and the transformation of the research landscape in South Africa, there is room for improvement. Specifically, the Draft Guidelines should: (1) specifically include reference to open science and its importance in South Africa; (2) add additional (and comprehensive) definitions for clarity, such as “open science” and “open access”; (3) consider the pathway for open access databases in South Africa by relying on an open consent model; and (4) have regard to the guiding principles for open access data and ensure that detailed guidance is provided to researchers, with reference being made to other relevant South African legislation and policy. The Draft Guidelines can also place reliance on existing policies and strategies that deal with open science and open access in order to align the Draft Guidelines with national imperatives. The implementation of these suggestions will serve to strengthen the Draft Guidelines and its position on open access databases.
AG: Formal Analysis, Writing–original draft, Writing–review and editing, Investigation.
The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. Work on this article was supported by the U.S. National Institute of Mental Health and the U.S. National Institutes of Health (Award Number. U01MH127690) under the Harnessing Data Science for Health Discovery and Innovation in Africa (DS-I Africa) program.
The author would like to acknowledge the Google PhD Fellowship Program. The author would also like to thank Donrich Thaldar for his useful comments on this manuscript.
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
The content of this article is solely my responsibility and does not necessarily represent the official views of the U.S. National Institute of Mental Health or the U.S. National Institutes of Health.