Author: Michael Kalichman, 2001
Contributors: P.D. Magnus, Dena Plemmons
Updates: Michael Kalichman, 2016
Questions to be asked to foster data integrity:
Because of concern about many cases of research misconduct, the Department of Health and Human Services (1990) convened a workshop on data management. The workshop highlighted the many ways in which research depends fundamentally on responsible data management. Several good resources provide comprehensive reviews of good data management practices (e.g., Macrina, 2014; Mays and Macrina, 2014) and recordkeeping in particular (e.g., Kanare, 1985; NIH Office of the Director, 2008).
Data management in research is rarely regulated, except:
Because data collection can be repetitious, time-consuming, and tedious, there is a temptation to underestimate its importance.
However, adequate planning and preparation can:
The best model for recordkeeping will not be the same for all areas of research.
However, nearly all types of research include records that could reasonably and usefully be kept in bound lab notebooks.
Nominal records would include:
Research ownership typically passes from the funder of the research (e.g., a federal agency or a private funder) to the University or institution, not to the research investigators.
Although the products of research involve creative contributions to new knowledge, the resulting data are in effect no different from the routine products of employees in any other private or public institution.
Equipment, materials and reagents, and the resulting data all belong to the institution in which they are purchased or produced, despite the language and practice of science.
The issue of institutional ownership becomes especially salient if:
In practice, even though the University or institution has legal standing to make decisions about what can or will be done with research data, it does not typically do so.
Absent an explicit agreement or ruling to the contrary, the principal investigator (PI) has primary responsibility for decisions about the collection, use, and sharing of data.
The quality of data supporting published work is moot if the data are lost or discarded.
Retaining records of research is necessary not only for the purpose of research, but to:
This depends in part on the nature of the products of research.
Some materials, such as thin sections for electron microscopy, cannot be kept indefinitely because of degradation.
It is also impractical to store extraordinarily large volumes of primary data.
At minimum, enough data should be retained to reconstruct what was done.
Original data are the responsibility of the principal investigator (PI) and should be kept in her or his lab or office.
Although most researchers have the expectation that graduating students may take copies of their research records, student or postdoctoral researchers should assume unless told otherwise that their original data will stay with the PI.
If regulations or other considerations preclude researchers taking copies, then the PI has a responsibility to make this clear to the research group before work begins.
Any stored data will be rendered useless if there are insufficient records to locate and identify the material in question.
Ease of access must be balanced against security, for instance if the study involved human subjects with a reasonable expectation of confidentiality.
Although the institution is the legal owner of the data, it is usually the responsibility of the principal investigator to ensure that records are stored in a secure, accessible fashion.
Under current National Institutes of Health (NIH, 2015) and National Science Foundation (NSF, 2005) requirements, research records must be maintained for at least three years after the last expenditure report.
Federal regulations or institutional guidelines may require that data be retained for longer periods. However, these formal requirements are minimal constraints. Decisions about retention of records should take into account:
Federal agencies, particularly the NIH (2003) and NSF (2010), have made funding contingent on plans to share research data and products, particularly after publication.
An open data policy reflects positively on those who share and benefits science by increasing the likelihood for new insights, collaboration, and reciprocal sharing.
Although sharing of data is generally in the best interests of science and the individual, it is clear that such sharing can place an individual scientist at risk: