Archival Data

Archival Data


-Archival data are any data that are collected prior to the beginning of the research study.

-The data may or may not contain HIPAA identifiers.

-The data are also the primary source (versus a secondary source where the data was analyzed for another publication).

-Archival data generally falls under the following categories:

  • Public data sets
  • Private data sets
  • Private records


Public Data Sets
  • Data collected by various government agencies and academic institutions make their data available to the public for research purposes.  Use of a public data set does not involve human subjects and therefore does not require IRB review as long as the following criteria are met:
    • Research will NOT involve merging any of the data sets in such a way that individuals might be identified
    • Researcher will NOT enhance the public data set with identifiable, or potentially identifiable data
    • Researcher will NOT use a “restricted” data set*
    • Researcher will use a Public Data Set that is included on the list of IRB-HSR approved list of Public Data Sets
    • Researcher will NOT use data from the NIH GWAS (Genome- Wide Association Studies) data repository
    • The data host does NOT have any additional requirements other than a Data Use Agreement (For example, membership to a consortium or society is required in order to access the data. If there are additional requirements, submit a Determination of Human Subject Research Form so the appropriate  type of submission for your project can be determined.)


*Restricted data set- special files distributed by federal agencies and research organizations upon which use restrictions are imposed.  These files often contain data such as Social Security numbers, names, or extensive life history markers that might enable an unauthorized user to identify a participant. 



Private data sets

Private data sets may include (but are not limited to):

  • data collected previously by another researcher for another study,
  • data collected by another agency for evaluative or research purposes,
  • your own data that you collected for a previous study.


Private data sets generally require permission to access the data, and the IRB-HSR will need to know that you will obtain (or have already obtained) proper permission to access the data. 

Except in the case where you already own the rights to the data, if you have access to the data as part of your profession but do not “own” the data, you will need to obtain permission from the “owner”  to use it for your research. 


The need for IRB approval for the use of data from a private data set depends on the identifiability of the data.  For additional information see Activities that Require IRB Review


Private records

Private records are data that were not collected with the intent to conduct research, but instead exists for the purpose of collecting information on individuals for the individual’s own sake.  For example, student records, medical records, credit histories, etc, are private records that are maintained by agencies other than the individual but contain personal information about the individual.  Some of these records are collected by government agencies and by law are accessible to the public; thus they fall under the public data sets category.  Private records are governed by privacy laws and regulations, thus requiring special permission to access the records as well as additional safeguards for using the data.  This section will look specifically at student records and medical records, two types of private records that are regularly requested for research. 

The need for IRB approval for the use of data from a private records depends on the identifiability of the data.  For additional information see Activities that Require IRB Review