What is a Data Management Plan: A Complete Guide

Team Airbyte
June 27, 2025

A data management plan (DMP) is more than a formal document, it’s a foundation for responsible data stewardship. Whether you're working in clinical data management, enterprise analytics, or an academic research project, a DMP outlines how data collected will be organized, stored, secured, and shared across its full lifecycle.

Funding agencies such as the National Institutes of Health, the National Science Foundation, and international research funders increasingly require a management and sharing plan as part of grant proposals.

As data becomes central to everything from scientific disciplines to business operations, building a clear, actionable plan is essential. A well-structured DMP reduces compliance risk, enables long term access, ensures reproducible outcomes, and supports discoverability through platforms like Dryad, Zenodo, or the California Digital Library.

What is a Data Management Plan and Why Is It So Important?

A data management plan is a formal document that describes how data will be collected, structured, stored, preserved, and ultimately made available for data sharing and re-use. It covers everything from data format and file formats, to naming conventions, and how existing data will be curated alongside new data.

DMPs are essential in contexts where research data management, clinical studies, or enterprise-level data governance are in play. These plans ensure that data management requirements from funding agencies are met, while enabling other researchers to understand, validate, and re-use findings.

Agencies like the NIH now require a data management and sharing plan, and platforms like the DMPTool offer specific guidance for creating DMPs that meet funder and institutional expectations, including templates aligned with the NY DMP model.

From enabling alignment with existing standards, to supporting reproducibility across scientific disciplines, a well managed DMP helps your research project stay on track. It can also clarify the handling of physical collections, curriculum materials, or software, and supports long-term archiving through repositories such as the UK Data Service or the California Digital Library.

Ultimately, a DMP is a living document that evolves with your project and helps your team manage data responsibly, effectively, and transparently.

What Are the Key Components of an Effective Data Management Plan?

A strong DMP outlines how data will be handled from start to finish and how it supports the broader goals of your research project or business initiative. Below are the essential elements to include.

Data Types and Formats

Specify how much data will be generated and the data format expected for each type. Clarify whether you're working with new data or existing data, and describe the file formats (e.g., CSV, JSON, NetCDF) to be used. This is foundational to choosing appropriate storage, analysis, and archiving solutions.

Data and Metadata Format

Define the metadata format and data and metadata format standards you’ll follow, referencing existing standards like Dublin Core, ISO 19115, or schema.org. Establish naming conventions early to promote uniformity and ease of access. These best practices improve interoperability and reduce confusion when other researchers attempt to reuse or validate your research data.

Data Collection and Storage

Explain how the data collected will be validated and stored securely. Will it be digitized from physical collections, pulled from sensors, or manually entered? Describe storage architecture, backup frequency, redundancy, and how you’ll safeguard sensitive data.

Access and Sharing Policies

Detail your data sharing strategy: who gets access, under what conditions, and when. Include information about embargoes, licenses, or access limits due to patent reasons or intellectual property concerns. If sharing via the UK Data Service, Dryad, or other repositories, note how you will meet their management and sharing plan expectations.

Data Archiving and Reuse

Explain your strategy for data archiving, including repository selection and retention duration. Note if your research results will be deposited alongside other materials, such as code or documentation. Indicate how other researchers will be able to discover, cite, and re-use your datasets.

Compliance and Legal Considerations

List all relevant legal and ethical obligations — e.g., GDPR, HIPAA, intellectual property rights, or national science foundation guidance. Include links to specific guidelines or internal compliance frameworks. If applicable, identify roles responsible for ensuring your DMP remains current and compliant.

How to Create a Data Management Plan (Step-by-Step)

Creating a comprehensive data management plan doesn’t have to be overwhelming, especially when guided by clear steps and supported by tools like DMPTool, the NY DMP template, or institutional platforms. Whether you're preparing a data management plan NIH submission or building a framework for internal projects, these steps apply across disciplines and sectors.

1. Define Project Objectives and Data Needs

Start by clarifying your research or business goals. What scientific data, clinical records, physical collections, or curriculum materials will you collect? Understanding your use case informs all other aspects of your management plan and helps structure how you'll manage data and support data sharing.

2. Identify Data Types, Sources, and Formats

Categorize the data collected and list where it comes from — be it sensors, interviews, APIs, or existing data. Include data format details and note whether data is static or dynamic. This step helps align your DMP with repository requirements such as those from the UK Data Service or California Digital Library.

3. Plan Storage, Backup, and Access

Determine where your data will be stored (e.g., cloud, on-prem, hybrid) and how it will be protected. Implement backup strategies that align with data management requirements and funding agencies' security expectations. Address how compliant data access will be granted to team members and collaborators.

4. Assign Roles and Responsibilities

Clarify who is responsible for which tasks such as managing metadata, updating the DMP, or overseeing data archiving. Define roles for both data creators and stewards. This improves accountability and aligns with institutional data management practices.

5. Define Metadata and Documentation Standards

Choose a metadata format (e.g., Dublin Core, ISO 19115) and describe your naming conventions. Explain how you'll capture contextual details that allow other researchers to understand and re-use the research data. Tie this to any existing standards used in your field.

6. Outline Sharing, Reuse, and Preservation Policies

Decide how and when you’ll share the data whether via public repositories or controlled-access platforms. Provide your sharing plan including embargo periods, licensing, and documentation for other materials. Describe how you’ll ensure long term access and re-use of data and metadata.

7. Document Legal and Regulatory Considerations

List any policies or laws (GDPR, HIPAA, IRB, etc.) that apply. Reference specific guidance from your institution or research funders. If applicable, describe how your data management plan DMP supports legal compliance.

8. Choose a Format and Tool

Use tools like the DMPTool, DMPonline, or your institution’s templates to structure your plan. Make sure the plan aligns with your grant proposals, is version-controlled, and references appropriate additional resources. Tools like Airbyte can help automate key parts of your DMP implementation.

Streamlining Your DMP Workflow with the Right Resources

Whether you're submitting a data management plan NIH requirement or building a DMP for internal use, the right templates and tools can simplify the process. These resources help standardize documentation, ensure compliance, and reduce the risk of overlooking key components in your data management planning process.

DMPTool and DMPonline

The DMPTool is a widely used, funder-aligned platform offering step-by-step prompts tailored to agencies like NIH, NSF, and DOE. Researchers at U.S. institutions can log in with their credentials and access templates that meet data management and sharing policy requirements. For international teams, DMPonline offers similar functionality and links directly to funder requirements.

NY DMP and Institutional Templates

Universities often provide their own resources, such as the NY DMP from NYU’s research services or versions from Stanford and Columbia. These templates help researchers meet specific guidelines and comply with local data policies. They may include boilerplate sections for managing sensitive data, citing additional resources, and aligning with existing standards.

Repositories and Planning Integration

Data platforms like Dryad, Zenodo, and the Open Science Framework support integrated DMP workflows. These tools bridge the gap between planning and data archiving, automating steps for submission, documentation, and license selection. They help ensure your plan supports long term access, reproducibility, and data sharing best practices.

Operational Integration with Airbyte

Airbyte helps make your data management plan operational. With automated ingestion, schema control, CDC, and backup monitoring, Airbyte ensures that your plan is consistently executed across environments. It complements the documentation by helping teams actually manage data and meet their management and sharing plan commitments.

Real-World Applications of a Data Management Plan

Academic Research

In universities, DMPs are often required by funding agencies and used to evaluate grant proposals. A strong plan ensures that research data is discoverable, preserved, and reusable by other researchers across scientific disciplines.

Clinical Research

Clinical projects rely on DMPs to protect sensitive data, manage intellectual property, and comply with federal and ethical standards. They often support structured retention and secure data archiving aligned with funder requirements.

Enterprise Use

In commercial environments, DMPs help teams align on data management practices, ensure proper governance, and enable data sharing across tools, vendors, and internal departments. Plans often define rules for data collected from customers or sensors.

Machine Learning

ML teams use DMPs to document dataset lineage, describe metadata formats, and ensure that software and other materials are reproducible. This helps meet expectations around ethical model development and auditability.

How Airbyte Supports Data Lifecycle Planning

Airbyte makes it easier to execute a data management plan by offering automated pipelines that align with the management and sharing plan requirements of many funders.

  • Automated Ingestion: Move existing data and new data into structured environments using 600+ connectors.
  • Schema and Metadata Enforcement: Apply your metadata format, manage schema evolution, and track versions.
  • Retention and Compliance: Configure syncs, logs, and access controls to match retention policies in your DMP.
  • Repository Integration: Send datasets to archives like Dryad, UK Data Service, or institutional S3 buckets with complete metadata and licensing.

Making Your DMP a Living Document

A data management plan is not a one-time requirement — it’s a living document that helps your research project evolve responsibly. When implemented thoughtfully and supported by the right tools, a DMP ensures that data collected, metadata, and supporting files remain useful and secure well beyond the project’s end.

By using platforms like DMPTool and Airbyte together, teams can satisfy funder requirements, align with existing standards, and reduce risk across the board. DMPs enable better science, smoother collaboration, and greater public trust in research results.

Optimize your data management plan with Airbyte—automate ingestion, ensure compliance, and integrate with repositories seamlessly.

Frequently Asked Questions

Can a data management plan be updated during a project?

Yes. A DMP should evolve as your project or research data scope changes, particularly when handling sensitive data, switching repositories, or adopting new data sources.

Who is responsible for maintaining the DMP?

Typically, a PI, data steward, or institutional data officer oversees updates — though team collaboration is essential to meeting compliance and data management goals.

What happens if a DMP is deemed inadequate?

Funding sources may reject the plan, delay funding, or request revisions to align with their specific guidance, especially around data sharing, retention, and intellectual property rights.

Limitless data movement with free Alpha and Beta connectors
Introducing: our Free Connector Program
The data movement infrastructure for the modern data teams.
Try a 14-day free trial