Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
For tutorials and how-to guides for data formatting, uploading, and other ODC actions, click here to go to the Tutorials page.
The Open Data Commons for Spinal Cord Injury is a cloud-based community-driven repository to store, share, and publish spinal cord injury research data.
There are several challenges for scientific reproducibility and bench-to-bedside translation. For example, only research and data that are published actually get disseminated, a phenomenon known as publication bias. Published research reflects to only a small fraction of all data collected, and data that do not lead to publication are largely ignored, hidden away in filing cabinets and hard drives. This results in an abundance of inaccessible scientific data known as “dark data”. Even when research is disseminated, it is usually in the form of summary reports of aggregated data (e.g. averages across individual subjects) such as scientific articles. The fact that the individual subject-level data are inaccessible further contributes to dark data.
The spinal cord injury (SCI) community created the ODC-SCI to mitigate dark data in SCI research. The ODC-SCI also aims to increase transparency with individual-level data, enhance collaboration, facilitate advanced analytics, and conform to increasing mandates by funders and publishers to make data accessible. Members of the ODC-SCI have access to a private digital lab space managed by the PI or multi-PIs for dataset storage and sharing. The PIs can share their labs’ datasets with the registered members of the ODC-SCI community and make their datasets public and citable. The ODC-SCI implements stewardship principles that scientific data be made FAIR (Findable, Accessible, Interoperable and Reusable) and has been widely adopted by the international SCI research community.
You can read more about the development of the ODC-SCI in our papers (Nielson et al., 2014; Ferguson et al., 2014; Callahan et al., 2017; Fouad et al., 2020).
Sharing research is critical for scientific progress. Current approaches to data sharing in scientific communities primarily include direct sharing (i.e. via email) between individuals, upload of data as supplementary materials in a publication, or minimally-regulated sharing through personal websites or social media platforms. However, these options do not make the shared data FAIR: data is not readily findable, not broadly accessible, and almost never interoperable and reusable. The best way to publish data is using data repositories which offer capabilities specifically oriented to the goal of sharing data. The ODC-SCI is currently the only community-driven data repository for SCI research. This specific focus allows us to align the repository with FAIR data principles and the SCI community’s needs. Sharing data through the ODC-SCI unlocks latent potential in research data through FAIR principles, enabling the SCI community to better tackle its many challenges.
More directly, sharing and publishing data with the ODC-SCI can benefit you in a number of ways:
We have several different authentication steps and account types in the ODC-SCI to help protect unpublished data. You gain more access and functions as you are approved to the next account type.
Requirement: Email verification upon signing up
Functions: As a registered user, you can explore and access (download) published datasets.
Requirement:
Functions: As a general member, other ODC-SCI members can directly share their datasets with you, including their unpublished datasets (feature under development).
Requirement:
Functions: As a full member, you will be able to upload, share, release, and publish your datasets. Additionally, you can explore and access unpublished datasets that have been released into the Community data space.
We have several different authentication steps and therefore different account types in the ODC-SCI to help protect unpublished data. You gain more access and functions as you are approved to the next account type. For more information, please see “What are the different account types on the ODC-SCI?”
There are several authentication steps and account types on the ODC-SCI to help protect unpublished data (see "What are the different account types on the ODC-SCI?" section). Additionally, the ODC-SCI is organized into different spaces that a dataset can belong to. Each space offers different levels of user access, and the movement of data from one space to another is always at the discretion of the lab’s PI. The general flow of data and the user access is as follows:
A dataset is initially uploaded into the Personal space. Though the dataset must be uploaded into a specific lab, the dataset will only be visible to the original uploader and the lab’s PI.
The PI can choose to Share a dataset with the rest of the Lab. Doing so moves a dataset from the Personal space to the Lab Space. Any dataset in the Lab space can be found and accessed by other Lab Members belonging to the same lab that the dataset was uploaded to.
The PI can also choose to Release a dataset to the ODC-SCI Community space. Any dataset in the Community space can be found and accessed by General Members of the ODC-SCI, even if they do not belong to the same lab that the dataset was uploaded to. If you want to use a dataset in the Community space, you must adhere to the terms in the data use agreement.
PIs control the full process
On the ODC-SCI, the PI has full control of their dataset up to the point of publication. The PI can move the dataset from Personal, Lab, and Community spaces as they wish. They can also publish a dataset from any data space at any time. Once a dataset is published and in the Public space, however, the published dataset can only be removed if the DOI is rescinded which will require an appeal to the ODC-SCI oversight committee.
FAIR stands for Findable, Accessible, Interoperable and Reusable. FAIR establishes a framework for data sharing and defines a set of recommendations developed by FORCE11 (The Future of Research Communications and e-Scholarship) for successful data dissemination. FAIR data encompasses the principles of:
The ODC-SCI has been developed to follow the FAIR principles with tools and functionalities designed for the SCI community’s needs to ensure the success of data sharing with the ODC-SCI.
Yes, in the ODC-SCI, the dataset, dataset metadata, and data dictionary undergo quality checks for proper formatting and completeness. The checks ensure that the data is Interoperable and Reusable. Some quality checks are performed during the upload of the dataset, ensuring a minimal level of quality to all private and public datasets in the ODC-SCI. The check during the upload process is automatic without human oversight since the upload is handled privately within the user’s account. When data is released to the Community data space or submitted for publication, further checks will be conducted by the ODC-SCI Data Team to ensure that the released or published dataset meets FAIR standards.
Any registered lab member can upload data to the ODC-SCI lab they belong in. The PI or any lab members they designate (as lab managers on the ODC-SCI) can share, release, and publish datasets. For more information, see: “What are the different account types on the ODC-SCI?”
Any Spinal Cord Injury (SCI) associated data from independent species that can be disseminated in a spreadsheet (csv) format is accepted. This includes in vivo and in vitro data. Human data must be de-identified prior to upload to the ODC-SCI.
We encourage publishing primary data: minimally processed data that provides the most flexibility and usefulness for additional analysis. Importantly, primary data is not always raw data but may have some minimal transformation to make the data more directly usable.
If the data has been processed, we recommend explaining the methodology in a dataset-associated methodology document which you can upload alongside the dataset.
Yes. Before uploading a dataset to the ODC-SCI, your dataset must be formatted into a Tidy data format.
See: “ODC data structure: Tidy data format” for more information.
See: “How to prepare/format data for uploading” for step-by-step instructions.
This section provides additional information about the concept of Tidy data. Click here for specific instructions on how to prepare and format your data for upload to the ODC.
Data structured for upload to the ODC must follow a specific format known as the Tidy format. The format was developed as a data structure to facilitate big data analytics and standardize the way data values are organized across datasets even across different disciplines and origins. The implementation of the Tidy format has significantly reduced the time scientists spend reformatting data, facilitated the ease of data exploration and analysis, and streamlined the development of analytical tools. Furthermore, the standardized data structure ultimately promotes FAIR data sharing (Findable, Accessible, Interoperable, and Reusable) by improving the Interoperability of the data.
Importantly, Tidy data can look quite different from data formatted for human readability. A Tidy dataset is organized with:
(A) Visual reference for Variables and Observations. (B) An example where the same subject can have multiple rows (i.e. Observations).
For specific instructions on how to prepare and format your dataset into a Tidy format for upload to the ODC, click here. The instructions also provide guidelines for combinations of parameters that define an Observation. If you need additional guidance, feel free to contact our help desk and an ODC Data Wrangler can assist you.
A data dictionary is a file containing information about each Variable (i.e. Column) in the dataset. The data dictionary provides critical information for the interpretability and reusability of the dataset. Importantly, the data dictionary helps other users understand what each of your Variables are and any important details you include. We encourage you to submit a data dictionary with your dataset, even if you do not plan to publish the data.
For more details and a downloadable Data Dictionary template, please see “How to prepare a data dictionary (Data Dictionary Guidelines)”.
“CSV” (or ".csv") is a widely-used file format for spreadsheet-style datasets and stands for “comma-separated values”. In brief, a CSV is a delimited text file where each value (i.e. cell of the spreadsheet) is separated by a comma.
We require datasets and data dictionaries to be CSV files when you are uploading to the ODC-SCI. You can easily convert excel (e.g. ".xls", ".xlsx") files to “.csv” files through spreadsheet programs like Excel by saving as a ".csv".
Importantly, the process will only save the ACTIVE spreadsheet in your excel file. The process will also exclude any graphs or graphics since the CSV file will only include the values in the spreadsheet cells. For more information about how to format your dataset, see the “How to prepare/format data for uploading” section in the Tutorials..
There are a few common errors that are flagged during the data upload process. If you hit an error on the data preview page after selecting your datafile, check your dataset for the following errors:
For more information about how to prepare your data for upload, see the "How to prepare/format data for uploading" Tutorial.
When uploading a data dictionary file, there are a few possible errors that are flagged during the data dictionary upload process. You will be notified directly on the upload page; as a reference, the possible errors include:
For more information on how to prepare your data dictionary for upload, see the "How to Prepare a Data Dictionary" Tutorial.
If you cannot identify the error in your dataset or data dictionary, please contact the ODC Data Team (data@odc-sci.org) for guidance.
If your dataset is too large (e.g. your dataset is larger than 100Mb or has a total number of cells larger than 3,000,000), it can cause an error during the data upload process. The error can also happen when you are trying to replace your dataset using the Upload New Version workflow. In both cases, we recommend splitting up your dataset-to-be-uploaded into chunks with fewer rows and utilizing the Append Data workflow to add your dataset piece by piece. For a detailed tutorial of the Append Data workflow, see "How to append new rows to a dataset" in the Tutorials.
Importantly, every chunk of your dataset must have the same column headers in the first row of each csv file. Make sure you (1) split your dataset along the rows and not along the columns and (2) include the column headers in every file.
If you have any difficulties, please contact the ODC Data Team (data@odc-sci.org) for guidance..
Currently, we do not require methodology documents for you to upload and publish your datasets. However, we do encourage you to include them as they improve the interpretability and reusability of your data.
Some recommendations for compiling a methodology document:
Yes, as long as the dataset has not been published with an assigned DOI. Once a dataset has a DOI and has been published (i.e. moved to the Public Space), the use of the dataset falls under the Creative Commons Attribution License (CC-BY v4.0), which allows anyone with access to use the contents of the dataset but sets the legal obligation of giving appropriate credit to the authors of the data.
Before you submit your dataset for publication/DOI request, make sure your dataset meets the following minimum requirements:
To request a DOI/publication for a dataset, you must be logged in as the PI of the lab that has the dataset. Only ODC-SCI lab PIs have the authority to request DOI/dataset publication.
If you are logged in as the PI:
For images of the process, refer to the "How to Share/Release/Request to Publish a Dataset" tutorial section.
Once you have completed the process, the dataset status should change to "DOI request". The ODC-SCI Editorial Board and Data Team will be automatically notified to initiate the dataset review process. We will work with you through the review process to ensure that the dataset meets the FAIR principles and data standards established by the ODC-SCI.
The process may take a few weeks depending on the revisions required. For more information, see “What is the ODC data review process and how long does it take to get a DOI?”
Dataset publication is the process of making your data accessible to the general public. There are several good reasons to make your research data public. Publishing your data makes the data accessible to more people, which can mitigate issues of reproducibility. Dataset publication also increases the transparency of your research process and increases visibility of your work. Providing access to your data allows for modern forms of meta-analysis at the individual subject level (instead of meta-analysis of summarized data, which is what is typically published in academic journals), which can contribute to new discoveries while minimizing repetitive experiments and unnecessary waste of resources. Finally, there is an ongoing cultural shift with publishers and funding agencies mandating public release of data in order to publish a research article or to get funded. The ODC-SCI provides the foundation and materials to familiarize yourself with the process and offers the infrastructure to take your own data from private storage to the public space.
In the ODC-SCI, data stewards (i.e. PIs) can start the process of publishing data whenever they decide a dataset is ready to be released to the public. At the end of this process, the dataset and related documentation will be accessible to any registered user of the ODC-SCI. A digital object identifier (DOI) will be associated with the dataset, and a citation will be generated so the public dataset can be cited much like a published article. ODC-SCI uses DataCite to generate the DOI and the citation. Datasets are published with an open source license, the Creative Commons Attribution License (CC-BY v4.0), which allows anyone with access to use the contents of the dataset but sets the legal obligation of giving appropriate credit to the authors of the data.
See our "Minimal Dataset Standards for publication" for more information on dataset and metadata quality requirements.
Once uploaded, data can move quickly through the ODC-SCI data spaces at the PI’s discretion between Personal, Lab, and Community data spaces (see: “How does privacy and data protection work on the ODC-SCI?”).
The final publishing step involves a two-step review process by the ODC-SCI Editorial Board and Data Team. The length of the review process depends on various factors including:
Summary of review process
Editorial Review
Your dataset will first be sent to ODC-SCI editors to determine whether the content is appropriate and within the scope of ODC-SCI. You will be contacted by the Editor in Chief within 3-5 business days if the editors have any concerns regarding your dataset.
If the Editorial Board decides that your dataset fits within the scope of the ODC-SCI, the dataset and any comments from the editors will be sent to the ODC-SCI Data Team for the dataset review process.
Dataset Review
The ODC-SCI Data Team will review your dataset, metadata, and data dictionary to ensure the formatting and contents conform to the ODC-SCI data structure and meet the minimum dataset standards for publication. The ODC-SCI Data Team will will contact you if any revisions are required.
The full dataset publication process can take a few weeks depending on the revision process. You can track any changes to the status of your publication/DOI request on the Current Lab page as a PI/Lab Manager.
Final PI Approval
Once the dataset is fully approved by both the Editorial Board and Data Team, the DOI will be reserved and the corresponding PI will be notified. The PI has control over the last step of the publication process: final approval for publishing the dataset and the associated landing page. Once the PI approves, the landing page and dataset will be made public and accessible to all ODC-SCI registered users within 24 hours. The dataset DOI will be accessible on the published landing page.
The final approval button can be found on the Current Lab page as a PI/Lab Manager in the Dataset window once the dataset has completed the ODC-SCI review process.
If you have questions about or during the process, please contact the Data Team at data@odc-sci.org.
The following are standards/requirements for the dataset, dataset-associated metadata, and dataset-associated data dictionary that will be assessed by the Editorial Board and Data Team review process during dataset publication/DOI request.
How to prepare/format data for uploading
Common errors for Dataset and Data Dictionary
We recommend downloading the data dictionary template (click here to download) which has the definitions and information for each column. The template will also help you get started in compiling your data dictionary, which will be required for dataset publication/DOI request.
If a required variable/column is not relevant to your dataset or you are missing data for the entire column, you must still include the column. Fill in the values with "not available", "not applicable", or "unknown" as appropriate for your case.
The columns should use these exact names:
For instructions on the Metadata Editor, see the "Metadata Editor Tutorial" tutorial section.
Metadata is critical so users can understand key details about the data. Metadata also improves how easily searchable the dataset will be. In general: the more metadata, the better. This section provides definitions, checklists, and recommendations applied during the revision of the metadata information. The checklists are designed to help users prepare their data and describe items to be reviewed by the ODC-SCI Editorial Board and Data Team to help improve the reusability of the data.
Dataset Publication Title
The published dataset title will be the first information a user sees when accessing a dataset. A good title of a dataset, like a good title for a research paper, should tell the reader why they should be interested in this particular data, providing information about the content of the dataset. The title should also provide key details that will enable effective retrieval by a search engine. In many search engines, the title is weighted very highly in ranking search results, so it is important to give the title careful thought.
Dos and Don'ts:
The structured abstract provides a description of the dataset. Abstracts for datasets are a short description of the content of the dataset that allows readers to understand the origin of the data, the reason the data was generated, and lets the user know important details about the dataset. This allows users to make an informed decision about whether to use the data or explore them further. Abstracts must have three sections:
Tips: Think about whether your description would pass reviewers if you submitted it as an abstract for a manuscript. Would a reviewer accept a one line description of what you did? e.g., testing BBB score in SCI animals treated with X. The more detail you provide, the more useful the data will be. However, remember, that you are describing a dataset, not a scientific publication. After reading the description, would a reader understand what they are looking at when browsing or downloading your data?
Short list of concepts that identifies the dataset.
On occasion, datasets are collected from different sources or are part of already published research. This section allows data authors to specify the related sources of information for a given dataset in order to track the provenance of the data and to point to important sources of information that can help with data reusability. For each element, authors should specify the source (e.g. the citation of a paper) and a short explanation of the relationship of the dataset with the source.
Tips: Think about referencing citations in a paper. In a paper, if you want to direct readers to further information not provided in the text, you make a citation. Here, would it be relevant for the user to know of other information that relates to the data?
Notes give authors space to explain further information that can be relevant for future data users to know. For example, if some caution should be taken when considering parts of the data.
As much as in a research article, authors should specify the list of funding that made the dataset possible and other acknowledgements.
The list of the authors of the dataset is similar to authors in a research article. Authors might be the same authors of a related publication if applicable. Authors must be part of the list of contributors, but not all contributors might be considered authors (see below). For each author, the affiliation and the ORCID (if available) will be needed.
Note about ORCID IDs: The ORCID is required for the PI and primary contact person of the dataset will be required. This will ensure users will know the correct individuals to contact regarding the dataset. Additionally, the ORCID ID ensures that contributors get credit for publishing the data. Registering for an ORCID ID is easy at https://orcid.org/. If you have students, postdocs, or developers, we really encourage you to get them to sign up for an ORCID ID as they will use it throughout their career.
Contributors are the personnel that made the dataset possible. This includes the authors of the dataset but might also include other important contributors such as data collectors, data managers, analysts, collaborators, administrative personnel, etc. This gives the opportunity to credit others beyond the main authors. For each contributor, the affiliation and the ORCID (if available) will be needed.
At the time of submission, the data dictionary must meet the following standards:
For full instructions on compiling a data dictionary, see the "How to Prepare a Data Dictionary" tutorial.
In ODC-SCI, the dataset and the data dictionary undergo quality checks for proper formatting (based on goodTables framework). These checks ensure that the data is Interoperable and Reusable with other datasets. Some of the quality checks are performed during the uploading of datasets, ensuring a minimal level of quality to all private and public datasets in the ODC-SCI. The check during the upload process is automatic without human oversight since data upload is handled privately within the account of the data owner. When data is released to the Community data space or submitted for publication, further checks will be conducted to ensure that the released or published dataset meets FAIR standards:
PIs can request permission to make changes/updates to published datasets from their lab. In order to maintain proper data provenance, if a published dataset needs to be edited, please contact the ODC SCI Data Team (data@odc-sci.org) with an explanation of the changes you wish to make. The Data Team will assess the proposed changes on a case-by-case basis to inform you of how the changes will be applied and guide you through the process.
For more information, refer to the ODC-SCI Version Control Policy.
Once a DOI and landing page have been published to the public on the ODC-SCI, we will not delete the information. However, PIs can ask for the dataset and associated data dictionary and supplementary files to be retracted. To initiate the process, please contact the ODC SCI Data Team (data@odc-sci.org) with an explanation of why the dataset needs to be retracted. The Data Team will assess the request and help you through the process.
For more information, refer to the ODC-SCI Version Control Policy.
No. No programming experience is required in order to upload your dataset. You can format your dataset in any spreadsheet software before uploading, and all the steps of the process are handled directly on the website.
In case you need assistance with anything during the process, you can contact us via the Help Desk button on the bottom right of every page or by emailing us (info@odc-sci.org).
The dataset that you upload is property of the lab, and the PI of the original lab will maintain full control of the dataset on the ODC-SCI.
If you are using a Mac, some of the scroll bars on the platform (e.g. in windows listing available datasets or lab members) might not show up for you because of your computer settings. For example:
To fix this issue:
If you can’t find a relevant help section for the page you’re on or need to report a bug, you can contact our help desk via the “Contact help desk” button at the bottom of every page. The button will automatically inform us of which page you are on when you contacted the help desk.
Contact Help Desk is at the bottom right of every page.
If you are reporting a bug, please also include the following information:
Please allow 2-3 business days for a response.
If you cannot find a relevant help or FAQ section for your question, you can email us at: info@odc-sci.org.
Please allow 2-3 business days for a response.