- RESEARCH DATA
Research data
“Research data are facts, observations or experiences on which an argument or theory is based | Cited in ANDS, 2017

Essentials 4 data support is an introductory course for data supporters, those who (want to) support researchers in storing, managing, archiving and sharing their research data. But what do we actually mean by research data? In this section you will find different definitions and ways of looking at research data.
Definitions
What a researcher understands by 'research data' depends on the significance of these data in the research process. And that will vary from one scientific discipline to another. Research data exist in many formats, which can be read with just as many different types of software. In the slideshow below you can see a number of definitions of research data.
Five ways
There are roughly five ways of looking at research data (University of Southampton, 2016; CESSDA, 2017):
Practice
This exercise is taken from RDM Rose (2015),since retired, activity sheet 5.2.2. It is an optional exercise that you can do if you want to get a better feel for the concept of research data.
Case studies
On pages 6-22 of a document of the University of Southampton (2016) you will find five case studies in the field of research data:
- medical research
- materials science
- aerodynamics
- chemistry
- archaeology
Look at one case study in detail and then answer the next two questions:
- Do you recognize the five ways of looking at research data? How?
- Identify a number of possible issues that researchers may have in storing, managing, archiving and sharing their research data.
Below you can see the elaboration of one of the students of Essentials 4 Data Support, who was looking at case study 3 (Aerodynamics).
5 ways of looking at the data:
- Collection: numerical model simulations
- Types: models, algorithms and scripts; software configuration, post-process files a.o. Figures
- Electronic storage: textual, software code, software specific (mesh), multimedia (figures)
- Size and complexity: large output files (hundreds of gigabytes) with corresponding additional files such as the input/configuration files and post-processing results (figures and aggregated results)
- Life cycle: this type of numerical modelling is typically done in the research phase where various wing shapes are “tested” with the model and the performance is compared. A subset of all the simulations carried out, with typical results to underpin the drawn conclusions, is usually described in the publication and therefore minimal required to be published.
Possible issues:
- Storage: With data volumes of 300GB per 1 sec of simulated flow, the total data volume easily exceeds the size of a regular laptop’s hard drive. Using network or cloud storage, that also has a good connection with the HPC to be used is recommended.
- Manage: For keeping track of a variety of simulations with sometimes minor differences in model input/configuration it is important to think before starting. A clear directory structure and sufficient description of modifications, and reasons for it, is crucial for good handling of the results. For reproducibility it is important to keep track of the used software version (even more important if it varies between different simulations). I recommend to use a version control system for model input/configuration and pre/post-processing scripts.
- Archive: For archiving again the data volume and the associated costs may play a role. Therefore archiving only the simulations results for the simulations that are actively discussed in the publication to draw conclusions from and archiving the input/configuration and software version (all necessary information for reproduction) of the remaining simulations might be wise.
- Share: For sharing of model results, it is crucial that others are able to interpret and reproduce the results. This means that the remarks made in the “manage” section are once more important. Basically, proper data management during the research phase makes you ready for sharing at any time.

ANDS (2017). ANDS Guides and Resources. What is research data. https://www.ands.org.au/guides/what-is-research-data (PDF https://www.ands.org.au/__data/assets/pdf_file/0006/731823/Whatis-research-data.pdf)
CESSDA (2017). Data Management Expert Guide. Research Data. https://www.cessda.eu/Training/Training-Resources/Library/Data-Management-Expert-Guide/1.-Plan/Research-data
OECD (2007). Principles and Guidelines for Access to Research Data from Public Funding, OECD Publishing, Paris. http://www.oecd.org/sti/inno/38500813.pdf
Queensland University of Technology. (2013). Management of Research data. http://www.mopp.qut.edu.au/D/D_02_08.jspRDM Rose (2015). RDM Rose Learning Materials. http://rdmrose.group.shef.ac.uk/?page_id=10#session-51-researchers-and-their-data
Universiteit Utrecht (2016). Universitair beleidskader onderzoeksdata Universiteit Utrecht. from https://www.uu.nl/sites/default/files/universitair_beleidskader_onderzoeksdata_universiteit_utrecht_versie_januari_2016_0.pdf
Universiteit van Southampton. (2016). Introducing Research Data. 4th Edition. https://eprints.soton.ac.uk/403440/1/introducing_research_data.pdf
Van Berchum, M., & Grootveld, M.J. (2016). Het beheren van onderzoeksdata. In Handboek Informatiewetenschap. [IV B 475] Vakmedianet. http://hdl.handle.net/20.500.11755/3108beb8-9168-4f6c-9298-c6e898be4838
Van Berchum, M. & Grootveld, M. (2017). Research data management. An overview of recent developments in the Netherlands. https://dans.knaw.nl/en/about/organisation-and-policy/information-material/Whitepaper_ResearchdatamanagementAnoverview_DEF.pdf