Join us in Amsterdam on 7 & 8 December for a series of free workshops and a mini-symposium on exciting developments around Open Science for Eco-Evo research!
Open Science has become a scientific mainstream. A substantial intellectual and financial input has been invested into developing e-infrastructures, services, and tools that enable Open Science, including Open Data, Open Source (Code) and Open access to publications. Current uptake of these elements in Ecology and Evolution is slow, yet these fields can greatly benefit from the implementation of already developed tools and services.
Workshops will offer training in several aspects of Open Science that can greatly benefit researchers: use of Open Data in research, conducting reproducible analysis, adding value to your code, funding opportunities in implementing Open Science and others. Workshops will be preceded by a mini symposium (half-day), introducing Open Science from a perspective of early-career and established ecological researchers and frontline advocates of Open Science.
NIOO-KNAW and DANS-KNAW are the main organizers of the workshop, with technical, intelectual, and financial support by IBED. Our plenary speakers for the mini symposium include ecologists with an interest in Open Science, and Open Science advocates. Our trainers expertise covers a range of Open Science tools and practices, with a specific focus on ecological/evolutionary research.
Amsterdam: the mini-symposium will take place at the historical Linnaeus Library of the Artis Zoo. The workshops will take place at the Amsterdam Science Park .
For any questions, suggestions, and sponsoring opportunities please contact organizers Antica Culina (NIOO-KNAW) A.Culina@nioo.knaw.nl or or Cees Hof (DANS-KNAW) cees.hof@dans.knaw.nl
12:00 –13:00 Registration and coffee/tea
13:00 –13.15 Welcome by Dr. Antica Culina, Dr. Peter Doorn, and Prof. Dr. Louise Vet (chair)
13:15 –13:45 Why we need open science, Dr. Dominique Roche
13:45 –14:15 Public data archiving in ecology and evolution: How well are we doing?, Dr. Sandra A. Binning
14:15 –15:00 Case examples of OS application in EcoEvo research:
1. Costs and benefits of mobile DNA in pathogen evolution, Like Fokkens
2. Encounternet: an automated radio-tracking technology to study spatial coordination in songbirds, Davide Baldan
3. Reproducible Data Analysis in Bioinformatics, Dr. Fotis E. Psomopoulos
4. Studying a species at the distribution range scale: spatio-temporal habitat use patterns in roe deer, Johannes De Groeve
15:00 –15:30 Coffee break
15:30 –16:00 Writing software as a researcher: being practical versus being perfect, Neil Chue Hong
16:00 –16:30 GO FAIR, GO Anywhere ……..also in biology, Prof. Dr. Barend Mons
16:30 –17.00 Discussion and Wrap-up
17:00-18:00 Drinks
8:00 Registration
9:00 –11.00 Parallel workshops morning
11:00 –11:30 Coffee break
11:30 –13:00 Parallel workshops morning (continued)
13:00 –14:00 Lunch
14:00 –15:30 Parallel workshops afternoon
15:30 –16:00 Coffee break
16:00 –18:00 Parallel workshops afternoon (continued)
19:00 Dinner (TBA)
Workshop duration: Full day
Transparency, open sharing, and reproducibility are core values of science, but not always part of daily practice. This workshop will provide an overview of current status in reproducible analysis in order to provide transparency in research. The workshop will cover methodological topics (such as the use of the Open Science Framework and reporting guidelines) as well as software tools (such as Git, Docker, R Markdown/knitr and Jupyter).
Going beyond simple listing and presentations, the workshop will focus on hands-on skill building, with exercises and tutorials covering most of the software aspects. Specifically, a draft agenda of the workshop is the following:
Organizer/trainer: Dr. Fotis E. Psomopoulos, Institute of Applied Biosciences (INAB) at the Center for Research and Technology Hellas (CERTH)
Workshop duration: Full day
This session is based on Data Carpentry Ecology workshop which can be found here
1. Introduction to tabular data format.
2. Cleaning Data with OpenRefine. A part of the data workflow is preparing the data for analysis. Some of this involves data cleaning, where errors in the data are identifed and corrected or formatting made consistent. This step must be taken with the same care and attention to reproducibility as the analysis. OpenRefine (formerly Google Refine) is a powerful free and open source tool for working with messy data: cleaning it and transforming it from one format into another. This lesson will teach you to use OpenRefine to effectively clean and format data and automatically track any changes that you make. Many people comment that this tool saves them literally months of work trying to make these edits by hand.
3. Data Carpentry: R for data analysis and visualization of Ecological Data. This is an introduction to R designed for participants with no programming experience. They lesson will start with some basic information about R syntax, the RStudio interface, and move through how to import CSV files, the structure of data frames, how to deal with factors, how to add/remove rows and columns, how to calculate summary statistics from a data frame, and a brief introduction to plotting.
Requirements:
Follow installation instructions on Data Carpentry website:
Open Refine: http://www.datacarpentry.org/OpenRefine-ecology-lesson/setup/
RStudion and R: http://www.datacarpentry.org/OpenRefine-ecology-lesson/setup/
Organizer/trainer: Mateusz Kuzak, Dutch Techcentre for Life Sciences (DTL)
Workshop duration: Half a day (morning session)
Meta-analysis is a powerful statistical approach to integrate results from individual studies. It can be used to weight the evidence for a certain effect, and to test hypothesis about the strength and the direction of the effect.
This workshop will introduce transparent meta-analysis, possiblities it offers to a research, as well as its limits. We will describe all the main steps needed to conduct a transparent and modern meta-analysis, and also use open (freely available online) data in meta-anlalysis. Thus, we will also introduce the new open data-landscape, and platforms where researchers can easily identify datasets (similar as, for example, Web of Science does for papers). Participants will have some time to familiarise themselves with these platforms.
After completing the workshop, participants will be able to:
1. Design answerable research question and define exact inclusion and exclusion criteria
2. Implement transparent and comprehensive data collection and extraction, including the aqusation of open-data
3. Decide on methods to critically assess the risk of bias
4. Decide on the most appropriate meta-analytical model (fixed/random effects, nested data structure)
5. Describe and interpret the results of meta-analyses
Requerments:
Pre-pare
Organiser/trainer: Dr. Antica Culina, Netherlands Institute of Ecology (NIOO-KNAW)
Workshop duration: Half a day (afternoon session)
This session is specifically focussing on participants starting to work in an open science environment and want to have a clearer understanding on where to store their data and what services to use to improve and standardise the overall quality of their taxonomic, ecological and phylogenetic data. A session in two parts provided by Data Archiving and Networked Services (DANS-KNAW) and Naturalis Biodiversity Center.
The session of DANS will focus on an overview of available repository facilities and their characteristics like discipline specific support, certification, history, underlying connections, interoperability and future developments. Alongside, information on related services will be provided such as metadata standards and services, Persistent Identifier services, online RDM and Data Management Plan (DMP) services and all of these items will be considered in the light of open science developments such as OpenAIRE and the European Open Science Cloud (EOSC).
After joining this sub-session of the workshop, participants should be able to make a considered decision on where to store their own research data and have a clear idea what services to use to optimise this process. They will also understand the landscape of relevant projects, organisations and services a (little) bit better.
Naturalis will explain, demonstrate and train the use of taxonomy and phylogeny related data services and tools, such as the Netherlands Biodiversity Data Services. The collection of specimens at Naturalis is one of the largest Natural History collections in the world. In an on-going effort to digitise the collection, large amounts of specimen data and associated multimedia content are generated. Furthermore, Naturalis hosts comprehensive taxonomic databases of the species currently described and their taxonomic classifications. Specimen-, taxonomic-, multimedia-, and geographical data are now jointly accessible via the Netherlands Biodiversity Data Services. In this session, it will be demonstrated how automated workflows based on our data are facilitated via the Netherlands Biodiversity API (NBA) and how it will benefit research in biodiversity research.
In the second part of this session, Naturalis will demonstrate new software tools that facilitate the inference of large-dated phylogenies from publicly available molecular sequence data. In this part of the session, the audience will see how workflows from mining, retrieving and processing of sequence- and fossil data, to the subsequent phylogenetic inference and molecular/fossil dating can be realised using software developed by Naturalis.
Organiser/trainer: Dr. Cees Hof, DANS-KNAW; and Dr. Hannes Hettling, Naturalis
Workshop duration: half a day (morning session)
As researchers, we all use code in our work. Often we will create new scripts or pieces of software, or modify code from someone else. This workshop will give an introduction to two of the main techniques that will help you to improve your software: working with repositories and making your code more maintainable. It will cover how to choose and use a version control system for your source code and data, choosing a license for your software, why and how you deposit software in a digital repository, structuring your code to make it cleaner, and finding problems in your software. This is a subset of practices covered in the paper “Good Enough Practices in Scientific Computing” by Wilson et al.
Organizer/trainer: Niel Chue Hong, Director of Software Sustainability Institute, UK
Workshop duration: half a day (afternoon session)
This workshop is aimed at enhancing our skills to recognize, describe, measure and enhance the quality of data that is commonly used in ecological research. Examples of such data are species occurrence and abundance, physical or functional traits, behaviour, environmental variables (e.g. weather, water quality, antropogenic influence, terrain). It is targeted at both early-career and established researchers.
The workshop is build-up of four parts which coincide with the aforementioned topics:
Recognizing. To tackle data quality issues it is crucial to know the relevant concepts, to recognize these in their practical conext and to understand their relevance. In this part we will explore the important quality-related concepts and their relationship interactively.
Describing. There are various systems to encode data quality, which are not conflicting but also not enirely complementary. We will have a look at the most common ways to describe data quality and apply them to a few case studies to explore strengths and weaknesses.
Measuring. The actual measurement or estimation of data quality is not always feasible, and when it is possible it is never trivial. During this part we explore different ways to measure data quality in ecology via a card-game.
Enhancing. Data can be enhanced by working on the data values themselves (e.g. by detecting and removing suspicious values, combining different types of data, or data aggregation), but also by annotation or by adding informative meta-data. We will learn about this issue through a small data-challenge where we are going to improve an ecological data set in different teams.
The workshop is practical in nature. It involves concrete exercises, discussion and a small challenge in teams. None of these activities requires any coding or analysis behind the computer.
Organiser/trainer: Dr. Emiel van Loon, University of Amsterdam
Here, you can find links to the slides of the mini-symposium presentations, as well as workshop slides and materials
1) Why we need open science, by Dr. Dominique Roche
2) Public data archiving in ecology and evolution: How well are we doing? by Dr. Sandra A. Binning
3) Writing software as a researcher: being practical versus being perfect, by Neil Chue Hong
1) Data Carpentry for Ecology, by Mateusz Kuzak
OpenRefine lesson:
http://www.datacarpentry.org/OpenRefine-ecology-lesson/
R lesson
http://www.datacarpentry.org/R-ecology-lesson/index.html
2) Reproducible analysis and Research Transparency, by Dr. Fotis E. Psomopoulos
http://reproducible-analysis-workshop.readthedocs.io/en/latest/
3) Transparent meta-analysis and use of open-data, by Dr. Antica Culina
https://doi.org/10.6084/m9.figshare.5722438.v1
4) Crash Course on Clean Code, by Neil Chue Hong
https://github.com/softwaresaved/clean-code-workshop
Fotis is currently a bioinformatician at the INAB (CERTH) in Thessaloniki, Greece. Since 2005 he has been working on several EU-funded research projects both at CERTH, as well as at the Information Processing Lab (IPL) in the Dpt. of Electrical and Computer Engineering at the Aristotle University of Thessaloniki. His research interests lay across three major pillars; (a) Bioinformatics and the involved computational workflows, (b) data mining methods and techniques in order to analyse the vast amount of data involved, and (c) Grid and Cloud Computing in order to optimize the developed methods for large-scale approaches and to ensure the reproducibility of results. His expertise at the intersection of Bioinformatics and Grid Computing allows him a unique insight into the specific requirements and bottlenecks expected in large scale computational approaches in Life Sciences. To this end he has been closely involved with EGI (www.egi.eu), and was selected as an EGI Champion for Bioinformatics in 2013. Moreover, he is the co-chair of two Research Data Alliance (RDA) Interest Groups (IGs); the Data Discovery Paradigms IG and the Early Career and Engagement IG. Finally, and in addition to his work in research, significant time is devoted to training activities. Beyond the traditional undergraduate and graduate courses (both in Greece and in Canada), he has organised several seminar Lectures at the Aristotle University of Thessaloniki on Bioinformatics, as well as two international EGI-affiliated workshops on the use of Grid and Cloud Computing in Life Sciences. Finally, he is a certified Carpentries Instructor and Trainer and is actively involved in organizing CaprentryCon 2018.
Mateusz is a Scientific Community Manager at DTL, where he works at the interface of training, technology and data platforms and the research community in the Netherlands. Before joining DTL, Mateusz was building research software at the Netherlands eScience Center in collaborative projects with domain scientists. At the Center Mateusz also helped to develop the training program around essential computing skills offered to project partners. Before becoming Research Software Engineer, he worked in the field of Biophysics spending lots of time with microscopes. Mateusz is also involved in Software Carpentry and Data Carpentry communities as instructor, trainer, mentor and Steering Committee member.
I am an evolutionary ecologist, and the project I lead at NIOO-KNAW lies on the interface between ecology and Open Science. We look at how various components of Open Science are currently implemented in ecological research, how do they help research to become better, and what are the best venues to increase applications of Open principles in ecology. My scientific interest revolve around animal ecology: I want to help us to better understand (and appreciate) the complexity of animal life, and to use this knowledge to protect them and teach others to do the same. I have finished undergrad degree in Ecology, worked on bird conservation projects in Croatia (NGO BIOM) for a couple of years, and then became an evolutionary ecologist, studying pair-bonds and divorce in monogamous birds at the Oxford University (PhD). Along the way, I have also became interested in Open Science. Once I have learned about this amazing initiative, I was determined to take a part in the wave of actions and research that can help other scientist (especially ecologist) to start to appreciate all the benefits that openness brings to science and to the society.
Emiel van Loon is assistant professor in Statistical Ecology at the University of Amsterdam. His research focuses on the development of statistical techniques to enhance the analysis of animal movement and distribution. He coordinates and teaches applied statistics in the earth science and biology curricula at UvA.
Cees Hof has a background in aquatic ecology and ecotoxicology and moved into animal systematics and palaeontology for his PhD and postdoctoral research. He was the coordinator of the European Network for Biodiversity Information (ENBI) and for more than 10 years developing the Dutch branch of the Global Biodiversity Information Facility (GBIF). Currently Cees holds a position at Data Archiving and Networked Services (DANS-KNAW) where he is responsible for project acquisition and interaction with the life science community. Areas of special interest are metadata, Knowledge Organisation Systems and data sharing infrastructures.
Neil is the founding Director and Principal Investigator of the Software Sustainability Institute, and is based at the University of Edinburgh. He enables research software users and developers to drive the continued improvement and impact of research software. He is the Editor-in-Chief of the Journal of Open Research Software, the current Advisory Council chair of the Software Carpentry Foundation, co-editor of "Software Engineering for Science", co-author of "Best Practices for Scientific Computing" and "An Open Science Peer Review Oath", and chair of the EPSRC Strategic Advisory Team on e-Infrastructure. His current research interests include barriers and incentives in research software ecosystems and the role of software as a research object.
Hannes is a trained Bioinformatician working at Naturalis Biodiversity Center in Leiden, the Netherlands. In his current position as a 'Technical Data Specialist', he is involved in implementing, testing, and managing workflows to deal with the large amounts of specimen collection data produced within Naturalis' large-scale digitisation project. During his previous post-doctoral appointment at Naturalis, Hannes' research focused on quantitative evolutionary biology and phylogenetics, including developing software and methods for data analysis and phylogenetic tree inference, such as the SUPERSMART software pipeline for automated generation of large dated phylogenies using public data resources. Hannes holds a Ph.D. in computational Systems Biology based on his work at the Vrije Universiteit Amsterdam.
We are excited to announce our four speakers who come from different backgrounds, and will share their view on different aspects of open science in Ecology. The Symposium will be moderated by Prof Louise Vet, Director of the Netherlands Institute for Ecology (NIOO-KNAW)
By Dr. Dominique Roche, Université de Neuchâtel, Switzerland
Dominique will discuss the contributions of open science towards curbing the 'reproducibility crisis', with a focus on open data.
Short bio:
I am a postdoc in the Institute of Biology at the University of Neuchâtel. My work on open data has focused on increasing awareness and facilitating dialogue among scientists, journal editors and the public with respect to the importance of archiving primary data while highlighting areas of current practice in need of improvement.
By Dr. Sandra A. Binning, Université de Neuchâtel, Switzerland
Sandra will discuss the current state of publically archived datasets in the ecological and evolutionary sciences and suggest ways that ecologists and evolutionary biologists can improve their data archiving practices to enhance reusability and reproducibility.
Short bio:
Sandra is currently a postdoc in the Eco-Ethology laboratory at the University of Neuchâtel. Her primary research interests are on fish parasites and animal movement, but she is also a vocal advocate of public data archiving, the Open Science movement, and ethical research practices in the ecological sciences.
By Prof. dr. Barend Mons, Leiden University, The Netherlands
Short bio:
Barend Mons is a molecular biologist by training (PhD Leiden University 1986) He spent over 15 years in malaria research in close collaboration with endemic countries. After that he gained experience in computer-assisted knowledge discovery, which is still his research focus. He spent time with the European Commission as a Seconded National Expert with the INCO-DC pogramme (1993-1996) and with the Netherlands Organisation for Scientific Research (NWO 1966-1999). In 2000 he founded the Biosemantics group in Rotterdam and later also in Leiden. Currently , Barend is Professor in Biosemantics at the Human Genetics department of Leiden University Medical Center. He was also the first Head of Node for ELIXIR-NL at the Dutch Techcentre for Life Sciences (until 2015), is Integrator Life Sciences at the Netherlands eScience Center, and board member of the Leiden Centre of Data Science. In 2014, Barend initiated the FAIR data initiative and in 2015, he was appointed Chair of the European Commission's High Level Expert Group for the “European Open Science Cloud”, from which he retired by the end of 2016. Presently, Barend is co-leading the GO FAIR initiative, an initiative to kick start developments towards the Internet of FAIR data and services, which will also contribute to the implementation of components of the European Open Science Cloud.
By Neil Chue Hong, Director of Software Sustainability Institute, UK
Neils talk will consider the implications of every researcher being a developer, where reproducibility comes in to it, and what researchers can do to their code to improve their research.
Short bio:
Neil is the founding Director and Principal Investigator of the Software Sustainability Institute, and is based at the University of Edinburgh. He enables research software users and developers to drive the continued improvement and impact of research s oftware. He is the Editor-in-Chief of the Journal of Open Research Software, the current Advisory Council chair of the Software Carpentry Foundation, co-editor of "Software Engineering for Science", co-author of "Best Practices for Scientific Computing" and "An Open Science Peer Review Oath", and chair of the EPSRC Strategic Advisory Team on e-Infrastructure. His current research interests include barriers and incentives in research software ecosystems and the role of software as a research object.
This is a great opportunity to contribute to a better integration between of the scientific community belonging largely to the long-tail of science and the Open Science. We look for contributions towards subsidizing travel and accommodation for some participants, speakers and trainers, towards venue costs, and meal expenses.
If you would like to offer your sponsorship, please contact Antica Culina: a.culina@nioo.knaw.nl
Here, you can find links to the slides of the mini-symposium presentations, as well as workshop slides and materials
1) Why we need open science, by Dr. Dominique Roche
2) Public data archiving in ecology and evolution: How well are we doing? by Dr. Sandra A. Binning
3) Writing software as a researcher: being practical versus being perfect, by Neil Chue Hong
1) Data Carpentry for Ecology, by Mateusz Kuzak
OpenRefine lesson:
http://www.datacarpentry.org/OpenRefine-ecology-lesson/
R lesson
http://www.datacarpentry.org/R-ecology-lesson/index.html
2) Reproducible analysis and Research Transparency, by Dr. Fotis E. Psomopoulos
http://reproducible-analysis-workshop.readthedocs.io/en/latest/
3) Transparent meta-analysis and use of open-data, by Dr. Antica Culina
https://doi.org/10.6084/m9.figshare.5722438.v1
4) Crash Course on Clean Code, by Neil Chue Hong
https://github.com/softwaresaved/clean-code-workshop