Behind the Scenes: What it takes to run CODATA-RDA School of Research Data Science

The Event Fund supports more inclusive and accessible events in data science in order to lower barriers to participation in science and facilitate new collaborative relationships. Part of supporting emerging data science leaders to run good data science events includes developing a knowledge base for what it takes to run outstanding and creative events. Through this blog series, we seek to bring attention to the often under-documented and invisibilized processes, technologies, people, and practices that undergird more inclusive and accessible data science event programming.

Alt text: collage of images taken at CODATA 2021

In 2021, the CODATA-RDA Schools of Research Data Science organized a ten-week-long virtual course, with the objective of teaching foundational skills in data science to Early Career Researchers (ECRs) in Low to Middle Income Countries (LMICs). This article takes a closer look at the creation of the course, examining the networks and resources which came together in order to accomplish such a task.

Aside from serving as an educational opportunity, the schools organized by CODATA-RDA play an important secondary role. During the process of creating a school, a robust community of alumni, instructors, classroom helpers, and hosting institutions must develop, growing the training capacity within a region. Alumni of the program become classroom helpers, and then can become instructors after they participate in the training themselves. Such a “train-the-trainer” approach facilitates the coordination of future events but has the secondary and important effect of also building a strong network of community organizers. This strong community that CODATA-RDA builds and nurtures is an essential component of the organization’s continued success in coordinating schools.


CODATA-RDA held their first school in Trieste, in August 2016. It has since coordinated schools across five continents, teaching well over 500 students. The CODATA-RDA Research Data Science Summer School is the seventeenth iteration of the school. CS&S support enabled the 2021 edition of the CODATA-RDA Research Data Science Summer School: a ten-week virtual school.

The event organization started six months before the event, a shorter timeline than usual because the event was based at a venue (The Abdus Salam International Centre for Theoretical Physics (ICTP) in Trieste, Italy) that had hosted previous editions of the school and already had institutional knowledge about what needed to be done. When a school is organized with a new host, then organization typically begins a year before the start of school.

Over the course of these six months, organizers held bi-weekly meetings to facilitate event planning. The event was announced at the end of June 2021 and open for a month.

The virtual course itself was held between September 6, 2021 and November 19, 2021. Each week focused on a different topic. More detailed information on the schedule can be found here.

Sociotechnical infrastructure

The term “sociotechnical infrastructure” acknowledges how complex and interconnected systems of human relations, technical objects, tools, and underlying processes are, and how they shape and are shaped by each other. In this section, we highlight the systems that event organizers established in order to thoughtfully organize and run their events.

Breaking Down the Organization

It’s important to examine the structure of CODATA-RDA itself, as that informs the steps organizers chose to take in coordinating the school. CODATA and RDA are two separate organizations, which formed a collaborative working group in 2017. This collaboration has continued in the years since, with organizers taking advantage of the existing CODATA and RDA networks already in place while coordinating their schools.

Indicative of its past as a working group, there is a clear hierarchy within the organization, which effectively delegates responsibilities amongst individual members. There are eight co-chairs within the CODATA-RDA Schools of Research Data Science, each responsible for different regions of the world. Raphael Cobe, Marcela Alfaro Córdoba, and Robert Quick coordinate schools in the Americas. Louise Bezuidenhout, Sara El Jadid, and Bianca Peterson organize schools in Africa. Hugh Shanahan and Shanmugasundaram Venkataraman manage schools in the remaining regions of the world.

Although work is typically regionally divided, the organization behind this school was a little different, as all of the co-chairs are involved in some way with the Trieste School, where the ten-week course was based. This means that for this event, the work was evenly distributed, with each co-chair working somewhere between fifteen and twenty hours total for the entire event. This work included reviewing applications, selecting students, designing a budget, and ensuring that all activities were running smoothly for the duration of the school. This estimate also takes into account teaching preparation, as many co-chairs also served as instructors within the school.

Besides the co-chairs, a number of teaching assistants were involved in the coordination of the school. These teaching assistants are normally alumni from the schools and worked around three hours a week, working a total of thirty hours for the duration of the school. These assistants supported the live sessions, answered questions on the chat, helped the instructor guide discussions in breakout rooms, and helped students with technical issues. Also, during the asynchronous part of the school, assistants helped instructors answer questions and solve technical issues from students on our messaging platform.

Other individuals and teams who contributed to the organization of the school include Bridget Walker, who provided administrative support, and a RDA interest group, who facilitated outreach and community collaboration. An advisory board was also recently formed to provide oversight during the coordination of future schools.

Building a Network of Instructors

Prior to the course, CODATA-RDA already had a number of connections built through previous iterations of the school. Event planners drew upon these existing connections when organizing this event. This simplified the process of designating instructors for the school, as all of the instructors for the CODATA-RDA Research Data Science Summer School had previously collaborated with the organization in the past. The curriculum itself also drew from previous years, as instructors from each iteration of the school typically offer up their materials for use in future schools.

CODATA-RDA Schools are built around a train-the-trainer model where previous students of the training return as instructors. Thus each school that the organization hosts builds a stronger network of instructors within the region.

For the next edition of the school, organizers will continue to take advantage of this extensive network, with previous teachers selecting new instructors from past helpers and students who have demonstrated interest.


Organizers used zoom to hold their biweekly meetings, and used email to communicate with one another. To keep in touch with each other, organizers used Matrix via the Element app. The community developed over the course of this virtual school is still active, with alumni keeping in contact with one another through an email list and alumni forum page.

Raising awareness

Again, the theme of utilizing existing networks emerges. Organizers helped raise awareness for the course by contacting individuals on RDA and CODATA social media and mailing lists. Each previous iteration of the school similarly offered a mailing list, along with social media information for their alumni, which organizers were then able to use to spread the word. Many new students came to the school following recommendations from the school’s alumni.

Increasing accessibility

CODATA-RDA is currently heavily supported by volunteers, and the organization is still in the process of looking for larger grants in order to transition to a long-term, sustainable operation. Due to the pandemic, burnout is high, and many individuals cannot freely give away their time as they did before. Funding from CS&S was able to help compensate volunteers for their time, facilitating their participation in helping to organize the school.

Organizers also took measures to increase accessibility for participants, including providing closed captions for lectures, translating course materials into Spanish, and offering financial support for participants.

The virtual format of the event also enabled participation from a larger audience when compared with previous editions of the school, with students from new countries like Iran, Mongolia, and Zimbabwe participating in the event. The event had participation from a total of 76 students.

What’s next?

Since the organization of the Trieste 2021 Summer School, more iterations of the school have been held. This includes Trieste 2022, which was a hybrid event held in July 2022. Online versions of the school were also organized for Pretoria and Sao Paulo. CODATA-RDA plans to continue to organize schools and grow their robust network (with 3 in-person schools and 1 online planned for this year), as well as working to increase both educational opportunities and training capacity in communities underrepresented in the field of data science. If you are interested in participating in future sessions or learning more, visit their site.

Featured Photo by Samuel Bryngelsson on Unsplash. Copy-editing by Pearse Anderson.