Methods – DARIAH-IE

Computational approaches to visual and material culture [June 2/3, Oxford, in person – FREE ]

19th May 2026 by Joan Murphy

Computational approaches to visual and material culture

Join us for a two-day thematic research event on computational approaches to visual and material culture at Oxford’s Weston Library!

2 June 9:30am-5:00pm to 3 June 9.30am-5:00pm

In-person event: Centre for Digital Scholarship, Weston Library, Oxford

Free event and open to all. Registration required, limited places available. Please follow the link below to register.

Data/Culture, the Centre for Digital Scholarship (Bodleian Libraries), Digital Scholarship @Oxford, and Mapping the Arts and Humanities (SAS, London) are hosting a two-day thematic research event exploring new ways of working with images, objects, and performances. The event focuses on developing research questions and approaches, using existing tools and resources. Participants will work collaboratively in small groups, supported by Research Software Engineers, and have the opportunity to develop a research idea further through a prize of dedicated technical collaboration. No prior coding experience is required.

Who is this for?

This event is designed for:

Arts and Humanities researchers (scholars and postgraduate students)
those working with images, objects, archives, or performance materials
those interested in exploring new research methods
those developing or planning research projects or grant applications

What do you need?

An interest in your research question
(Optional) a dataset or collection you work with
A laptop is desirable but not essential

You do NOT need:

coding experience
prior knowledge of tools
technical expertise

Galaxy – Digital Research Methods Training [Free, online, register by May 16]

7th May 2026 by Joan Murphy

Galaxy – Digital Research Methods Training online

We would like to draw your attention to the Galaxy Training Academy 2026, a free, international training programme focused on open, reproducible digital research methods, with particular relevance for arts, humanities, and cultural heritage research.

This training is relevant if you:

• work with textual, audiovisual, or cultural heritage data;

• are interested in practical approaches to digital humanities, text analysis, or machine learning;

• would like access to shared, non‑commercial computational infrastructure for research and teaching experiments;

• are interested in FAIR research practices for the digital arts and humanities.

About the Galaxy Training Academy 2026

The Academy is organised by the Galaxy Training Network, a long‑running international community that develops and delivers training for the Galaxy open‑source research infrastructure, which is widely used and supported across the global research community.

Dates: 18–22 May 2026 (Registration deadline: 16 May)

Format: Fully asynchronous (no live sessions)

Cost: Free

More details: https://training.galaxyproject.org/training-material/events/2026-05-18-galaxy-academy.html

The Academy is open to researchers at all career stages, including postgraduate students, doctoral researchers and early‑career academics.

Participants work through a structured set of video‑based and text‑based tutorials at their own pace. No prior experience with Galaxy is required, although more experienced users are also welcome.

Topics

Recommended tracks for community members include: Digital Humanities; From Zero to Hero with Python Machine Learning.

Indicative topics in the Digital Humanities track include:

• Introduction to Digital Humanities workflows in Galaxy

• Researching cultural data using OpenRefine

• Text mining Chinese newspaper archives

• Automated transcription of audio and video materials

The FLOW Project: A Modular Workflow for Automatic Text Recognition and Beyond [Feb 18, online @ 15:00 GMT]

12th February 2026 by Joan Murphy

The FLOW Project: A Modular Workflow for Automatic Text Recognition and Beyond

Bodleian Bytes

18 February 15:00 to 16:00

Online event. Registration required.

Registration: https://app.onlinesurveys.jisc.ac.uk/s/oxford/registration-bodleian-bytes-the-flow-project

Historical research often involves working with highly diverse and complex source materials, ranging from handwritten manuscripts to large, heterogeneous document collections. Machine learning methods are increasingly shaping how historians work with digitised sources, particularly through Automatic Text Recognition (ATR). In this talk, Jonas Widmer and Dana Meyer will introduce The FLOW, a modular, microservice-based framework designed to support machine learning–driven data management and processing in the Digital Humanities.

The talk will outline how The FLOW separates complex ATR workflows such as pre-processing, model training, inference, and evaluation into independent, reusable components that can be combined flexibly and accessed without programming experience. Using state-of-the-art transformer-based models, the project aims to make advanced text recognition workflows more transparent, reproducible, and scalable across diverse historical datasets.

Jonas and Dana will outline a typical FLOW workflow, showing how datasets are managed on the Hugging Face platform and then processed step by step. The focus will be on how such workflows can support everyday research practices when working with large and heterogeneous historical corpora.

Speaker Biographies

Jonas Widmer is a Research Software Engineer specialising in Digital Humanities at the University of Bern. In this role, he assists in planning and developing projects focused on Natural Language Processing. His primary interest lies in Handwritten Text Recognition (HTR), where he engages with historical projects and their diverse sources.

Dana Meyer is a Master’s student in Intelligent Interactive Systems at Bielefeld University and works as a research assistant on the project The Flow in the Digital History group at Bielefeld University

Jonas Widmer

Dana Meyer

Bodleian Bytes

Bodleian Bytes is a series of online talks hosted by the Centre for Digital Scholarship at the Bodleian Libraries. The series engages with innovative national and international research in digital scholarship. It is a virtual space for discussions surrounding different tools and methodologies whilst also providing inspiration for future digital research.

Event Details and Registration

Registration is required for this free online event. Registration closes at 17.00 on Monday 16 February 2026.

Date and time: Wednesday 18 February, 15:00-16:00 (UK time)

Location: Online via Zoom.

For further information, please email the Centre for Digital Scholarship: cds@bodleian.ox.ac.uk.

Centre for Digital Scholarship

The Centre for Digital Scholarship (CDS) at the Bodleian Libraries is a space and place for engaging, leading and shaping discussions around digital scholarship practice and research within and beyond the University of Oxford.

SynFlow: Continuous Semantics Change Analysis via Dependency Co-occurences [Jan 26, online, @17:00 GMT]

21st January 2026 by Joan Murphy

SynFlow: Continuous Semantics Change Analysis via Dependency Co-occurences

The first talk of the Data in Historical Linguistics Seminar Series 2026 will take place remotely on Monday 26 January 2026 at 5pm GMT. Bách Phan-Tất (KU Leuven, Belgium) will be presenting on SynFlow: Continuous Semantics Change Analysis via Dependency Co-occurences

Registration for this talk will close at midnight on the Friday before the event and the link for this can be accessed here: https://forms.gle/HEnpTKreXdrZqjfA8

Participants will receive a Microsoft Teams link via email on the morning of the talk.

The abstract for this talk can be found at this page.

The programme and registration links for all talks in the series can be found on our website:

2026 Programme

This seminar series is run by Andrea Farina (King’s College London) and Dr Mathilde Bru and is aimed at PhD students and early career researchers. The purpose of this seminar series is to bring together researchers working on historical linguistics with a quantitative approach, and to discuss current avenues of research in this topic. We hope that these seminars will nurture international collaboration and establish academic ties among researchers working on similar topics in this field.

ACDH Lecture: From Punch Cards to Prompt Engineering [Jan 20, online, 16:45 CET]

15th January 2026 by Joan Murphy

ACDH Lecture: “From Punch Cards to Prompt Engineering: The MHDBD and the Future of Semantic Annotation with LLMs” with Katharina Zeppezauer-Wachauer & Julia Hintersteiner (both Universität Salzburg)

📅 Tuesday, January 20th, 2026
⏰ 16:45 – 18:15

The invited speakers will present the complete technological redesign of the Mittelhochdeutsche Begriffsdatenbank (hashtag#MHDBDB): After decades of development in relational and RDF-based environments, the project has moved to a TEI-first architecture designed to support LLM-driven research.

The speakers address the key reasons for this shift:
i) the need for structured, AI-readable data;
ii) the practical limits of high-complexity standoff models; and
iii) the excessive resource demands of large-scale RDF infrastructures.

In this context, Large Language Models are reshaping annotation, search, and interpretation. TEI-XML emerges as a sustainable framework for transparent, semantically robust, and interoperable Expert-in-the-Loop workflows, balancing philological rigor with AI scalability.
The talk offers a focused perspective on the evolving technical foundations of text research in the humanities.

More detailed information,

https://www.oeaw.ac.at/acdh/newsevents/event-series/acdh-lecture-121

This lecture is jointly organised in close collaboration with the Austrian Centre for Digital Humanities (ACDH) and the University of Vienna and is part of the University’s Digital Humanities Lecture Circuit (WS 2025).

Technical Writing in the Humanities: a facilitated writing sprint [Dec 15, online, 13:30-15:00 GMT]

12th December 2025 by Joan Murphy

Technical Writing in the Humanities: a facilitated writing sprint

The Digital Skills in Arts and Humanities Network (DISKAH) is organising a webinar on “Technical Writing in the Humanities: a facilitated writing sprint” in collaboration with the Programming Historian to support interested colleagues in developing a publication targeted to this journal, and more widely in communicating your technical workflows within Digital Humanities research to relevant audiences.

Webinar date and time: Monday 15 December, 13:30-15:00 (GMT)

Please register here for the webinar: https://www.eventbrite.co.uk/e/diskah-webinar-technical-writing-in-the-humanities-tickets-1976718137160

Further information about the webinar: https://culturedigitalskills.org/webinar-diskah-programming-historian/

Reference Extraction at the Intersection of AI Research and the Digital Humanities: Validation, Interoperability and Collaboration [Nov 4, hybrid]

3rd November 2025 by Joan Murphy

Reference Extraction at the Intersection of AI Research and the Digital Humanities: Validation, Interoperability and Collaboration

This informal meeting is meant mainly to foster collaboration and knowledge exchange between researchers and practitioners working at the intersection of data extraction, artificial intelligence, and the digital humanities. In the workshop, we continue to address the challenge of extracting heterogeneous references from texts, particularly from historical documents and humanities or legal scholarship. This second workshop focuses on three key themes emerging from the 2023 discussions:

Validation: How can we evaluate and benchmark the performance of different reference extraction tools and approaches, particularly with large language models?
Interoperability: How can we ensure that different tools, datasets, and workflows can work together effectively through shared data models and formats?
Collaboration: How can researchers, developers, and institutions work together to advance the field of reference extraction?

The program is available online at: https://mpilhlt.github.io/reference-extraction/workshop-2025/programme/

The event will take place in-person and online. Register at https://plan.events.mpg.de/e/refextract25

A link for online attendance will be sent to registered participants before the event. Also, even if you cannot attend, but want to be informed about updates, materials being made available, etc. you can notify us about this at the registration link.

Programme

Tuesday 04 November 2025

Onboarding

09:00-09:15 Arrival/Registration

09:15-09:45 Christian Boulanger/Andreas Wagner (mpilhlt): Welcome and Upshot from RefExtract2023, State of the Discussion

09:45-10:00 Coffee Break

Research presentations

10:00-12:30

Hiba Arnaout (TU Darmstadt): In-depth Research Impact Summarization through Fine-Grained Temporal Citation Analysis
Yurui Zhu/Matteo Romanello (Odoma): Benchmarking Large Language Models on Reference Extraction and Parsing in the Social Sciences and Humanities
Sofía Aguilar Valdez (Saarland University): How Scientific Ideas Evolve
Open Discussion and Ad-Hoc Presentation of Research

12:30-13:30 Lunch

Datasets, Infrastructure and Interoperability

13:30-15:30

Angelo Di Iorio/Matteo Guenci/Marta Soricetti*/Silvio Peroni/Lorenzo Paolini*/Ivan Heibi (University of Bologna): Citation Extractor and Classifier: Pipeline and Datasets (*presenting)
Tamara Heck/Christoph Schindler/Verena Weimer/Philipp Mayr/Ahsan Shahid (DIPF/GESIS): Open Citation Data for Educational Research
Christian Boulanger, Andreas Wagner (mpilhlt): Datasets in the Legal Theory Knowledge Graph Project
Interoperability Roundtable: Open Discussion on Data Models and Data Formats

15:30-16:00 Coffee Break

Tools, Workflows and Pipelines

16:00-17:30

Raphael Schlattmann/Malte Vogl (mpigea)/Aleksandra Kaye (TU Berlin/mpigea): LLM-Based Knowledge Graph Extraction Pipeline
Luca Foppiano (ScienciaLAB): Training the Grobid Reference Extraction Models
Christian Boulanger/Andreas Wagner (mpilhlt): Annotation Tools for Machine Learning: PDF-TEI Editor (for LLamore & Grobid), Prodigy, TEI-Publisher

17:30-18:30 Takeaways, Way Forward, Closing

DARIAH’s Visual Media & Interactivity Working Group – Workshop on audiovisual corpora annotation [Oct 22nd, in person & online]

6th October 2025 by Joan Murphy Leave a Comment

Workshop on audiovisual corpora annotation

October 22nd, 2025 [in person and online]

DARIAH’s Visual Media & Interactivity Working Group is participating in a one-day workshop on audiovisual corpora annotation, which will take place on October 22 during the 5th DARIAH-HR International Conference (University of Osijek, Croatia). The workshop can also be followed remotely. Below is the link to the workshop presentation and to the registration form.

https://dhh.dariah.hr/2025/workshops

The Consortium for Annotation, Analysis and Archiving of Video Applied to Scientific Activities (Canevas) is accredited since 2022 by the French research infrastructure for Digital Humanities Huma-Num. Since April 2025, this consortium has been receiving European EOSC-OSCARS funding for a period of 24 months, which has given rise to the OASIS project (Open Audiovisual Science Innovation Scheme). The aim of Canevas and OASIS is to facilitate research in the humanities and social sciences involving audiovisual corpora by facilitating actions such as archiving, annotating, commenting, analysing, and sharing videos. To do this, the members of the Canevas consortium have created two tools, Celluloid (for annotating corpora on media studies and media literacy) and e-spect@tor (for annotating corpora on the performing arts, especially theater), which enable collaborative annotation of videos for research or teaching purposes. These are free and open source tools (https://github.com/celluloid-camp/) that comply with open science and FAIR data standards, while leveraging AI to promote the intelligibility of videos and the interactions that result from them.

As part of the OASIS project, the Canevas Consortium is organising a workshop during the pre-conference day of the 5th DARIAH-HR International Conference, which will take place on Wednesday 22 October 2025. This workshop will be divided into two 3-hour sessions. The first will take place on Wednesday morning and aims to introduce participants to the PeerTube technology, developed by the French education-oriented video network Framasoft to offer an alternative to the services provided by the GAFAM, and particularly the online video hosting platform Youtube, and thus promote digital empowerment. The second session, on Wednesday afternoon, will be devoted to learning how to use the Celluloid and e-spect@tor tools. We invite you to discover collaborative annotation through your own audiovisual corpus, enabling you to develop new skills adapted to the changes of media practices and the epistemological issues that come with them. During this second session, we will focus on specific features provided by these tools: some are automated using AI, such as audio transcription and video segmentation into chapters, while others can be done manually and allow users to enrich their viewing experience through the traces they leave behind.

By renewing interactions and collaborations with these digital tools, this workshop aims to introduce participants, from all disciplines and all levels of video expertise, to our research methods while allowing to acquire new skills to foster a convergence culture around video archiving and annotations. These can then be deployed in various educational or research contexts, which can be enhanced, for instance, with group work in the classroom, or in research carried out by researchers in training (Masters, PhD) or more experienced researchers.

The workshop will be led (in English) by:

Michael Bourgatte, Professor at the University of Lorraine (France)
Cécile Chantraine-Braillon, Professor at the University of La Rochelle (France)
Anatole Grimaldi, OASIS project engineer
Laurent Tessier, Professor at the Catholic University of Paris

NB: to get the most out of this workshop, please bring your own computer. It is also possible to follow this workshop with one computer for several people. Moreover, if you want to explore part of one of your corpora, you can send us one of your videos. All video formats are welcome.

Share this:

Share this:

Bodleian Bytes

Speaker Biographies

Bodleian Bytes

Event Details and Registration

Centre for Digital Scholarship

Share this:

Share this:

Share this:

Share this:

Tuesday 04 November 2025

Onboarding

Research presentations

Datasets, Infrastructure and Interoperability

Tools, Workflows and Pipelines

Share this:

Share this: