DARIAH-IE

  • Home
  • About
    • National Advisory Committee
    • National Coordinating Institution
    • Research Ireland
    • Contact Us
  • Resources
    • DARIAH in a nutshell
    • ECR Bursary Blogs
    • Transformations Journal
  • News & Events
    • Events & Announcements
    • DARIAH-EU News & Events
    • Past Events
      • (Re)introducing DARIAH-IE
  • Newsletter
    • Newsletter Archive
Reference Extraction at the Intersection of AI Research and the Digital Humanities: Validation, Interoperability and Collaboration [Nov 4, hybrid]

Reference Extraction at the Intersection of AI Research and the Digital Humanities: Validation, Interoperability and Collaboration [Nov 4, hybrid]

3rd November 2025 by Joan Murphy

Reference Extraction at the Intersection of AI Research and the Digital Humanities: Validation, Interoperability and Collaboration

This informal meeting is meant mainly to foster collaboration and knowledge exchange between researchers and practitioners working at the intersection of data extraction, artificial intelligence, and the digital humanities. In the workshop, we continue to address the challenge of extracting heterogeneous references from texts, particularly from historical documents and humanities or legal scholarship. This second workshop focuses on three key themes emerging from the 2023 discussions:

  1. Validation: How can we evaluate and benchmark the performance of different reference extraction tools and approaches, particularly with large language models?
  2. Interoperability: How can we ensure that different tools, datasets, and workflows can work together effectively through shared data models and formats?
  3. Collaboration: How can researchers, developers, and institutions work together to advance the field of reference extraction?

The program is available online at: https://mpilhlt.github.io/reference-extraction/workshop-2025/programme/

The event will take place in-person and online. Register at https://plan.events.mpg.de/e/refextract25 

A link for online attendance will be sent to registered participants before the event. Also, even if you cannot attend, but want to be informed about updates, materials being made available, etc. you can notify us about this at the registration link.

Programme

Tuesday 04 November 2025

Onboarding

09:00-09:15 Arrival/Registration

09:15-09:45 Christian Boulanger/Andreas Wagner (mpilhlt): Welcome and Upshot from RefExtract2023, State of the Discussion

09:45-10:00 Coffee Break

Research presentations

10:00-12:30

  1. Hiba Arnaout (TU Darmstadt): In-depth Research Impact Summarization through Fine-Grained Temporal Citation Analysis
  2. Yurui Zhu/Matteo Romanello (Odoma): Benchmarking Large Language Models on Reference Extraction and Parsing in the Social Sciences and Humanities
  3. Sofía Aguilar Valdez (Saarland University): How Scientific Ideas Evolve
  4. Open Discussion and Ad-Hoc Presentation of Research

12:30-13:30 Lunch

Datasets, Infrastructure and Interoperability

13:30-15:30

  1. Angelo Di Iorio/Matteo Guenci/Marta Soricetti*/Silvio Peroni/Lorenzo Paolini*/Ivan Heibi (University of Bologna): Citation Extractor and Classifier: Pipeline and Datasets (*presenting)
  2. Tamara Heck/Christoph Schindler/Verena Weimer/Philipp Mayr/Ahsan Shahid (DIPF/GESIS): Open Citation Data for Educational Research
  3. Christian Boulanger, Andreas Wagner (mpilhlt): Datasets in the Legal Theory Knowledge Graph Project
  4. Interoperability Roundtable: Open Discussion on Data Models and Data Formats

15:30-16:00 Coffee Break

Tools, Workflows and Pipelines

16:00-17:30

  1. Raphael Schlattmann/Malte Vogl (mpigea)/Aleksandra Kaye (TU Berlin/mpigea): LLM-Based Knowledge Graph Extraction Pipeline
  2. Luca Foppiano (ScienciaLAB): Training the Grobid Reference Extraction Models
  3. Christian Boulanger/Andreas Wagner (mpilhlt): Annotation Tools for Machine Learning: PDF-TEI Editor (for LLamore & Grobid), Prodigy, TEI-Publisher

17:30-18:30 Takeaways, Way Forward, Closing

Share this:

  • Click to share on Bluesky (Opens in new window) Bluesky
  • Click to share on LinkedIn (Opens in new window) LinkedIn
  • Click to share on Mastodon (Opens in new window) Mastodon
  • Click to share on Facebook (Opens in new window) Facebook
  • Click to share on X (Opens in new window) X
  • Click to share on Reddit (Opens in new window) Reddit
  • Click to email a link to a friend (Opens in new window) Email
Posted in: AI, Data, Digital Humanities, Events, Methods, Research IT, TEI, Tools, Workflows Tagged: AI, Data Science, Digital Humanities, Events, Methods, TEI, Tools, Workflows
← Expanding Realities: XR at the Intersection of Hidden Histories, Biosciences, and Creative Technologies [Online and in person, Nov 4th, 12pm]. University of Galway Centre for Creative Technologies.
GLAM Data: Access, Refining, Analysis and Visualization [Nov 6, 2pm, Online] →

News & Upcoming Events

  • Technical Writing in the Humanities: a facilitated writing sprint [Dec 15, online, 13:30-15:00 GMT]
  • When Machines Read Manuscripts: Tools and Challenges in Handwritten Text Recognition [Dec 16, online, @16:45 CET]
  • Comparing, Classifying, Clustering: Palaeographic Analysis of Inscriptions from Ancient Sicily [Dec 10, online, @17:00 CET]
  • iFrame Project – workshop exploring research data management [Dec 5, in person, HEAnet Dublin]
  • Submission deadline for Digital Humanities 2026 extended to Dec 15, 2025

BlueSky Latest Posts

  • Get to this post

    Digital Repository of Ireland (DRI) @dri.ie 1 week

    📣DRI is seeking expressions of interest from DRI members to fill a vacancy which has arisen on the DRI Board.

    Any staff member from a full-member institution is eligible to nominate themselves for consideration for appointment.

    Apply by 16 Jan 2026. Find out more: dri.ie/news/call-fo...

    Call for nominations to the DRI Board - Digital Repository of Ireland

    The Digital Repository of Ireland (DRI) is seeking expressions of interest from members of the DRI to fill a vacancy which has arisen on the DRI Board. Any staff member from a full-member institution ...

    dri.ie

  • Get to this post

    IIIF Consortium @iiif.bsky.social 1 week

    The final #IIIF newsletter of the year is out!

    Open for:
    🔹The 2026 Conference CfP
    🔹IIIF & Allmaps partnership
    🔹New IIIF viewer updates

    mailchi.mp/iiif/nov-dec...

DARIAH-IE is funded by Research Ireland

Unless stated otherwise all contents of this site are licensed under CC-BY-4.0-Licence

Copyright © 2025 DARIAH-IE.

Custom WordPress Theme by themehall.com