{"id":1395,"date":"2025-11-03T11:00:22","date_gmt":"2025-11-03T11:00:22","guid":{"rendered":"https:\/\/dariah.ie\/wordpress\/?p=1395"},"modified":"2025-11-03T11:00:25","modified_gmt":"2025-11-03T11:00:25","slug":"reference-extraction-at-the-intersection-of-ai-research-and-the-digital-humanities-validation-interoperability-and-collaboration-nov-4-hybrid","status":"publish","type":"post","link":"https:\/\/dariah.ie\/wordpress\/2025\/11\/reference-extraction-at-the-intersection-of-ai-research-and-the-digital-humanities-validation-interoperability-and-collaboration-nov-4-hybrid\/","title":{"rendered":"Reference Extraction at the Intersection of AI Research and the Digital Humanities: Validation, Interoperability and Collaboration [Nov 4, hybrid]"},"content":{"rendered":"\n<p class=\"has-medium-font-size wp-block-paragraph\">Reference Extraction at the Intersection of AI Research and the Digital Humanities: Validation, Interoperability and Collaboration <\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This informal meeting is meant mainly to foster collaboration and knowledge exchange between researchers and practitioners working at the intersection of data extraction, artificial intelligence, and the digital humanities. In the workshop, we continue to address the challenge of extracting heterogeneous references from texts, particularly from historical documents and humanities or legal scholarship. This second workshop focuses on three key themes emerging from the 2023 discussions:<\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li><strong>Validation<\/strong>: How can we evaluate and benchmark the performance of different reference extraction tools and approaches, particularly with large language models?<\/li>\n\n\n\n<li><strong>Interoperability<\/strong>: How can we ensure that different tools, datasets, and workflows can work together effectively through shared data models and formats?<\/li>\n\n\n\n<li><strong>Collaboration<\/strong>: How can researchers, developers, and institutions work together to advance the field of reference extraction?<\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">The program is available online at:\u00a0<a href=\"https:\/\/mpilhlt.github.io\/reference-extraction\/workshop-2025\/programme\/\">https:\/\/mpilhlt.github.io\/reference-extraction\/workshop-2025\/programme\/<\/a><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The event will take place in-person and online. Register at\u00a0<a href=\"https:\/\/plan.events.mpg.de\/e\/refextract25\">https:\/\/plan.events.mpg.de\/e\/refextract25<\/a>\u00a0<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">A link for online attendance will be sent to registered participants before the event. Also, even if you cannot attend, but want to be informed about updates, materials being made available, etc. you can notify us about this at the registration link.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Programme<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"tuesday-04-november-2025\">Tuesday 04 November 2025<\/h3>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"onboarding\">Onboarding<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">09:00-09:15&nbsp;<strong>Arrival\/Registration<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">09:15-09:45&nbsp;<strong>Christian Boulanger\/Andreas Wagner (mpilhlt):&nbsp;<em>Welcome and Upshot from RefExtract2023, State of the Discussion<\/em><\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">09:45-10:00&nbsp;<strong>Coffee Break<\/strong><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"research-presentations\">Research presentations<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">10:00-12:30<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Hiba Arnaout (TU Darmstadt):\u00a0<em>In-depth Research Impact Summarization through Fine-Grained Temporal Citation Analysis<\/em><\/strong><\/li>\n\n\n\n<li><strong>Yurui Zhu\/Matteo Romanello (Odoma):\u00a0<em>Benchmarking Large Language Models on Reference Extraction and Parsing in the Social Sciences and Humanities<\/em><\/strong><\/li>\n\n\n\n<li><strong>Sof\u00eda Aguilar Valdez (Saarland University):\u00a0<em>How Scientific Ideas Evolve<\/em><\/strong><\/li>\n\n\n\n<li><strong>Open Discussion and Ad-Hoc Presentation of Research<\/strong><\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>12:30-13:30 Lunch<\/strong><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"datasets-infrastructure-and-interoperability\">Datasets, Infrastructure and Interoperability<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">13:30-15:30<\/p>\n\n\n\n<ol start=\"5\" class=\"wp-block-list\">\n<li><strong>Angelo Di Iorio\/Matteo Guenci\/Marta Soricetti*\/Silvio Peroni\/Lorenzo Paolini*\/Ivan Heibi (University of Bologna):\u00a0<em>Citation Extractor and Classifier: Pipeline and Datasets<\/em><\/strong>\u00a0<em>(*presenting)<\/em><\/li>\n\n\n\n<li><strong>Tamara Heck\/Christoph Schindler\/Verena Weimer\/Philipp Mayr\/Ahsan Shahid (DIPF\/GESIS):\u00a0<em>Open Citation Data for Educational Research<\/em><\/strong><\/li>\n\n\n\n<li><strong>Christian Boulanger, Andreas Wagner (mpilhlt):\u00a0<em>Datasets in the Legal Theory Knowledge Graph Project<\/em><\/strong><\/li>\n\n\n\n<li><strong>Interoperability Roundtable: Open Discussion on Data Models and Data Formats<\/strong><\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">15:30-16:00&nbsp;<strong>Coffee Break<\/strong><\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"tools-workflows-and-pipelines\">Tools, Workflows and Pipelines<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">16:00-17:30<\/p>\n\n\n\n<ol start=\"9\" class=\"wp-block-list\">\n<li><strong>Raphael Schlattmann\/Malte Vogl (mpigea)\/Aleksandra Kaye (TU Berlin\/mpigea):\u00a0<em>LLM-Based Knowledge Graph Extraction Pipeline<\/em><\/strong><\/li>\n\n\n\n<li><strong>Luca Foppiano (ScienciaLAB):\u00a0<em>Training the Grobid Reference Extraction Models<\/em><\/strong><\/li>\n\n\n\n<li><strong>Christian Boulanger\/Andreas Wagner (mpilhlt):\u00a0<em>Annotation Tools for Machine Learning: PDF-TEI Editor (for LLamore &amp; Grobid), Prodigy, TEI-Publisher<\/em><\/strong><\/li>\n<\/ol>\n\n\n\n<p class=\"wp-block-paragraph\">17:30-18:30&nbsp;<strong>Takeaways, Way Forward, Closing<\/strong><\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Reference Extraction at the Intersection of AI Research and the Digital Humanities: Validation, Interoperability and Collaboration This informal meeting is meant mainly to foster collaboration and knowledge exchange between researchers and practitioners working at the intersection of data extraction, artificial intelligence, and the digital humanities. In the workshop, we continue to address the challenge of &#8230; <span class=\"more\"><a class=\"more-link\" href=\"https:\/\/dariah.ie\/wordpress\/2025\/11\/reference-extraction-at-the-intersection-of-ai-research-and-the-digital-humanities-validation-interoperability-and-collaboration-nov-4-hybrid\/\">[Read more&#8230;]<\/a><\/span><\/p>\n","protected":false},"author":8,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_bluesky_dont_syndicate":"1","_bluesky_syndication_accounts":"","_bluesky_syndication_text":"","_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[95,97,103,9,130,114,16,131,121],"tags":[136,105,98,99,133,135,134,132],"class_list":["entry","post","publish","author-joan-y-murphytcd-ie","post-1395","format-standard","category-ai","category-data","category-digital-humanities","category-events","category-methods","category-research-it","category-tei","category-tools","category-workflows","post_tag-ai","post_tag-data-science","post_tag-digital-humanities","post_tag-events","post_tag-methods","post_tag-tei","post_tag-tools","post_tag-workflows"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p9zaVM-mv","_links":{"self":[{"href":"https:\/\/dariah.ie\/wordpress\/wp-json\/wp\/v2\/posts\/1395","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dariah.ie\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dariah.ie\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dariah.ie\/wordpress\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/dariah.ie\/wordpress\/wp-json\/wp\/v2\/comments?post=1395"}],"version-history":[{"count":1,"href":"https:\/\/dariah.ie\/wordpress\/wp-json\/wp\/v2\/posts\/1395\/revisions"}],"predecessor-version":[{"id":1396,"href":"https:\/\/dariah.ie\/wordpress\/wp-json\/wp\/v2\/posts\/1395\/revisions\/1396"}],"wp:attachment":[{"href":"https:\/\/dariah.ie\/wordpress\/wp-json\/wp\/v2\/media?parent=1395"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dariah.ie\/wordpress\/wp-json\/wp\/v2\/categories?post=1395"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dariah.ie\/wordpress\/wp-json\/wp\/v2\/tags?post=1395"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}