Data Engineering Meets Large Language Models:
Challenges and Opportunities
DExLLM workshop at ICDE'25
Tell Me More!

About

This workshop on the synergy between Data engineering (DE) and Large Language Models (LLMs) is a pioneering event in top-tier conferences. It brings together experts to share recent challenges and opportunities, aiming to inspire future research. With oral presentations and poster sessions, the workshop provides a platform for researchers from natural language processing, data engineering, and database systems to discuss ideas, review past work, and explore new directions. The focus is on the synergy between LLMs and DBMS from two aspects. The first explores how LLMs empower DBMS tasks, such as data retrieval, query optimization, and index tuning. The second examines how DE enhance the capability of LLM through data engineering, retrieval-augmented generation (RAG) systems, multi-modal data management, and more. By including participants from academia and industry, the workshop aims to connect practical applications with research, encouraging collaboration that will shape the future of intelligent databases.

By inviting experts to deliver keynote speeches, the DExLLM workshop on ICDE'25 aims to share the latest innovations and breakthroughs on the target topic, serving as a beacon for current and future research. The oral and poster sessions provide a communication platform for researchers in the areas of either DE or natural language processing to exchange their ideas, summarize existing works, and discuss prospective aspects. It will focus on the under-explored ability for both DE and LLMs including enhancing LLMs with powerful DBMS and empowering data engineering with the advantages of LLMs. By including participants from both academia and industry, this workshop tries to narrow the gaps between the application attempts and methodology studies and, more importantly, to push the boundaries of data engineering in the era of LLMs.


Topics of Interest

Researchers in the fields of data management and artificial intelligence have begun to explore the integration of LLMs and DE to address the evolving challenges in data processing. Initial studies highlight the potential of LLMs to enhance tasks such as query optimization and natural language translation to SQL, providing new avenues for improving DBMS efficiency. Despite these promising developments, the intersection of LLMs and DBMS remains a relatively nascent area, requiring further investigation and collaboration. To facilitate knowledge sharing and encourage contributions from both academic and industry perspectives.

The workshop will be welcoming theory and methodology papers falling into the scope of following themes, including but not limited to:

  • LLMs-empowered DE
    • LLM-enabled Improvements to Foundational DB Algorithms
    • LLM-based Anomaly Detection in DBMS
    • Learned Database Design, Configuration, and Tuning
    • LLM for Query Optimization, Indexing, and Partitioning
    • Novel Query Interfaces and Interactive Query Refinement
    • Data Cleaning, Integration, and Augmentation with LLMs
    • Reasoning Over Knowledge Bases with LLMs
    • Automated Data Understanding and Visualization
    • Natural Language Queries and SQL Co-pilots
    • Text-to-SQL Design and Solutions
    • Context-aware Data Retrieval with LLMs
    • Personalized Query Optimization Using LLMs
  • Data Engineering-Enhanced LLMs
    • Data Collection and Preparation for LLMs
    • Data and Metadata Management for the LLM Lifecycle
    • DB-inspired Techniques for Modeling, Storage, and Provenance of LLMs
    • Data Management for Multi-modal LLMs
    • Vector Data Management for LLMs
    • RAG System Retrieval and Reasoning for LLMs
    • Novel Data Management Systems for Accelerating LLM Training


Important Dates

  • Submission Deadline: January 7th, 2025
  • Notification of Acceptance: February 10th, 2025
  • Camera-ready Paper Due: March 1st, 2025
  • DExLLM at ICDE'25 Workshop Day: May 19th, 2025, Half-Day (AM)

Submission Details

Workshop papers should not exceed 12 pages in length (maximum 8 pages for the main paper content + maximum 2 pages for appendixes + maximum 2 pages for references). Papers must be submitted in PDF format according to the IEEE template published in the IEEE guidelines, selecting the generic “conference” sample. The PDF files must have all non-standard fonts embedded. Workshop papers must be self-contained and in English. Submissions will be reviewed double-blind, and author names and affiliations should NOT be listed. Submitted works will be assessed based on their novelty, technical quality, potential impact, and clarity of writing. Please refer to the ICDE'25 website for further details.

Note that at least one of the authors of the accepted workshop papers must register for the workshop (details to come on the main ICDE'25 website). For questions about submission, please contact us at: wenqifan03@gmail.com

To submit your work, please use the following Submission Link.

Workshop Program

09:00 ~ 09:10 Opening Remarks
09:10 ~ 09:55 Invited Talk I:
09:55 ~ 10:40 Invited Talk II:

10:40 ~ 11:00 Coffee Break

11:00 ~ 11:30 Invited Talk III:
11:30 ~ 12:00 Oral Presentation:
12:00 ~ 12:20 Closing Remarks

Keynote Speakers

TBA

Organization


Workshop Co-Chairs

Qing Li

Professor

The Hong Kong Polytechnic University

Wenqi Fan

Assistant Professor

The Hong Kong Polytechnic University

Pangjing Wu

Research Fellow

The Hong Kong Polytechnic University