This workshop seeks to expedite efforts at the intersection of Symbolic Knowledge and Statistical Knowledge inherent in LLMs. The objective is to establish quantifiable methods and acceptable metrics for addressing consistency, reliability, and safety in LLMs. Simultaneously, we seek unimodal or multimodal NeuroSymbolic solutions to mitigate LLM issues through context-aware explanations and reasoning. The workshop also focuses on critical applications of LLMs in health informatics, biomedical informatics, crisis informatics, cyber-physical systems, and legal domains. We invite submissions that present novel developments and assessments of informatics methods, including those that showcase the strengths and weaknesses of utilizing LLMs.
Time (GMT+2) | Event | Details |
---|---|---|
8:50 AM - 9:00 AM | Welcome Address and Introduction | Organizing Committee Members and Keynote Speakers |
9:00 AM - 9:40 AM | Keynote Talk #1 | Dr. Amit P. Sheth |
9:45 AM - 10:00 AM | Oral Presentation 1 | Vinicius Monteiro de Lira et al. |
10:00 AM - 10:15 AM | Oral Presentation 2 | Hannah Sansford et al. |
10:15 AM - 10:40 AM | Invited Talk #1 | Andrii Skomorokhov, Haltia.Inc |
10:45 AM - 11:20 AM | Keynote Talk #2 | Dr. Alex Jaimes, Dataminr Inc. |
11:20 AM - 11:30 AM | BREAK (10 Mins) | |
11:30 AM - 11:35 AM | Lightning Talk 1 | Bing Hu et al. |
11:35 AM - 11:40 AM | Lightning Talk 2 | Amirhossein Ghaffari et al. |
11:40 AM - 11:45 AM | Lightning Talk 3 | Walid S. Saba |
11:45 AM - 11:50 AM | Lightning Talk 4 | Firuz Juraev et al. |
11:50 AM - 11:55 AM | BREAK (5 Mins) | |
11:55 AM - 12:30 PM | Keynote Talk #3 | Dr. Huzefa Rangawala, Amazon/George Mason University |
12:30 PM - 12:45 PM | Invited Talk #2 | Negar Foroutan Eghlidi, EPFL (Antoine Bosselut) |
12:45 PM - 1:00 PM | Oral Presentation #3 | Ishwar B Balappanawar |
1:00 PM - 1:15 PM | Oral Presentation #4 | Ziyi Shou et al. |
Closing Remarks for the Workshop |
Theme: Improving LLMs with Consistency, Reliability, Explainability, and Safety
NeuroSymbolic and Knowledge-infused Learning
Camera Ready Submission: June 30, 2024 July 7, 2024
We welcome original research papers in four types of submissions:
A skilled and multidisciplinary program committee will evaluate all submitted papers, focusing on the originality of the work and its relevance to the workshop's theme. Acceptance of papers will adhere to the KDD 2024 Conference Template and undergo a double-blind review process. More details regarding submission can also be found at https://kdd2024.kdd.org/research-track-call-for-papers/. Selected papers will be presented at the workshop and published as open-access in the workshop proceedings through CEUR, where they will be available as archival content.
University of Maryland Baltimore County, USA
(Primary Contact)
Email: manas@umbc.edu
Samsung Research, Cambridge, UK
Email: efi.tsamoura@samsung.com
Booz Allen Hamilton, USA
Email: Raff_Edward@bah.com
Amazon, USA
Email: veduln@amazon.com
Ohio State University, USA
Email: srini@cse.ohio-state.edu
Research Lead, NLP at Haltia.AI
Email: andrii.skomorokhov@haltia.ai
PhD student at EPFL
Email: negar.foroutan@epfl.ch
Abstract: In Pedro Dominguez's influential 2012 paper, the phrase "Data alone is not enough" emphasized a crucial point. I've long shared this belief, which is evident in our Semantic Search engine, which was commercialized in 2000 and detailed in a patent. We enhanced machine learning classifiers with a comprehensive WorldModel™, known today as knowledge graphs, to improve named entity, relationship extraction, and semantic search. This early project highlighted the synergy between data-driven statistical learning and knowledge-supported symbolic AI methods, a key idea driving the fast-emerging NeuroSymbolic AI.
LLMs, while impressive in their abilities to understand and generate human-like text, have limitations in reasoning. They excel at pattern recognition, language processing, and generating coherent text based on input. However, their reasoning capabilities are limited by their need for true understanding or awareness of concepts, contexts, or causal relationships beyond the statistical patterns in the data they were trained on. While they can perform certain types of reasoning tasks (e.g., simple logical deductions or basic arithmetic), they often need help with more complex forms of reasoning that require deeper understanding, context awareness, or commonsense knowledge. They may produce responses that appear rational on the surface but lack genuine comprehension or logical consistency. Furthermore, their reasoning does not adapt well to the changing environment (where data and knowledge change) in which the AI model operates.
Solution: Neurosymbolic AI combined with Custom and Compact Models: AI models can be augmented with neurosymbolic methods and external knowledge sources, resulting in compact (small size, high performance) and custom (vertical, addressing specific application/use) models. They can support efficient adaptation to changing data and knowledge. By integrating neurosymbolic approaches, these models acquire a structured understanding of data, enhancing interpretability and reliability (e.g., through verifiability audits using reasoning traces). This structured understanding fosters safer and more consistent behavior and facilitates efficient adaptation to evolving information, ensuring agility in handling dynamic environments. Furthermore, incorporating external knowledge sources enriches the model's understanding and adaptability for the chosen domains, bolstering its efficiency in tackling varied specialized tasks. The small size of these models enables rapid deployment and contributes to computational efficiency, better management of constraints, and faster re-training/fine-tuning/inference. Our current work involves applications to health, autonomous vehicles, and smart manufacturing.
Bio: Professor Amit Sheth is an Educator, Researcher, and Entrepreneur. He founded the university-wide AI Institute at the University of South Carolina (AIISC) in 2019 and grew it to nearly 50 AI researchers in four years. He is a fellow of IEEE, AAAI, AAAS, ACM, and AIAA. Among his awards include the IEEE CS Wallace McDowell Award and IEEE TCSVC Research Innovation Award. He has co-founded four companies and ran two of them. These include Taalee/Semangix, which pioneered Semantic Search (founded 1999), ezDI (founded 2014), which supported knowledge-infused clinical NLP/NLU (founded 2024), and Cognovi Labs (founded 2016), an emotion AI company. He is proud of the success of over 45 Ph.D. advisees and postdocs he has advised/mentored.
Abstract: Dataminr’s AI Platform discovers the earliest signals of events, risks, and threats from billions of multi-modal inputs from over one million public data sources. It uses predictive AI to detect events, generative AI to describe them, and regenerative AI to generate live briefs that continuously update as events unfold. The events discovered by the platform help first responders quickly respond to emergencies, they help corporate security teams respond to risks (including Cyber risks), and they help news organizations discover breaking events to provide fast and accurate coverage. Building and deploying a large-scale AI platform like Dataminr’s is fraught with research and technical challenges. This includes tackling the hardest problem in AI (determining the real-time value of information), which requires combining a multitude of AI approaches. In this talk, I will focus on the role of knowledge in the platform, particularly the role of Knowledge Graphs and how they can be used in conjunction with LLMs in critical real-time applications. I’ll point to the main research challenges in building and leveraging Knowledge Graphs in critical applications.
Bio: Alex Jaimes leads the AI efforts at Dataminr, focusing on leveraging AI to detect and respond to critical events in real-time. His work has significant impacts on first responders, corporate security teams, and news organizations, providing them with the necessary tools to act quickly and accurately in high-stakes situations.
Bio: At AWS AI/ML, Huzefa Rangawala spearheads a team of scientists and engineers, revolutionizing AWS services through advancements in graph machine learning, reinforcement learning, AutoML, low-code/no-code generative AI, and personalized AI solutions. His passion extends to transforming analytical sciences with the power of generative AI. He is a Professor of Computer Science and the Lawrence Cranberg Faculty Fellow at George Mason University, where he also served as interim Chair from 2019-2020. He is the recipient of the National Science Foundation (NSF) Career Award, the 2014 University-wide Teaching Award, Emerging Researcher/Creator/Scholar Award, the 2018 Undergraduate Research Mentor Award. In 2022, Huzefa co-chaired the ACM SIGKDD conference in Washington, DC. His research interests include structured learning, federated learning, and ML fairness inter-twinned with applying ML to problems in biology, biomedical engineering, and learning sciences.