Generating Analytic Specifications for Data Visualization from Natural Language Queries using Large Language Models
This was presented at IEEE VIS NLVIZ Workshop 2024
Released in 2024, this version enables developers to utilize a Large Language Model (GPT) to translate a natural language query about a dataset into a relevant visualization, including additional features such as multi-turn conversational interaction and ambiguity resolution. We present a comprehensive text prompt that, given a tabular dataset and an NL query about the dataset, generates an analytic specification including (detected) data attributes, (inferred) analytic tasks, and (recommended) visualizations. This specification captures key aspects of the query translation process, affording both explainability and debuggability. For instance, it provides mappings from the detected entities to the corresponding phrases in the input query, as well as the specific visual design principles that determined the visualization recommendations. Moreover, unlike prior LLM-based approaches, our prompt supports conversational interaction and ambiguity detection capabilities. In our paper, we detail the iterative process of curating our prompt, present a preliminary performance evaluation using GPT-4, and discuss the strengths and limitations of LLMs at various stages of query translation. Check it out at https://nl4dv.github.io/nl4dv/