Navigation auf uzh.ch

Suche

Department of Informatics Blockchain and Distributed Ledger Technologies

AI-Enhanced Blockchain Analytics: An LLM-Powered Approach

Level: MA
Contact Person: Mostafa Chegeni
Keywords: Natural Language Processing (NLP), Large Language Models (LLMs), Blockchain Data Analysis, Prompt Engineering


This master's thesis introduces a Python-based application designed for streamlined data extraction and analysis from the Cardano blockchain. Leveraging advanced artificial intelligence, inspired by models like GitHub’s Copilot [1], CodexDB [2], and GPT-DB [3], the tool uses AI to interpret and execute data analysis tasks based on natural language prompts. It involves a customized GPT model, specifically tailored for the Cardano blockchain data structure in a PostgreSQL database [4], to accurately interpret user prompts and generate Python code and SQL queries.
The process starts with the user inputting analysis instructions as natural language prompts. These are processed by the customized GPT model through an OpenAI Assistant API [5], generating a Python script with SQL queries. This script, a crucial output of the model, extracts and analyzes the requested data according to user instructions.
The application then executes the Python script, facilitating data retrieval, analysis, and visualization based on user prompts. The result is a visual representation of the analyzed data, demonstrating a seamless integration of user input and AI-driven output.
A key aspect of the research is prompt engineering [6,7], which examines how different prompt styles and formats affect AI model performance. This exploration aims to optimize the model's accuracy and response usefulness, crucial for efficient data analysis.
In conclusion, the thesis presents an innovative approach to blockchain data analysis using AI and machine learning, contributing to both blockchain data analysis and the broader field of data science and analytics.

References:

[1] Nat Friedman. 2021. Introducing GitHub Copilot: your AI pair programmer. https://github.blog/2021-06-29-introducing-github-copilot-ai-pairprogrammer/ (2021).
[2] Trummer, I., 2022. CodexDB: Synthesizing code for query processing from natural language instructions using GPT-3 Codex. Proceedings of the VLDB Endowment, 15(11), pp.2921-2928.
[3] Trummer, I., 2023. Demonstrating GPT-DB: Generating Query-Specific and Customizable Code for SQL Processing with GPT-4. Proceedings of the VLDB Endowment, 16(12), pp.4098-4101.
[4] IOHK, “Schema documentation for cardano-db-sync.” https://github.com/input-output-hk/cardano-db-sync/blob/master/doc/schema.md.
[5] OpenAI, “Create assistant.” https://platform.openai.com/docs/api-reference/assistants/createAssistant
[6] Ekin, S., 2023. Prompt engineering for ChatGPT: A quick guide to techniques, tips, and best practices. Authorea Preprints.
[7] Marvin, G., Hellen, N., Jjingo, D. and Nakatumba-Nabende, J., 2023, June. Prompt Engineering in Large Language Models. In International Conference on Data Intelligence and Cognitive Informatics (pp. 387-402). Singapore: Springer Nature Singapore.