Issue |
Wuhan Univ. J. Nat. Sci.
Volume 29, Number 4, August 2024
|
|
---|---|---|
Page(s) | 349 - 356 | |
DOI | https://doi.org/10.1051/wujns/2024294349 | |
Published online | 04 September 2024 |
Computer Science
CLC number: TP311
Ask Me Any Type: Type Inference Plugin for Partial Code on the Web and in the Integrated Development Environment
问我任何类型:嵌入Web浏览器和集成开发环境(IDE)的代码片段类型推理插件
1
College of Computer and Information Engineering, Jiangxi Normal University, Nanchang 330022, Jiangxi, China
2
The High School Attached to Jiangxi Normal University, Nanchang 330013, Jiangxi, China
† Corresponding author. E-mail: 003484@jxnu.edu.cn
Received:
20
March
2023
Inferring the fully qualified names (FQNs) of undeclared receiving objects and non-fully-qualified type names (non-FQNs) in partial code is critical for effectively searching, understanding, and reusing partial code. Existing type inference tools, such as COSTER and SNR, rely on a symbolic knowledge base and adopt a dictionary-lookup strategy to map simple names of undeclared receiving objects and non-FQNs to FQNs. However, building a symbolic knowledge base requires parsing compilable code files, which limits the collection of APIs and code contexts, resulting in out-of-vocabulary (OOV) failures. To overcome the limitations of a symbolic knowledge base for FQN inference, we implemented Ask Me Any Type (AMAT), a type of inference plugin embedded in web browsers and integrated development environment (IDE). Unlike the dictionary-lookup strategy, AMAT uses a cloze-style fill-in-the-blank strategy for type inference. By treating code as text, AMAT leverages a fine-tuned large language model (LLM) as a neural knowledge base, thereby preventing the need for code compilation. Experimental results show that AMAT outperforms state-of-the-art tools such as COSTER and SNR. In practice, developers can directly reuse partial code by inferring the FQNs of unresolved type names in real time.
摘要
推理代码片段中未声明的接收对象和非完全限定类型名称(非FQNs)的完全限定名称(FQNs)对于有效搜索、理解和重用代码片段至关重要。现有的类型推断工具,如COSTER和SNR,依赖于符号知识库并采用字典查找策略,将未声明的接收对象和非FQNs的简单名称映射到FQNs。然而,构建符号知识库需要解析可编译的代码文件,它限制了API和代码上下文的收集,导致待搜索的FQN不在符号知识库范围。为克服符号知识库在FQN推理中的局限性,本文实现了一种嵌入Web浏览器和集成开发环境(IDE)的类型推理插件——Ask-Me-Any-Type(AMAT)。AMAT使用填空式策略而不是字典查找策略进行类型推理,通过将代码视为文本,把经过微调的大型语言模型(LLM)作为神经知识库,避免了代码编译的需要。实验结果表明,AMAT的性能优于COSTER和SNR等工具。在实践中,开发人员可以运用AMAT实时推理未解析类型名称的FQNs,直接重用代码片段。
Key words: type inference / large language model / prompt learning / web and integrated development environment (IDE) plugin
关键字 : 类型推理 / 大型语言模型 / 提示学习 / 网页和集成开发环境插件
Cite this article: CHENG Yu, HUANG Guanming, WU Yishun, et al. Ask Me Any Type: Type Inference Plugin for Partial Code on the Web and in the Integrated Development Environment[J]. Wuhan Univ J of Nat Sci, 2024, 29(4): 349-356.
Biography: CHENG Yu,male, undergraduate, research direction: intelligent software engineering. E-mail: yc@jxnu.edu.cn, 2905926811@qq.com
Fundation item: Supported by the Key Scientific and Technological Research Projects of the Jiangxi Provincial Department of Education ( GJJ2200303) and the National Social Science Foundation Major Bidding Project (20&ZD068)
© Wuhan University 2024
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.