读paper22-arxiv代码修复论文组

CoSIL: Software Issue Localization via LLM-Driven Code Repository Graph Searching

相当于一个分级搜索，首先针对issue描述，基于Module Call Graph搜索搜索相关代码文件，然后针对相关代码文件，构建Function Call Graph，搜索相关函数。

相对于抽象语法树，函数调用图确实可以更精确的反应函数间的上下文关系，在缺陷定位与上下文收集上会更加精确。而且这种逐级展开的方式，也可以一定程度上避免LLM的上下文窗口限制问题（尤其是研究中两个Call Graph都是LLM构建的）。

感觉也可以加一个回溯机制，如果无法在函数调用图中找到目标，回溯到模块调用图的剩余子图中继续搜索。

对于d4j类的数据集，则可以更简单一些，直接正则提取测试用例中的函数，在函数调用图中搜索，尤其是对于被测函数不会直接触发报错而不会体现在异常栈中的测试。

研究通过平衡补丁迭代次数与生成的补丁数量，探究最佳的生成策略。文中探究了如下几种策略：

Strategy A (10×1): Generate ten outputs in a single iteration.
Strategy B (8-2): Generate eight outputs in the first iteration, and two outputs in the next iteration.
Strategy C (5×2): Generate five outputs per iteration over two iterations.
Strategy D (6-2-2): Generate six outputs in the first iteration, and two outputs in the next two iterations.
Strategy E (4-3-3): Generate four outputs in the first iteration, and three outputs in the next two iterations.
Strategy F (2×5): Generate two outputs per iteration over five iterations.
Strategy G (1×10): Generate one output per iteration over ten iterations.