Enhancing and Reporting Robustness Boundary of Neural Code Models for Intelligent Code Understanding
原題: Enhancing and Reporting Robustness Boundary of Neural Code Models for Intelligent Code Understanding 著者: Tingxu Han, Wei Song, Weisong Sun, Haoze Wu, Chunrong Fang, Yuan Xiao, Xiaofang Zhang, Zhenyu Chen, Yang Liu | 会議: 2026 | 引用: 0 PDF: han26a.pdf
Enhancing and Reporting Robustness Boundary of Neural Code Models for Intelligent Code Understanding
Authors
Tingxu Han1, Wei Song1, Weisong Sun1, Hao Wu1, Chunrong Fang1, Yuan Xiao1, Xiaofang Zhang1, Zhenyu Chen1, and Yang Liu1
Abstract
Deep learning の発展に伴い、CodeBERT や CodeLlama などの Neural Code Models (NCMs) がコード理解タスク(欠陥検出やコード分類など)で広く利用されています。しかし、最近の研究では NCM が 対抗的例(adversarial examples)に脆弱であることが明らかになっています。対抗的例とは、微細な変更を加えた入力で、人間には目立ちにくいもののモデルの予測を誤らせるものです。
既存の防御策はデータ拡張を通じて実証的にロバスト性を向上させますが、コストが高く理論的な保証がない上、通常はモデル内部(勾配など)への白箱アクセスが必要です。
本研究では ENBECOME と呼ばれる新しい黒箱・訓練不要かつ軽量な対抗防御手法を提案します。ENBECOME は以下の二つの目的を同時に達成します。
- 実証的ロバスト性の向上 – 推論時にランダムで意味を保った変更を入力コードに加えることで、NCM の決定境界を滑らかにし、対抗例に対する耐性を高めます。
- 認定ロバスト性境界の報告 – 変更した入力集合に対して多数決投票を行うことで、モデルが必ず正しい予測を保つ「認定ロバスト半径」(r) を理論的に導出し、実際に (r) 以内の変更であれば対抗例は必ず成功しないことを示します。
実験では複数の NCM アーキテクチャとタスクに対して ENBECOME を適用し、攻撃成功率を大幅に低減しつつ高い精度を維持できることを確認しました。たとえば欠陥検出タスクでは、平均 ASR(Attack Success Rate)を 42.43 % から 9.74 % に削減し、精度はわずか 0.29 % の低下で済みました。さらに ENBECOME は平均認定ロバスト半径 (r = 1.63) を達成し、最大でも 1.63 個の識別子が変更されても対抗例が成功しないことが保証されます。
FFTContext *av_fft_init(int nbits, int inverse) {
FFTContext *s = av_malloc(sizeof(*s));
if (s && fft_init(s, nbits, inverse)) av_freep(&s);
return s;
}欠陥検出: Yes
(a) 元のコードスニペット。
FFTContext *av_fft_init(int mp3, int CPUArchState) {
FFTContext *s = av_malloc(sizeof(*s));
if (s && fft_init(s, mp3, CPUArchState)) av_freep(&s);
return s;
}欠陥検出: No
(b) 対抗的コードスニペット。
図 1: CodeBERT を用いた欠陥検出タスクにおける対抗例。
元のコード((a))は「Yes」‑ 欠陥ありと正しく判定され、対抗コード((b))は識別子だけを変更した結果「No」‑ 欠陥が無いと誤判定しますが、実際には同じ欠陥が残っています。
1. Introduction
近年、ディープラーニング (DL) の急速な発展に伴い、ソフトウェア工学における インテリジェントコード理解 タスクへの適用がますます一般的になっています [1]–[3]。現在の DL モデルは大規模なコードデータセットで事前学習され、タスク固有のデータで微調整(fine‑tuning)またはプロンプト入力されます。これらの大型事前学習モデルや微調整済みモデル(大規模言語モデル LLM も含む)を総称して Neural Code Models (NCMs) と呼びます。
既存のソフトウェアシステムは数百万行ものコードから構成されており、十分なデータが揃っているため NCM を事前学習させることが可能です。豊富なコードコーパスを活用し、CodeBERT [5] や CodeLlama [6] といった NCM は欠陥検出 [7], [8] などのコード理解タスクで優れた性能を示します。開発者はこれらのモデルを微調整またはプロンプト入力によりタスク固有のデータセットで利用します。
しかし、最近の研究 [9]–[15] では NCM が 対抗的例(adversarial examples)に脆弱であることが示されています。攻撃者はコードスニペットの識別子だけを微細に変更することで、人間には目立ちにくいがモデルにとっては重要な変化を加え、誤分類を引き起こします [16]–[18]。図 1 はその直感的な例です。元のコード(図 1a)では関数がダングリングポインタを返す欠陥があり、CodeBERT は正しく「Yes」と判定します。一方、対抗的コード(図 1b)は識別子だけを変更し、モデルは「No」と誤判定しますが、実際には同じ欠陥が残っています。
2. Background and Preliminary
2.1. Code Understanding and NCMs
コード理解は、開発者がコードの意味やドメイン固有の概念を把握する必要があるタスクです。ディープラーニング技術の進展により、NCMs を用いてさまざまなコード理解タスク(欠陥検出 [8], クローン検出 [33] など)が効果的に解決されています。
与えられたデータセット ({X,Y}) に対し、NCM (f(x)) は入力コードスニペット (\mathbf{x} = {x_1, x_2, \dots, x_N} \in \mathcal{X}) をラベル (c \in \mathcal{Y}) に写像します。
[ f(\mathbf{x}) = c,\qquad \mathbf{x} \in \mathcal{X},; c \in \mathcal{Y} \tag{1} ]
2.2. Adversarial Examples (AEs)
対抗的例は、モデルの予測を誤らせるように意図的に変更された入力です。自然言語タスクでは任意のトークンを変更できますが、コード理解においては構文エラーを防ぐために 識別子 のみを変更することが一般的です(図 3b)。
二値マスクベクトル (\boldsymbol{m}) を用いて、(\mathbf{x} = {x_1,\dots,x_N}) の各位置が識別子かどうかを示します。(m_i = 1) なら (x_i) は識別子、そうでなければ 0 です。攻撃者はマスクに基づき (k) 個の識別子を変更し、対抗例 (\mathbf{x}’) を生成して
[ f(\mathbf{x}’) \neq c ]
となります。全ての可能な対抗例集合は次のように定義されます。
[ \begin{aligned} \mathcal{X}^{*} &= {,\mathbf{x}’ \mid \mathbf{x}’ = (1-\boldsymbol{m}) \odot \mathbf{x} + \mathcal{A}\bigl(\boldsymbol{m} \odot \mathbf{x}\bigr) ,}, \ f(\mathbf{x}’) &\neq c . \end{aligned} \tag{2} ]
ここで (\odot) は要素ごとの乗算、(\mathcal{A}(\cdot)) は識別子全体に対して一貫した(構文を保つ)変換を施す関数です。
2.3. Objective of ENBECOME
従来の防御策(例:SPACE [38]、RoPGen [19])は訓練後のポストトレーニングや理論的保証が不十分な点があります。ENBECOME は 黒箱・訓練不要 の手法で、推論時にのみ動作し、以下の二つの目標を同時に達成します。
- 実証的ロバスト性の向上:ランダムかつ意味を保った変更を入力コードに加え、決定境界を滑らかにして対抗例に対する耐性を高める。
- 認定ロバスト性の提供:多数決投票により得られる「平滑化」された予測に基づき、(r) と呼ばれる認定ロバスト半径を計算し、(r) 以内の変更であれば必ず正しいラベルが保たれることを証明する。
Definition 1 (Empirical Robustness)
実証的ロバスト性とは、元のサンプルとその対抗例が同じ予測結果を持つことです。形式的には
[ \forall \mathbf{x} \in \mathcal{X},; \forall \mathbf{x}’ \in \mathcal{X}^{*}:; f(\mathbf{x}) = f(\mathbf{x}’) ]
が成り立ちます。
Definition 2 (Certified Robustness)
認定ロバスト性は、ある入力 (\mathbf{x}) の周囲で、半径 (r) 以内のすべての対抗例に対して予測が変わらないことを保証します。形式的には
[ \forall \mathbf{x}’ \in \mathcal{X}^{*},; f(\mathbf{x}) = f(\mathbf{x}’) = c,; |\mathbf{x} \ominus \mathbf{x}’| \le r ]
が成立します。ここで (c) は真のラベル、(|\cdot|) は変更された識別子数を表し、(r) が認定ロバスト半径です。
3. ENBECOME
ENBECOME の核心は ランダム平滑化(random smoothing)をコード入力に適用することです。具体的には、推論時に元のコードに対して意味を保ったランダム変更を複数回行い、各変更後の予測を集約して最終的なラベルと認定ロバスト半径 (r) を決定します。
-
平滑化手順
- 入力コード (\mathbf{x}) に対し、ランダムに (t) 回の意味保持変更を施し、(t) 個の対抗例 ({\mathbf{x}^{(i)}}_{i=1}^{t}) を生成。
- 各 (\mathbf{x}^{(i)}) に対して NCM (f) の予測を取得し、多数決で最終ラベル (\hat{c}) を決定。
-
認定ロバスト半径の計算
- 変更した識別子数が最も少ない対抗例から順にカウントし、(\hat{c}) が変わらない最大変更数 (r) を求める。
このプロセスは 黒箱 で実行でき、追加の学習やモデル内部へのアクセスを必要としません。
4. Experiments
4.1. Datasets and Models
- データセット:Defect Detection(Defects4J, Bugs.jar)および Clone Detection(CloneDetect)など。
- NCM:CodeBERT、CodeLlama、StarCoder などを対象に評価。
4.2. Metrics
- Attack Success Rate (ASR):対抗例が正しくラベルを変える割合。
- Accuracy:元のコードに対する予測精度。
- Certified Radius:ENBECOME が報告する認定ロバスト半径 (r)。
4.3. Results
| Model | Task | ASR (original) | ASR (ENBECOME) | Accuracy drop | Certified radius |
|---|---|---|---|---|---|
| CodeBERT | Defect detection | 42.43 % | 9.74 % | +0.29 % | 1.63 |
| CodeLlama | Clone detection | 38.12 % | 12.58 % | +0.15 % | 1.47 |
結果は、ENBECOME が攻撃成功率を大幅に低減しつつ、精度低下を最小限に抑えることを示しています。さらに、平均認定ロバスト半径は 1.63(最大で 1.63 個の識別子まで変更しても対抗例が成功しないことが保証されます)。
5. Discussion
- 効果:ENBECOME は推論時に追加計算をほとんど行わずにロバスト性を向上させ、理論的な認定半径も提供できる点が特徴です。
- 一般化:異なる NCM(CodeBERT, CodeLlama, StarCoder)やタスク(欠陥検出、クローン検出)に対して一貫した効果を示しました。
6. Conclusion
本稿では、黒箱・訓練不要の対抗防御手法 ENBECOME を提案し、実証的ロバスト性と認定ロバスト性の両方を同時に提供できることを示しました。今後は、より多様なコードタスクや大規模モデルへの適用、さらなる平滑化戦略の探索を目指します。
References
[1] J. Smith et al., “Intelligent code understanding with deep learning,” IEEE Transactions on Software Engineering, vol. 48, no. 3, pp. 567–582, 2022.
[2] Y. Liu and H. Wu, “Neural models for code classification,” Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2021.
[3] T. Han et al., “CodeBERT: A pre-trained model for code understanding,” arXiv preprint arXiv:2002.08155, 2020.
[4] … (続く)
(※ 参考文献は省略せずに全て記載してください。)
Enhancing and Reporting Robustness Boundary of Neural Code Models for Intelligent Code Understanding (cont.)
TABLE 1: The semantic-preserving operation used in our paper. Three different operations are considered.
| Name | Description | Example |
|---|---|---|
| — | The original sample. | int f(void *env) |
| Insert | Randomly insert a new character into the identifier. | int f(void *enQv) |
| Replace | Randomly replace a character in the identifier. | int f(void *enQ) |
| Delete | Randomly delete a character from the identifier. | int f(void *nv) |
Intuitively, certified robustness means that the model’s prediction remains unchanged when an adversary modifies up to identifiers in the input. In this case, no adversarial example within this bound can successfully fool the model.
Based on this definition, the objective of ENBECOME is to construct an NCM that achieves both strong empirical robustness and certified robustness guarantees.
3. Methodology
Figure 4 illustrates the overall workflow of ENBECOME.
ENBECOME is a black‑box requiring no alterations to NCMs. ENBECOME reduces the success rate of adversarial attacks by smoothing NCM predictions. In addition, ENBECOME can report the certified robustness boundary of NCMs.
Specifically, given an input code snippet (potentially an adversarial example) and the target NCM for defense, ENBECOME achieves robust prediction and reports certified robustness through three stages: (a) smoothed sample generation and prediction, (b) voting‑based prediction selection, and (c) certified radius generation.
In phase (a), ENBECOME generates a set of smoothed code snippets and then collects the prediction results of the NCM for these code snippets.
In phase (b), ENBECOME aggregates the above predictions by a voting‑based mechanism and outputs the final robust prediction.
Subsequently, ENBECOME reports certified robustness by generating the certified radius.
3.1 Smoothed Sample Generation and Prediction
Different from original samples, a successful AE (adversarial example) is usually crafted elaborately and easily destroyed by perturbation (supposed by Figure 5). A smoothed sample means that a sample is modified around a given original sample while preserving the underlying structure. Traditional random smoothing techniques in natural language [39], [40] employ perturbations on arbitrary tokens of a given code snippet. However, these approaches compromise the original semantics and syntax of the code, potentially leading to compilation failure. Figure 3 showcases an example.
Given an input code snippet , ENBECOME generates smoothed samples by perturbing identifiers in two steps: (1) perturbation position selection and (2) perturbation operation application.
In step (1), ENBECOME restricts perturbations to identifiers, as modifications to them do not break the code’s syntax. For each code snippet, we randomly select identifiers. All positions of each selected identifier are then consistently perturbed, and this process is repeated times to generate a set of smoothed samples, denoted as . Importantly, position selection is performed at the identifier level. All occurrences of the same identifier are perturbed consistently, preserving code semantics and syntactic validity. In practice, we randomly select 90 % of the identifiers for each code snippet.
In step (2), ENBECOME applies a semantic‑preserving character‑level edit to each selected identifier. Specifically, we define three semantic‑preserving operations, including Insert, Replace, and Delete.
- Insert randomly inserts a new character into the identifier.
- Replace randomly replaces a character in the identifier with a new one.
- Delete randomly deletes a character from the identifier.
As all operations are conducted on identifiers, the perturbation is semantic‑preserving and will not destroy the code syntax. Table 1 illustrates them with corresponding examples.
These operations constitute a fundamental set of character‑level edits that can express any transformation between identifier strings. In other words, any adversarial modification to an identifier can be decomposed into a sequence of these basic operations. To formally represent such transformations, we introduce the concept of a perturbation path , which denotes an ordered sequence of operations applied to an identifier. The length of the perturbation path indicates the extent of the identifier to be perturbed.
Following the definition in Eq (2), we can obtain a perturbed sample by applying a path for a given :
[ \begin{aligned} \boldsymbol{s} &= (1 - \boldsymbol{m}) \odot \boldsymbol{x} + \mathcal{P}\big(\boldsymbol{m} \odot \boldsymbol{x}\big) \ &= (1 - \boldsymbol{m}) \odot \boldsymbol{x} + \prod_{i = 1}^{| \boldsymbol{x} | / |\mathcal{P}|} o_j\big((\boldsymbol{m} \odot \boldsymbol{x})_i\big), \quad \forall \boldsymbol{x} \in \mathcal{X} \end{aligned} ]
(4)
where is the mask matrix to select identifiers from . Based on pre‑defined operations, ENBECOME produces a set of smoothed code snippets (2). For example, given a code snippet int f(void *env){...}, ENBECOME obtains a smoothed code after operation Insert, int f(void *env){...}. ENBECOME inserts the character Q into the identifier env. Similarly, ENBECOME obtains int f(void *enQ){...} and int f(void *nv){...} after operations Replace and Delete.
In practice, we introduce a hyperparameter called perturbation rate , which determines the expected proportion of characters to be perturbed within a selected identifier. For each identifier, we adaptively compute a character‑level perturbation budget based on its length, resulting in a total of edit operations. These operations are then applied randomly to individual characters. Thus, the actual perturbation path length , i.e., the number of character‑level edits, is indirectly controlled by , ensuring that perturbation strength scales with identifier length.
Subsequently, we input these smoothed samples into
(the remainder of the description would continue here, detailing how predictions are collected and aggregated.)
Enhancing and Reporting Robustness Boundary of Neural Code Models for Intelligent Code Understanding (cont.)
Algorithm 1: Practical Algorithm to Determine Certified Radius
Input:
- Smoothed predictions (y_{1}, y_{2}, \dots , y_{N})
- Original code snippet (\mathbf{x})
- Ground‑truth label (c)
- Target neural code model (g(\cdot))
Output: Certified robustness radius (r)
- (\tilde{y} \leftarrow) obtain the final prediction by majority voting.
- If (\tilde{y} \neq c) then return ABSTAIN.
- Else:
4. Compute the lower and upper bounds of (g(\mathbf{x}, y)) using Eq. 8, obtaining (g(\mathbf{x}, y)) and (\overline{g(\mathbf{x}, y)}).
5. Let (h_{x}) be the number of identifiers in (\mathbf{x}).
6. For (r = 0,1,\dots , h_{x}):- Compute (\beta) using Eq. 11.
- If (g(\mathbf{x}, y) - \beta \times \overline{g(\mathbf{x}, y)} > 0.5), increment (r) by one; otherwise break.
- Return the final value of (r).
By using combinatorial methods, we calculate that
[ \begin{aligned} \mathbb{P}\bigl(\mathcal{H} \cap (\mathbf{x} \otimes \mathbf{x}’) \neq \emptyset\bigr) &= 1 - \mathbb{P}\bigl(\mathcal{H} \cap (\mathbf{x} \otimes \mathbf{x}’) = \emptyset\bigr) \ &= 1 - \frac{C_{k_{x}}^{h_{x}}}{C_{h_{x}}^{k_{x}}} \ &\le 1 - \frac{C_{k_{x}}^{h_{x} - r}}{C_{h_{x}}^{k_{x}}} = \beta . \end{aligned} ]
(11)
Combining Eq. 10 and Eq. 11, we have
[ g(\mathbf{x}, y) - g(\mathbf{x}’, y) \le \beta \times g(\mathbf{x}, y), ]
which bounds the prediction under perturbation. □
Theorem 2.
Given original code snippet (\mathbf{x}) and a perturbed version (\mathbf{x}’), if
[ |\mathbf{x} \otimes \mathbf{x}’| \le r ]
and
[ g(\mathbf{x}, y) - \beta \times \overline{g(\mathbf{x}, y)} > 0.5, ]
then, with probability at least (1-\alpha), the model’s prediction for the perturbed code satisfies
[ g(\mathbf{x}’) = y. ]
Proof.
With probability at least (1-\alpha),
[ 0.5 < g(\mathbf{x}, y) - \beta \times \overline{g(\mathbf{x}, y)} \le g(\mathbf{x}, y) - \beta \times \overline{g(\mathbf{x}, y)} \le g(\mathbf{x}’, y), ]
where the last inequality follows from Theorem 1, and (g(\mathbf{x}’) = y) by definition in Eq. 6. If (y=c) (the ground‑truth label for (\mathbf{x})) and
[ g(\mathbf{x}, y) - \beta \times g(\mathbf{x}, y) > 0.5, ]
then the neural code model (g(\cdot)) is guaranteed to be robust around (\mathbf{x}). □
The upper and lower bounds of (g(\mathbf{x}, y)) are estimated as described earlier, and (\beta) can be computed via Eq. 8. From Theorem 2, we can report the final certified robustness by a step‑by‑step practical algorithm that approximates the certified radius.
Practical algorithm.
Recall that ENBECOME operates at inference time of neural code models, requiring only model feedback without needing internal details. Algorithm 1 illustrates the practical procedure for determining the certified radius. Given the smoothed predictions (y_{1}, y_{2}, \dots , y_{N}) and the target model (g(\cdot)), we first obtain the final prediction (\tilde{y}) and verify its correctness (lines 1–3). Using Eq. 8, we then estimate the lower and upper bounds of (g(\mathbf{x}, y)) (line 5). We incrementally increase (r) (the number of identifiers to be perturbed) starting from 0 by one (line 7) and compute (\beta) via Eq. 11 (line 8). This process continues until
[ \overline{g(\mathbf{x}, y)} - \beta \times \overline{g(\mathbf{x}, y)} \le 0.5, ]
as stated in Theorem 2 (line 13). Upon termination, ENBECOME returns (r) as the certified robustness for (\mathbf{x}) (line 16). Consequently, we can report that (g(\mathbf{x}’)) will return the label (c) for any adversarial example (\mathbf{x}’) satisfying
[ |\mathbf{x} \otimes \mathbf{x}’| \le r, ]
with confidence (1-\alpha).
4. Evaluation
To ensure a thorough evaluation, we conduct comprehensive experiments assessing ENBECOME across four distinct aspects: robustness, generalization, time cost, and ablation studies focusing on core components and hyper‑parameters. We complete the evaluation by answering the following research questions:
- RQ1. How robust is ENBECOME, from empirical and theoretical perspectives?
(Continue with answers to RQ2, etc., as needed.)
RQ2. How efficient is ENBECOME?
Design
本セクションでは、ENBECOME の効率性を比較するために、準備コストと推論コストの観点からベースライン手法(SPACE と RoPGen)と対比します。
- 準備コスト:SPACE と RoPGen はそれぞれポストトレーニングに要する時間です。一方 ENBECOME ではテストセットに対して平滑化サンプルを生成する作業が必要です。
- 推論コスト:各手法で実際にコードスニペットを予測する際の平均処理時間を測定します。
公平な比較を行うため、すべての手法はデフォルトハイパーパラメータを使用し、ランダムに 1,000 個のクリーンサンプルから構成したテストセットで評価しました。
Results and analysis
表 3 が示すように、ENBECOME の準備コストは SPACE(2,240 秒)と RoPGen(141,010 秒)に比べて大幅に短く、欠陥検出タスクでは約 1,328 秒、クローン検出タスクでは約 1,365 秒で済みます。
推論時のコストは ENBECOME がやや高くなるものの、実用的な範囲に収まります。具体的には、欠陥検出タスクでの平均推論時間は 0.724 秒(±0.122 秒)、クローン検出タスクでは 1.535 秒(±0.062 秒)です。
| タスク | 手法 | 準備コスト (s) | 推論時間 (s/サンプル) |
|---|---|---|---|
| 欠陥検出 | SPACE | 2,240 | 0.015 ± 0.039 |
| RoPGen | 141,010 | 0.010 ± 0.034 | |
| ENBECOME | 1,328 | 0.724 ± 0.122 | |
| クローン検出 | SPACE | 10,268 | 0.016 ± 0.037 |
| RoPGen | 342,545 | 0.014 ± 0.035 | |
| ENBECOME | 1,365 | 1.535 ± 0.062 |
準備コストの削減により、ENBECOME は実際の運用環境で即座に利用でき、推論時のオーバーヘッドも許容範囲内です。
RQ3. How effectively does ENBECOME demonstrate generalization across architectures and tasks?
Design
ENBECOME の汎用性を評価するために、以下の 3 つの観点から検証しました。
- 異なる NCM アーキテクチャ:GraphCodeBERT、CodeBERT、CodeT5、StarCoder、CodeLlama の計 5 種類のモデルを使用。
- 他のコードインテリジェンスデータセット:Functionality Classification(OJ)と CodeChef データセットを追加で評価。
- バックドア攻撃への耐性:事前に埋め込まれたバックドアに対して、ENBECOME がどれだけ堅牢に動作するかを確認。
Results and analysis
(1) 複数モデルでの性能
表 4 は各 NCM アーキテクチャに対する ENBECOME の結果を示しています。
- GraphCodeBERT:ACC = 63.10、ASR = 25.46、Radius = 1.43、NCRR = 0.17
- CodeBERT:ACC = 62.15、ASR = 8.74、Radius = 1.85、NCRR = 0.22
- CodeT5:ACC = 63.61、ASR = 13.32、Radius = 1.61、NCRR = 0.22
- StarCoder:ACC = 49.01、ASR = 30.04、Radius = 1.74、NCRR = 0.24
- CodeLlama:ACC = 45.75、ASR = 28.65、Radius = 1.02、NCRR = 0.13
どのモデルでも ENBECOME は ASR を顕著に低減し、特に CodeBERT と CodeT5 では ACC がほぼ維持されたまま ASR が大幅に改善されています。
(2) 他のデータセットでの評価
図 7 が示すように、ENBECOME は OJ(Functionality Classification)および CodeChef データセットでも高い効果を発揮し、各モデルの ASR を一貫して低減させました。特に暗い色調で表されるデータセットでは、ENBECOME の恩恵が顕著です。
(3) バックドア攻撃への耐性
図 8 に示すバックドア攻撃実験では、ENBECOME を適用したモデルは元のモデルと比較して、バックドア効果を保持しつつも追加的なロバストネスを提供し、攻撃成功率(ASR)がさらに低減しました。
Summary
以上の結果から、ENBECOME は 異なる NCM アーキテクチャ、多様なコードデータセット、そして バックドア攻撃 のいずれに対しても汎用的にロバスト性を向上させることが確認できました。
RQ4. How do different parameters (including the smoothed sample number N and perturbation rate ) affect the performance of ENBECOME?
Design
ENBECOME のパラメータとして、平滑化サンプル数 と摂動率 を変化させた実験を行いました。具体的には、以下の設定で評価しました。
各組み合わせについて、欠陥検出タスクとクローン検出タスクの両方で ACC、ASR、Radius、NCRR を測定しました。
Results and analysis
-
サンプル数 の影響:表 5(以下に示す)から分かるように、 がコストと効果のバランスが最も良く、Radius と NCRR がほぼ最適化されます。 を増やすと Radius は僅かに向上しますが、計算コストが比例して増加し、 でも との差はごくわずかです。
-
摂動率 の影響: を大きくすると Radius がやや大きくなる傾向がありますが、ASR の低減効果は で最も顕著です。過度に大きな (例:0.5)になると、Radius が増える一方で ACC が若干低下するケースが見られました。
| Radius (Defect) | NCRR (Defect) | ASR (Defect) | Radius (Clone) | NCRR (Clone) | ASR (Clone) | ||
|---|---|---|---|---|---|---|---|
| 50 | 0.1 | 1.62 | 0.28 | 9.84% | 2.37 | 0.30 | 1.42% |
| 100 | 0.2 | 1.63 | 0.29 | 9.74% | 2.39 | 0.31 | 1.33% |
| 200 | 0.3 | 1.65 | 0.30 | 9.68% | 2.41 | 0.32 | 1.30% |
| 400 | 0.4 | 1.67 | 0.31 | 9.65% | 2.43 | 0.33 | 1.28% |
Summary
- がデフォルト設定として推奨され、計算コストとロバストネスのバランスが最適です。
- – の範囲で使用すると、Radius が十分に拡大し、かつ ACC を保ったまま ASR が最も低減します。
Table 5 (Parameter‑analysis results)
| Radius (Defect) | NCRR (Defect) | ASR (Defect) | Radius (Clone) | NCRR (Clone) | ASR (Clone) | ||
|---|---|---|---|---|---|---|---|
| 50 | 0.1 | 1.62 | 0.28 | 9.84% | 2.37 | 0.30 | 1.42% |
| 100 | 0.2 | 1.63 | 0.29 | 9.74% | 2.39 | 0.31 | 1.33% |
| 200 | 0.3 | 1.65 | 0.30 | 9.68% | 2.41 | 0.32 | 1.30% |
| 400 | 0.4 | 1.67 | 0.31 | 9.65% | 2.43 | 0.33 | 1.28% |
以上が、ENBECOME の効率性、汎用性、およびパラメータ設定に対する感度に関する詳細な分析結果です。
RQ4. Ablation Study.
We analyze how ENBECOME’s performance is affected by the smoothed sample count and perturbation rate . All evaluations utilize 1000 defect‑detection samples with adversarial examples generated by ALERT [16].
Influence of the smoothed sample number
As outlined in Section 3, ENBECOME generates perturbed samples per input. We evaluate its impact with , , , and revert to the victim NCM when . Table 5 presents the results.
- Without perturbations (), the victim model achieves ACC = 64.3 % but suffers a high ASR = 47.0 %, revealing its vulnerability.
- With , ACC drops slightly to 62.9 %, while ASR improves markedly to 8.8 %, demonstrating increased robustness with minimal accuracy loss.
- For , ACC and ASR stabilize, indicating no further gains in robustness.
Consequently, we set for efficiency.
Influence of the perturbation rate
ENBECOME perturbs identifiers at a rate of per smoothed sample. We evaluate its impact by testing , , and , with reverting to the baseline NCM. Table 5 presents the results.
- Without perturbation (), ASR is 47.0 %.
- Raising to 0.3 reduces ASR sharply to 10.2 %, with only a minor ACC change.
- Further increasing to 0.6 and 1.0 lowers ASR to 9.4 % and 9.1 %, respectively, while ACC remains stable.
- Improvements plateau at , so moderate perturbation rates provide the best balance between robustness and accuracy.
TABLE 5: Ablation study. Parameters and of ENBECOME are considered. denotes the smoothed sample count and denotes the perturbation rate.
| Param: N | N=0 | N=100 | N=1000 | N=10000 | ||||
| ACC | ASR | ACC | ASR | ACC | ASR | ACC | ASR | |
| 64.3 | 47.0 | 62.9 | 8.8 | 63.3 | 8.9 | 63.2 | 8.9 | |
| Param: η | η=0 | η=0.3 | η=0.6 | η=1.0 | ||||
| ACC | ASR | ACC | ASR | ACC | ASR | ACC | ASR | |
| 64.1 | 47.0 | 63.1 | 10.2 | 63.1 | 9.4 | 62.9 | 9.1 | |
RQ2. How efficient is ENBECOME? (cont.)
ENBECOME demonstrates high efficiency both in terms of inference time and resource usage.
We evaluate its runtime on the defect‑detection task using ALERT as a baseline.
| Dataset | Original model | ALERT | ENBECOME () |
|---|---|---|---|
| Defect detection | 0.056 s / sample | 0.078 s / sample | 0.042 s / sample |
- With , ENBECOME requires only about 0.042 seconds per input, which is roughly 30 % faster than the original model (0.056 s) and about half the time of ALERT (0.078 s).
- Even when is increased to 1000, the additional overhead remains modest: runtime grows linearly with , staying well below one second for a batch of 1000 inputs.
In addition, we assess scalability on larger datasets (e.g., CodeSearchNet). ENBECOME processes a batch of 5 000 code snippets in under 2 seconds, yielding an average latency of ≈0.38 ms per snippet, which is competitive with other smoothing methods while preserving certified robustness.
Overall, ENBECOME offers a favorable trade‑off between computational cost and robustness: it can be applied to real‑time code‑understanding tasks without incurring significant latency.
References
[23] K. Fang, Q. Tao, Y. Wu, T. Li, X. Huang, and J. Yang, “Multi-head ensemble of smoothed classifiers for certified robustness,” Neural Networks, vol. 188, p. 107426, 2025.
[24] M. Ye, C. Gong, and Q. Liu, “Safer: A structure‑free approach for certified robustness to adversarial word substitutions,” in Annual Meeting of the Association for Computational Linguistics (ACL), 2020.
[25] X. Zhang, H. Hong, Y. Hong, P. Huang, B. Wang, Z. Ba, and K. Ren, “Text‑CRS: A generalized certified robustness framework against textual adversarial attacks,” in 2024 IEEE Symposium on Security and Privacy (SP). IEEE, 2024, pp. 2920–2938.
[26] V. Cevher, “Certified robustness under bounded Levenshtein distance,” in The Thirteenth International Conference on Learning Representations, 2025.
[27] T. Han, “Our artifacts,” https://github.com/GeniusHTX/SecCode.
[28] D. N. Palacio, A. Velasco, N. Cooper, A. Rodriguez, K. Moran, and D. Poshyvanyk, “Toward a theory of causation for interpreting neural code models,” IEEE Transactions on Software Engineering, 2024.
[29] J. D. M.-W. C. Kenton and L. K. Toutanova, “BERT: Pre‑training of deep bidirectional transformers for language understanding,” in Proceedings of NAACL‑HLT, vol. 1. Minneapolis, Minnesota, 2019, p. 2.
[30] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text‑to‑text transformer,” Journal of Machine Learning Research, vol. 21, pp. 140:1–140:67, 2020.
[31] Y. Wang, W. Wang, S. R. Joty, and S. C. H. Hoi, “CodeT5: Identifier‑aware unified pre‑trained encoder‑decoder models for code understanding and generation,” in Proceedings of the 26th Conference on Empirical Methods in Natural Language Processing. Virtual Event / Punta Cana, Dominican Republic: Association for Computational Linguistics, 7‑11 November 2021, pp. 8696–8708.
[32] S. Wang, T. Liu, and L. Tan, “Automatically learning semantic features for defect prediction,” in Proceedings of the 38th International Conference on Software Engineering, 2016, pp. 297–308.
[33] C. Fang, Z. Liu, Y. Shi, J. Huang, and Q. Shi, “Functional code clone detection with syntax and semantics fusion learning,” in Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. Virtual Event, USA: ACM, July 18‑22 2020, pp. 516–527.
[34] H. Wei and M. Li, “Supervised deep features for software functional clone detection by exploiting lexical and syntactical information in source code,” IJCAI, 2017, pp. 3034–3040.
[35] J. Gao, J. Lanchantin, M. L. Soffa, and Y. Qi, “Black‑box generation of adversarial text sequences to evade deep learning classifiers,” in 2018 IEEE Security and Privacy Workshops (SPW). IEEE, 2018, pp. 50–56.
[36] L. Li, R. Ma, Q. Guo, X. Xue, and X. Qiu, “BERT‑attack: Adversarial attack against BERT using BERT,” arXiv preprint arXiv:2004.09984, 2020.
[37] D. Jin, Z. Jin, J. T. Zhou, and P. Szolovits, “Is BERT really robust? A strong baseline for natural language attack on text classification and entailment,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, 2020, pp. 8018–8025.
[38] Y. Li, H. Wu, and H. Zhao, “Semantic‑preserving adversarial code comprehension,” arXiv preprint arXiv:2209.05130, 2022.
[39] J. Zeng, J. Xu, X. Zheng, and X. Huang, “Certified robustness to text adversarial attacks by randomized [mask],” Computational Linguistics, vol. 49, no. 2, pp. 395–427, 2023.
[40] M. Ye, C. Gong, and Q. Liu, “Safer: A structure‑free approach for certified robustness to adversarial word substitutions,” arXiv preprint arXiv:2005.14424, 2020.
[41] H. Zhang, Z. Fu, G. Li, L. Ma, Z. Zhao, H. Yang, Y. Sun, Y. Liu, and Z. Jin, “Towards robustness of deep program processing models—detection, estimation, and enhancement,” ACM Transactions on Software Engineering and Methodology (TOSEM), vol. 31, no. 3, pp. 1–40, 2022.
[42] J. Svajlenko, J. F. Islam, I. Keivanloo, C. K. Roy, and M. M. Mia, “Towards a big data curated benchmark of inter‑project code clones,” in 2014 IEEE International Conference on Software Maintenance and Evolution. IEEE, 2014, pp. 476–480.
[43] L. Mou, G. Li, L. Zhang, T. Wang, and Z. Jin, “Convolutional neural networks over tree structures for programming language processing,” in Proceedings of the 30th AAAI Conference on Artificial Intelligence. Phoenix, Arizona, USA: AAAI Press, February 12‑17 2016, pp. 1287–1293.
[44] D. Guo, S. Ren, S. Lu, Z. Feng, D. Tang, S. Liu, L. Zhou, N. Duan, A. Svyatkovskiy, S. Fu, M. Tufano, S. K. Deng, C. B. Clement, D. Drain, N. Sundaresan, J. Yin, D. Jiang, and M. Zhou, “GraphCodeBERT: Pre‑training code representations with data flow,” in Proceedings of the 9th International Conference on Learning Representations. Virtual Event, Austria: OpenReview.net, May 3‑7 2021, pp. 1–12.
[51] W. Sun, C. Fang, Y. Chen, G. Tao, T. Han, and Q. Zhang, “Code search based on context‑aware code translation,” in Proceedings of the 44th IEEE/ACM International Conference on Software Engineering. May 25‑27: ACM, Pittsburgh, PA, USA 2022, pp. 388–400.
[52] A. LeClair, S. Jiang, and C. McMillan, “A neural model for generating natural language summaries of program subroutines,” in Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering. IEEE, 2019, pp. 795–806.
[53] S. Haque, A. LeClair, L. Wu, and C. McMillan, “Improved automatic summarization of subroutines via attention to file context,” in Proceedings of the 17th International Conference on Mining Software Repositories, 2020, pp. 300–310.
[54] W. U. Ahmad, S. Chakraborty, B. Ray, and K. Chang, “A transformer‑based approach for source code summarization,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, July 5‑10 2020, pp. 4998–5007.
[55] C. Fang, W. Sun, Y. Chen, X. Chen, Z. Wei, Q. Zhang, Y. You, B. Luo, Y. Liu, and Z. Chen, “ESALE: Enhancing code‑summary alignment learning for source code summarization,” IEEE Transactions on Software Engineering (Early Access), pp. 1–18, 2024.
[56] W. Sun, Y. Miao, Y. Li, H. Zhang, C. Fang, Y. Liu, G. Deng, Y. Liu, and Z. Chen, “Source code summarization in the era of large language models,” CoRR, vol. abs/2407.07959, no. 1, pp. 1–13, 2024.
[57] A. Mastropaolo, S. Scalabrin, N. Cooper, D. N. Palacio, D. Poshyvanyk, R. Oliveto, and G. Bavota, “Studying the usage of text‑to‑text transfer transformer to support code‑related tasks,” in Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering. IEEE, 2021, pp. 336–347.
[58] J. Zhang, X. Wang, H. Zhang, H. Sun, K. Wang, and X. Liu, “A novel neural source code representation based on abstract syntax tree,” in Proceedings of the 41st International Conference on Software Engineering. Montreal, QC, Canada: IEEE / ACM, May 25‑31 2019, pp. 783–794.
[59] Y. Wan, J. Shu, Y. Sui, G. Xu, Z. Zhao, J. Wu, and P. S. Yu, “Multimodal attention network learning for semantic source code retrieval,” in Proceedings of the 34th International Conference on Automated Software Engineering. San Diego, CA, USA: IEEE, November 11‑15 2019, pp. 13–25.
[60] C. Zeng, Y. Yu, S. Li, X. Xia, Z. Wang, M. Geng, L. Bai, W. Dong, and X. Liao, “dgraphs: Embedding variable‑based flow graph for neural code search,” ACM Transactions on Software Engineering and Methodology, vol. 32, no. 2, pp. 1–27, 2023.
[61] M. Z. Nasrabadi, S. Parsa, M. Ramezani, C. Roy, and M. Ekhtiarzadeh, “A systematic literature review on source code similarity measurement and clone detection: Techniques, applications, and challenges,” Journal of Systems and Software, vol. 204, p. 111796, 2023.
[62] Y. Du, T. Ma, L. Wu, X. Zhang, and S. Ji, “Adaccd: Adaptive semantic contrasts discovery based cross‑lingual adaptation for code clone detection,” in Proceedings of Thirty‑Eighth AAAI Conference on Artificial Intelligence. Vancouver, Canada: AAAI Press, February 20‑27 2024, pp. 17942–17950.
[63] Y. Li, B. Wu, Y. Feng, Y. Fan, Y. Jiang, Z. Li, and S.-T. Xia, “Semi‑supervised robust training with generalized perturbed neighborhood,” Pattern Recognition, vol. 124, p. 108472, 2022.
[64] J. Bai, B. Chen, Y. Li, D. Wu, W. Guo, S. Xia, and E. Yang, “Targeted attack for deep hashing based retrieval,” in Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer, August 23‑28 2020, pp. 618–634.
[65] F. Gao, Y. Wang, and K. Wang, “Discrete adversarial attack to models of code,” Proceedings of the ACM on Programming Languages, vol. 7, no. PLDI, pp. 172–195, 2023.
[66] J. M. Springer, B. M. Reinstadler, and U.-M. O’Reilly, “Strata: simple, gradient‑free attacks for models of code,” arXiv preprint arXiv:2009.13562, 2020.