YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

VB6 → C# Code Translation LLM

A complete training pipeline to fine-tune a code LLM for translating VB6 to C# using SFT + GRPO for maximum quality.

Architecture

Qwen2.5-Coder-7B-Instruct
      ↓
   SFT (360 examples, 3 epochs)
      ↓
  simooo21/vb6-to-cs-qwen2.5-coder-7b-sft
      ↓
   GRPO (reward: syntax + format + length, 2 epochs)
      ↓
  simooo21/vb6-to-cs-qwen2.5-coder-7b-grpo

Dataset

  • Source: simooo21/vb6-to-csharp-translation
  • Size: 360 train, 40 validation
  • Format: Conversational messages ([{"role": "user"}, {"role": "assistant"}])
  • Coverage: 41 VB6 → C# translation patterns across 40 categories:
Category Examples Key Mappings
variables Dim → type declaration Dim x As Integerint x;
control_flow If/Select Case → if/switch Select Caseswitch
loops For/Do/For Each → for/while/foreach For Eachforeach
functions Sub/Function → methods ByRefref, Optional → default params
arrays 1-based → 0-based arrays Dim arr(1 To 5)int[] arr = new int[5]
strings VB6 string funcs → C# methods UCase/Trim/LenToUpper/Trim/Length
file_io Open/Close → StreamReader/Writer FreeFileusing statement
error_handling On Error → try/catch On Error GoTotry { } catch
gui_events Event subs → event handlers cmd_Click()cmd_Click(object, EventArgs)
gui_controls VB6 controls → WinForms ComboBox.AddItemItems.Add
form_designer .frm code → C# InitializeComponent VB6 form layout → C# programmatic UI
database ADO → SqlClient ADODB.ConnectionSqlConnection
api_calls Declare → DllImport Declare Function[DllImport]
classes VB6 class → C# class Property Get/Let → C# properties
structs Type → struct Public Typepublic struct
collections Collection → Dictionary Scripting.DictionaryDictionary<K,V>
regex Like → Regex Like "[A-Z]###"Regex.IsMatch
enums VB6 Enum → C# enum Direct mapping with values
events RaiseEvent → event invocation RaiseEvent?.Invoke
interfaces Implements → : interface Implements INotifyPropertyChanged: INotifyPropertyChanged
modules Module-level → static class Public Constpublic const
dialogs MsgBox/InputBox → MessageBox vbModalShowDialog()
async DoEvents → async/await DoEventsApplication.DoEvents() / await
datetime Date functions → DateTime Now/DateAdd/DateDiffDateTime.Now/AddDays
math Rnd → Random RndRandom.Next
formatting Format → ToString FormatCurrency.ToString("C2")
conversions CStr/CInt → Convert CStr/CInt/CDateConvert.ToString/Int32/DateTime
type_checks IsNumeric → TryParse IsNumericint.TryParse
operators IIf → ternary IIf(condition, a, b)condition ? a : b
networking Winsock → TcpClient WinsockTcpClient/NetworkStream
com_interop CreateObject → Activator CreateObject("Word.Application")Activator.CreateInstance
printing Printer → PrintDocument Printer.Printe.Graphics.DrawString
graphics LoadPicture → Image.FromFile Direct mapping
system SendKeys → SendKeys.SendWait Direct mapping
application App.Path → Assembly App.PathApplication.ExecutablePath
entry_point Sub Main → static void Main With [STAThread] attribute
null_checks IsNull/Nothing → null/DBNull Is Nothing== null
access_modifiers Friend → internal Friendinternal

Training Recipes

Phase 1: SFT (Supervised Fine-Tuning)

Base model: Qwen/Qwen2.5-Coder-7B-Instruct

python train_sft.py

Hyperparameters (from OpenCodeInstruct):

Parameter Value Rationale
lr 5e-6 Low LR for stable convergence on small dataset
epochs 3 Full coverage of 360 examples
batch 1 × 8 accum Effective batch 8 per GPU
seq_len 1024 All examples fit (max 570 tokens)
warmup 50 steps Smooth start
scheduler cosine (default) Proven for code tasks

Phase 2: GRPO (Reinforcement Learning)

Base model: simooo21/vb6-to-cs-qwen2.5-coder-7b-sft

python train_grpo.py

Hyperparameters (from DRIVE / DeepSeekMath):

Parameter Value Rationale
lr 1e-6 Very low for RL stability
epochs 2 Don't overfit on small data
num_generations 8 Group size per prompt
max_completion 2048 Long enough for code
rewards syntax + format + length Multi-objective optimization

Reward Functions

  1. syntax_reward (0-1): Checks C# syntax patterns

    • Access modifiers (+0.2)
    • Balanced braces (+0.2)
    • Semicolons (+0.1)
    • Type declarations (+0.2)
    • Using/namespace (+0.1)
    • Class/struct (+0.1)
    • Balanced brackets (+0.1)
  2. format_reward (0-1): Checks for ```csharp code blocks

  3. length_reward (0-1): Rewards reasonable code length (3-100 lines)

Hardware Requirements

  • SFT: 1× A10G (24GB) or better. ~2-3 hours.
  • GRPO: 1× A10G or A100. ~3-4 hours.
  • For multi-GPU: use accelerate launch

Evaluation

python evaluate_model.py --model simooo21/vb6-to-cs-qwen2.5-coder-7b-grpo

Reports:

  • Exact match rate
  • Code block extraction rate
  • Mean syntax score (0-1)
  • Mean Levenshtein distance
  • Per-category breakdown

Why SFT + GRPO?

Per the literature review:

  1. SFT alone gets you far with high-quality data (OpenCodeInstruct showed 5M samples beats small-model RL)
  2. GRPO with verifiable rewards is the new SOTA for code generation (DRIVE: +58.3% on Codeforces)
  3. Two-stage pipeline is proven: SFT first for imitation learning, then RL for exploration and reward optimization
  4. Multi-objective rewards (syntax + format + length) prevent reward hacking and produce balanced outputs

References

  1. OpenCodeInstruct (2025) - lr=5e-6, 3 epochs, batch=2048 for SFT. Paper
  2. DRIVE (2025) - Two-stage GRPO with verifiable rewards. Paper
  3. DeepSeekMath (2024) - GRPO algorithm foundation. Paper
  4. StepCoder (2024) - Compiler feedback for code RL. Paper
  5. Lost in Translation (2023) - Cross-language code translation challenges. Paper
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for simooo21/vb6-to-cs-training-pipeline