VB6 → C# Code Translation LLM

A complete training pipeline to fine-tune a code LLM for translating VB6 to C# using SFT + GRPO for maximum quality.

Architecture

Qwen2.5-Coder-7B-Instruct
      ↓
   SFT (360 examples, 3 epochs)
      ↓
  simooo21/vb6-to-cs-qwen2.5-coder-7b-sft
      ↓
   GRPO (reward: syntax + format + length, 2 epochs)
      ↓
  simooo21/vb6-to-cs-qwen2.5-coder-7b-grpo

Dataset

Source: simooo21/vb6-to-csharp-translation
Size: 360 train, 40 validation
Format: Conversational messages ([{"role": "user"}, {"role": "assistant"}])
Coverage: 41 VB6 → C# translation patterns across 40 categories:

Category	Examples	Key Mappings
variables	Dim → type declaration	`Dim x As Integer` → `int x;`
control_flow	If/Select Case → if/switch	`Select Case` → `switch`
loops	For/Do/For Each → for/while/foreach	`For Each` → `foreach`
functions	Sub/Function → methods	`ByRef` → `ref`, `Optional` → default params
arrays	1-based → 0-based arrays	`Dim arr(1 To 5)` → `int[] arr = new int[5]`
strings	VB6 string funcs → C# methods	`UCase/Trim/Len` → `ToUpper/Trim/Length`
file_io	Open/Close → StreamReader/Writer	`FreeFile` → `using` statement
error_handling	On Error → try/catch	`On Error GoTo` → `try { } catch`
gui_events	Event subs → event handlers	`cmd_Click()` → `cmd_Click(object, EventArgs)`
gui_controls	VB6 controls → WinForms	`ComboBox.AddItem` → `Items.Add`
form_designer	.frm code → C# InitializeComponent	VB6 form layout → C# programmatic UI
database	ADO → SqlClient	`ADODB.Connection` → `SqlConnection`
api_calls	Declare → DllImport	`Declare Function` → `[DllImport]`
classes	VB6 class → C# class	`Property Get/Let` → C# properties
structs	Type → struct	`Public Type` → `public struct`
collections	Collection → Dictionary	`Scripting.Dictionary` → `Dictionary<K,V>`
regex	Like → Regex	`Like "[A-Z]###"` → `Regex.IsMatch`
enums	VB6 Enum → C# enum	Direct mapping with values
events	RaiseEvent → event invocation	`RaiseEvent` → `?.Invoke`
interfaces	Implements → : interface	`Implements INotifyPropertyChanged` → `: INotifyPropertyChanged`
modules	Module-level → static class	`Public Const` → `public const`
dialogs	MsgBox/InputBox → MessageBox	`vbModal` → `ShowDialog()`
async	DoEvents → async/await	`DoEvents` → `Application.DoEvents()` / `await`
datetime	Date functions → DateTime	`Now/DateAdd/DateDiff` → `DateTime.Now/AddDays`
math	Rnd → Random	`Rnd` → `Random.Next`
formatting	Format → ToString	`FormatCurrency` → `.ToString("C2")`
conversions	CStr/CInt → Convert	`CStr/CInt/CDate` → `Convert.ToString/Int32/DateTime`
type_checks	IsNumeric → TryParse	`IsNumeric` → `int.TryParse`
operators	IIf → ternary	`IIf(condition, a, b)` → `condition ? a : b`
networking	Winsock → TcpClient	`Winsock` → `TcpClient/NetworkStream`
com_interop	CreateObject → Activator	`CreateObject("Word.Application")` → `Activator.CreateInstance`
printing	Printer → PrintDocument	`Printer.Print` → `e.Graphics.DrawString`
graphics	LoadPicture → Image.FromFile	Direct mapping
system	SendKeys → SendKeys.SendWait	Direct mapping
application	App.Path → Assembly	`App.Path` → `Application.ExecutablePath`
entry_point	Sub Main → static void Main	With `[STAThread]` attribute
null_checks	IsNull/Nothing → null/DBNull	`Is Nothing` → `== null`
access_modifiers	Friend → internal	`Friend` → `internal`

Training Recipes

Phase 1: SFT (Supervised Fine-Tuning)

Base model: Qwen/Qwen2.5-Coder-7B-Instruct

python train_sft.py

Hyperparameters (from OpenCodeInstruct):

Parameter	Value	Rationale
lr	5e-6	Low LR for stable convergence on small dataset
epochs	3	Full coverage of 360 examples
batch	1 × 8 accum	Effective batch 8 per GPU
seq_len	1024	All examples fit (max 570 tokens)
warmup	50 steps	Smooth start
scheduler	cosine (default)	Proven for code tasks

Phase 2: GRPO (Reinforcement Learning)

Base model: simooo21/vb6-to-cs-qwen2.5-coder-7b-sft

python train_grpo.py

Hyperparameters (from DRIVE / DeepSeekMath):

Parameter	Value	Rationale
lr	1e-6	Very low for RL stability
epochs	2	Don't overfit on small data
num_generations	8	Group size per prompt
max_completion	2048	Long enough for code
rewards	syntax + format + length	Multi-objective optimization

Reward Functions

syntax_reward (0-1): Checks C# syntax patterns
- Access modifiers (+0.2)
- Balanced braces (+0.2)
- Semicolons (+0.1)
- Type declarations (+0.2)
- Using/namespace (+0.1)
- Class/struct (+0.1)
- Balanced brackets (+0.1)
format_reward (0-1): Checks for ```csharp code blocks
length_reward (0-1): Rewards reasonable code length (3-100 lines)

Hardware Requirements

SFT: 1× A10G (24GB) or better. ~2-3 hours.
GRPO: 1× A10G or A100. ~3-4 hours.
For multi-GPU: use accelerate launch

Evaluation

python evaluate_model.py --model simooo21/vb6-to-cs-qwen2.5-coder-7b-grpo

Reports:

Exact match rate
Code block extraction rate
Mean syntax score (0-1)
Mean Levenshtein distance
Per-category breakdown

Why SFT + GRPO?

Per the literature review:

SFT alone gets you far with high-quality data (OpenCodeInstruct showed 5M samples beats small-model RL)
GRPO with verifiable rewards is the new SOTA for code generation (DRIVE: +58.3% on Codeforces)
Two-stage pipeline is proven: SFT first for imitation learning, then RL for exploration and reward optimization
Multi-objective rewards (syntax + format + length) prevent reward hacking and produce balanced outputs

References

OpenCodeInstruct (2025) - lr=5e-6, 3 epochs, batch=2048 for SFT. Paper
DRIVE (2025) - Two-stage GRPO with verifiable rewards. Paper
DeepSeekMath (2024) - GRPO algorithm foundation. Paper
StepCoder (2024) - Compiler feedback for code RL. Paper
Lost in Translation (2023) - Cross-language code translation challenges. Paper

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Papers for simooo21/vb6-to-cs-training-pipeline

DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation

Paper • 2511.06307 • Published Nov 9, 2025 • 53

OpenCodeInstruct: A Large-scale Instruction Tuning Dataset for Code LLMs

Paper • 2504.04030 • Published Apr 5, 2025 • 4

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 145

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

Paper • 2402.01391 • Published Feb 2, 2024 • 43

Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code

Paper • 2308.03109 • Published Aug 6, 2023