ONNX to Tensorflow.js conversion of GPT-2 #927

christinakopi · 2025-06-26T19:01:51Z

Roadmap:

ensure bijection between onnx and tfjs weights
add progress indicator to hellaswag benchmark
move filesystem logic to discojs-node and cli
automate downloading onnx model from hub
convert python script to nodejs and web-compatible logic
experiment with webgpu for hellaswag
experiment with converting tfjs back to onnx

In order to be able to run the load_weights.spec.ts you need to change the format of the pretrained weights from .onnx to .jsonl by running the following python script:

Before running the script make sure you have downloaded the decoder_model.onnx from the following link: https://huggingface.co/Xenova/gpt2/tree/main/onnx and you have added it in the ONNX_MODEL_PATH

!pip install onnx
!pip install onnxruntime

import onnx
import numpy as np
from onnx import numpy_helper
import json
import sys

ONNX_MODEL_PATH = "<path/to/decoder_model.onnx>"
OUTPUT_FILENAME = "<where the output file (gpt2_weights.jsonl) will be saved>"

try:
    model = onnx.load(ONNX_MODEL_PATH)
except FileNotFoundError:
    print(f"ERROR: The model file was not found at '{ONNX_MODEL_PATH}'")
    sys.exit(1)

print(f"Extracting weights from {ONNX_MODEL_PATH}...")

with open(OUTPUT_FILENAME, "w") as f:
    for initializer in model.graph.initializer:
        name = initializer.name
        array = numpy_helper.to_array(initializer)

        weight_object = {
            "key": name,
            "value": array.tolist()
        }

        f.write(json.dumps(weight_object) + "\n")

print(f"Successfully saved weights to {OUTPUT_FILENAME}")

After generating this file, place it under the following path: disco/discojs/gpt2_weights.jsonl

The table below shows the accuracy along with the time of the evaluation on the whole HellaSwag for each model tested:

Model	Accuracy	Eval Time (s)
TFJS GPT (gpt-nano)	24.67%	1390.25
Xenova GPT-2 (ONNX)	29.03%	22767.03
Loaded TFJS GPT (from ONNX)	28.41%	12523.59

Note: The loaded_hellaswag.spec.ts reproduces this result for the loaded model. For the other 2 models this can be reproduced using the hellaswag_gpt.ts script in cli/

martinjaggi

amazing work, thanks a lot!
later i'd be very curious to check if also the conversion in the reverse direction (to onnx) works fine.

…ersion

tharvik

haa, that's great work, well done in hacking around the onnx protobuf!
nothing blocking (except the tsc --build thing), only a few nitpicks and questions here and there

tharvik · 2026-01-09T11:18:53Z

cli/src/hellaswag_gpt.ts

@@ -1,50 +1,102 @@
+// import fs from 'fs';


tharvik · 2026-01-09T11:19:13Z

cli/src/hellaswag_gpt.ts

-import { Tokenizer, models } from '@epfml/discojs';
+import { models, serialization, Tokenizer } from '@epfml/discojs';
 import { loadHellaSwag } from '@epfml/discojs-node';
+// import { AutoTokenizer } from '@xenova/transformers';


tharvik · 2026-01-09T11:23:14Z

cli/src/hellaswag_gpt.ts

+const logLines: string[] = [];
 function log(message: string) {
    console.log(message);
    logLines.push(message);
 }


we don't need a log system for the CLI, we can simply output to the console, no?

tharvik · 2026-01-09T12:11:27Z

cli/src/hellaswag_gpt.ts

+            case 'gpt-tfjs-random':
+            log("Using GPT-TFJS with random initialization")
+            model = new models.GPT({ seed: 42 });
+            break;
+            case 'gpt-tfjs-pretrained':


that was confusing

Suggested change

case 'gpt-tfjs-random':

log("Using GPT-TFJS with random initialization")

model = new models.GPT({ seed: 42 });

break;

case 'gpt-tfjs-pretrained':

case 'gpt-tfjs-random':

log("Using GPT-TFJS with random initialization")

model = new models.GPT({ seed: 42 });

break;

case 'gpt-tfjs-pretrained':

tharvik · 2026-01-09T12:28:00Z

cli/src/hellaswag_gpt.ts

+    const defaultPretrainedModelPath = path.join(__dirname, "..", "..", "onnx-converter", "assets", "model.json")
+    const args = parse<HellaSwagArgs>({
+        model: {
+            type: (raw: string) => raw as ModelType,


casting isn't nice, especially for user interfaces, better to implement so value checking with a switch

tharvik · 2026-01-09T13:48:56Z

onnx-converter/src/convert_onnx.ts

+    console.log("WARNING: protobuf raw data is empty, falling back on specific data fields.")
+    if (tensor.floatData && tensor.floatData.length > 0) return new Float32Array(tensor.floatData);


that look to me as more relevant field than using rawData, why not using it first (and drop the warning)?

tharvik · 2026-01-09T13:55:22Z

onnx-converter/src/convert_onnx.ts

+  if (gptLayersModel.weights.length !== onnxTfjsMapping.size) 
+    throw new Error(`Mismatch between TFJS and ONNX weight mapping weights.`);
+
+  const finalWeights = gptLayersModel.weights.map((weight, _i) => {


Suggested change

const finalWeights = gptLayersModel.weights.map((weight, _i) => {

const finalWeights = gptLayersModel.weights.map((weight) => {

tharvik · 2026-01-09T14:00:38Z

onnx-converter/src/convert_onnx.ts

+const ASSET_FOLDER = path.join(__dirname, "..", "assets");
+const OUTPUT_FILENAME = path.join(ASSET_FOLDER, "model.json");


can we more simply store it in the current directory and so letting choose the user where to have it?

tharvik · 2026-01-09T14:45:59Z

onnx-converter/tsconfig.lib.json

+	"extends": "../tsconfig.base.json",
+	"compilerOptions": { "outDir": "dist" },
+	"include": ["src"],
+	"exclude": ["**/*.spec.ts"]


we don't have tests anyway

Suggested change

"exclude": ["**/*.spec.ts"]

tharvik · 2026-01-09T14:46:50Z

cli/src/hellaswag_gpt.ts

+        pretrainedModelPath: {
+            type: String,
+            description: 'If specifying gpt-tfjs-pretrained, provide the relative path to the TF.js pretrained model',
+            defaultValue: defaultPretrainedModelPath


single use, we can inline it IMO

martinjaggi reviewed Jul 4, 2025

View reviewed changes

JulienVig force-pushed the NAN-init_from_ONNX-christinakopi branch from ae33116 to 99e2db4 Compare November 26, 2025 01:27

JulienVig changed the title ~~First try of loading the weights of the pretrained ONNX GPT2 model into our GPT2-tfjs implementation~~ ONNX to Tensorflow.js conversion of GPT-2 Nov 26, 2025

JulienVig force-pushed the NAN-init_from_ONNX-christinakopi branch 3 times, most recently from 6565027 to d08b99d Compare November 26, 2025 03:04

discojs/models/gpt: update layer names to be consistent with Python v…

55538d7

…ersion

JulienVig force-pushed the NAN-init_from_ONNX-christinakopi branch from 0f7831d to 772832d Compare November 26, 2025 03:49

JulienVig added 3 commits November 25, 2025 20:35

onnx-converter: new npm workspace to convert GPT2 from ONNX to TFJS

de946dd

*: remove v-html tags

29b1d4b

Re-generate package-lock.json

464ff8b

JulienVig force-pushed the NAN-init_from_ONNX-christinakopi branch from 772832d to 464ff8b Compare November 26, 2025 04:43

JulienVig marked this pull request as ready for review November 26, 2025 05:01

JulienVig requested a review from tharvik November 26, 2025 05:01

tharvik approved these changes Jan 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ONNX to Tensorflow.js conversion of GPT-2 #927

ONNX to Tensorflow.js conversion of GPT-2 #927

christinakopi commented Jun 26, 2025 •

edited by JulienVig

Loading

Uh oh!

martinjaggi left a comment

Uh oh!

tharvik left a comment

Uh oh!

tharvik Jan 9, 2026

Uh oh!

tharvik Jan 9, 2026

Uh oh!

tharvik Jan 9, 2026

Uh oh!

tharvik Jan 9, 2026

Uh oh!

tharvik Jan 9, 2026

Uh oh!

tharvik Jan 9, 2026

Uh oh!

tharvik Jan 9, 2026

Uh oh!

tharvik Jan 9, 2026

Uh oh!

tharvik Jan 9, 2026

Uh oh!

tharvik Jan 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		console.log("WARNING: protobuf raw data is empty, falling back on specific data fields.")
		if (tensor.floatData && tensor.floatData.length > 0) return new Float32Array(tensor.floatData);

	const finalWeights = gptLayersModel.weights.map((weight, _i) => {
	const finalWeights = gptLayersModel.weights.map((weight) => {

		const ASSET_FOLDER = path.join(__dirname, "..", "assets");
		const OUTPUT_FILENAME = path.join(ASSET_FOLDER, "model.json");

ONNX to Tensorflow.js conversion of GPT-2 #927

Are you sure you want to change the base?

ONNX to Tensorflow.js conversion of GPT-2 #927

Conversation

christinakopi commented Jun 26, 2025 • edited by JulienVig Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martinjaggi left a comment

Choose a reason for hiding this comment

Uh oh!

tharvik left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

christinakopi commented Jun 26, 2025 •

edited by JulienVig

Loading