15 Jul 16:05

rstz

0edcd30

Python API 0.13.0 Latest

Latest

0.13.0 - 2025-07-15

API Changes

For Random Forest models, .out_of_bag_evaluations() now returns a
TrainingLogs object. The content is identical to the object previously
returned, but the number_of_trees property has been renamed to
iteration for consistency with Gradient Boosted Trees Training Logs.
mode="tf" is now the default on model.to_tensorflow_saved_model(). The
previous default is still available by setting mode="keras".
model.label() returns None for models trained without a label.
Remove deprecated evaluation_task argument for model.evaluate(). Use
task instead.

Feature

Add standalone C++ export with model.to_standalone_cc(). Standalone models
are super flexible, fast and memory-efficient. They only depend on the C++
standard library.
Add model.training_logs() method to return the training logs of the model.
Expose Mean Average Precision for Ranking tasks.
Add hyperparameters
numerical_vector_sequence_enable_closer_than_conditions and
numerical_vector_sequence_enable_projected_more_than_conditions.
Clear error messages when attempting to evaluate models without label.
Faster training with sparse oblique splits for datasets with many numerical
features
Many documentation improvements.
Increase default number of threads to 256 or number of CPU cores.
Enable cross-validation for hyperparameter tuning.
Add thresholds to classification plots.
Explicitly disable custom losses for hyperparameter tuning.
Disable parallel evaluation for cross-validation custom losses.

Fix

Distributed Training: recvmsg: Connection reset to isTransientError.
Enable SHAP values when training with BEST_FIRST_GLOBAL.
Predictions with cross-entropy LambdaMART no longer need the slow engine.
Disable the generic engine for oblique splits without global imputation.
This may fix a very rare bug in the way predictions are computed.

Release music

Sinfonie Nr. 4 in A-Dur, op. 90. Felix Mendelssohn

Assets 2

0 Join discussion

20 May 14:11

rstz

pydf_0.12.0

a6d852c

Python API 0.12.0

0.12.0 - 2025-05-20

Feature

Enable support for Python 3.13.
Add custom fields to model metadata.
Add SHAP value variable importances with model.analyze().
Add SHAP values for a dataset with model.predict_shap().
Speed-up (up to 20x) training of models with CATEGORICAL_SET features.
Add hyper-parameter to limit the mask size for CATEGORICAL_SET features.
Add hyper-parameter total_max_num_nodes to limit the total number of nodes in a model.
Add support for na_replacements in python tree editor API.
Add support for include_all_columns in FeatureSelector.
Add the ydf.utils.LogBook to manage and track experiments.
Speed-up training of NDCG ranking model when a single example per group
is non-zero.
Speed-up training on datasets with few columns on a computer with a
large amount of cores.
Speed-up loss computation multi-threading code.
Improve distributed training error messages.
Remove need for label columns for deep learning models.

Fix

Log message if early stopping is not used.
Fix force_numerical_discretization errors and documentation.
Fix handling of empty list columns in the dataset.

Release music

Te Deum in D major, H.146. Marc-Antoine Charpentier

Assets 2

12 Mar 13:45

rstz

v1.11.0

9d28074

v1.11.0

1.11.0 - 2025-03-12

Features

Speed-up training of GBT models by ~10%.
Support for categorical and boolean features in Isolation Forests.
Rename LAMBDA_MART_NDCG5 to LAMBDA_MART_NDCG. The old name is deprecated but
can still be used.
Allow configuring the truncation of NDCG losses.
Add support for distributed training for ranking gradient boosted tree
models.
Add support for NUMERICAL_VECTOR_SEQUENCE features.
Add support for AVRO data file using the "avro:" prefix.
Additional hyperparameters restricting weights of sparse oblique splits
to integers or powers of 2.
Facilitate training on VertexAI.
Deprecated SparseObliqueSplit.binary_weights hyperparameter in favor of
SparseObliqueSplit.weights.
Add Gzip-compressed BLOB_SEQUENCE serialization
Enable Poisson loss for model analysis and fast inference.
Add config for compatibility with protobuf lite.

Fix

Fix structural variable importances for oblique splits.
Deflake tests.
Remove CHECK/FATAL from training code.
Fix crash in YDF distributed training.

Misc

Loss options are now defined
model/gradient_boosted_trees/gradient_boosted_trees.proto (previously
learner/gradient_boosted_trees/gradient_boosted_trees.proto)
Remove C++14 support.
Various documentation improvements.

Assets 3

12 Mar 13:47

rstz

pydf_0.11.0

9d28074

Python API 0.11.0

0.11.0 - 2025-03-12

Feature

Expose losses for distributed training.
Add class_weights parameter to the learners.
Support for Google Cloud paths for datasets and model IO.
Add utility to facilitate distributed training on VertexAI.
Improved support for non-unicode data in categorical features.
Add support for saving and analyzing deep models.

Fix

Fix incorrectly transposed confusion table in HTML.
Various documentation fixes.
Better requirements management.

Documentation

Add tutorial for Categorical Set features.
Add tutorial for training on VertexAI.

Release music

3. Sinfonie in d-Moll. Gustav Mahler

Assets 2

0 Join discussion

11 Feb 13:32

rstz

pydf_0.10.0

12a83b8

Python API 0.10.0

0.10.0 - 2025-02-11

Feature

Expose model.save(..., pure_serving=True) for saving a model without debug
information.
Allow users to provide a training proto configuration to the learner.
Add vector sequence feature support.
Add Variable importances for Isolation Forest Models.
Add ydf.help.loading_data() to print information about the type of
supported dataset formats.
Add experimental Tabular Transformer implementation.
Add gzipped blob sequence as new model format (still optional).
Enabled Poisson Loss for model analysis and fast inference.

Fix

Fix recognition of multidimensional features for Numpy arrays of type
object.
Fix subsample count for small number of training examples for Isolation
Forests.
Fix NUM_NODES variable importance for oblique splits.

Other

Updated OSS dependencies of protobuf, grpc and abseil.

Release music

Sinfonie in Es-Dur "Sinfonia Eroica", op. 55. Ludwig van Beethoven

Assets 2

0 Join discussion

02 Dec 16:02

rstz

pydf_0.9.0

cfd4275

Python API 0.9.0

0.9.0 - 2024-12-02

Breaking

Classification Label classes are now consistently ordered lexicographically
(for string labels) or increasingly (for integer labels).
Change typo partial_depepence_plot to partial_dependence_plot on
model.analyze().

Feature

Add support for Avro file for path / distributed training with the "avro:"
prefix.
Add support for discretized numerical features for in-memory datasets.
Expose MRR for ranking models.
Add model.predict_class to generate the most likely predicted class of
classification models.
Add support for automatic feature selection with the feature_selector
learner constructor argument. See the feature selection tutorial for
more details.
Add standalone prediction evaluation ydf.evaluate_predictions().
Add new hyperparameter sparse_oblique_max_num_projections.
Add options "POWER_OF_TWO" and "INTEGER" for sparse oblique weights.
Emit proper errors when using lists for multi-dimensional features.

Fix

Regression and Ranking CEPs scaling corrected.

Release music

The John B. Sails. Traditional

Assets 2

0 Join discussion

23 Sep 16:49

rstz

pydf_0.8.0

a89064f

Python API 0.8.0

0.8.0 - 2024-09-23

Breaking

Disallow positional parameters for the learners, except for label and task.
Remove the unsupported / invalid hyperparameters from the Isolation Forest
learner.
Remove parameters for distributed training and resuming training from
learners that do not support these capabilities.
By default, model.analyze for a maximum of 20 seconds (i.e.
maximum_duration=20 by default).
Convert boolean values in categorical sets to lowercase, matching the
treatment of categorical features.

Feature

Warn if training on a VerticalDataset and fail if attempting to modify the
columns in a VerticalDataset during training.
User can override the model's task, label or group during evaluation.
Add num_examples_per_tree() method to Isolation Forest models.
Expose the slow engine for debugging predictions and evaluations with
use_slow_engine=True.
Speed-up training of GBT models by ~10%.
Support for categorical and boolean features in Isolation Forests.
Add ydf.util.read_tf_record and ydf.util.write_tf_record to facilitate
TF Record datasets usage.
Rename LAMBDA_MART_NDCG5 to LAMBDA_MART_NDCG. The old name is deprecated but
can still be used.
Allow configuring the truncation of NDCG losses.
Enable multi-threading when using model.predict and model.evaluate.
Default number of threads of model.analyze is equal to the number of
cores.
Add multi-threaded results in model.benchmark.
Add argument to control the maximum duration of model.analyze.
Add support for Unicode strings, normalize categorical set values in the
same way as categorical values, and validate their types.
Add support for distributed training for ranking gradient boosted tree
models.

Fix

Fix labels of regression evaluation plots
Improved errors if Isolation Forest training fails.

Release music

Perpetuum Mobile "Ein musikalischer Scherz", Op. 257. Johann Strauss (Sohn)

Assets 2

0 Join discussion

21 Aug 19:51

rstz

v1.10.0

0d4e307

v1.10.0

1.10.0 - 2024-08-21

Features

Add support for Isolation Forests model.
The default value of num_candidate_attributes in the CART learner is
changed from 0 (Random Forest style sampling) to -1 (no sampling). This is
the generally accepted logic of CART.
Added support for GCS for file I/O.

Assets 2

21 Aug 19:47

rstz

pydf_0.7.0

0d4e307

Python API 0.7.0

Python API 0.7.0 - 2024-08-21

Feature

Expose validate_hyperparameters() on the learner.
Clarify which parameters in the learner are optional.
Add support in JAX FeatureEncoder for non-string categorical feature values.
Improve performance of Isolation Forests.
Models can be serialized/deserialized to/from bytes with model.serialize()
and ydf.deserialize_model.
Models can be pickled safely.
Native support for Xarray as a dataset format for all operations (e.g.,
training, evaluation, predictions).
The output of model.to_jax_function can be converted to a TensorFlow Lite
model.
Change the default number of examples to scan when training on files to
determine the semantic and dictionaries of columns from 10k to 100k.
Various improvements of error messages.
Evaluation for Anomaly Detection models.
Oblique splits for Anomaly Detection models.

Fix

Fix parsing of multidimensional ragged inputs.
Fix isolation forest hyperparameter defaults.
Fix bug causing distributed training to fail on a sharded dataset containing
an empty shard.
Handle unordered categorical sets in training.
Fix dataspec ignoring definitions of unrolled columns, such as
multidimensional categorical integers.
Fix error when defining categorical sets for non-ragged multidimensional
inputs.
MacOS: Fix compatibility with other protobuf-using libraries such as
Tensorflow.

Release music

Rondo Alla ingharese quasi un capriccio "Die Wut über den verlorenen Groschen",
Op. 129. Ludwig van Beethoven

Assets 2

26 Jul 13:57

rstz

pydf_v0.6.0

ff76e60

Python API 0.6.0

Feature

model.to_jax_function now always outputs a FeatureEncoder to help feeding
data to the JAX model.
The default value of num_candidate_attributes in the CART learner is
changed from 0 (Random Forest style sampling) to -1 (no sampling). This is
the generally accepted logic of CART.
model.to_tensorflow_saved_model support preprocessing functions which have
a different signature than the YDF model.
Improve error messages when feeding wrong size Numpy arrays.
Add option for weighted evaluation in model.evaluate.

Fix

Fix display of confusion matrix with floating point weights.

Known issues

MacOS build is broken.

Assets 2

Releases: google/yggdrasil-decision-forests

Python API 0.13.0

0.13.0 - 2025-07-15

API Changes

Feature

Fix

Release music

Uh oh!

Python API 0.12.0

0.12.0 - 2025-05-20

Feature

Fix

Release music

Uh oh!

v1.11.0

1.11.0 - 2025-03-12

Features

Fix

Misc

Uh oh!

Python API 0.11.0

0.11.0 - 2025-03-12

Feature

Fix

Documentation

Release music

Uh oh!

Python API 0.10.0

0.10.0 - 2025-02-11

Feature

Fix

Other

Release music

Uh oh!

Python API 0.9.0

0.9.0 - 2024-12-02

Breaking

Feature

Fix

Release music

Uh oh!

Python API 0.8.0

0.8.0 - 2024-09-23

Breaking

Feature

Fix

Release music

Uh oh!

v1.10.0

1.10.0 - 2024-08-21

Features

Uh oh!

Python API 0.7.0

Python API 0.7.0 - 2024-08-21

Feature

Fix

Release music

Uh oh!

Python API 0.6.0

Feature

Fix

Known issues

Uh oh!