runs

`OpenMLRun` ¶

Bases: OpenMLBase

OpenML Run: result of running a model on an OpenML dataset.

Parameters:

Name	Type	Description	Default
`task_id`	`int`	The ID of the OpenML task associated with the run.	required
`flow_id`	`int \| None`	The ID of the OpenML flow associated with the run.	required
`dataset_id`	`int \| None`	The ID of the OpenML dataset used for the run.	required
`setup_string`	`str \| None`	The setup string of the run.	`None`
`output_files`	`dict[str, int] \| None`	Specifies where each related file can be found.	`None`
`setup_id`	`int \| None`	An integer representing the ID of the setup used for the run.	`None`
`tags`	`list[str] \| None`	Representing the tags associated with the run.	`None`
`uploader`	`int \| None`	User ID of the uploader.	`None`
`uploader_name`	`str \| None`	The name of the person who uploaded the run.	`None`
`evaluations`	`dict \| None`	Representing the evaluations of the run.	`None`
`fold_evaluations`	`dict \| None`	The evaluations of the run for each fold.	`None`
`sample_evaluations`	`dict \| None`	The evaluations of the run for each sample.	`None`
`data_content`	`list[list] \| None`	The predictions generated from executing this run.	`None`
`trace`	`OpenMLRunTrace \| None`	The trace containing information on internal model evaluations of this run.	`None`
`model`	`object \| None`	The untrained model that was evaluated in the run.	`None`
`task_type`	`str \| None`	The type of the OpenML task associated with the run.	`None`
`task_evaluation_measure`	`str \| None`	The evaluation measure used for the task.	`None`
`flow_name`	`str \| None`	The name of the OpenML flow associated with the run.	`None`
`parameter_settings`	`list[dict[str, Any]] \| None`	Representing the parameter settings used for the run.	`None`
`predictions_url`	`str \| None`	The URL of the predictions file.	`None`
`task`	`OpenMLTask \| None`	An instance of the OpenMLTask class, representing the OpenML task associated with the run.	`None`
`flow`	`OpenMLFlow \| None`	An instance of the OpenMLFlow class, representing the OpenML flow associated with the run.	`None`
`run_id`	`int \| None`	The ID of the run.	`None`
`description_text`	`str \| None`	Description text to add to the predictions file. If left None, is set to the time the arff file is generated.	`None`
`run_details`	`str \| None`	Description of the run stored in the run meta-data.	`None`

Source code in openml/runs/run.py

class OpenMLRun(OpenMLBase):
    """OpenML Run: result of running a model on an OpenML dataset.

    Parameters
    ----------
    task_id: int
        The ID of the OpenML task associated with the run.
    flow_id: int
        The ID of the OpenML flow associated with the run.
    dataset_id: int
        The ID of the OpenML dataset used for the run.
    setup_string: str
        The setup string of the run.
    output_files: Dict[str, int]
        Specifies where each related file can be found.
    setup_id: int
        An integer representing the ID of the setup used for the run.
    tags: List[str]
        Representing the tags associated with the run.
    uploader: int
        User ID of the uploader.
    uploader_name: str
        The name of the person who uploaded the run.
    evaluations: Dict
        Representing the evaluations of the run.
    fold_evaluations: Dict
        The evaluations of the run for each fold.
    sample_evaluations: Dict
        The evaluations of the run for each sample.
    data_content: List[List]
        The predictions generated from executing this run.
    trace: OpenMLRunTrace
        The trace containing information on internal model evaluations of this run.
    model: object
        The untrained model that was evaluated in the run.
    task_type: str
        The type of the OpenML task associated with the run.
    task_evaluation_measure: str
        The evaluation measure used for the task.
    flow_name: str
        The name of the OpenML flow associated with the run.
    parameter_settings: list[OrderedDict]
        Representing the parameter settings used for the run.
    predictions_url: str
        The URL of the predictions file.
    task: OpenMLTask
        An instance of the OpenMLTask class, representing the OpenML task associated
        with the run.
    flow: OpenMLFlow
        An instance of the OpenMLFlow class, representing the OpenML flow associated
        with the run.
    run_id: int
        The ID of the run.
    description_text: str, optional
        Description text to add to the predictions file. If left None, is set to the
        time the arff file is generated.
    run_details: str, optional (default=None)
        Description of the run stored in the run meta-data.
    """

    def __init__(  # noqa: PLR0913
        self,
        task_id: int,
        flow_id: int | None,
        dataset_id: int | None,
        setup_string: str | None = None,
        output_files: dict[str, int] | None = None,
        setup_id: int | None = None,
        tags: list[str] | None = None,
        uploader: int | None = None,
        uploader_name: str | None = None,
        evaluations: dict | None = None,
        fold_evaluations: dict | None = None,
        sample_evaluations: dict | None = None,
        data_content: list[list] | None = None,
        trace: OpenMLRunTrace | None = None,
        model: object | None = None,
        task_type: str | None = None,
        task_evaluation_measure: str | None = None,
        flow_name: str | None = None,
        parameter_settings: list[dict[str, Any]] | None = None,
        predictions_url: str | None = None,
        task: OpenMLTask | None = None,
        flow: OpenMLFlow | None = None,
        run_id: int | None = None,
        description_text: str | None = None,
        run_details: str | None = None,
    ):
        self.uploader = uploader
        self.uploader_name = uploader_name
        self.task_id = task_id
        self.task_type = task_type
        self.task_evaluation_measure = task_evaluation_measure
        self.flow_id = flow_id
        self.flow_name = flow_name
        self.setup_id = setup_id
        self.setup_string = setup_string
        self.parameter_settings = parameter_settings
        self.dataset_id = dataset_id
        self.evaluations = evaluations
        self.fold_evaluations = fold_evaluations
        self.sample_evaluations = sample_evaluations
        self.data_content = data_content
        self.output_files = output_files
        self.trace = trace
        self.error_message = None
        self.task = task
        self.flow = flow
        self.run_id = run_id
        self.model = model
        self.tags = tags
        self.predictions_url = predictions_url
        self.description_text = description_text
        self.run_details = run_details
        self._predictions = None

    @property
    def predictions(self) -> pd.DataFrame:
        """Return a DataFrame with predictions for this run"""
        if self._predictions is None:
            if self.data_content:
                arff_dict = self._generate_arff_dict()
            elif self.predictions_url:
                arff_text = openml._api_calls._download_text_file(self.predictions_url)
                arff_dict = arff.loads(arff_text)
            else:
                raise RuntimeError("Run has no predictions.")
            self._predictions = pd.DataFrame(
                arff_dict["data"],
                columns=[name for name, _ in arff_dict["attributes"]],
            )
        return self._predictions

    @property
    def id(self) -> int | None:
        """The ID of the run, None if not uploaded to the server yet."""
        return self.run_id

    def _evaluation_summary(self, metric: str) -> str:
        """Summarizes the evaluation of a metric over all folds.

        The fold scores for the metric must exist already. During run creation,
        by default, the MAE for OpenMLRegressionTask and the accuracy for
        OpenMLClassificationTask/OpenMLLearningCurveTasktasks are computed.

        If repetition exist, we take the mean over all repetitions.

        Parameters
        ----------
        metric: str
            Name of an evaluation metric that was used to compute fold scores.

        Returns
        -------
        metric_summary: str
            A formatted string that displays the metric's evaluation summary.
            The summary consists of the mean and std.
        """
        if self.fold_evaluations is None:
            raise ValueError("No fold evaluations available.")
        fold_score_lists = self.fold_evaluations[metric].values()

        # Get the mean and std over all repetitions
        rep_means = [np.mean(list(x.values())) for x in fold_score_lists]
        rep_stds = [np.std(list(x.values())) for x in fold_score_lists]

        return f"{np.mean(rep_means):.4f} +- {np.mean(rep_stds):.4f}"

    def _get_repr_body_fields(self) -> Sequence[tuple[str, str | int | list[str]]]:
        """Collect all information to display in the __repr__ body."""
        # Set up fields
        fields = {
            "Uploader Name": self.uploader_name,
            "Metric": self.task_evaluation_measure,
            "Run ID": self.run_id,
            "Task ID": self.task_id,
            "Task Type": self.task_type,
            "Task URL": openml.tasks.OpenMLTask.url_for_id(self.task_id),
            "Flow ID": self.flow_id,
            "Flow Name": self.flow_name,
            "Flow URL": (
                openml.flows.OpenMLFlow.url_for_id(self.flow_id)
                if self.flow_id is not None
                else None
            ),
            "Setup ID": self.setup_id,
            "Setup String": self.setup_string,
            "Dataset ID": self.dataset_id,
            "Dataset URL": (
                openml.datasets.OpenMLDataset.url_for_id(self.dataset_id)
                if self.dataset_id is not None
                else None
            ),
        }

        # determines the order of the initial fields in which the information will be printed
        order = ["Uploader Name", "Uploader Profile", "Metric", "Result"]

        if self.uploader is not None:
            fields["Uploader Profile"] = f"{openml.config.get_server_base_url()}/u/{self.uploader}"
        if self.run_id is not None:
            fields["Run URL"] = self.openml_url
        if self.evaluations is not None and self.task_evaluation_measure in self.evaluations:
            fields["Result"] = self.evaluations[self.task_evaluation_measure]
        elif self.fold_evaluations is not None:
            # -- Add locally computed summary values if possible
            if "predictive_accuracy" in self.fold_evaluations:
                # OpenMLClassificationTask; OpenMLLearningCurveTask
                result_field = "Local Result - Accuracy (+- STD)"
                fields[result_field] = self._evaluation_summary("predictive_accuracy")
                order.append(result_field)
            elif "mean_absolute_error" in self.fold_evaluations:
                # OpenMLRegressionTask
                result_field = "Local Result - MAE (+- STD)"
                fields[result_field] = self._evaluation_summary("mean_absolute_error")
                order.append(result_field)

            if "usercpu_time_millis" in self.fold_evaluations:
                # Runtime should be available for most tasks types
                rt_field = "Local Runtime - ms (+- STD)"
                fields[rt_field] = self._evaluation_summary("usercpu_time_millis")
                order.append(rt_field)

        # determines the remaining order
        order += [
            "Run ID",
            "Run URL",
            "Task ID",
            "Task Type",
            "Task URL",
            "Flow ID",
            "Flow Name",
            "Flow URL",
            "Setup ID",
            "Setup String",
            "Dataset ID",
            "Dataset URL",
        ]
        return [
            (key, "None" if fields[key] is None else fields[key])  # type: ignore
            for key in order
            if key in fields
        ]

    @classmethod
    def from_filesystem(cls, directory: str | Path, expect_model: bool = True) -> OpenMLRun:  # noqa: FBT001, FBT002
        """
        The inverse of the to_filesystem method. Instantiates an OpenMLRun
        object based on files stored on the file system.

        Parameters
        ----------
        directory : str
            a path leading to the folder where the results
            are stored

        expect_model : bool
            if True, it requires the model pickle to be present, and an error
            will be thrown if not. Otherwise, the model might or might not
            be present.

        Returns
        -------
        run : OpenMLRun
            the re-instantiated run object
        """
        # Avoiding cyclic imports
        import openml.runs.functions

        directory = Path(directory)
        if not directory.is_dir():
            raise ValueError("Could not find folder")

        description_path = directory / "description.xml"
        predictions_path = directory / "predictions.arff"
        trace_path = directory / "trace.arff"
        model_path = directory / "model.pkl"

        if not description_path.is_file():
            raise ValueError("Could not find description.xml")
        if not predictions_path.is_file():
            raise ValueError("Could not find predictions.arff")
        if (not model_path.is_file()) and expect_model:
            raise ValueError("Could not find model.pkl")

        with description_path.open() as fht:
            xml_string = fht.read()
        run = openml.runs.functions._create_run_from_xml(xml_string, from_server=False)

        if run.flow_id is None:
            flow = openml.flows.OpenMLFlow.from_filesystem(directory)
            run.flow = flow
            run.flow_name = flow.name

        with predictions_path.open() as fht:
            predictions = arff.load(fht)
            run.data_content = predictions["data"]

        if model_path.is_file():
            # note that it will load the model if the file exists, even if
            # expect_model is False
            with model_path.open("rb") as fhb:
                run.model = pickle.load(fhb)  # noqa: S301

        if trace_path.is_file():
            run.trace = openml.runs.OpenMLRunTrace._from_filesystem(trace_path)

        return run

    def to_filesystem(
        self,
        directory: str | Path,
        store_model: bool = True,  # noqa: FBT001, FBT002
    ) -> None:
        """
        The inverse of the from_filesystem method. Serializes a run
        on the filesystem, to be uploaded later.

        Parameters
        ----------
        directory : str
            a path leading to the folder where the results
            will be stored. Should be empty

        store_model : bool, optional (default=True)
            if True, a model will be pickled as well. As this is the most
            storage expensive part, it is often desirable to not store the
            model.
        """
        if self.data_content is None or self.model is None:
            raise ValueError("Run should have been executed (and contain " "model / predictions)")
        directory = Path(directory)
        directory.mkdir(exist_ok=True, parents=True)

        if any(directory.iterdir()):
            raise ValueError(f"Output directory {directory.expanduser().resolve()} should be empty")

        run_xml = self._to_xml()
        predictions_arff = arff.dumps(self._generate_arff_dict())

        # It seems like typing does not allow to define the same variable multiple times
        with (directory / "description.xml").open("w") as fh:
            fh.write(run_xml)
        with (directory / "predictions.arff").open("w") as fh:
            fh.write(predictions_arff)
        if store_model:
            with (directory / "model.pkl").open("wb") as fh_b:
                pickle.dump(self.model, fh_b)

        if self.flow_id is None and self.flow is not None:
            self.flow.to_filesystem(directory)

        if self.trace is not None:
            self.trace._to_filesystem(directory)

    def _generate_arff_dict(self) -> OrderedDict[str, Any]:
        """Generates the arff dictionary for uploading predictions to the
        server.

        Assumes that the run has been executed.

        The order of the attributes follows the order defined by the Client API for R.

        Returns
        -------
        arf_dict : dict
            Dictionary representation of the ARFF file that will be uploaded.
            Contains predictions and information about the run environment.
        """
        if self.data_content is None:
            raise ValueError("Run has not been executed.")
        if self.flow is None:
            assert self.flow_id is not None, "Run has no associated flow id!"
            self.flow = get_flow(self.flow_id)

        if self.description_text is None:
            self.description_text = time.strftime("%c")
        task = get_task(self.task_id)

        arff_dict = OrderedDict()  # type: 'OrderedDict[str, Any]'
        arff_dict["data"] = self.data_content
        arff_dict["description"] = self.description_text
        arff_dict["relation"] = f"openml_task_{task.task_id}_predictions"

        if isinstance(task, OpenMLLearningCurveTask):
            class_labels = task.class_labels
            instance_specifications = [
                ("repeat", "NUMERIC"),
                ("fold", "NUMERIC"),
                ("sample", "NUMERIC"),
                ("row_id", "NUMERIC"),
            ]

            arff_dict["attributes"] = instance_specifications
            if class_labels is not None:
                arff_dict["attributes"] = (
                    arff_dict["attributes"]
                    + [("prediction", class_labels), ("correct", class_labels)]
                    + [
                        ("confidence." + class_labels[i], "NUMERIC")
                        for i in range(len(class_labels))
                    ]
                )
            else:
                raise ValueError("The task has no class labels")

        elif isinstance(task, OpenMLClassificationTask):
            class_labels = task.class_labels
            instance_specifications = [
                ("repeat", "NUMERIC"),
                ("fold", "NUMERIC"),
                ("sample", "NUMERIC"),  # Legacy
                ("row_id", "NUMERIC"),
            ]

            arff_dict["attributes"] = instance_specifications
            if class_labels is not None:
                prediction_confidences = [
                    ("confidence." + class_labels[i], "NUMERIC") for i in range(len(class_labels))
                ]
                prediction_and_true = [("prediction", class_labels), ("correct", class_labels)]
                arff_dict["attributes"] = (
                    arff_dict["attributes"] + prediction_and_true + prediction_confidences
                )
            else:
                raise ValueError("The task has no class labels")

        elif isinstance(task, OpenMLRegressionTask):
            arff_dict["attributes"] = [
                ("repeat", "NUMERIC"),
                ("fold", "NUMERIC"),
                ("row_id", "NUMERIC"),
                ("prediction", "NUMERIC"),
                ("truth", "NUMERIC"),
            ]

        elif isinstance(task, OpenMLClusteringTask):
            arff_dict["attributes"] = [
                ("repeat", "NUMERIC"),
                ("fold", "NUMERIC"),
                ("row_id", "NUMERIC"),
                ("cluster", "NUMERIC"),
            ]

        else:
            raise NotImplementedError("Task type %s is not yet supported." % str(task.task_type))

        return arff_dict

    def get_metric_fn(self, sklearn_fn: Callable, kwargs: dict | None = None) -> np.ndarray:  # noqa: PLR0915, PLR0912, C901
        """Calculates metric scores based on predicted values. Assumes the
        run has been executed locally (and contains run_data). Furthermore,
        it assumes that the 'correct' or 'truth' attribute is specified in
        the arff (which is an optional field, but always the case for
        openml-python runs)

        Parameters
        ----------
        sklearn_fn : function
            a function pointer to a sklearn function that
            accepts ``y_true``, ``y_pred`` and ``**kwargs``
        kwargs : dict
            kwargs for the function

        Returns
        -------
        scores : ndarray of scores of length num_folds * num_repeats
            metric results
        """
        kwargs = kwargs if kwargs else {}
        if self.data_content is not None and self.task_id is not None:
            predictions_arff = self._generate_arff_dict()
        elif (self.output_files is not None) and ("predictions" in self.output_files):
            predictions_file_url = openml._api_calls._file_id_to_url(
                self.output_files["predictions"],
                "predictions.arff",
            )
            response = openml._api_calls._download_text_file(predictions_file_url)
            predictions_arff = arff.loads(response)
            # TODO: make this a stream reader
        else:
            raise ValueError(
                "Run should have been locally executed or " "contain outputfile reference.",
            )

        # Need to know more about the task to compute scores correctly
        task = get_task(self.task_id)

        attribute_names = [att[0] for att in predictions_arff["attributes"]]
        if (
            task.task_type_id in [TaskType.SUPERVISED_CLASSIFICATION, TaskType.LEARNING_CURVE]
            and "correct" not in attribute_names
        ):
            raise ValueError('Attribute "correct" should be set for ' "classification task runs")
        if task.task_type_id == TaskType.SUPERVISED_REGRESSION and "truth" not in attribute_names:
            raise ValueError('Attribute "truth" should be set for ' "regression task runs")
        if task.task_type_id != TaskType.CLUSTERING and "prediction" not in attribute_names:
            raise ValueError('Attribute "predict" should be set for ' "supervised task runs")

        def _attribute_list_to_dict(attribute_list):  # type: ignore
            # convenience function: Creates a mapping to map from the name of
            # attributes present in the arff prediction file to their index.
            # This is necessary because the number of classes can be different
            # for different tasks.
            res = OrderedDict()
            for idx in range(len(attribute_list)):
                res[attribute_list[idx][0]] = idx
            return res

        attribute_dict = _attribute_list_to_dict(predictions_arff["attributes"])

        repeat_idx = attribute_dict["repeat"]
        fold_idx = attribute_dict["fold"]
        predicted_idx = attribute_dict["prediction"]  # Assume supervised task

        if task.task_type_id in (TaskType.SUPERVISED_CLASSIFICATION, TaskType.LEARNING_CURVE):
            correct_idx = attribute_dict["correct"]
        elif task.task_type_id == TaskType.SUPERVISED_REGRESSION:
            correct_idx = attribute_dict["truth"]
        has_samples = False
        if "sample" in attribute_dict:
            sample_idx = attribute_dict["sample"]
            has_samples = True

        if (
            predictions_arff["attributes"][predicted_idx][1]
            != predictions_arff["attributes"][correct_idx][1]
        ):
            pred = predictions_arff["attributes"][predicted_idx][1]
            corr = predictions_arff["attributes"][correct_idx][1]
            raise ValueError(
                "Predicted and Correct do not have equal values:" f" {pred!s} Vs. {corr!s}",
            )

        # TODO: these could be cached
        values_predict: dict[int, dict[int, dict[int, list[float]]]] = {}
        values_correct: dict[int, dict[int, dict[int, list[float]]]] = {}
        for _line_idx, line in enumerate(predictions_arff["data"]):
            rep = line[repeat_idx]
            fold = line[fold_idx]
            samp = line[sample_idx] if has_samples else 0

            if task.task_type_id in [
                TaskType.SUPERVISED_CLASSIFICATION,
                TaskType.LEARNING_CURVE,
            ]:
                prediction = predictions_arff["attributes"][predicted_idx][1].index(
                    line[predicted_idx],
                )
                correct = predictions_arff["attributes"][predicted_idx][1].index(line[correct_idx])
            elif task.task_type_id == TaskType.SUPERVISED_REGRESSION:
                prediction = line[predicted_idx]
                correct = line[correct_idx]
            if rep not in values_predict:
                values_predict[rep] = OrderedDict()
                values_correct[rep] = OrderedDict()
            if fold not in values_predict[rep]:
                values_predict[rep][fold] = OrderedDict()
                values_correct[rep][fold] = OrderedDict()
            if samp not in values_predict[rep][fold]:
                values_predict[rep][fold][samp] = []
                values_correct[rep][fold][samp] = []

            values_predict[rep][fold][samp].append(prediction)
            values_correct[rep][fold][samp].append(correct)

        scores = []
        for rep in values_predict:
            for fold in values_predict[rep]:
                last_sample = len(values_predict[rep][fold]) - 1
                y_pred = values_predict[rep][fold][last_sample]
                y_true = values_correct[rep][fold][last_sample]
                scores.append(sklearn_fn(y_true, y_pred, **kwargs))
        return np.array(scores)

    def _parse_publish_response(self, xml_response: dict) -> None:
        """Parse the id from the xml_response and assign it to self."""
        self.run_id = int(xml_response["oml:upload_run"]["oml:run_id"])

    def _get_file_elements(self) -> dict:
        """Get file_elements to upload to the server.

        Derived child classes should overwrite this method as necessary.
        The description field will be populated automatically if not provided.
        """
        if self.parameter_settings is None and self.model is None:
            raise PyOpenMLError(
                "OpenMLRun must contain a model or be initialized with parameter_settings.",
            )
        if self.flow_id is None:
            if self.flow is None:
                raise PyOpenMLError(
                    "OpenMLRun object does not contain a flow id or reference to OpenMLFlow "
                    "(these should have been added while executing the task). ",
                )

            # publish the linked Flow before publishing the run.
            self.flow.publish()
            self.flow_id = self.flow.flow_id

        if self.parameter_settings is None:
            if self.flow is None:
                assert self.flow_id is not None  # for mypy
                self.flow = openml.flows.get_flow(self.flow_id)
            self.parameter_settings = self.flow.extension.obtain_parameter_values(
                self.flow,
                self.model,
            )

        file_elements = {"description": ("description.xml", self._to_xml())}

        if self.error_message is None:
            predictions = arff.dumps(self._generate_arff_dict())
            file_elements["predictions"] = ("predictions.arff", predictions)

        if self.trace is not None:
            trace_arff = arff.dumps(self.trace.trace_to_arff())
            file_elements["trace"] = ("trace.arff", trace_arff)
        return file_elements

    def _to_dict(self) -> dict[str, dict]:  # noqa: PLR0912, C901
        """Creates a dictionary representation of self."""
        description = OrderedDict()  # type: 'OrderedDict'
        description["oml:run"] = OrderedDict()
        description["oml:run"]["@xmlns:oml"] = "http://openml.org/openml"
        description["oml:run"]["oml:task_id"] = self.task_id
        description["oml:run"]["oml:flow_id"] = self.flow_id
        if self.setup_string is not None:
            description["oml:run"]["oml:setup_string"] = self.setup_string
        if self.error_message is not None:
            description["oml:run"]["oml:error_message"] = self.error_message
        if self.run_details is not None:
            description["oml:run"]["oml:run_details"] = self.run_details
        description["oml:run"]["oml:parameter_setting"] = self.parameter_settings
        if self.tags is not None:
            description["oml:run"]["oml:tag"] = self.tags
        if (self.fold_evaluations is not None and len(self.fold_evaluations) > 0) or (
            self.sample_evaluations is not None and len(self.sample_evaluations) > 0
        ):
            description["oml:run"]["oml:output_data"] = OrderedDict()
            description["oml:run"]["oml:output_data"]["oml:evaluation"] = []
        if self.fold_evaluations is not None:
            for measure in self.fold_evaluations:
                for repeat in self.fold_evaluations[measure]:
                    for fold, value in self.fold_evaluations[measure][repeat].items():
                        current = OrderedDict(
                            [
                                ("@repeat", str(repeat)),
                                ("@fold", str(fold)),
                                ("oml:name", measure),
                                ("oml:value", str(value)),
                            ],
                        )
                        description["oml:run"]["oml:output_data"]["oml:evaluation"].append(current)
        if self.sample_evaluations is not None:
            for measure in self.sample_evaluations:
                for repeat in self.sample_evaluations[measure]:
                    for fold in self.sample_evaluations[measure][repeat]:
                        for sample, value in self.sample_evaluations[measure][repeat][fold].items():
                            current = OrderedDict(
                                [
                                    ("@repeat", str(repeat)),
                                    ("@fold", str(fold)),
                                    ("@sample", str(sample)),
                                    ("oml:name", measure),
                                    ("oml:value", str(value)),
                                ],
                            )
                            description["oml:run"]["oml:output_data"]["oml:evaluation"].append(
                                current,
                            )
        return description

`id: int | None` `property` ¶

The ID of the run, None if not uploaded to the server yet.

`predictions: pd.DataFrame` `property` ¶

Return a DataFrame with predictions for this run

`from_filesystem(directory, expect_model=True)` `classmethod` ¶

The inverse of the to_filesystem method. Instantiates an OpenMLRun object based on files stored on the file system.

Parameters:

Name	Type	Description	Default
`directory`	`str`	a path leading to the folder where the results are stored	required
`expect_model`	`bool`	if True, it requires the model pickle to be present, and an error will be thrown if not. Otherwise, the model might or might not be present.	`True`

Returns:

Name	Type	Description
`run`	`OpenMLRun`	the re-instantiated run object

Source code in openml/runs/run.py

@classmethod
def from_filesystem(cls, directory: str | Path, expect_model: bool = True) -> OpenMLRun:  # noqa: FBT001, FBT002
    """
    The inverse of the to_filesystem method. Instantiates an OpenMLRun
    object based on files stored on the file system.

    Parameters
    ----------
    directory : str
        a path leading to the folder where the results
        are stored

    expect_model : bool
        if True, it requires the model pickle to be present, and an error
        will be thrown if not. Otherwise, the model might or might not
        be present.

    Returns
    -------
    run : OpenMLRun
        the re-instantiated run object
    """
    # Avoiding cyclic imports
    import openml.runs.functions

    directory = Path(directory)
    if not directory.is_dir():
        raise ValueError("Could not find folder")

    description_path = directory / "description.xml"
    predictions_path = directory / "predictions.arff"
    trace_path = directory / "trace.arff"
    model_path = directory / "model.pkl"

    if not description_path.is_file():
        raise ValueError("Could not find description.xml")
    if not predictions_path.is_file():
        raise ValueError("Could not find predictions.arff")
    if (not model_path.is_file()) and expect_model:
        raise ValueError("Could not find model.pkl")

    with description_path.open() as fht:
        xml_string = fht.read()
    run = openml.runs.functions._create_run_from_xml(xml_string, from_server=False)

    if run.flow_id is None:
        flow = openml.flows.OpenMLFlow.from_filesystem(directory)
        run.flow = flow
        run.flow_name = flow.name

    with predictions_path.open() as fht:
        predictions = arff.load(fht)
        run.data_content = predictions["data"]

    if model_path.is_file():
        # note that it will load the model if the file exists, even if
        # expect_model is False
        with model_path.open("rb") as fhb:
            run.model = pickle.load(fhb)  # noqa: S301

    if trace_path.is_file():
        run.trace = openml.runs.OpenMLRunTrace._from_filesystem(trace_path)

    return run

`get_metric_fn(sklearn_fn, kwargs=None)` ¶

Calculates metric scores based on predicted values. Assumes the run has been executed locally (and contains run_data). Furthermore, it assumes that the 'correct' or 'truth' attribute is specified in the arff (which is an optional field, but always the case for openml-python runs)

Parameters:

Name	Type	Description	Default
`sklearn_fn`	`function`	a function pointer to a sklearn function that accepts `y_true`, `y_pred` and `**kwargs`	required
`kwargs`	`dict`	kwargs for the function	`None`

Returns:

Name	Type	Description
`scores`	`ndarray of scores of length num_folds * num_repeats`	metric results

Source code in openml/runs/run.py

def get_metric_fn(self, sklearn_fn: Callable, kwargs: dict | None = None) -> np.ndarray:  # noqa: PLR0915, PLR0912, C901
    """Calculates metric scores based on predicted values. Assumes the
    run has been executed locally (and contains run_data). Furthermore,
    it assumes that the 'correct' or 'truth' attribute is specified in
    the arff (which is an optional field, but always the case for
    openml-python runs)

    Parameters
    ----------
    sklearn_fn : function
        a function pointer to a sklearn function that
        accepts ``y_true``, ``y_pred`` and ``**kwargs``
    kwargs : dict
        kwargs for the function

    Returns
    -------
    scores : ndarray of scores of length num_folds * num_repeats
        metric results
    """
    kwargs = kwargs if kwargs else {}
    if self.data_content is not None and self.task_id is not None:
        predictions_arff = self._generate_arff_dict()
    elif (self.output_files is not None) and ("predictions" in self.output_files):
        predictions_file_url = openml._api_calls._file_id_to_url(
            self.output_files["predictions"],
            "predictions.arff",
        )
        response = openml._api_calls._download_text_file(predictions_file_url)
        predictions_arff = arff.loads(response)
        # TODO: make this a stream reader
    else:
        raise ValueError(
            "Run should have been locally executed or " "contain outputfile reference.",
        )

    # Need to know more about the task to compute scores correctly
    task = get_task(self.task_id)

    attribute_names = [att[0] for att in predictions_arff["attributes"]]
    if (
        task.task_type_id in [TaskType.SUPERVISED_CLASSIFICATION, TaskType.LEARNING_CURVE]
        and "correct" not in attribute_names
    ):
        raise ValueError('Attribute "correct" should be set for ' "classification task runs")
    if task.task_type_id == TaskType.SUPERVISED_REGRESSION and "truth" not in attribute_names:
        raise ValueError('Attribute "truth" should be set for ' "regression task runs")
    if task.task_type_id != TaskType.CLUSTERING and "prediction" not in attribute_names:
        raise ValueError('Attribute "predict" should be set for ' "supervised task runs")

    def _attribute_list_to_dict(attribute_list):  # type: ignore
        # convenience function: Creates a mapping to map from the name of
        # attributes present in the arff prediction file to their index.
        # This is necessary because the number of classes can be different
        # for different tasks.
        res = OrderedDict()
        for idx in range(len(attribute_list)):
            res[attribute_list[idx][0]] = idx
        return res

    attribute_dict = _attribute_list_to_dict(predictions_arff["attributes"])

    repeat_idx = attribute_dict["repeat"]
    fold_idx = attribute_dict["fold"]
    predicted_idx = attribute_dict["prediction"]  # Assume supervised task

    if task.task_type_id in (TaskType.SUPERVISED_CLASSIFICATION, TaskType.LEARNING_CURVE):
        correct_idx = attribute_dict["correct"]
    elif task.task_type_id == TaskType.SUPERVISED_REGRESSION:
        correct_idx = attribute_dict["truth"]
    has_samples = False
    if "sample" in attribute_dict:
        sample_idx = attribute_dict["sample"]
        has_samples = True

    if (
        predictions_arff["attributes"][predicted_idx][1]
        != predictions_arff["attributes"][correct_idx][1]
    ):
        pred = predictions_arff["attributes"][predicted_idx][1]
        corr = predictions_arff["attributes"][correct_idx][1]
        raise ValueError(
            "Predicted and Correct do not have equal values:" f" {pred!s} Vs. {corr!s}",
        )

    # TODO: these could be cached
    values_predict: dict[int, dict[int, dict[int, list[float]]]] = {}
    values_correct: dict[int, dict[int, dict[int, list[float]]]] = {}
    for _line_idx, line in enumerate(predictions_arff["data"]):
        rep = line[repeat_idx]
        fold = line[fold_idx]
        samp = line[sample_idx] if has_samples else 0

        if task.task_type_id in [
            TaskType.SUPERVISED_CLASSIFICATION,
            TaskType.LEARNING_CURVE,
        ]:
            prediction = predictions_arff["attributes"][predicted_idx][1].index(
                line[predicted_idx],
            )
            correct = predictions_arff["attributes"][predicted_idx][1].index(line[correct_idx])
        elif task.task_type_id == TaskType.SUPERVISED_REGRESSION:
            prediction = line[predicted_idx]
            correct = line[correct_idx]
        if rep not in values_predict:
            values_predict[rep] = OrderedDict()
            values_correct[rep] = OrderedDict()
        if fold not in values_predict[rep]:
            values_predict[rep][fold] = OrderedDict()
            values_correct[rep][fold] = OrderedDict()
        if samp not in values_predict[rep][fold]:
            values_predict[rep][fold][samp] = []
            values_correct[rep][fold][samp] = []

        values_predict[rep][fold][samp].append(prediction)
        values_correct[rep][fold][samp].append(correct)

    scores = []
    for rep in values_predict:
        for fold in values_predict[rep]:
            last_sample = len(values_predict[rep][fold]) - 1
            y_pred = values_predict[rep][fold][last_sample]
            y_true = values_correct[rep][fold][last_sample]
            scores.append(sklearn_fn(y_true, y_pred, **kwargs))
    return np.array(scores)

`to_filesystem(directory, store_model=True)` ¶

The inverse of the from_filesystem method. Serializes a run on the filesystem, to be uploaded later.

Parameters:

Name	Type	Description	Default
`directory`	`str`	a path leading to the folder where the results will be stored. Should be empty	required
`store_model`	`(bool, optional(default=True))`	if True, a model will be pickled as well. As this is the most storage expensive part, it is often desirable to not store the model.	`True`

Source code in openml/runs/run.py

def to_filesystem(
    self,
    directory: str | Path,
    store_model: bool = True,  # noqa: FBT001, FBT002
) -> None:
    """
    The inverse of the from_filesystem method. Serializes a run
    on the filesystem, to be uploaded later.

    Parameters
    ----------
    directory : str
        a path leading to the folder where the results
        will be stored. Should be empty

    store_model : bool, optional (default=True)
        if True, a model will be pickled as well. As this is the most
        storage expensive part, it is often desirable to not store the
        model.
    """
    if self.data_content is None or self.model is None:
        raise ValueError("Run should have been executed (and contain " "model / predictions)")
    directory = Path(directory)
    directory.mkdir(exist_ok=True, parents=True)

    if any(directory.iterdir()):
        raise ValueError(f"Output directory {directory.expanduser().resolve()} should be empty")

    run_xml = self._to_xml()
    predictions_arff = arff.dumps(self._generate_arff_dict())

    # It seems like typing does not allow to define the same variable multiple times
    with (directory / "description.xml").open("w") as fh:
        fh.write(run_xml)
    with (directory / "predictions.arff").open("w") as fh:
        fh.write(predictions_arff)
    if store_model:
        with (directory / "model.pkl").open("wb") as fh_b:
            pickle.dump(self.model, fh_b)

    if self.flow_id is None and self.flow is not None:
        self.flow.to_filesystem(directory)

    if self.trace is not None:
        self.trace._to_filesystem(directory)

`OpenMLRunTrace` ¶

OpenML Run Trace: parsed output from Run Trace call

Parameters:

Name	Type	Description	Default
`run_id`	`int`	OpenML run id.	required
`trace_iterations`	`dict`	Mapping from key `(repeat, fold, iteration)` to an object of OpenMLTraceIteration.	required

Source code in openml/runs/trace.py

class OpenMLRunTrace:
    """OpenML Run Trace: parsed output from Run Trace call

    Parameters
    ----------
    run_id : int
        OpenML run id.

    trace_iterations : dict
        Mapping from key ``(repeat, fold, iteration)`` to an object of
        OpenMLTraceIteration.

    """

    def __init__(
        self,
        run_id: int | None,
        trace_iterations: dict[tuple[int, int, int], OpenMLTraceIteration],
    ):
        """Object to hold the trace content of a run.

        Parameters
        ----------
        run_id : int
            Id for which the trace content is to be stored.
        trace_iterations : List[List]
            The trace content obtained by running a flow on a task.
        """
        self.run_id = run_id
        self.trace_iterations = trace_iterations

    def get_selected_iteration(self, fold: int, repeat: int) -> int:
        """
        Returns the trace iteration that was marked as selected. In
        case multiple are marked as selected (should not happen) the
        first of these is returned

        Parameters
        ----------
        fold: int

        repeat: int

        Returns
        -------
        int
            The trace iteration from the given fold and repeat that was
            selected as the best iteration by the search procedure
        """
        for r, f, i in self.trace_iterations:
            if r == repeat and f == fold and self.trace_iterations[(r, f, i)].selected is True:
                return i
        raise ValueError(
            "Could not find the selected iteration for rep/fold %d/%d" % (repeat, fold),
        )

    @classmethod
    def generate(
        cls,
        attributes: list[tuple[str, str]],
        content: list[list[int | float | str]],
    ) -> OpenMLRunTrace:
        """Generates an OpenMLRunTrace.

        Generates the trace object from the attributes and content extracted
        while running the underlying flow.

        Parameters
        ----------
        attributes : list
            List of tuples describing the arff attributes.

        content : list
            List of lists containing information about the individual tuning
            runs.

        Returns
        -------
        OpenMLRunTrace
        """
        if content is None:
            raise ValueError("Trace content not available.")
        if attributes is None:
            raise ValueError("Trace attributes not available.")
        if len(content) == 0:
            raise ValueError("Trace content is empty.")
        if len(attributes) != len(content[0]):
            raise ValueError(
                "Trace_attributes and trace_content not compatible:"
                f" {attributes} vs {content[0]}",
            )

        return cls._trace_from_arff_struct(
            attributes=attributes,
            content=content,
            error_message="setup_string not allowed when constructing a "
            "trace object from run results.",
        )

    @classmethod
    def _from_filesystem(cls, file_path: str | Path) -> OpenMLRunTrace:
        """
        Logic to deserialize the trace from the filesystem.

        Parameters
        ----------
        file_path: str | Path
            File path where the trace arff is stored.

        Returns
        -------
        OpenMLRunTrace
        """
        file_path = Path(file_path)

        if not file_path.exists():
            raise ValueError("Trace file doesn't exist")

        with file_path.open("r") as fp:
            trace_arff = arff.load(fp)

        for trace_idx in range(len(trace_arff["data"])):
            # iterate over first three entrees of a trace row
            # (fold, repeat, trace_iteration) these should be int
            for line_idx in range(3):
                trace_arff["data"][trace_idx][line_idx] = int(
                    trace_arff["data"][trace_idx][line_idx],
                )

        return cls.trace_from_arff(trace_arff)

    def _to_filesystem(self, file_path: str | Path) -> None:
        """Serialize the trace object to the filesystem.

        Serialize the trace object as an arff.

        Parameters
        ----------
        file_path: str | Path
            File path where the trace arff will be stored.
        """
        trace_path = Path(file_path) / "trace.arff"

        trace_arff = arff.dumps(self.trace_to_arff())
        with trace_path.open("w") as f:
            f.write(trace_arff)

    def trace_to_arff(self) -> dict[str, Any]:
        """Generate the arff dictionary for uploading predictions to the server.

        Uses the trace object to generate an arff dictionary representation.

        Returns
        -------
        arff_dict : dict
            Dictionary representation of the ARFF file that will be uploaded.
            Contains information about the optimization trace.
        """
        if self.trace_iterations is None:
            raise ValueError("trace_iterations missing from the trace object")

        # attributes that will be in trace arff
        trace_attributes = [
            ("repeat", "NUMERIC"),
            ("fold", "NUMERIC"),
            ("iteration", "NUMERIC"),
            ("evaluation", "NUMERIC"),
            ("selected", ["true", "false"]),
        ]
        trace_attributes.extend(
            [
                (PREFIX + parameter, "STRING")
                for parameter in next(iter(self.trace_iterations.values())).get_parameters()
            ],
        )

        arff_dict: dict[str, Any] = {}
        data = []
        for trace_iteration in self.trace_iterations.values():
            tmp_list = []
            for _attr, _ in trace_attributes:
                if _attr.startswith(PREFIX):
                    attr = _attr[len(PREFIX) :]
                    value = trace_iteration.get_parameters()[attr]
                else:
                    attr = _attr
                    value = getattr(trace_iteration, attr)

                if attr == "selected":
                    tmp_list.append("true" if value else "false")
                else:
                    tmp_list.append(value)
            data.append(tmp_list)

        arff_dict["attributes"] = trace_attributes
        arff_dict["data"] = data
        # TODO allow to pass a trace description when running a flow
        arff_dict["relation"] = "Trace"
        return arff_dict

    @classmethod
    def trace_from_arff(cls, arff_obj: dict[str, Any]) -> OpenMLRunTrace:
        """Generate trace from arff trace.

        Creates a trace file from arff object (for example, generated by a
        local run).

        Parameters
        ----------
        arff_obj : dict
            LIAC arff obj, dict containing attributes, relation, data.

        Returns
        -------
        OpenMLRunTrace
        """
        attributes = arff_obj["attributes"]
        content = arff_obj["data"]
        return cls._trace_from_arff_struct(
            attributes=attributes,
            content=content,
            error_message="setup_string not supported for arff serialization",
        )

    @classmethod
    def _trace_from_arff_struct(
        cls,
        attributes: list[tuple[str, str]],
        content: list[list[int | float | str]],
        error_message: str,
    ) -> Self:
        """Generate a trace dictionary from ARFF structure.

        Parameters
        ----------
        cls : type
            The trace object to be created.
        attributes : list[tuple[str, str]]
            Attribute descriptions.
        content : list[list[int | float | str]]]
            List of instances.
        error_message : str
            Error message to raise if `setup_string` is in `attributes`.

        Returns
        -------
        OrderedDict
            A dictionary representing the trace.
        """
        trace = OrderedDict()
        attribute_idx = {att[0]: idx for idx, att in enumerate(attributes)}

        for required_attribute in REQUIRED_ATTRIBUTES:
            if required_attribute not in attribute_idx:
                raise ValueError("arff misses required attribute: %s" % required_attribute)
        if "setup_string" in attribute_idx:
            raise ValueError(error_message)

        # note that the required attributes can not be duplicated because
        # they are not parameters
        parameter_attributes = []
        for attribute in attribute_idx:
            if attribute in REQUIRED_ATTRIBUTES or attribute == "setup_string":
                continue

            if not attribute.startswith(PREFIX):
                raise ValueError(
                    f"Encountered unknown attribute {attribute} that does not start "
                    f"with prefix {PREFIX}",
                )

            parameter_attributes.append(attribute)

        for itt in content:
            repeat = int(itt[attribute_idx["repeat"]])
            fold = int(itt[attribute_idx["fold"]])
            iteration = int(itt[attribute_idx["iteration"]])
            evaluation = float(itt[attribute_idx["evaluation"]])
            selected_value = itt[attribute_idx["selected"]]
            if selected_value == "true":
                selected = True
            elif selected_value == "false":
                selected = False
            else:
                raise ValueError(
                    'expected {"true", "false"} value for selected field, '
                    "received: %s" % selected_value,
                )

            parameters = {
                attribute: itt[attribute_idx[attribute]] for attribute in parameter_attributes
            }

            current = OpenMLTraceIteration(
                repeat=repeat,
                fold=fold,
                iteration=iteration,
                setup_string=None,
                evaluation=evaluation,
                selected=selected,
                parameters=parameters,
            )
            trace[(repeat, fold, iteration)] = current

        return cls(None, trace)

    @classmethod
    def trace_from_xml(cls, xml: str | Path | IO) -> OpenMLRunTrace:
        """Generate trace from xml.

        Creates a trace file from the xml description.

        Parameters
        ----------
        xml : string | file-like object
            An xml description that can be either a `string` or a file-like
            object.

        Returns
        -------
        run : OpenMLRunTrace
            Object containing the run id and a dict containing the trace
            iterations.
        """
        if isinstance(xml, Path):
            xml = str(xml.absolute())

        result_dict = xmltodict.parse(xml, force_list=("oml:trace_iteration",))["oml:trace"]

        run_id = result_dict["oml:run_id"]
        trace = OrderedDict()

        if "oml:trace_iteration" not in result_dict:
            raise ValueError("Run does not contain valid trace. ")
        if not isinstance(result_dict["oml:trace_iteration"], list):
            raise TypeError(type(result_dict["oml:trace_iteration"]))

        for itt in result_dict["oml:trace_iteration"]:
            repeat = int(itt["oml:repeat"])
            fold = int(itt["oml:fold"])
            iteration = int(itt["oml:iteration"])
            setup_string = json.loads(itt["oml:setup_string"])
            evaluation = float(itt["oml:evaluation"])
            selected_value = itt["oml:selected"]
            if selected_value == "true":
                selected = True
            elif selected_value == "false":
                selected = False
            else:
                raise ValueError(
                    'expected {"true", "false"} value for '
                    "selected field, received: %s" % selected_value,
                )

            current = OpenMLTraceIteration(
                repeat=repeat,
                fold=fold,
                iteration=iteration,
                setup_string=setup_string,
                evaluation=evaluation,
                selected=selected,
            )
            trace[(repeat, fold, iteration)] = current

        return cls(run_id, trace)

    @classmethod
    def merge_traces(cls, traces: list[OpenMLRunTrace]) -> OpenMLRunTrace:
        """Merge multiple traces into a single trace.

        Parameters
        ----------
        cls : type
            Type of the trace object to be created.
        traces : List[OpenMLRunTrace]
            List of traces to merge.

        Returns
        -------
        OpenMLRunTrace
            A trace object representing the merged traces.

        Raises
        ------
        ValueError
            If the parameters in the iterations of the traces being merged are not equal.
            If a key (repeat, fold, iteration) is encountered twice while merging the traces.
        """
        merged_trace: dict[tuple[int, int, int], OpenMLTraceIteration] = {}

        previous_iteration = None
        for trace in traces:
            for iteration in trace:
                key = (iteration.repeat, iteration.fold, iteration.iteration)

                assert iteration.parameters is not None
                param_keys = iteration.parameters.keys()

                if previous_iteration is not None:
                    trace_itr = merged_trace[previous_iteration]

                    assert trace_itr.parameters is not None
                    trace_itr_keys = trace_itr.parameters.keys()

                    if list(param_keys) != list(trace_itr_keys):
                        raise ValueError(
                            "Cannot merge traces because the parameters are not equal: "
                            "{} vs {}".format(
                                list(trace_itr.parameters.keys()),
                                list(iteration.parameters.keys()),
                            ),
                        )

                if key in merged_trace:
                    raise ValueError(
                        f"Cannot merge traces because key '{key}' was encountered twice",
                    )

                merged_trace[key] = iteration
                previous_iteration = key

        return cls(None, merged_trace)

    def __repr__(self) -> str:
        return "[Run id: {}, {} trace iterations]".format(
            -1 if self.run_id is None else self.run_id,
            len(self.trace_iterations),
        )

    def __iter__(self) -> Iterator[OpenMLTraceIteration]:
        yield from self.trace_iterations.values()

`init(run_id, trace_iterations)` ¶

Object to hold the trace content of a run.

Parameters:

Name	Type	Description	Default
`run_id`	`int`	Id for which the trace content is to be stored.	required
`trace_iterations`	`List[List]`	The trace content obtained by running a flow on a task.	required

Source code in openml/runs/trace.py

def __init__(
    self,
    run_id: int | None,
    trace_iterations: dict[tuple[int, int, int], OpenMLTraceIteration],
):
    """Object to hold the trace content of a run.

    Parameters
    ----------
    run_id : int
        Id for which the trace content is to be stored.
    trace_iterations : List[List]
        The trace content obtained by running a flow on a task.
    """
    self.run_id = run_id
    self.trace_iterations = trace_iterations

`generate(attributes, content)` `classmethod` ¶

Generates an OpenMLRunTrace.

Generates the trace object from the attributes and content extracted while running the underlying flow.

Parameters:

Name	Type	Description	Default
`attributes`	`list`	List of tuples describing the arff attributes.	required
`content`	`list`	List of lists containing information about the individual tuning runs.	required

Returns:

Type	Description
`OpenMLRunTrace`

Source code in openml/runs/trace.py

@classmethod
def generate(
    cls,
    attributes: list[tuple[str, str]],
    content: list[list[int | float | str]],
) -> OpenMLRunTrace:
    """Generates an OpenMLRunTrace.

    Generates the trace object from the attributes and content extracted
    while running the underlying flow.

    Parameters
    ----------
    attributes : list
        List of tuples describing the arff attributes.

    content : list
        List of lists containing information about the individual tuning
        runs.

    Returns
    -------
    OpenMLRunTrace
    """
    if content is None:
        raise ValueError("Trace content not available.")
    if attributes is None:
        raise ValueError("Trace attributes not available.")
    if len(content) == 0:
        raise ValueError("Trace content is empty.")
    if len(attributes) != len(content[0]):
        raise ValueError(
            "Trace_attributes and trace_content not compatible:"
            f" {attributes} vs {content[0]}",
        )

    return cls._trace_from_arff_struct(
        attributes=attributes,
        content=content,
        error_message="setup_string not allowed when constructing a "
        "trace object from run results.",
    )

`get_selected_iteration(fold, repeat)` ¶

Returns the trace iteration that was marked as selected. In case multiple are marked as selected (should not happen) the first of these is returned

Parameters:

Name	Type	Description	Default
`fold`	`int`		required
`repeat`	`int`		required

Returns:

Type	Description
`int`	The trace iteration from the given fold and repeat that was selected as the best iteration by the search procedure

Source code in openml/runs/trace.py

def get_selected_iteration(self, fold: int, repeat: int) -> int:
    """
    Returns the trace iteration that was marked as selected. In
    case multiple are marked as selected (should not happen) the
    first of these is returned

    Parameters
    ----------
    fold: int

    repeat: int

    Returns
    -------
    int
        The trace iteration from the given fold and repeat that was
        selected as the best iteration by the search procedure
    """
    for r, f, i in self.trace_iterations:
        if r == repeat and f == fold and self.trace_iterations[(r, f, i)].selected is True:
            return i
    raise ValueError(
        "Could not find the selected iteration for rep/fold %d/%d" % (repeat, fold),
    )

`merge_traces(traces)` `classmethod` ¶

Merge multiple traces into a single trace.

Parameters:

Name	Type	Description	Default
`cls`	`type`	Type of the trace object to be created.	required
`traces`	`List[OpenMLRunTrace]`	List of traces to merge.	required

Returns:

Type	Description
`OpenMLRunTrace`	A trace object representing the merged traces.

Raises:

Type	Description
`ValueError`	If the parameters in the iterations of the traces being merged are not equal. If a key (repeat, fold, iteration) is encountered twice while merging the traces.

Source code in openml/runs/trace.py

@classmethod
def merge_traces(cls, traces: list[OpenMLRunTrace]) -> OpenMLRunTrace:
    """Merge multiple traces into a single trace.

    Parameters
    ----------
    cls : type
        Type of the trace object to be created.
    traces : List[OpenMLRunTrace]
        List of traces to merge.

    Returns
    -------
    OpenMLRunTrace
        A trace object representing the merged traces.

    Raises
    ------
    ValueError
        If the parameters in the iterations of the traces being merged are not equal.
        If a key (repeat, fold, iteration) is encountered twice while merging the traces.
    """
    merged_trace: dict[tuple[int, int, int], OpenMLTraceIteration] = {}

    previous_iteration = None
    for trace in traces:
        for iteration in trace:
            key = (iteration.repeat, iteration.fold, iteration.iteration)

            assert iteration.parameters is not None
            param_keys = iteration.parameters.keys()

            if previous_iteration is not None:
                trace_itr = merged_trace[previous_iteration]

                assert trace_itr.parameters is not None
                trace_itr_keys = trace_itr.parameters.keys()

                if list(param_keys) != list(trace_itr_keys):
                    raise ValueError(
                        "Cannot merge traces because the parameters are not equal: "
                        "{} vs {}".format(
                            list(trace_itr.parameters.keys()),
                            list(iteration.parameters.keys()),
                        ),
                    )

            if key in merged_trace:
                raise ValueError(
                    f"Cannot merge traces because key '{key}' was encountered twice",
                )

            merged_trace[key] = iteration
            previous_iteration = key

    return cls(None, merged_trace)

`trace_from_arff(arff_obj)` `classmethod` ¶

Generate trace from arff trace.

Creates a trace file from arff object (for example, generated by a local run).

Parameters:

Name	Type	Description	Default
`arff_obj`	`dict`	LIAC arff obj, dict containing attributes, relation, data.	required

Returns:

Type	Description
`OpenMLRunTrace`

Source code in openml/runs/trace.py

@classmethod
def trace_from_arff(cls, arff_obj: dict[str, Any]) -> OpenMLRunTrace:
    """Generate trace from arff trace.

    Creates a trace file from arff object (for example, generated by a
    local run).

    Parameters
    ----------
    arff_obj : dict
        LIAC arff obj, dict containing attributes, relation, data.

    Returns
    -------
    OpenMLRunTrace
    """
    attributes = arff_obj["attributes"]
    content = arff_obj["data"]
    return cls._trace_from_arff_struct(
        attributes=attributes,
        content=content,
        error_message="setup_string not supported for arff serialization",
    )

`trace_from_xml(xml)` `classmethod` ¶

Generate trace from xml.

Creates a trace file from the xml description.

Parameters:

Name	Type	Description	Default
`xml`	`string \| file-like object`	An xml description that can be either a `string` or a file-like object.	required

Returns:

Name	Type	Description
`run`	`OpenMLRunTrace`	Object containing the run id and a dict containing the trace iterations.

Source code in openml/runs/trace.py

@classmethod
def trace_from_xml(cls, xml: str | Path | IO) -> OpenMLRunTrace:
    """Generate trace from xml.

    Creates a trace file from the xml description.

    Parameters
    ----------
    xml : string | file-like object
        An xml description that can be either a `string` or a file-like
        object.

    Returns
    -------
    run : OpenMLRunTrace
        Object containing the run id and a dict containing the trace
        iterations.
    """
    if isinstance(xml, Path):
        xml = str(xml.absolute())

    result_dict = xmltodict.parse(xml, force_list=("oml:trace_iteration",))["oml:trace"]

    run_id = result_dict["oml:run_id"]
    trace = OrderedDict()

    if "oml:trace_iteration" not in result_dict:
        raise ValueError("Run does not contain valid trace. ")
    if not isinstance(result_dict["oml:trace_iteration"], list):
        raise TypeError(type(result_dict["oml:trace_iteration"]))

    for itt in result_dict["oml:trace_iteration"]:
        repeat = int(itt["oml:repeat"])
        fold = int(itt["oml:fold"])
        iteration = int(itt["oml:iteration"])
        setup_string = json.loads(itt["oml:setup_string"])
        evaluation = float(itt["oml:evaluation"])
        selected_value = itt["oml:selected"]
        if selected_value == "true":
            selected = True
        elif selected_value == "false":
            selected = False
        else:
            raise ValueError(
                'expected {"true", "false"} value for '
                "selected field, received: %s" % selected_value,
            )

        current = OpenMLTraceIteration(
            repeat=repeat,
            fold=fold,
            iteration=iteration,
            setup_string=setup_string,
            evaluation=evaluation,
            selected=selected,
        )
        trace[(repeat, fold, iteration)] = current

    return cls(run_id, trace)

`trace_to_arff()` ¶

Generate the arff dictionary for uploading predictions to the server.

Uses the trace object to generate an arff dictionary representation.

Returns:

Name	Type	Description
`arff_dict`	`dict`	Dictionary representation of the ARFF file that will be uploaded. Contains information about the optimization trace.

Source code in openml/runs/trace.py

def trace_to_arff(self) -> dict[str, Any]:
    """Generate the arff dictionary for uploading predictions to the server.

    Uses the trace object to generate an arff dictionary representation.

    Returns
    -------
    arff_dict : dict
        Dictionary representation of the ARFF file that will be uploaded.
        Contains information about the optimization trace.
    """
    if self.trace_iterations is None:
        raise ValueError("trace_iterations missing from the trace object")

    # attributes that will be in trace arff
    trace_attributes = [
        ("repeat", "NUMERIC"),
        ("fold", "NUMERIC"),
        ("iteration", "NUMERIC"),
        ("evaluation", "NUMERIC"),
        ("selected", ["true", "false"]),
    ]
    trace_attributes.extend(
        [
            (PREFIX + parameter, "STRING")
            for parameter in next(iter(self.trace_iterations.values())).get_parameters()
        ],
    )

    arff_dict: dict[str, Any] = {}
    data = []
    for trace_iteration in self.trace_iterations.values():
        tmp_list = []
        for _attr, _ in trace_attributes:
            if _attr.startswith(PREFIX):
                attr = _attr[len(PREFIX) :]
                value = trace_iteration.get_parameters()[attr]
            else:
                attr = _attr
                value = getattr(trace_iteration, attr)

            if attr == "selected":
                tmp_list.append("true" if value else "false")
            else:
                tmp_list.append(value)
        data.append(tmp_list)

    arff_dict["attributes"] = trace_attributes
    arff_dict["data"] = data
    # TODO allow to pass a trace description when running a flow
    arff_dict["relation"] = "Trace"
    return arff_dict

`OpenMLTraceIteration` `dataclass` ¶

OpenML Trace Iteration: parsed output from Run Trace call Exactly one of setup_string or parameters must be provided.

Parameters:

Name	Type	Description	Default
`repeat`	`int`	repeat number (in case of no repeats: 0)	required
`fold`	`int`	fold number (in case of no folds: 0)	required
`iteration`	`int`	iteration number of optimization procedure	required
`setup_string`	`str`	json string representing the parameters If not provided, `parameters` should be set.	`None`
`evaluation`	`double`	The evaluation that was awarded to this trace iteration. Measure is defined by the task	required
`selected`	`bool`	Whether this was the best of all iterations, and hence selected for making predictions. Per fold/repeat there should be only one iteration selected	required
`parameters`	`OrderedDict`	Dictionary specifying parameter names and their values. If not provided, `setup_string` should be set.	`None`

Source code in openml/runs/trace.py

@dataclass
class OpenMLTraceIteration:
    """
    OpenML Trace Iteration: parsed output from Run Trace call
    Exactly one of `setup_string` or `parameters` must be provided.

    Parameters
    ----------
    repeat : int
        repeat number (in case of no repeats: 0)

    fold : int
        fold number (in case of no folds: 0)

    iteration : int
        iteration number of optimization procedure

    setup_string : str, optional
        json string representing the parameters
        If not provided, ``parameters`` should be set.

    evaluation : double
        The evaluation that was awarded to this trace iteration.
        Measure is defined by the task

    selected : bool
        Whether this was the best of all iterations, and hence
        selected for making predictions. Per fold/repeat there
        should be only one iteration selected

    parameters : OrderedDict, optional
        Dictionary specifying parameter names and their values.
        If not provided, ``setup_string`` should be set.
    """

    repeat: int
    fold: int
    iteration: int

    evaluation: float
    selected: bool

    setup_string: dict[str, str] | None = None
    parameters: dict[str, str | int | float] | None = None

    def __post_init__(self) -> None:
        # TODO: refactor into one argument of type <str | OrderedDict>
        if self.setup_string and self.parameters:
            raise ValueError(
                "Can only be instantiated with either `setup_string` or `parameters` argument.",
            )

        if not (self.setup_string or self.parameters):
            raise ValueError(
                "Either `setup_string` or `parameters` needs to be passed as argument.",
            )

        if self.parameters is not None and not isinstance(self.parameters, dict):
            raise TypeError(
                "argument parameters is not an instance of OrderedDict, but %s"
                % str(type(self.parameters)),
            )

    def get_parameters(self) -> dict[str, Any]:
        """Get the parameters of this trace iteration."""
        # parameters have prefix 'parameter_'
        if self.setup_string:
            return {
                param[len(PREFIX) :]: json.loads(value)
                for param, value in self.setup_string.items()
            }

        assert self.parameters is not None
        return {param[len(PREFIX) :]: value for param, value in self.parameters.items()}

`get_parameters()` ¶

Get the parameters of this trace iteration.

Source code in openml/runs/trace.py

def get_parameters(self) -> dict[str, Any]:
    """Get the parameters of this trace iteration."""
    # parameters have prefix 'parameter_'
    if self.setup_string:
        return {
            param[len(PREFIX) :]: json.loads(value)
            for param, value in self.setup_string.items()
        }

    assert self.parameters is not None
    return {param[len(PREFIX) :]: value for param, value in self.parameters.items()}

`delete_run(run_id)` ¶

Delete run with id run_id from the OpenML server.

You can only delete runs which you uploaded.

Parameters:

Name	Type	Description	Default
`run_id`	`int`	OpenML id of the run	required

Returns:

Type	Description
`bool`	True if the deletion was successful. False otherwise.

Source code in openml/runs/functions.py

def delete_run(run_id: int) -> bool:
    """Delete run with id `run_id` from the OpenML server.

    You can only delete runs which you uploaded.

    Parameters
    ----------
    run_id : int
        OpenML id of the run

    Returns
    -------
    bool
        True if the deletion was successful. False otherwise.
    """
    return openml.utils._delete_entity("run", run_id)

`get_run(run_id, ignore_cache=False)` ¶

Gets run corresponding to run_id.

Parameters:

Name	Type	Description	Default
`run_id`	`int`		required
`ignore_cache`	`bool`	Whether to ignore the cache. If `true` this will download and overwrite the run xml even if the requested run is already cached.	`False`
`ignore_cache`	`bool`		`False`

Returns:

Name	Type	Description
`run`	`OpenMLRun`	Run corresponding to ID, fetched from the server.

Source code in openml/runs/functions.py

@openml.utils.thread_safe_if_oslo_installed
def get_run(run_id: int, ignore_cache: bool = False) -> OpenMLRun:  # noqa: FBT002, FBT001
    """Gets run corresponding to run_id.

    Parameters
    ----------
    run_id : int

    ignore_cache : bool
        Whether to ignore the cache. If ``true`` this will download and overwrite the run xml
        even if the requested run is already cached.

    ignore_cache

    Returns
    -------
    run : OpenMLRun
        Run corresponding to ID, fetched from the server.
    """
    run_dir = Path(openml.utils._create_cache_directory_for_id(RUNS_CACHE_DIR_NAME, run_id))
    run_file = run_dir / "description.xml"

    run_dir.mkdir(parents=True, exist_ok=True)

    try:
        if not ignore_cache:
            return _get_cached_run(run_id)

        raise OpenMLCacheException(message="dummy")

    except OpenMLCacheException:
        run_xml = openml._api_calls._perform_api_call("run/%d" % run_id, "get")
        with run_file.open("w", encoding="utf8") as fh:
            fh.write(run_xml)

    return _create_run_from_xml(run_xml)

`get_run_trace(run_id)` ¶

Get the optimization trace object for a given run id.

Parameters:

Name	Type	Description	Default
`run_id`	`int`		required

Returns:

Type	Description
`OpenMLTrace`

Source code in openml/runs/functions.py

def get_run_trace(run_id: int) -> OpenMLRunTrace:
    """
    Get the optimization trace object for a given run id.

    Parameters
    ----------
    run_id : int

    Returns
    -------
    openml.runs.OpenMLTrace
    """
    trace_xml = openml._api_calls._perform_api_call("run/trace/%d" % run_id, "get")
    return OpenMLRunTrace.trace_from_xml(trace_xml)

`get_runs(run_ids)` ¶

Gets all runs in run_ids list.

Parameters:

Name	Type	Description	Default
`run_ids`	`list of ints`		required

Returns:

Name	Type	Description
`runs`	`list of OpenMLRun`	List of runs corresponding to IDs, fetched from the server.

Source code in openml/runs/functions.py

def get_runs(run_ids: list[int]) -> list[OpenMLRun]:
    """Gets all runs in run_ids list.

    Parameters
    ----------
    run_ids : list of ints

    Returns
    -------
    runs : list of OpenMLRun
        List of runs corresponding to IDs, fetched from the server.
    """
    runs = []
    for run_id in run_ids:
        runs.append(get_run(run_id))
    return runs

`initialize_model_from_run(run_id)` ¶

Initialized a model based on a run_id (i.e., using the exact same parameter settings)

Parameters:

Name	Type	Description	Default
`run_id`	`int`	The Openml run_id	required

Returns:

Type	Description
`model`

Source code in openml/runs/functions.py

def initialize_model_from_run(run_id: int) -> Any:
    """
    Initialized a model based on a run_id (i.e., using the exact
    same parameter settings)

    Parameters
    ----------
    run_id : int
        The Openml run_id

    Returns
    -------
    model
    """
    run = get_run(run_id)
    # TODO(eddiebergman): I imagine this is None if it's not published,
    # might need to raise an explicit error for that
    assert run.setup_id is not None
    return initialize_model(run.setup_id)

`initialize_model_from_trace(run_id, repeat, fold, iteration=None)` ¶

Initialize a model based on the parameters that were set by an optimization procedure (i.e., using the exact same parameter settings)

Parameters:

Name	Type	Description	Default
`run_id`	`int`	The Openml run_id. Should contain a trace file, otherwise a OpenMLServerException is raised	required
`repeat`	`int`	The repeat nr (column in trace file)	required
`fold`	`int`	The fold nr (column in trace file)	required
`iteration`	`int`	The iteration nr (column in trace file). If None, the best (selected) iteration will be searched (slow), according to the selection criteria implemented in OpenMLRunTrace.get_selected_iteration	`None`

Returns:

Type	Description
`model`

Source code in openml/runs/functions.py

def initialize_model_from_trace(
    run_id: int,
    repeat: int,
    fold: int,
    iteration: int | None = None,
) -> Any:
    """
    Initialize a model based on the parameters that were set
    by an optimization procedure (i.e., using the exact same
    parameter settings)

    Parameters
    ----------
    run_id : int
        The Openml run_id. Should contain a trace file,
        otherwise a OpenMLServerException is raised

    repeat : int
        The repeat nr (column in trace file)

    fold : int
        The fold nr (column in trace file)

    iteration : int
        The iteration nr (column in trace file). If None, the
        best (selected) iteration will be searched (slow),
        according to the selection criteria implemented in
        OpenMLRunTrace.get_selected_iteration

    Returns
    -------
    model
    """
    run = get_run(run_id)
    # TODO(eddiebergman): I imagine this is None if it's not published,
    # might need to raise an explicit error for that
    assert run.flow_id is not None

    flow = get_flow(run.flow_id)
    run_trace = get_run_trace(run_id)

    if iteration is None:
        iteration = run_trace.get_selected_iteration(repeat, fold)

    request = (repeat, fold, iteration)
    if request not in run_trace.trace_iterations:
        raise ValueError("Combination repeat, fold, iteration not available")
    current = run_trace.trace_iterations[(repeat, fold, iteration)]

    search_model = initialize_model_from_run(run_id)
    return flow.extension.instantiate_model_from_hpo_class(search_model, current)

`list_runs(offset=None, size=None, id=None, task=None, setup=None, flow=None, uploader=None, tag=None, study=None, display_errors=False, output_format='dict', **kwargs)` ¶

List all runs matching all of the given filters. (Supports large amount of results)

Parameters:

Name	Type	Description	Default
`offset`	`int`	the number of runs to skip, starting from the first	`None`
`size`	`int`	the maximum number of runs to show	`None`
`id`	`list`		`None`
`task`	`list`		`None`
`setup`	`list \| None`		`None`
`flow`	`list`		`None`
`uploader`	`list`		`None`
`tag`	`str`		`None`
`study`	`int`		`None`
`display_errors`	`(bool, optional(default=None))`	Whether to list runs which have an error (for example a missing prediction file).	`False`
`output_format`	`Literal['dict', 'dataframe']`	The parameter decides the format of the output. - If 'dict' the output is a dict of dict - If 'dataframe' the output is a pandas DataFrame	`'dict'`
`kwargs`	`dict`	Legal filter operators: task_type.	`{}`

Returns:

Type	Description
`dict of dicts, or dataframe`

Source code in openml/runs/functions.py

def list_runs(  # noqa: PLR0913
    offset: int | None = None,
    size: int | None = None,
    id: list | None = None,  # noqa: A002
    task: list[int] | None = None,
    setup: list | None = None,
    flow: list | None = None,
    uploader: list | None = None,
    tag: str | None = None,
    study: int | None = None,
    display_errors: bool = False,  # noqa: FBT001, FBT002
    output_format: Literal["dict", "dataframe"] = "dict",
    **kwargs: Any,
) -> dict | pd.DataFrame:
    """
    List all runs matching all of the given filters.
    (Supports large amount of results)

    Parameters
    ----------
    offset : int, optional
        the number of runs to skip, starting from the first
    size : int, optional
        the maximum number of runs to show

    id : list, optional

    task : list, optional

    setup: list, optional

    flow : list, optional

    uploader : list, optional

    tag : str, optional

    study : int, optional

    display_errors : bool, optional (default=None)
        Whether to list runs which have an error (for example a missing
        prediction file).

    output_format: str, optional (default='dict')
        The parameter decides the format of the output.
        - If 'dict' the output is a dict of dict
        - If 'dataframe' the output is a pandas DataFrame

    kwargs : dict, optional
        Legal filter operators: task_type.

    Returns
    -------
    dict of dicts, or dataframe
    """
    if output_format not in ["dataframe", "dict"]:
        raise ValueError("Invalid output format selected. Only 'dict' or 'dataframe' applicable.")

    # TODO: [0.15]
    if output_format == "dict":
        msg = (
            "Support for `output_format` of 'dict' will be removed in 0.15 "
            "and pandas dataframes will be returned instead. To ensure your code "
            "will continue to work, use `output_format`='dataframe'."
        )
        warnings.warn(msg, category=FutureWarning, stacklevel=2)

    # TODO(eddiebergman): Do we really need this runtime type validation?
    if id is not None and (not isinstance(id, list)):
        raise TypeError("id must be of type list.")
    if task is not None and (not isinstance(task, list)):
        raise TypeError("task must be of type list.")
    if setup is not None and (not isinstance(setup, list)):
        raise TypeError("setup must be of type list.")
    if flow is not None and (not isinstance(flow, list)):
        raise TypeError("flow must be of type list.")
    if uploader is not None and (not isinstance(uploader, list)):
        raise TypeError("uploader must be of type list.")

    return openml.utils._list_all(  # type: ignore
        list_output_format=output_format,  # type: ignore
        listing_call=_list_runs,
        offset=offset,
        size=size,
        id=id,
        task=task,
        setup=setup,
        flow=flow,
        uploader=uploader,
        tag=tag,
        study=study,
        display_errors=display_errors,
        **kwargs,
    )

`run_exists(task_id, setup_id)` ¶

Checks whether a task/setup combination is already present on the server.

Parameters:

Name	Type	Description	Default
`task_id`	`int`		required
`setup_id`	`int`		required

Returns:

Type	Description
`Set run ids for runs where flow setup_id was run on task_id. Empty`	set if it wasn't run yet.

Source code in openml/runs/functions.py

def run_exists(task_id: int, setup_id: int) -> set[int]:
    """Checks whether a task/setup combination is already present on the
    server.

    Parameters
    ----------
    task_id : int

    setup_id : int

    Returns
    -------
        Set run ids for runs where flow setup_id was run on task_id. Empty
        set if it wasn't run yet.
    """
    if setup_id <= 0:
        # openml setups are in range 1-inf
        return set()

    try:
        result = list_runs(task=[task_id], setup=[setup_id], output_format="dataframe")
        assert isinstance(result, pd.DataFrame)  # TODO(eddiebergman): Remove once #1299
        return set() if result.empty else set(result["run_id"])
    except OpenMLServerException as exception:
        # error code implies no results. The run does not exist yet
        if exception.code != ERROR_CODE:
            raise exception
        return set()

`run_flow_on_task(flow, task, avoid_duplicate_runs=True, flow_tags=None, seed=None, add_local_measures=True, upload_flow=False, dataset_format='dataframe', n_jobs=None)` ¶

Run the model provided by the flow on the dataset defined by task.

Takes the flow and repeat information into account. The Flow may optionally be published.

Parameters:

Name	Type	Description	Default
`flow`	`OpenMLFlow`	A flow wraps a machine learning model together with relevant information. The model has a function fit(X,Y) and predict(X), all supervised estimators of scikit learn follow this definition of a model (https://scikit-learn.org/stable/tutorial/statistical_inference/supervised_learning.html)	required
`task`	`OpenMLTask`	Task to perform. This may be an OpenMLFlow instead if the first argument is an OpenMLTask.	required
`avoid_duplicate_runs`	`(bool, optional(default=True))`	If True, the run will throw an error if the setup/task combination is already present on the server. This feature requires an internet connection.	`True`
`avoid_duplicate_runs`	`(bool, optional(default=True))`	If True, the run will throw an error if the setup/task combination is already present on the server. This feature requires an internet connection.	`True`
`flow_tags`	`(List[str], optional(default=None))`	A list of tags that the flow should have at creation.	`None`
`seed`	`int \| None`	Models that are not seeded will get this seed.	`None`
`add_local_measures`	`(bool, optional(default=True))`	Determines whether to calculate a set of evaluation measures locally, to later verify server behaviour.	`True`
`upload_flow`	`bool(default=False)`	If True, upload the flow to OpenML if it does not exist yet. If False, do not upload the flow to OpenML.	`False`
`dataset_format`	`str(default='dataframe')`	If 'array', the dataset is passed to the model as a numpy array. If 'dataframe', the dataset is passed to the model as a pandas dataframe.	`'dataframe'`
`n_jobs`	`int(default=None)`	The number of processes/threads to distribute the evaluation asynchronously. If `None` or `1`, then the evaluation is treated as synchronous and processed sequentially. If `-1`, then the job uses as many cores available.	`None`

Returns:

Name	Type	Description
`run`	`OpenMLRun`	Result of the run.

Source code in openml/runs/functions.py

def run_flow_on_task(  # noqa: C901, PLR0912, PLR0915, PLR0913
    flow: OpenMLFlow,
    task: OpenMLTask,
    avoid_duplicate_runs: bool = True,  # noqa: FBT002, FBT001
    flow_tags: list[str] | None = None,
    seed: int | None = None,
    add_local_measures: bool = True,  # noqa: FBT001, FBT002
    upload_flow: bool = False,  # noqa: FBT001, FBT002
    dataset_format: Literal["array", "dataframe"] = "dataframe",
    n_jobs: int | None = None,
) -> OpenMLRun:
    """Run the model provided by the flow on the dataset defined by task.

    Takes the flow and repeat information into account.
    The Flow may optionally be published.

    Parameters
    ----------
    flow : OpenMLFlow
        A flow wraps a machine learning model together with relevant information.
        The model has a function fit(X,Y) and predict(X),
        all supervised estimators of scikit learn follow this definition of a model
        (https://scikit-learn.org/stable/tutorial/statistical_inference/supervised_learning.html)
    task : OpenMLTask
        Task to perform. This may be an OpenMLFlow instead if the first argument is an OpenMLTask.
    avoid_duplicate_runs : bool, optional (default=True)
        If True, the run will throw an error if the setup/task combination is already present on
        the server. This feature requires an internet connection.
    avoid_duplicate_runs : bool, optional (default=True)
        If True, the run will throw an error if the setup/task combination is already present on
        the server. This feature requires an internet connection.
    flow_tags : List[str], optional (default=None)
        A list of tags that the flow should have at creation.
    seed: int, optional (default=None)
        Models that are not seeded will get this seed.
    add_local_measures : bool, optional (default=True)
        Determines whether to calculate a set of evaluation measures locally,
        to later verify server behaviour.
    upload_flow : bool (default=False)
        If True, upload the flow to OpenML if it does not exist yet.
        If False, do not upload the flow to OpenML.
    dataset_format : str (default='dataframe')
        If 'array', the dataset is passed to the model as a numpy array.
        If 'dataframe', the dataset is passed to the model as a pandas dataframe.
    n_jobs : int (default=None)
        The number of processes/threads to distribute the evaluation asynchronously.
        If `None` or `1`, then the evaluation is treated as synchronous and processed sequentially.
        If `-1`, then the job uses as many cores available.

    Returns
    -------
    run : OpenMLRun
        Result of the run.
    """
    if flow_tags is not None and not isinstance(flow_tags, list):
        raise ValueError("flow_tags should be a list")

    # TODO: At some point in the future do not allow for arguments in old order (changed 6-2018).
    # Flexibility currently still allowed due to code-snippet in OpenML100 paper (3-2019).
    if isinstance(flow, OpenMLTask) and isinstance(task, OpenMLFlow):
        # We want to allow either order of argument (to avoid confusion).
        warnings.warn(
            "The old argument order (Flow, model) is deprecated and "
            "will not be supported in the future. Please use the "
            "order (model, Flow).",
            DeprecationWarning,
            stacklevel=2,
        )
        task, flow = flow, task

    if task.task_id is None:
        raise ValueError("The task should be published at OpenML")

    if flow.model is None:
        flow.model = flow.extension.flow_to_model(flow)

    flow.model = flow.extension.seed_model(flow.model, seed=seed)

    # We only need to sync with the server right now if we want to upload the flow,
    # or ensure no duplicate runs exist. Otherwise it can be synced at upload time.
    flow_id = None
    if upload_flow or avoid_duplicate_runs:
        flow_id = flow_exists(flow.name, flow.external_version)
        if isinstance(flow.flow_id, int) and flow_id != flow.flow_id:
            if flow_id is not False:
                raise PyOpenMLError(
                    "Local flow_id does not match server flow_id: "
                    f"'{flow.flow_id}' vs '{flow_id}'",
                )
            raise PyOpenMLError(
                "Flow does not exist on the server, but 'flow.flow_id' is not None."
            )
        if upload_flow and flow_id is False:
            flow.publish()
            flow_id = flow.flow_id
        elif flow_id:
            flow_from_server = get_flow(flow_id)
            _copy_server_fields(flow_from_server, flow)
            if avoid_duplicate_runs:
                flow_from_server.model = flow.model
                setup_id = setup_exists(flow_from_server)
                ids = run_exists(task.task_id, setup_id)
                if ids:
                    error_message = (
                        "One or more runs of this setup were already performed on the task."
                    )
                    raise OpenMLRunsExistError(ids, error_message)
        else:
            # Flow does not exist on server and we do not want to upload it.
            # No sync with the server happens.
            flow_id = None

    dataset = task.get_dataset()

    run_environment = flow.extension.get_version_information()
    tags = ["openml-python", run_environment[1]]

    if flow.extension.check_if_model_fitted(flow.model):
        warnings.warn(
            "The model is already fitted!"
            " This might cause inconsistency in comparison of results.",
            RuntimeWarning,
            stacklevel=2,
        )

    # execute the run
    res = _run_task_get_arffcontent(
        model=flow.model,
        task=task,
        extension=flow.extension,
        add_local_measures=add_local_measures,
        dataset_format=dataset_format,
        n_jobs=n_jobs,
    )

    data_content, trace, fold_evaluations, sample_evaluations = res
    fields = [*run_environment, time.strftime("%c"), "Created by run_flow_on_task"]
    generated_description = "\n".join(fields)
    run = OpenMLRun(
        task_id=task.task_id,
        flow_id=flow_id,
        dataset_id=dataset.dataset_id,
        model=flow.model,
        flow_name=flow.name,
        tags=tags,
        trace=trace,
        data_content=data_content,
        flow=flow,
        setup_string=flow.extension.create_setup_string(flow.model),
        description_text=generated_description,
    )

    if (upload_flow or avoid_duplicate_runs) and flow.flow_id is not None:
        # We only extract the parameter settings if a sync happened with the server.
        # I.e. when the flow was uploaded or we found it in the avoid_duplicate check.
        # Otherwise, we will do this at upload time.
        run.parameter_settings = flow.extension.obtain_parameter_values(flow)

    # now we need to attach the detailed evaluations
    if task.task_type_id == TaskType.LEARNING_CURVE:
        run.sample_evaluations = sample_evaluations
    else:
        run.fold_evaluations = fold_evaluations

    if flow_id:
        message = f"Executed Task {task.task_id} with Flow id:{run.flow_id}"
    else:
        message = f"Executed Task {task.task_id} on local Flow with name {flow.name}."
    config.logger.info(message)

    return run

`run_model_on_task(model, task, avoid_duplicate_runs=True, flow_tags=None, seed=None, add_local_measures=True, upload_flow=False, return_flow=False, dataset_format='dataframe', n_jobs=None)` ¶

Run the model on the dataset defined by the task.

Parameters:

Name	Type	Description	Default
`model`	`sklearn model`	A model which has a function fit(X,Y) and predict(X), all supervised estimators of scikit learn follow this definition of a model (https://scikit-learn.org/stable/tutorial/statistical_inference/supervised_learning.html)	required
`task`	`OpenMLTask or int or str`	Task to perform or Task id. This may be a model instead if the first argument is an OpenMLTask.	required
`avoid_duplicate_runs`	`(bool, optional(default=True))`	If True, the run will throw an error if the setup/task combination is already present on the server. This feature requires an internet connection.	`True`
`flow_tags`	`(List[str], optional(default=None))`	A list of tags that the flow should have at creation.	`None`
`seed`	`int \| None`	Models that are not seeded will get this seed.	`None`
`add_local_measures`	`(bool, optional(default=True))`	Determines whether to calculate a set of evaluation measures locally, to later verify server behaviour.	`True`
`upload_flow`	`bool(default=False)`	If True, upload the flow to OpenML if it does not exist yet. If False, do not upload the flow to OpenML.	`False`
`return_flow`	`bool(default=False)`	If True, returns the OpenMLFlow generated from the model in addition to the OpenMLRun.	`False`
`dataset_format`	`str(default='dataframe')`	If 'array', the dataset is passed to the model as a numpy array. If 'dataframe', the dataset is passed to the model as a pandas dataframe.	`'dataframe'`
`n_jobs`	`int(default=None)`	The number of processes/threads to distribute the evaluation asynchronously. If `None` or `1`, then the evaluation is treated as synchronous and processed sequentially. If `-1`, then the job uses as many cores available.	`None`

Returns:

Name	Type	Description
`run`	`OpenMLRun`	Result of the run.
`flow`	OpenMLFlow (optional, only if `return_flow` is True).	Flow generated from the model.

Source code in openml/runs/functions.py

def run_model_on_task(  # noqa: PLR0913
    model: Any,
    task: int | str | OpenMLTask,
    avoid_duplicate_runs: bool = True,  # noqa: FBT001, FBT002
    flow_tags: list[str] | None = None,
    seed: int | None = None,
    add_local_measures: bool = True,  # noqa: FBT001, FBT002
    upload_flow: bool = False,  # noqa: FBT001, FBT002
    return_flow: bool = False,  # noqa: FBT001, FBT002
    dataset_format: Literal["array", "dataframe"] = "dataframe",
    n_jobs: int | None = None,
) -> OpenMLRun | tuple[OpenMLRun, OpenMLFlow]:
    """Run the model on the dataset defined by the task.

    Parameters
    ----------
    model : sklearn model
        A model which has a function fit(X,Y) and predict(X),
        all supervised estimators of scikit learn follow this definition of a model
        (https://scikit-learn.org/stable/tutorial/statistical_inference/supervised_learning.html)
    task : OpenMLTask or int or str
        Task to perform or Task id.
        This may be a model instead if the first argument is an OpenMLTask.
    avoid_duplicate_runs : bool, optional (default=True)
        If True, the run will throw an error if the setup/task combination is already present on
        the server. This feature requires an internet connection.
    flow_tags : List[str], optional (default=None)
        A list of tags that the flow should have at creation.
    seed: int, optional (default=None)
        Models that are not seeded will get this seed.
    add_local_measures : bool, optional (default=True)
        Determines whether to calculate a set of evaluation measures locally,
        to later verify server behaviour.
    upload_flow : bool (default=False)
        If True, upload the flow to OpenML if it does not exist yet.
        If False, do not upload the flow to OpenML.
    return_flow : bool (default=False)
        If True, returns the OpenMLFlow generated from the model in addition to the OpenMLRun.
    dataset_format : str (default='dataframe')
        If 'array', the dataset is passed to the model as a numpy array.
        If 'dataframe', the dataset is passed to the model as a pandas dataframe.
    n_jobs : int (default=None)
        The number of processes/threads to distribute the evaluation asynchronously.
        If `None` or `1`, then the evaluation is treated as synchronous and processed sequentially.
        If `-1`, then the job uses as many cores available.

    Returns
    -------
    run : OpenMLRun
        Result of the run.
    flow : OpenMLFlow (optional, only if `return_flow` is True).
        Flow generated from the model.
    """
    if avoid_duplicate_runs and not config.apikey:
        warnings.warn(
            "avoid_duplicate_runs is set to True, but no API key is set. "
            "Please set your API key in the OpenML configuration file, see"
            "https://openml.github.io/openml-python/main/examples/20_basic/introduction_tutorial"
            ".html#authentication for more information on authentication.",
            RuntimeWarning,
            stacklevel=2,
        )

    # TODO: At some point in the future do not allow for arguments in old order (6-2018).
    # Flexibility currently still allowed due to code-snippet in OpenML100 paper (3-2019).
    # When removing this please also remove the method `is_estimator` from the extension
    # interface as it is only used here (MF, 3-2019)
    if isinstance(model, (int, str, OpenMLTask)):
        warnings.warn(
            "The old argument order (task, model) is deprecated and "
            "will not be supported in the future. Please use the "
            "order (model, task).",
            DeprecationWarning,
            stacklevel=2,
        )
        task, model = model, task

    extension = get_extension_by_model(model, raise_if_no_extension=True)
    if extension is None:
        # This should never happen and is only here to please mypy will be gone soon once the
        # whole function is removed
        raise TypeError(extension)

    flow = extension.model_to_flow(model)

    def get_task_and_type_conversion(_task: int | str | OpenMLTask) -> OpenMLTask:
        """Retrieve an OpenMLTask object from either an integer or string ID,
        or directly from an OpenMLTask object.

        Parameters
        ----------
        _task : Union[int, str, OpenMLTask]
            The task ID or the OpenMLTask object.

        Returns
        -------
        OpenMLTask
            The OpenMLTask object.
        """
        if isinstance(_task, (int, str)):
            return get_task(int(_task))  # type: ignore

        return _task

    task = get_task_and_type_conversion(task)

    run = run_flow_on_task(
        task=task,
        flow=flow,
        avoid_duplicate_runs=avoid_duplicate_runs,
        flow_tags=flow_tags,
        seed=seed,
        add_local_measures=add_local_measures,
        upload_flow=upload_flow,
        dataset_format=dataset_format,
        n_jobs=n_jobs,
    )
    if return_flow:
        return run, flow
    return run

runs

OpenMLRun ¶

id: int | None property ¶

predictions: pd.DataFrame property ¶

from_filesystem(directory, expect_model=True) classmethod ¶

get_metric_fn(sklearn_fn, kwargs=None) ¶

to_filesystem(directory, store_model=True) ¶

OpenMLRunTrace ¶

__init__(run_id, trace_iterations) ¶

generate(attributes, content) classmethod ¶

get_selected_iteration(fold, repeat) ¶

merge_traces(traces) classmethod ¶

trace_from_arff(arff_obj) classmethod ¶

trace_from_xml(xml) classmethod ¶

trace_to_arff() ¶

OpenMLTraceIteration dataclass ¶

get_parameters() ¶

delete_run(run_id) ¶

get_run(run_id, ignore_cache=False) ¶

get_run_trace(run_id) ¶

get_runs(run_ids) ¶

initialize_model_from_run(run_id) ¶

initialize_model_from_trace(run_id, repeat, fold, iteration=None) ¶

list_runs(offset=None, size=None, id=None, task=None, setup=None, flow=None, uploader=None, tag=None, study=None, display_errors=False, output_format='dict', **kwargs) ¶

run_exists(task_id, setup_id) ¶

run_flow_on_task(flow, task, avoid_duplicate_runs=True, flow_tags=None, seed=None, add_local_measures=True, upload_flow=False, dataset_format='dataframe', n_jobs=None) ¶

run_model_on_task(model, task, avoid_duplicate_runs=True, flow_tags=None, seed=None, add_local_measures=True, upload_flow=False, return_flow=False, dataset_format='dataframe', n_jobs=None) ¶

`OpenMLRun` ¶

`id: int | None` `property` ¶

`predictions: pd.DataFrame` `property` ¶

`from_filesystem(directory, expect_model=True)` `classmethod` ¶

`get_metric_fn(sklearn_fn, kwargs=None)` ¶

`to_filesystem(directory, store_model=True)` ¶

`OpenMLRunTrace` ¶

`init(run_id, trace_iterations)` ¶

`generate(attributes, content)` `classmethod` ¶

`get_selected_iteration(fold, repeat)` ¶

`merge_traces(traces)` `classmethod` ¶

`trace_from_arff(arff_obj)` `classmethod` ¶

`trace_from_xml(xml)` `classmethod` ¶

`trace_to_arff()` ¶

`OpenMLTraceIteration` `dataclass` ¶

`get_parameters()` ¶

`delete_run(run_id)` ¶

`get_run(run_id, ignore_cache=False)` ¶

`get_run_trace(run_id)` ¶

`get_runs(run_ids)` ¶

`initialize_model_from_run(run_id)` ¶

`initialize_model_from_trace(run_id, repeat, fold, iteration=None)` ¶

`list_runs(offset=None, size=None, id=None, task=None, setup=None, flow=None, uploader=None, tag=None, study=None, display_errors=False, output_format='dict', **kwargs)` ¶

`run_exists(task_id, setup_id)` ¶

`run_flow_on_task(flow, task, avoid_duplicate_runs=True, flow_tags=None, seed=None, add_local_measures=True, upload_flow=False, dataset_format='dataframe', n_jobs=None)` ¶

`run_model_on_task(model, task, avoid_duplicate_runs=True, flow_tags=None, seed=None, add_local_measures=True, upload_flow=False, return_flow=False, dataset_format='dataframe', n_jobs=None)` ¶