Getting Started¶
This page will guide you through the process of getting started with OpenML. While this page is a good starting point, for more detailed information, please refer to the integrations section and the rest of the documentation.
Authentication¶
- If you are using the OpenML API to download datasets, upload results, or create tasks, you will need to authenticate. You can do this by creating an account on the OpenML website and using your API key. - You can find detailed instructions on how to authenticate in the authentication section
In [2]:
Copied!
!pip install -q openml
!pip install -q openml
EEG Eye State example¶
Download the OpenML task for the eeg-eye-state.
In [3]:
Copied!
# License: BSD 3-Clause
import openml
from sklearn import neighbors
# License: BSD 3-Clause
import openml
from sklearn import neighbors
In [4]:
Copied!
openml.config.start_using_configuration_for_example()
openml.config.start_using_configuration_for_example()
/var/folders/0t/5d8ttqzd773fy0wq3h5db0xr0000gn/T/ipykernel_60921/256497051.py:1: UserWarning: Switching to the test server https://test.openml.org/api/v1/xml to not upload results to the live server. Using the test server may result in reduced performance of the API! openml.config.start_using_configuration_for_example()
When using the main server instead, make sure your apikey is configured. This can be done with the following line of code (uncomment it!). Never share your apikey with others.
In [5]:
Copied!
# openml.config.apikey = 'YOURKEY'
# openml.config.apikey = 'YOURKEY'
Caching¶
When downloading datasets, tasks, runs and flows, they will be cached to retrieve them without calling the server later. As with the API key, the cache directory can be either specified through the config file or through the API:
- Add the line cachedir = 'MYDIR' to the config file, replacing 'MYDIR' with the path to the cache directory. By default, OpenML will use ~/.openml/cache as the cache directory.
- Run the code below, replacing 'YOURDIR' with the path to the cache directory.
In [6]:
Copied!
# Uncomment and set your OpenML cache directory
# import os
# openml.config.cache_directory = os.path.expanduser('YOURDIR')
# Uncomment and set your OpenML cache directory
# import os
# openml.config.cache_directory = os.path.expanduser('YOURDIR')
In [7]:
Copied!
task = openml.tasks.get_task(403)
data = openml.datasets.get_dataset(task.dataset_id)
clf = neighbors.KNeighborsClassifier(n_neighbors=5)
run = openml.runs.run_model_on_task(clf, task, avoid_duplicate_runs=False)
# Publish the experiment on OpenML (optional, requires an API key).
# For this tutorial, our configuration publishes to the test server
# as to not crowd the main server with runs created by examples.
myrun = run.publish()
print(f"kNN on {data.name}: {myrun.openml_url}")
task = openml.tasks.get_task(403)
data = openml.datasets.get_dataset(task.dataset_id)
clf = neighbors.KNeighborsClassifier(n_neighbors=5)
run = openml.runs.run_model_on_task(clf, task, avoid_duplicate_runs=False)
# Publish the experiment on OpenML (optional, requires an API key).
# For this tutorial, our configuration publishes to the test server
# as to not crowd the main server with runs created by examples.
myrun = run.publish()
print(f"kNN on {data.name}: {myrun.openml_url}")
kNN on eeg-eye-state: https://test.openml.org/r/32906
In [8]:
Copied!
openml.config.stop_using_configuration_for_example()
openml.config.stop_using_configuration_for_example()