Athena is a learned optimizer enhancer, proposed to enhance the query optimizer of DBMSs. It is consisted of three key designs: (i) Order-centric plan explorer, (ii) Tree-Mamba plan comparator, and (iii) Time-weighted loss function.
-
Order-centric plan explorer is implemented in Athena_PG.
-
Tree-Mamba plan comparator is implemented in athena_model.py.
-
Time-weighted loss function is implemented in train.py as function
tailr_loss_with_logits.
To ensure usability and reusability, we provide the following two implementation approaches: (1) Docker and (2) manual step-by-step deployment.
-
Install Docker and Docker-compose.
For example, for ubuntu, please refer to the following link:
Install Docker using the apt repository
Notice that if you encountered a network problem, supposing that you have a http proxy at
http://<ip>:<port>, such ashttp://127.0.0.1:7890, you can try the following command. Remember to replace the<ip>and<port>in the following texts to your actual value.export HTTP_PROXY=http://<ip>:<port> export HTTPS_PROXY=$HTTP_PROXY cat <<EOF | sudo tee /etc/systemd/system/docker.service.d/http-proxy.conf > /dev/null [Service] Environment="HTTP_PROXY=$HTTP_PROXY" Environment="HTTPS_PROXY=$HTTP_PROXY" Environment="NO_PROXY=localhost,127.0.0.1" EOF
and add two lines after the line
- POSTGRES_PASSWORD=postgresin the./docker-compose.yml- HTTP_PROXY=http://<ip>:<port> - HTTPS_PROXY=http://<ip>:<port> -
Download datasets
- IMDB
Download the imdb.tgz to
./postgres_service -
Generate configuration for PostgreSQL.
Go to PGTune, generate the configuration of your computer, and save it to
./postgres_service/postgresql.conf. -
Train our model in Docker
docker compose up --build
-
Run other commands
Refer to other commands in section Run. Replace the command property of the athena service with the command you want in
docker-compose.yml. And then rundocker compose up --buildagain.
uv pip install -r requirements.txt
// install mamba in https://github.com/a858438680/TreeMamba.git
git clone https://github.com/a858438680/tree_mamba.git
uv pip install --no-build-isolation ./pythonNo need to install these extensions here, there is guidelines when installing Athena_PG.
Athena support 4 benchmarks: JOB, STATS-CEB, TPC-DS, and DSB.
We have included their workloads into the dir workloads/, where the training and test queries of each benchmark are placed into dir xxx and xxx-sample respectively.
Install Athena_PG before the following steps.
# generate plans for JOB queries as example
python -u src/Athena/generate_dataset.py --database imdb --workload JOBThis code employs Athena's plan explorer to enumerate candidate plans for input queries, executes them, and stores the results into dir datasets as the training samples. The dir datasets is orgranizeda as datasets/{database}/{workload}/{plan-generation-method}/.
We also support to generate training samples of Bao, Lero, Kepler. For example, for Bao, check the file.
# set parameter for torch determistic algorithm
export CUBLAS_WORKSPACE_CONFIG=:4096:8
# train a Tree-Mamba plan comparator for JOB queries as an example
python scripts/run_experiments.py --models athena --workloads JOB --methods athenaThis code will train a Tree-Mamba plan comparator based on the collected training samples in the last step and store the model in the dir models/{model}/{workload}/{plan-generation-methods}/{random-seed}/models/. For this example, the dir to store the model is models/athena/JOB/athena/3407/models/.
This step tests the model based JOB as an example.
- 3.1 Geneate plans for test queries
python -u src/Athena/generate_dataset.py --database imdb --workload JOB-sample --generate_candidate_plansThe code with the option generate_candidate_plans only generates the plans for queries and does not execute them. The generated plans will be also stored in dir datasets/.
- 3.2 Test model
python src/Athena/test.py --dataset imdb/JOB-sample/Athena --model models/athena/JOB/athena/3407/models/199.ptThis code will employ the model to select a plan from the candidate plans of each query and store the indexes of selected plans in the dir results/{database}/{workload}/{plan-generation-method}/{model}.json.
file results/imdb/JOB-sample/Athena/Athena.json.
- 3.3 Execute the selected plans in the last step
python -u scripts/test_model.py --database=imdb --workload=JOB-sample --plan_generation_method=Athena --model=Athena --result_path=AthenaThis code loads the indexes of the selected plan in the last step, executes them, and stores the execution results into dir records/. This code will directly load the records instead of executing the queries next time.
We have stored model selection and execution records for Athena on JOB-sample, so you can just run 3.3 without 3.1 and 3.2.