By Theofilos Ioannidis (tioannid [at] di [dot] uoa [dot] gr), created on , last updated on
Containers are not ideal for benchmarking purposes similar to the one GeoRDFBench Framework performs, because they do not allow clearing system caches. The reason for this is that:
Therefore, in the following example, although the user can verify that the experiments run properly and results are correctly calculated and reported, the COLD cache response times will not be accurate. However, for experiments that do not require COLD cache response time measurements, e.g., macro benchmark scenarios, response times should be accurate enough for drawing basic conclusions.
This example, features:
The geordfbench_nuc_rdf4j.zip is a zipped file which contains a docker image. When run, the image will generate a container which will execute the Scalability-10K workload with RDF4J on the NUC8i7BEH host (for simplicity reasons, in this example, we do not allocate memory, cpus, IP and hostname for the container). We assume that we have the ~/Downloads/geordfbench_nuc_rdf4j.zip. Then we uncompress in /data:
/data$ unzip geordfbench_nuc_rdf4j.zip
/data$ cd geordfbench_nuc_rdf4j
/data/geordfbench_nuc_rdf4j$ docker build -t geordfbench_nuc_rdf4j .
/data/geordfbench_nuc_rdf4j$ docker run -it --name test geordfbench_nuc_rdf4j /bin/bash
The default terminal will act as a log window and after some time the experiment will end with:
...
144431 [main] INFO GenericExprerimentResultsCollector - Cache COLD
144431 [main] INFO GenericExprerimentResultsCollector - Query 0
144431 [main] INFO GenericExprerimentResultsCollector - Rep 0 28734018 + 693243213 = 721977231 nsecs, 554 results, 0 scan errors
144431 [main] INFO GenericExprerimentResultsCollector - Rep 1 4034276 + 309748342 = 313782618 nsecs, 554 results, 0 scan errors
144431 [main] INFO GenericExprerimentResultsCollector - Rep 2 3005777 + 243356481 = 246362258 nsecs, 554 results, 0 scan errors
144431 [main] INFO GenericExprerimentResultsCollector - Query 1
144432 [main] INFO GenericExprerimentResultsCollector - Rep 0 17586203 + 239027786 = 256613989 nsecs, 2 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Rep 1 2819656 + 154320599 = 157140255 nsecs, 2 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Rep 2 5752366 + 166884988 = 172637354 nsecs, 2 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Query 2
144432 [main] INFO GenericExprerimentResultsCollector - Rep 0 2609175 + 155153038 = 157762213 nsecs, 2 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Rep 1 2548615 + 154570164 = 157118779 nsecs, 2 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Rep 2 3796632 + 149082904 = 152879536 nsecs, 2 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Cache WARM
144432 [main] INFO GenericExprerimentResultsCollector - Query 0
144432 [main] INFO GenericExprerimentResultsCollector - Rep 0 520773 + 220539046 = 221059819 nsecs, 554 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Rep 1 423052 + 173751455 = 174174507 nsecs, 554 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Rep 2 498243 + 185245195 = 185743438 nsecs, 554 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Query 1
144432 [main] INFO GenericExprerimentResultsCollector - Rep 0 1227311 + 121353249 = 122580560 nsecs, 2 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Rep 1 1387298 + 116242473 = 117629771 nsecs, 2 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Rep 2 1145577 + 120575113 = 121720690 nsecs, 2 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Query 2
144432 [main] INFO GenericExprerimentResultsCollector - Rep 0 1015724 + 137564542 = 138580266 nsecs, 2 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Rep 1 1135437 + 145388424 = 146523861 nsecs, 2 results, 0 scan errors
144432 [main] INFO GenericExprerimentResultsCollector - Rep 2 963440 + 144746807 = 145710247 nsecs, 2 results, 0 scan errors
144432 [main] INFO RunRDF4JExperimentWorkload - End ScalabilityFunc
Start time = Fri Jun 23 20:45:21 UTC 2023
End time = Fri Jun 23 20:47:46 UTC 2023
From another terminal we can connect to the test container and check the results in the geographica3 database in PostgreSQL:
/data$ docker exec -it test /bin/bash
root@9185155c6a9c:/data# su postgres
postgres@9185155c6a9c:/data$ psql
psql (14.8 (Ubuntu 14.8-0ubuntu0.22.04.1))
Type "help" for help.
postgres=# \l
List of databases
Name | Owner | Encoding | Collate | Ctype | Access privileges
--------------+--------------+----------+---------+---------+-----------------------
geographica3 | geographica3 | UTF8 | C.UTF-8 | C.UTF-8 |
postgres | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
template0 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
(4 rows)
postgres=# \c geographica3
You are now connected to database "geographica3" as user "postgres".
geographica3=# \d+
List of relations
Schema | Name | Type | Owner | Persistence | Access method | Size | Description
--------+----------------------------------+----------+--------------+-------------+---------------+------------+-------------
public | EXPERIMENT | table | geographica3 | permanent | heap | 16 kB |
public | EXPERIMENT_id_seq | sequence | geographica3 | permanent | | 8192 bytes |
public | QUERYEXECUTION | table | geographica3 | permanent | heap | 8192 bytes |
public | QUERYEXECUTION_experiment_id_seq | sequence | geographica3 | permanent | | 8192 bytes |
public | QUERYEXECUTION_id_seq | sequence | geographica3 | permanent | | 8192 bytes |
public | vquery_ordered_aggrs | view | postgres | permanent | | 0 bytes |
public | vquery_ordered_aggrs2 | view | postgres | permanent | | 0 bytes |
public | vqueryexecution | view | postgres | permanent | | 0 bytes |
public | vqueryexecution2 | view | postgres | permanent | | 0 bytes |
(9 rows)
geographica3=# select * from "EXPERIMENT";
id | instime | exectime | description | host | os |
sut | queryset | dataset | executionspec | reportspec | type
----+----------------------------+----------------------------+-----------------------------------+-----------------------------------------------------------------------------+--------------------------------+-
---------+-----------------+-----------------+---------------------------------------------------------------------------------------------------------------+------------------+-----------------
1 | 2023-06-23 20:45:31.172+00 | 2023-06-23 20:45:31.154+00 | 2023-08-05_RDF4JSUT_RunWL_Scal10K | SimpleHost{ NUC8i7BEH, 192.168.1.44, 32GB, GenericLinuxOS{ Ubuntu-jammy } } | GenericLinuxOS{ Ubuntu-jammy } |
RDF4JSUT | scalabilityFunc | scalability_10K | SimpleES{ COLD=3, WARM=3, action=RUN, maxduration=604800 secs, repmaxduration=86400 secs, func=QUERY_MEDIAN } | SimpleReportSpec | ScalabilityFunc
(1 row)
geographica3=# select * from vquery_ordered_aggrs;
experiment_id | query_no | cache_type | no_iterations | mean | median
---------------+----------+------------+---------------+-------+--------
1 | 0 | COLD | 3 | 0.427 | 0.314
1 | 0 | WARM | 3 | 0.194 | 0.186
1 | 1 | COLD | 3 | 0.196 | 0.173
1 | 1 | WARM | 3 | 0.121 | 0.122
1 | 2 | COLD | 3 | 0.156 | 0.157
1 | 2 | WARM | 3 | 0.144 | 0.146
(6 rows)
Afterwards we can verify the result files generated in the filesystem:
geographica3=# \q
postgres@f9a1d01d4750:/data$ exit
exit
root@f9a1d01d4750:/data# tree /data/Results_Store
/data/Results_Store
`-- RDF4JSUT
`-- 2023-08-05_RDF4JSUT_RunWL_Scal10K
`-- Scalability
`-- 10K
`-- RDF4JSUT-ExperimentWorkload
|-- 00-SC1_Geometries_Intersects_GivenPolygon-cold
|-- 00-SC1_Geometries_Intersects_GivenPolygon-cold-long
|-- 00-SC1_Geometries_Intersects_GivenPolygon-warm
|-- 00-SC1_Geometries_Intersects_GivenPolygon-warm-long
|-- 01-SC2_Intensive_Geometries_Intersect_Geometries-cold
|-- 01-SC2_Intensive_Geometries_Intersect_Geometries-cold-long
|-- 01-SC2_Intensive_Geometries_Intersect_Geometries-warm
|-- 01-SC2_Intensive_Geometries_Intersect_Geometries-warm-long
|-- 02-SC3_Relaxed_Geometries_Intersect_Geometries-cold
|-- 02-SC3_Relaxed_Geometries_Intersect_Geometries-cold-long
|-- 02-SC3_Relaxed_Geometries_Intersect_Geometries-warm
`-- 02-SC3_Relaxed_Geometries_Intersect_Geometries-warm-long
5 directories, 12 files
root@f9a1d01d4750:/data# more /data/Results_Store/RDF4JSUT/2023-08-05_RDF4JSUT_RunWL_Scal10K/Scalability/10K/RDF4JSUT-ExperimentWorkload/00-SC1_Geometries_Intersects_GivenPolygon-cold
554 341338164
root@f9a1d01d4750:/data# more /data/Results_Store/RDF4JSUT/2023-08-05_RDF4JSUT_RunWL_Scal10K/Scalability/10K/RDF4JSUT-ExperimentWorkload/00-SC1_Geometries_Intersects_GivenPolygon-cold-long
554 37442204 739450536 776892740
554 1730537 339607627 341338164
554 8631253 258019229 266650482
When you are done with the docker container, you can terminate it with:
/data$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f9a1d01d4750 geordfbench_nuc_rdf4j "/bin/sh -c '/data/s…" 12 minutes ago Up 12 minutes 5432/tcp test
/data$ docker rm -f test
test
The more interested user, can look at the simple Bash script, /data/startUpScript.sh, which is the entry point of the docker description file. The simple actions taken are: