This is a tutorial on how to perform queries on .ttl
brick schema and retrieve
the uuid
of the relative timeseries froma a timescaleDB
deployed locally on docker
container. It is based on the code contained in the
repository brick-data-retrieval-demo and explained
in this tutorial video. The following figure shows the general infrastructure setup.
A timeseries database is deployed using Docker container and data from static csv files are loaded on the database. The data exploration process is performed through the python script main.py
. A basic sparql query on a brick schema and data retrieval from the timescale db is performed.
-
Create virtual environment in the root folder and activate
python3 -m venv venv
-
Install the necessary packages in the virtual environment from the
requirements.txt
filepip install -r requirements.txt
-
Create an
.env
file in the root directory and set up the environmental variables. All the scripts secrets have as reference the env file in the root folder and the file should be like theexample.env
file:touch .env printf "TIMESCALEDB_HOST=localhost TIMESCALEDB_PORT=5432 TIMESCALEDB_USER=postgres TIMESCALEDB_PSW=mypassword POSTGRES_PASSWORD=mypassword VOLUME_NAME=4f31cadb[...]0eab" > .env
-
Unzip data files contained in
data.zip
with the following command:unzip data.zip
-
Start docker container with timeseries database. This action requires that you have Docker installed on your machine. Create a network and a container with a TimescaleDB by running the following command:
./scripts/start_docker_containers.sh
-
Check whether the docker containers are running:
docker ps
-
The first time you will load the timeseries data contained in the
data
folder in the TimescaleDB by running the following command:./scripts/setup_docker_timescaledb.sh
This command creates the database schema and tables based on the script
schema.sql
There are some useful tips when deploying the services and code development:
-
Persistence storage: To avoid to reload the csv files each time you create the container I suggest to create a volume to mount every time a new container is created. Once the container is running, get the current volume name or mount path with this command:
docker inspect timescaledb --format='{{range .Mounts }}{{.Name}}{{end}}' # should be something like -> 069ba64815f0c26783b81a5f0ca813227fde8491f429cf77ed9a5ae3536c0b2c
Copy the volume name in the
VOLUME_NAME
environment variable in the.env
file. Now you know the name of the volume, and you can mount it on the next run using the following script../scripts/start_docker_containers.sh
-
Requirements: Export requirements using
pipreqs
that scans the.py
files in the project and generates therequirements.txt
file the file. I prefer using this method instead ofpip freeze
to avoid to list unnecessary or conflicting requirementspipreqs . --force
-
Shutdown: The
scripts/cleanup.sh
script will delete the docker containers and network once you finished../scripts/cleanup_docker_containers.sh