The Denormalization Migration Project is to migrate the data fetching from the old tables to denormalized new tables/schema.
Previously, when we were rendering the map, we are fetching data from the legacy (currently it's still online) version of the tables, the downside of this approach is that it is pretty time-consuming in cases, you can find the SQL is pretty complicated:
INNER JOIN
( select trees.id as org_tree_id from trees
INNER JOIN (
SELECT id FROM planter
JOIN (
SELECT entity_id FROM getEntityRelationshipChildren(
(SELECT id FROM entity WHERE map_name = '${this.mapName}')
)
) org ON planter.organization_id = org.entity_id
) planter_ids
ON trees.planter_id = planter_ids.id
) tree_ids
ON tree_region.tree_id = tree_ids.org_tree_id`;
Please note, the getEntityRelationshipChildren is a function with recurse, so, generaly, these kind of SQL are pretty slow and hard to optimize. Now with the denormalized tabels in place, we are going to use the simplied way to speed up the web map performance.
The Github Project
We created a project on Github to collect tickets relevant to this task:
The Database
In the PostgreSQL DB, we put denormalized tabes into schema map_features
Currently, the map app displays all the captures, including unapproved ones, so now we just fetch data from raw_capture_featureraow_capture_cluster (the capture_feature would be approved ones)
For the raw_capture_feature:
treetracker=> \d map_features.raw_capture_feature;
Table "map_features.raw_capture_feature"
Column | Type | Collation | Nullable | Default
---------------------+--------------------------+-----------+----------+---------
id | uuid | | not null |
lat | numeric | | not null |
lon | numeric | | not null |
location | geometry(Point,4326) | | not null |
field_user_id | bigint | | not null |
field_username | character varying | | not null |
device_identifier | character varying | | |
attributes | jsonb | | |
tracking_session_id | uuid | | |
map_name | jsonb | | |
created_at | timestamp with time zone | | not null |
updated_at | timestamp with time zone | | not null |
capture_taken_at | timestamp with time zone | | |
We can get the coordinates of the capture directly, to filter the data by some conditions, say, planter, organization, we need to parse the data in the JSON object attributes field.
The API spec
Because we are going to keep the original/online version of map app running for a while, so we need to let the API service offers both the old and new spec.
Currently, there are two services related to this task, the tile server and the query API: