133 lines
3.7 KiB
Markdown
133 lines
3.7 KiB
Markdown
---
|
|
id: xowl
|
|
tags:
|
|
- AWS
|
|
- databases
|
|
- dynamodb
|
|
created: Sunday, June 09, 2024
|
|
---
|
|
|
|
# DynamoDB
|
|
|
|
## Data structure
|
|
|
|
### Non-relational tables
|
|
|
|
DynamoDB is "NoSQL" because it does not support #SQL queries and is
|
|
non-relational meaning there cannot be JOIN operations via
|
|
[foreign_keys](Foreign_keys_in_SQL.md)
|
|
|
|

|
|
|
|
### Primary key
|
|
|
|
Although the data is stored as a table, one of the attributes is a primary key
|
|
and the rest of the attributes are effectively the "value" associated with it.
|
|
|
|
Because DynamoDB is schemaless, other than the primary key, neither the
|
|
attributes or their data types need to be defined beforehand and each item can
|
|
have its own distinct attributes.
|
|
|
|
Each item in the table is uniquely identifiable by its primary key.
|
|
|
|
There are two types of primary key available:
|
|
|
|
- partition key: a simple primary key composed of one attribute only. Because
|
|
the primary key is hash-mapped items can be retrieved very rapidly using the
|
|
primary key. This would be the `personId` alone.
|
|
|
|
- composite key: this comprises a partition key and a _sort key_ both of which
|
|
are attributes. In a table that has a partition key and a sort key, it's
|
|
possible for multiple items to have the same partition key value. However,
|
|
those items must have different sort key values. You could then query by
|
|
either key or both. For instance using the `PersonId` along with `LastName`
|
|
|
|
### Secondary index
|
|
|
|
As well as the index provided by the primary key, you can set one or more
|
|
**secondary indices**. A secondary index lets you query the data in the table
|
|
using an alternate key.
|
|
|
|
A **global secondary index** is useful for querying data that needs to be
|
|
accessed using non-primary key attributes. For example, if you have a Users
|
|
table with `UserID` as the primary key but often need to fetch users by their
|
|
`Email`, a GSI on `Email` would be appropriate.
|
|
|
|
There are also **local secondary indices** but I don't understand the
|
|
difference.
|
|
|
|
## Real example
|
|
|
|
Below is a specification of the DynamoDB table I am using for my time-entries
|
|
project:
|
|
|
|
```json
|
|
{
|
|
"TableName": "TimeEntries",
|
|
"KeyAttributes": {
|
|
"PartitionKey": {
|
|
"AttributeName": "activity_start_end",
|
|
"AttributeType": "S"
|
|
}
|
|
},
|
|
"NonKeyAttributes": [
|
|
{
|
|
"AttributeName": "activity_type",
|
|
"AttributeType": "S"
|
|
},
|
|
{
|
|
"AttributeName": "start",
|
|
"AttributeType": "S"
|
|
},
|
|
{
|
|
"AttributeName": "end",
|
|
"AttributeType": "S"
|
|
},
|
|
{
|
|
"AttributeName": "duration",
|
|
"AttributeType": "N"
|
|
},
|
|
{
|
|
"AttributeName": "description",
|
|
"AttributeType": "S"
|
|
},
|
|
{
|
|
"AttributeName": "year",
|
|
"AttributeType": "S"
|
|
}
|
|
],
|
|
"GlobalSecondaryIndexes": [
|
|
{
|
|
"IndexName": "YearIndex",
|
|
"KeyAttributes": {
|
|
"PartitionKey": {
|
|
"AttributeName": "year",
|
|
"AttributeType": "S"
|
|
},
|
|
"SortKey": {
|
|
"AttributeName": "start",
|
|
"AttributeType": "S"
|
|
}
|
|
},
|
|
"Projection": {
|
|
"ProjectionType": "ALL"
|
|
}
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
This defines the attribute `activity_start_end` as the primary key. This string
|
|
(`S`) value is a concatenation of three attributes, which is a way of ensuring
|
|
each entry for the attribute will be unique.
|
|
|
|
The `NonKeyAttributes` are all the other attributes in addition to the primary
|
|
key. As mentioned these do not actually need to be defined when setting up the
|
|
table but they are listed here for clarity.
|
|
|
|
I have also defined a GSI. This is derived from the `Year` attribute. This will
|
|
group all the items by their `Year`, allowing me to query directly by year but
|
|
also helping to chunk the entries which will make look-ups quicker and less
|
|
expensive.
|
|
|
|
## Related notes
|