Data Types
The Schema builder allows you to build up your own complex types by composing other data types, either by using types provided by the library or constructing your own.
Scalars
Scalar fields are the most basic of the data types used in StatelyDB. These data types get mapped into the closest equivalent in the language you’re using. Some of these data types have built-in data validation. Each of these can be imported from @stately-cloud/schema
and used as the type
of a field.
Type | Description |
---|---|
bool | A boolean indicating true or false |
string | A UTF-8 encoded string |
uint | An unsigned (i.e. always positive) integer up to 64 bits |
int | A signed integer up to 64 bits |
double | A 64 bit float |
bytes | An arbitrarily sized binary blob |
uuid | 16 bytes representing a UUID |
durationMilliseconds | A signed integer that indicates a duration of time in milliseconds |
durationSeconds | A signed integer that indicates a duration of time in seconds |
timestampMicroseconds | A signed integer that indicates a timestamp since unix epoch in microseconds |
timestampMilliseconds | A signed integer that indicates a timestamp since unix epoch in milliseconds |
timestampSeconds | A signed integer that indicates a timestamp since unix epoch in seconds |
url | A string that enforces and validates it is a URL structure |
We’ll add more standard types to the schema builder library as time goes on.
Custom Scalars
StatelyDB schema supports creating your own custom type aliases for cases where you want to use a consistent reference in multiple places. For example, you might want to create a type alias for an identifier that is referenced by multiple Item types. Another example would be re-using a common type with a validation rule, like the format of an email address.
The CourseID
and Email
types above define their own validation rules using a CEL expression and can be used in place of a raw string
that might otherwise be needed. The StudentID
is just a nice named alias for a uint
to make your schema self-documenting.
Enums
StatelyDB supports defining Enum types that provide a simple mapping of names to numerical values.
It’s recommended to start your enums at 1, not 0. The schema builder will automatically add a 0 value named “UNSET” if one has not already been specified. This is important because 0 is the “zero value” for an enum—if you had a real value at 0, you couldn’t tell the difference between a field of that type being unset or set to the zero value. See Zero Values for more details.
Note: When a field with an Enum type is referenced in a key path template, the key path will use the Enum’s number value, not its string value.
Arrays
The arrayOf
function can take any other type and turn it into an array (ordered list) of that type. Other container types like mapOf
, setOf
, etc. are on the roadmap.
Objects
Object types allow you to create more complex composite types that can be reused across Item types. An Object type definition in schema looks similar to an Item type, but without a Key Path and without support for attributes like TTLs. You also use the objectType
builder function instead of itemType
. The rule of thumb is that Object types can be used as a field type in other Object types and Item types, while Item types cannot be used as a field type in another type.
The following example shows adding an Object type of ContactInfo
that contains four fields. The new ContactInfo
Object type is then referenced by Student
and Instructor
. Object types provide a powerful way to define reusable types that be composed together.
Items
Of course, Item types are types. But they’re a bit special since they can’t be used as the field type for another item. In other words, Item types can’t be embedded in other Items. In the future, we’ll handle this via relations and pointers, but for now, it’s forbidden.
Zero Values
You may have guessed based on the supported data types and the field numbers that Stately’s schema is based on protobuf, and you’d be right! We didn’t want to reinvent the building blocks of an already-great schema system and binary encoding. However this means we’ve consciously inherited some behavior from protobuf that might not be entirely intuitive.
An important thing to understand is that every data type has a zero value, and there is no distinction between an unset value and a zero value. For example, if you have a uint
field and don’t set it at all, that field’s value is 0. If you set it to 0, it’s still 0. One of the great properties this gives us is that zero values take up no storage space. But it also can be weird because if a field is required (and almost all fields default to required!), it means you cannot have a zero value in that field.
This might be easy to remember for numeric types, but some of the other zero values are less intuitive:
- The zero value of an array is an empty array. So by default, all your array fields require there to be at least one item in them! This includes
bytes
fields. - The zero value of a
string
is the empty string. - The zero value of a
bool
isfalse
, but we won’t even let you set abool
to required - what would that even mean?
UUIDs
StatelyDB’s UUIDs require a bit more explanation. You may be familiar with the standard string form for a v4 UUID, like 9edae9a5-fa39-4e45-bfd6-21707067f613
. This is a 36-character representation of what is really a 128-bit (16 byte) value. That’s 20 wasted bytes per UUID, or a 125% overhead! At Stately Cloud we care deeply about storage efficiency and we know that these kinds of things add up. That’s why we always store UUIDs as 16-byte arrays, and even when we convert them to strings (for example, in key paths), we base64-encode the original byte array into a 22-character string instead of using the standard string format (that’s only 38% overhead, and that’s only while the key path is on the wire—we still store it in binary).
The downside of this obsession with efficiency is that in your client code, you might end up with a Uint8Array
(JavaScript) or []byte
(Go) which is more annoying to work with than a string. We don’t like this, and our roadmap includes fixing this up, but for now it’s good to be aware of it. There are libraries you can use to translate between bytes and UUIDs in the meantime.