Types
Type = Wildcard | BasicTypes | ListType | DictType | EnumType | TableTypes ;
Wildcard Type
The wildcard type ?
is used to represent either heterogeneous data or dynamically-typed data of any type (including compound types). The compiler must resolve all statically-determined wildcards prior to generating code and ensure type correctness.
Wildcard = '?' ;
Declarations may either specify the exact type, or use the wildcard to use compile-time resolution.
Example
// Declare a new variable 't' with compile-time type resolution
t:? = @sum(a);
// Equivalent to the following, assuming @sum returns i32
t:i32 = @sum(a);
Basic Types
The basic unit of data in HorseIR is a vector (i.e. an array), consisting of data of the same type.
Name | Alias | Byte | Description |
---|---|---|---|
boolean | bool |
1* | 0 (false) and 1 (true) |
small | i8 |
1 | Half short integer or char |
short | i16 |
2 | Short integer |
int | i32 |
4 | Integer |
long | i64 |
8 | Long integer (default, x64) |
float | f32 |
4 | Single precision |
double | f64 |
8 | Double precision |
complex | clex |
8 | Complex number (real+imaginary single precision floats) |
char | char |
1 | Half short integer or char |
symbol | sym |
8 | Symbol, but stored in integer |
string | str |
8 | String |
month | m |
4 | Month (YYYY-MM ) |
date | d |
4 | Date (YYYY-MM-DD ) |
date time | z |
8 | Date time |
minute | w |
4 | Minute (hh:mm ) |
second | v |
4 | Second (hh\:mm:ss ) |
time | t |
4 | Time (hh\:mm:ss.ll ) |
function | func |
8 | Function literal |
Syntactically, the type alias is used to refer to the type. The short name is used internally.
BasicTypes = "bool" | "i8" | "i16" | "i32" | "i64" |
"f32" | "f64" | "complex" | "char" | "str" | "sym" |
"dt" | "date" | "month" | "minute" | "second" | "time" ;
Note
Vectors of function literals are currently not supported.
Compound Types
Compound types store more complex structures allowing heterogeneity and mappings.
Name | Alias | Short | Description |
---|---|---|---|
list | list |
G | Collection of items |
dictionary | dict |
N | Key-value mapping |
enumeration | enum |
Y | Mapping (i.e. foreign key) |
table | table |
A | Collection of columns |
keyed table | ktable |
K | Two normal tables |
List Type
A list defines a variable length container consisting of cells, each cell containing data of any type. Nested lists are permitted.
ListType = "list" '<' Type { ',' Type } '>' ;
For a list type, either a single type may be specified for the entire list, or a type for each cell. In the case of a single type, the list has an unbounded number of cells, all with the same type. If more than one type is specified, the list corresponds to a tuple.
Example
// Defines a list with 1+ cells of i32 type
list<i32>
// Defines a list with exactly 3 cells of explicit types
list<i32, i64, i32>
Wildcard Cell Type
List cell types may also be specified using the wilcard type according to the following rules:
- If there is only 1 cell type given, the wildcard may resolve to a type list with any number of elements of any type
- If there is more than 1 cell type given, the wildcard must resolve to a single type
Example
// Equivalent to: list<i32>, list<i32, i64, i8>, etc.
t:list<?> = ...;
// Equivalent to: list<i32, i32>, list<i32, list<i32>>, etc.
// Error to: list<i32, i32, i32>
t:list<i32, ?> = ...;
Dictionary Type
A dictionary stores key-value pairs, mapping keys to values.
- If key/value is a basic type, each element in the vector should be considered as a single key/value
- If key/value is a list type, each cell in the list should be considered as a single key/value
- If key/value is a compound type other than list, its entirety is considered as a single key/value
DictType = "dict" '<' Type ',' Type '>' ;
Dictionaries are formed from collections of keys and values using built-in function @dict
.
Example
a:str = ("a", "b", "c"):str;
b:str = ("Montreal", "Toronto", "Vancouver"):str;
c:dict<str, str> = @dict(a, b);
The above creates mappings: a \rightarrow Montreal, b \rightarrow Toronto, and c \rightarrow Vancouver.
Enumeration Type
An enumeration represents the indexing relationship between two vectors commonly found in relational database systems: enum keys and enum values. For each value, the enumeration stores the index of the same element in the keys vector. If no corresponding element is found, the length of the enum keys is stored.
EnumType = "enum" '<' Type '>' ;
Enumerations are formed from vectors of keys and values using built-in function @enum
.
Example
a:i32 = (1, 2, 3):i32;
b:i32 = (3, 3, 1,2):i32;
// Forms an enumeration with internal index vector (2,2,0,1):i32
c:enum<i32> = @enum(a, b);
Table Types
A table consists of a non-empty list of columns. Each column has a name and a vector of homogeneous values. There are two kinds of tables: table
and ktable
(keyed table).
TableTypes = "table" | "ktable" ;
- Table: A normal table of columns
- Keyed table: A table with key columns
Tables are formed from a vector of column symbols and a list of column values. The number of symbols and list cells must be equal.
Example
student_id:i32 = (1, 2, 3):i32;
student_age:i8 = (10, 11, 9):i8;
student_grade:i8 = (9, 9, 9):i8;
tab_cols:sym = (`id, `age, `grade):sym;
tab_vals:list<?> = @list(student_id, student_age, student_grade);
tab:table = @table(tab_cols, tab_vals);
The resulting table tab
is formed as follows:
id | age | grade |
---|---|---|
1 | 10 | 9 |
2 | 11 | 9 |
3 | 9 | 9 |
A keyed table is similar to a normal table and consists of a set of key columns and a set of non-key columns. A key column, as in relational databases, must have unique non-null values. The function @ktable
creates a keyed table from two tables with the same number of rows. The columns from the first table become the key columns, and columns from the second table become the non-key columns.
Example
a:table = /* columns: id */;
b:table = /* columns: age, grade */;
c:ktable = @ktable(a, b);
The resulting keyed table c
is formed as follows:
id (key) | age | grade |
---|---|---|
1 | 10 | 9 |
2 | 11 | 9 |
3 | 9 | 9 |
Note
- An empty table has no rows, but must have at least one column.
- A keyed table must have at least one key column.
- A keyed table with multiple key columns has compound keys.
- The conversion between tables and keyed tables uses two built-in functions:
@add_key
: Designates columns as keys.@remove_key
: Removes columns from keys. If all keys are removed, a normal table is returned.
Value Ranges
Basic types have value ranges based on standard C conventions. Numeric types are always signed.
bool
: 0 or 1- Numeric types depend on the number of bits
i8
: -27 to 27-1i16
: -215 to 215-1i32
: -231 to 231-1i64
: -263 to 263-1f32
: 1.2E-38 to 3.4E+38 (precision: 6 decimal places)f64
: 2.3E-308 to 1.7E+308 (precision: 15 decimal places)
- A complex number is the combination of two floating point numbers (
f32
) - Each date type has its own date-specific format (
YYYY-MM-DD T hh:mm:ss.ll
) and value rangeYYYY
(year): 1000 to 9999MM
(month): 01 to 12 (two digits required)DD
(day): 01 to 28/29/30/31- January - 31 days
- February - 28 days (common year) or 29 days (leap year)
- March - 31 days
- April - 30 days
- May - 31 days
- June - 30 days
- July - 31 days
- August - 31 days
- September - 30 days
- October - 31 days
- November - 30 days
- December - 31 days
hh
(hour): 00 to 23mm
(minute): 00 to 59ss
(second): 00 to 59ll
(millisecond): 000 to 999
Noted that an error occurs when a number exceeds the range of its type.
Example
2:bool // Error
999:i8 // Error
Type Conversions
Type conversions are performed with explicit casting. Only the following type conversions are permitted (any conversion not-listed is disallowed).
-
Integer
- An integer of narrower width may be cast to an integer of wider width (e.g.
i32
toi64
). - An integer of wider width may not be cast to a integer of narrower width.
- An integer of narrower width may be cast to an integer of wider width (e.g.
-
Integer and Float
- An integer may be cast to a float. Loss of precision may occur (e.g.
i64
tof32
). - A float value may be cast to an integer by truncating it's decimal part.
- Only
f32
toi32
/i64
, andf64
toi64
is allowed.
- Only
- An integer may be cast to a float. Loss of precision may occur (e.g.
-
Integer and Char
- Integer and character values may not be cast to each other.
-
Float
- A float with lower precision may be cast to a float of higher precision.
- A float with higher precision may not be cast to a float of lower precision.
-
Boolean
- A boolean value may be cast to an integer and vice-versa. Zero is false, non-zero is true.
-
String and Symbol
- Strings may be cast to symbols and vice-versa.
Illustrations
Basic types
contiguous
/
+-------+-------+-------+
| 35 | 79 | ... | (integers, for example)
+-------+-------+-------+
List
+------+
| list |
+------+
/ | ...
+----+ +----+
| c0 | | c1 | ... (cells)
+----+ +----+
Dictionary
+-----+ map +-------------+
| "a" | --> | "Montreal" |
+-----+ +-------------+
| "b" | --> | "Toronto" |
+-----+ +-------------+
| "c" | --> | "Vancouver" |
+-----+ +-------------+
| ... | --> | ........... | (key --> value)
+-----+ +-------------+
Enumeration
+---+---+---+
key | 7 | 3 | 6 | key
+---+---+---+ \
enum
/
+---+---+---+---+---+ +---+---+---+---+---+
value | 3 | 3 | 6 | 6 | 7 | ----> | 1 | 1 | 2 | 2 | 0 | (indices)
+---+---+---+---+---+ +---+---+---+---+---+
Table
+----+-----+-------+
|"Id"|"Age"|"Grade"|
+----+-----+-------+
| | |
+---+ +----+ +---+
| 1 | | 10 | | 9 |
+---+ +----+ +---+
| 2 | | 11 | | 9 |
+---+ +----+ +---+
| 3 | | 9 | | 9 | (columnar store)
+---+ +----+ +---+
Keyed table
+-----------ktable-----------+
| +----+ +-----+-------+ |
| |"Id"| |"Age"|"Grade"| |
| +----+ +-----+-------+ |
| | | | |
| +---+ +----+ +---+ |
| | 1 | | 10 | | 9 | |
| +---+ +----+ +---+ |
| | 2 | | 11 | | 9 | |
| +---+ +----+ +---+ |
| | 3 | | 9 | | 9 | |
| +---+ +----+ +---+ |
| /* key */ /* non-keys */ |
+----------------------------+