Dataset ORM Tutorial
The dataset module provides an ORM-like interface for SQLite databases with automatic schema management. This tutorial covers everything from basic CRUD operations to advanced querying patterns.
Table of Contents
- Introduction
- Getting Started
- Insert Operations
- Querying Data
- Query Operators
- Update Operations
- Delete Operations
- Schema Evolution
- Working with JSON Fields
- Multiple Tables
- Building a Complete Application
- Best Practices
Introduction
Traditional database programming requires you to define schemas upfront, write SQL queries manually, and handle type conversions. The dataset module eliminates this boilerplate:
- Auto-schema - Tables and columns are created automatically when you insert data
- No SQL - Query using maps with intuitive operators
- Type conversion - Maps and Lists are automatically serialized to JSON
- Soft delete - Records are never permanently lost by default
- UUID primary keys - Every record gets a unique identifier
Getting Started
Opening a Database
Create a file-based database for persistent storage:
import "dataset" for Dataset
var db = Dataset.open("myapp.db")
var users = db["users"]
db.close()
Or use an in-memory database for testing:
import "dataset" for Dataset
var db = Dataset.memory()
var users = db["users"]
db.close()
Basic Structure
Every table automatically has three fields:
| Field | Type | Description |
|---|---|---|
uid |
TEXT (UUID) | Primary key, auto-generated |
created_at |
TEXT (ISO datetime) | Timestamp when record was created |
deleted_at |
TEXT (ISO datetime) | Timestamp when soft-deleted (null if active) |
Insert Operations
Basic Insert
import "dataset" for Dataset
var db = Dataset.memory()
var users = db["users"]
var user = users.insert({
"name": "Alice",
"email": "alice@example.com",
"age": 28
})
System.print(user["uid"]) // e.g., "550e8400-e29b-41d4-a716-446655440000"
System.print(user["created_at"]) // e.g., "2024-01-15T10:30:00"
System.print(user["name"]) // "Alice"
db.close()
Custom UID
Provide your own UID if needed:
var user = users.insert({
"uid": "custom-id-12345",
"name": "Bob"
})
Type Inference
Columns are created with appropriate SQLite types:
users.insert({
"name": "Charlie", // TEXT
"age": 30, // INTEGER
"score": 95.5, // REAL
"active": true, // INTEGER (1 or 0)
"tags": ["a", "b"], // TEXT (JSON array)
"settings": { // TEXT (JSON object)
"theme": "dark"
}
})
Querying Data
Find All Records
var allUsers = users.all()
for (user in allUsers) {
System.print("%(user["name"]) - %(user["email"])")
}
Find by Conditions
var admins = users.find({"role": "admin"})
var youngAdmins = users.find({
"role": "admin",
"age": 25
})
Find One Record
var user = users.findOne({"email": "alice@example.com"})
if (user != null) {
System.print("Found: %(user["name"])")
} else {
System.print("User not found")
}
Count Records
var totalUsers = users.count()
System.print("Total users: %(totalUsers)")
Graceful Handling of Missing Columns
Querying a column that does not exist returns an empty list instead of crashing:
var result = users.find({"nonexistent_column": "value"})
System.print(result.count) // 0
Query Operators
Use double-underscore suffixes for comparison operators:
Comparison Operators
var adults = users.find({"age__gte": 18})
var teenagers = users.find({
"age__gte": 13,
"age__lt": 20
})
var notAdmin = users.find({"role__ne": "admin"})
Complete Operator Reference
| Suffix | SQL | Example |
|---|---|---|
__gt |
> |
{"age__gt": 18} - age greater than 18 |
__lt |
< |
{"price__lt": 100} - price less than 100 |
__gte |
>= |
{"score__gte": 90} - score 90 or higher |
__lte |
<= |
{"quantity__lte": 10} - quantity 10 or less |
__ne |
!= |
{"status__ne": "deleted"} - status not deleted |
__like |
LIKE |
{"name__like": "A\%"} - name starts with A |
__in |
IN |
{"status__in": ["active", "pending"]} |
__null |
IS NULL |
{"deleted_at__null": true} - is null |
LIKE Patterns
var startsWithA = users.find({"name__like": "A\%"})
var containsSmith = users.find({"name__like": "\%Smith\%"})
var threeLetterNames = users.find({"name__like": "___"})
In Wren strings, use \% for the SQL % wildcard and _ for single character wildcard.
IN Operator
var priorityUsers = users.find({
"role__in": ["admin", "moderator", "editor"]
})
var specificIds = users.find({
"uid__in": [
"550e8400-e29b-41d4-a716-446655440000",
"6ba7b810-9dad-11d1-80b4-00c04fd430c8"
]
})
NULL Checking
var usersWithEmail = users.find({"email__null": false})
var usersWithoutPhone = users.find({"phone__null": true})
Combining Conditions
Multiple conditions are combined with AND:
var results = products.find({
"category": "electronics",
"price__gte": 100,
"price__lte": 500,
"stock__gt": 0
})
Update Operations
Basic Update
var user = users.findOne({"email": "alice@example.com"})
var changes = users.update({
"uid": user["uid"],
"name": "Alice Smith",
"age": 29
})
System.print("Rows affected: %(changes)")
Adding New Fields
Updates can add new columns dynamically:
users.update({
"uid": user["uid"],
"phone": "+1-555-0123",
"verified": true
})
Delete Operations
Soft Delete (Default)
Soft delete sets deleted_at but keeps the record:
var deleted = users.delete(user["uid"])
System.print(deleted) // true if found and deleted
var all = users.all()
// Soft-deleted records are excluded
Hard Delete
Permanently remove a record:
users.hardDelete(user["uid"])
Hard delete is permanent. The record cannot be recovered.
Schema Evolution
The dataset module handles schema changes automatically:
import "dataset" for Dataset
var db = Dataset.memory()
var users = db["users"]
users.insert({"name": "Alice"})
System.print(users.columns) // {"uid": "TEXT", "created_at": "TEXT", "deleted_at": "TEXT", "name": "TEXT"}
users.insert({"name": "Bob", "email": "bob@example.com", "age": 30})
System.print(users.columns) // Now includes "email" and "age"
var alice = users.findOne({"name": "Alice"})
System.print(alice["email"]) // null (column exists but Alice has no value)
db.close()
Working with JSON Fields
Storing Complex Data
import "dataset" for Dataset
var db = Dataset.memory()
var posts = db["posts"]
posts.insert({
"title": "Getting Started with Wren",
"tags": ["wren", "programming", "tutorial"],
"metadata": {
"author": "Alice",
"views": 1250,
"featured": true,
"related": [101, 102, 103]
}
})
var post = posts.findOne({"title": "Getting Started with Wren"})
System.print(post["tags"]) // ["wren", "programming", "tutorial"]
System.print(post["tags"][0]) // "wren"
System.print(post["metadata"]["author"]) // "Alice"
System.print(post["metadata"]["views"]) // 1250
System.print(post["metadata"]["related"][0]) // 101
db.close()
Nested Updates
To update nested JSON, retrieve the full object, modify it, and save:
var post = posts.findOne({"title": "Getting Started with Wren"})
var metadata = post["metadata"]
metadata["views"] = metadata["views"] + 1
posts.update({
"uid": post["uid"],
"metadata": metadata
})
Multiple Tables
import "dataset" for Dataset
var db = Dataset.open("blog.db")
var authors = db["authors"]
var posts = db["posts"]
var comments = db["comments"]
var author = authors.insert({
"name": "Alice",
"bio": "Tech writer and developer"
})
var post = posts.insert({
"author_uid": author["uid"],
"title": "Introduction to Dataset",
"content": "The dataset module provides..."
})
comments.insert({
"post_uid": post["uid"],
"author_name": "Bob",
"content": "Great article!"
})
System.print(db.tables) // ["authors", "posts", "comments"]
var postComments = comments.find({"post_uid": post["uid"]})
System.print("Comments on post: %(postComments.count)")
db.close()
Building a Complete Application
Let's build a task management application:
import "dataset" for Dataset
import "io" for Stdin
class TaskManager {
construct new(dbPath) {
_db = Dataset.open(dbPath)
_tasks = _db["tasks"]
_categories = _db["categories"]
initDefaults()
}
initDefaults() {
if (_categories.count() == 0) {
_categories.insert({"name": "Work", "color": "blue"})
_categories.insert({"name": "Personal", "color": "green"})
_categories.insert({"name": "Shopping", "color": "orange"})
}
}
addTask(title, categoryName, priority) {
var category = _categories.findOne({"name": categoryName})
return _tasks.insert({
"title": title,
"category_uid": category != null ? category["uid"] : null,
"priority": priority,
"completed": false
})
}
listTasks() {
var tasks = _tasks.find({"completed": false})
for (task in tasks) {
var cat = task["category_uid"] != null ?
_categories.findOne({"uid": task["category_uid"]}) : null
var catName = cat != null ? cat["name"] : "None"
System.print("[%(task["priority"])] %(task["title"]) (%(catName))")
}
}
completeTask(title) {
var task = _tasks.findOne({"title": title})
if (task != null) {
_tasks.update({"uid": task["uid"], "completed": true})
return true
}
return false
}
listCategories() {
var categories = _categories.all()
for (cat in categories) {
var count = _tasks.find({
"category_uid": cat["uid"],
"completed": false
}).count
System.print("%(cat["name"]): %(count) tasks")
}
}
stats() {
var total = _tasks.count()
var completed = _tasks.find({"completed": true}).count
var pending = _tasks.find({"completed": false}).count
var highPriority = _tasks.find({
"completed": false,
"priority__gte": 3
}).count
System.print("Total: %(total)")
System.print("Completed: %(completed)")
System.print("Pending: %(pending)")
System.print("High priority: %(highPriority)")
}
close() {
_db.close()
}
}
var app = TaskManager.new("tasks.db")
app.addTask("Write documentation", "Work", 3)
app.addTask("Buy groceries", "Shopping", 2)
app.addTask("Learn Wren", "Personal", 3)
System.print("--- Tasks ---")
app.listTasks()
System.print("\n--- Categories ---")
app.listCategories()
System.print("\n--- Statistics ---")
app.stats()
app.completeTask("Write documentation")
System.print("\n--- After completion ---")
app.listTasks()
app.close()
Output:
--- Tasks ---
[3] Write documentation (Work)
[2] Buy groceries (Shopping)
[3] Learn Wren (Personal)
--- Categories ---
Work: 1 tasks
Personal: 1 tasks
Shopping: 1 tasks
--- Statistics ---
Total: 3
Completed: 0
Pending: 3
High priority: 2
--- After completion ---
[2] Buy groceries (Shopping)
[3] Learn Wren (Personal)
Best Practices
1. Use Meaningful Table Names
// Good
var users = db["users"]
var orderItems = db["order_items"]
// Avoid
var t1 = db["t1"]
var data = db["data"]
2. Close the Database
var db = Dataset.open("app.db")
// ... operations ...
db.close() // Always close when done
3. Use Soft Delete for Recoverable Data
users.delete(uid) // Preferred: data can be recovered
users.hardDelete(uid) // Only when permanent deletion is required
4. Check for null Results
var user = users.findOne({"email": email})
if (user == null) {
System.print("User not found")
return
}
// Use user safely
5. Use In-Memory Databases for Tests
var db = Dataset.memory() // Fast, isolated, no cleanup needed
6. Use uid Instead of id
// The primary key is "uid", not "id"
var user = users.insert({"name": "Alice"})
System.print(user["uid"]) // Correct
// user["id"] does not exist
7. Store Related Data with UIDs
var author = authors.insert({"name": "Alice"})
var post = posts.insert({
"author_uid": author["uid"], // Reference by uid
"title": "My Post"
})
For complex queries that cannot be expressed with the query operators (JOINs, GROUP BY, subqueries), access the underlying SQLite database via db.db and write raw SQL.
var results = db.db.query("
SELECT p.title, a.name as author_name
FROM posts p
JOIN authors a ON p.author_uid = a.uid
WHERE p.deleted_at IS NULL
")
Next Steps
- See the Dataset API Reference for complete method documentation
- Learn about raw SQLite access for complex queries
- Build a web application with dataset storage
- Use Jinja templates to render database content