JSON Storage

Introduction

The JSON Storage system is comprised of two types of files: Schema file(optional) and Data file. The Schema file is written in Standard SQL and is responsible for storing the structure of the table. The Data file contains the actual data and supports two file formats: *.json and *.jsonl. This document provides detailed examples of how to create schema and read/write data using the Json Storage system. While it supports all DML features, it is particularly specialized for SELECT and APPEND INSERT. For further information, please refer to the Limitations section.

Structure

JSON Storage is based on two types of files: Schema and Data. The Schema file contains the definition of the structure of the data, while the Data file contains the actual data.

gluesql> CREATE TABLE User (
    id INT,
    name TEXT
)

$ ls -l

User.sql # Schema file
User.jsonl # Data file

Schema File

The schema definition is saved in a file named {TABLE_NAME}.sql using standard SQL. For example, if the table name is User, then the schema file will be named User.sql.

--! User.sql
CREATE TABLE User (
  id INT,
  name TEXT,
);

Schemaless Table

A schemaless table is optional, and if there is no corresponding {TABLE_NAME}.sql file, the table is schemaless. A schemaless table can save any data regardless of column name and data type.

gluesql> CREATE Table User;

$ ls -l

User.jsonl

Data File

JSON Storage saves data in two types of data files: *.jsonl (default) and *.json.

`*.jsonl` File Format

The *.jsonl file format is a file containing one JSON object per line. For example:

{"id": 1, "name": "Glue"}
{"id": 2, "name": "SQL"}

`*.json` File Format

The *.json file format supports two different formats:

Array of JSON

[
  {
    "id": 1,
    "name": "Glue"
  },
  {
    "id": 2,
    "name": "SQL"
  }
]

Single JSON

{
  "name": "GlueSQL"
  "keywords": ["Database, Rust"]
  "stars": 999999
}

Examples

Read Existing JSON/JSONL Schemaless Files

Locate your JSON/JSONL schemaless files in the data path. Here, we use ./data.

$ ls -rlt ./data

User.json
LoginHistory.jsonl

Keep in mind that if there are no *.sql files, the data is considered schemaless, meaning that the number of columns in each row may vary.

//! User.json
[
  {
    "id": 1,
    "name": "Alice",
    "location": "New York"
  },
  {
    "id": 2,
    "name": "Bob",
    "language": "Rust"
  },
  {
    "id": 3,
    "name": "Eve"
  }
]

//! LoginHistory.jsonl
{"timestamp": "2023-05-01T14:36:22.000Z", "userId": 1, "action": "login"}
{"timestamp": "2023-05-01T14:38:17.000Z", "userId": 2, "action": "logout"}
{"timestamp": "2023-05-02T08:12:05.000Z", "userId": 2, "action": "logout"}
{"timestamp": "2023-05-02T09:45:13.000Z", "userId": 3, "action": "login"}
{"timestamp": "2023-05-03T16:21:44.000Z", "userId": 1, "action": "logout"}

Read with GlueSQL JSON Storage

let path = "./data/";
let json_storage = JsonStorage::new(path).unwrap();
let mut glue = Glue::new(json_storage);

glue.execute("
SELECT *
FROM User U
JOIN LoginHistory L ON U.id = L.userId;
");

action	id	language	location	name	timestamp	userId
login	1		New York	Alice	2023-05-01T14:36:22.000Z	1
logout	1		New York	Alice	2023-05-03T16:21:44.000Z	1
logout	2	Rust		Bob	2023-05-01T14:38:17.000Z	2
logout	2	Rust		Bob	2023-05-02T08:12:05.000Z	2
login	3			Eve	2023-05-02T09:45:13.000Z	3

Create Schema Table

Create Table

let path = "./data/";
let json_storage = JsonStorage::new(path).unwrap();
let mut glue = Glue::new(json_storage);

glue.execute("
CREATE TABLE Account (
  accountId INT NOT NULL,
  accountOwner TEXT NOT NULL,
  accountType TEXT NOT NULL,
  balance INT NOT NULL,
  isActive BOOLEAN NOT NULL
);
");

$ ls -l

Account.sql
Account.jsonl

Verity Schema file

--! Account.sql
CREATE TABLE Account (
  accountId INT NOT NULL,
  accountOwner TEXT NOT NULL,
  accountType TEXT NOT NULL,
  balance INT NOT NULL,
  isActive BOOLEAN NOT NULL
);

Insert data

glue.execute("
INSERT INTO Account VALUES
  (10001, 'John Smith', 'Checking', 5000, true),
  (10002, 'Jane Doe', 'Savings', 10000, true),
  (10003, 'Robert Johnson', 'Checking', 2500, false),
  (10004, 'Alice Kim', 'Savings', 7500, true),
  (10005, 'Michael Chen', 'Checking', 10000, true);
");

Select data

glue.execute("SELECT * FROM Account;");

accountId	accountOwner	accountType	balance	isActive
10001	John Smith	Checking	5000	TRUE
10002	Jane Doe	Savings	10000	TRUE
10003	Robert Johnson	Checking	2500	FALSE
10004	Alice Kim	Savings	7500	TRUE
10005	Michael Chen	Checking	10000	TRUE

Verify Data file

//! Account.jsonl
{"accountId":10001,"accountOwner":"John Smith","accountType":"Checking","balance":5000,"isActive":true}
{"accountId":10002,"accountOwner":"Jane Doe","accountType":"Savings","balance":10000,"isActive":true}
{"accountId":10003,"accountOwner":"Robert Johnson","accountType":"Checking","balance":2500,"isActive":false}
{"accountId":10004,"accountOwner":"Alice Kim","accountType":"Savings","balance":7500,"isActive":true}
{"accountId":10005,"accountOwner":"Michael Chen","accountType":"Checking","balance":10000,"isActive":true}

Limitation

JSON Storage is capable of supporting a variety of operations, including SELECT, INSERT, DELETE, and UPDATE.
However, its design primarily emphasizes SELECT and APPEND INSERT functionality.
It's important to note that if you perform DELETE, UPDATE, or INSERT in the middle of the rows, it can cause the internal rewriting of all the rows, which can lead to a decrease in performance.

JSON Storage

Introduction​

Structure​

Schema File​

Schemaless Table​

Data File​

*.jsonl File Format​

*.json File Format​

Examples​

Read Existing JSON/JSONL Schemaless Files​

Create Schema Table​

Limitation​