Tauq Binary Format (TBF)

High-performance binary serialization: up to 84% smaller than JSON, with optional schema-aware encoding.

84%

Smaller than JSON
(with schema)

4µs

Parse time per record
(faster than Protobuf)

Encoding approaches
(pick what you need)

When to Use TBF

Scenario	Use TBF?	Use TQN (Text)?
LLM input/output	❌ No	✅ Yes (54% fewer tokens)
Database storage	✅ Yes	❌ No
Network protocols	✅ Yes	❌ No
Config files	❌ No	✅ Yes (human-readable)
Data interchange	✅ Yes	⚠️ Depends
Apache Iceberg tables	✅ Yes	❌ No

Key Features

Compact Binary Encoding

✅ Up to 84% smaller than JSON (with schema)
✅ 44-56% reduction (CLI default)
✅ Adaptive integer encoding
✅ Dictionary compression
✅ Columnar encoding for tables

Schema-Based Optimization

✅ Type-aware encoding
✅ Offset-based encoding for ranges
✅ Zero-copy deserialization
✅ Optional schema hints
✅ No code generation needed

Iceberg Integration

✅ Apache Iceberg tables
✅ Arrow RecordBatch
✅ Columnar file format
✅ Distributed processing
✅ Time-series friendly

Three Encoding Approaches

⚡

1. CLI - Generic Serde

Perfect for quick conversions. No setup required, works with any data.

# Convert TQN to TBF
$ tauq build data.tqn --format tbf -o data.tbf

# Compression achieved: 94 KB → 41 KB (44% reduction)

Compression

44-56% reduction

Setup

None

Use Case

Quick conversions

🎯

2. Rust API - Schema-Aware

Best compression with type hints. Zero-copy deserialization, type safety.

use tauq::tbf::{TableSchemaBuilder, FieldEncoding, TableEncode};
use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize, TableEncode)]
struct Employee {
    #[tauq(encoding = "u16")]
    id: u32,
    name: String,
    #[tauq(encoding = "u8", offset = 18)]
    age: u32,
    salary: f64,
}

let employees = vec![
    Employee { id: 1, name: "Alice".into(), age: 30, salary: 75000.0 },
    Employee { id: 2, name: "Bob".into(), age: 25, salary: 65000.0 },
];

let bytes = tbf::to_bytes(&employees)?;
// Result: ~40 bytes vs ~120 bytes JSON

Compression

~84% reduction

Setup

Add derive macro

Use Case

Maximum compression

📊

3. Iceberg - Data Lakes

Write directly to Iceberg tables with full columnar optimization.

use tauq::tbf_iceberg::TbfFileWriter;

let writer = TbfFileWriter::new(schema);
writer.write_records(records)?;

Compression

~84% + columnar

Setup

Arrow required

Use Case

Data lakes

Size Comparison

1000 employee records (5 fields each)

JSON 87 KB (baseline)

100%

TQN (text) 43 KB (-51%)

49%

TBF (binary) 14 KB (-84%)

16%

Overview

Detailed format description, encoding strategies, and when to use each approach.

Learn more →

Schema-Based Encoding

Type hints, offset encoding, and optimization techniques for maximum compression.

Learn more →

Iceberg Integration

Using TBF with Apache Iceberg for data lakes and distributed processing.

Learn more →