Tauq Binary Format (TBF)

High-performance binary serialization: up to 84% smaller than JSON, with optional schema-aware encoding.

84%
Smaller than JSON
(with schema)
4µs
Parse time per record
(faster than Protobuf)
3
Encoding approaches
(pick what you need)

When to Use TBF

Scenario Use TBF? Use TQN (Text)?
LLM input/output ❌ No ✅ Yes (54% fewer tokens)
Database storage ✅ Yes ❌ No
Network protocols ✅ Yes ❌ No
Config files ❌ No ✅ Yes (human-readable)
Data interchange ✅ Yes ⚠️ Depends
Apache Iceberg tables ✅ Yes ❌ No

Key Features

Compact Binary Encoding

  • ✅ Up to 84% smaller than JSON (with schema)
  • ✅ 44-56% reduction (CLI default)
  • ✅ Adaptive integer encoding
  • ✅ Dictionary compression
  • ✅ Columnar encoding for tables

Schema-Based Optimization

  • ✅ Type-aware encoding
  • ✅ Offset-based encoding for ranges
  • ✅ Zero-copy deserialization
  • ✅ Optional schema hints
  • ✅ No code generation needed

Iceberg Integration

  • ✅ Apache Iceberg tables
  • ✅ Arrow RecordBatch
  • ✅ Columnar file format
  • ✅ Distributed processing
  • ✅ Time-series friendly

Three Encoding Approaches

1. CLI - Generic Serde

Perfect for quick conversions. No setup required, works with any data.

# Convert TQN to TBF
$ tauq build data.tqn --format tbf -o data.tbf

# Compression achieved: 94 KB → 41 KB (44% reduction)
Compression
44-56% reduction
Setup
None
Use Case
Quick conversions
🎯

2. Rust API - Schema-Aware

Best compression with type hints. Zero-copy deserialization, type safety.

use tauq::tbf::{TableSchemaBuilder, FieldEncoding, TableEncode};
use serde::{Serialize, Deserialize};

#[derive(Serialize, Deserialize, TableEncode)]
struct Employee {
    #[tauq(encoding = "u16")]
    id: u32,
    name: String,
    #[tauq(encoding = "u8", offset = 18)]
    age: u32,
    salary: f64,
}

let employees = vec![
    Employee { id: 1, name: "Alice".into(), age: 30, salary: 75000.0 },
    Employee { id: 2, name: "Bob".into(), age: 25, salary: 65000.0 },
];

let bytes = tbf::to_bytes(&employees)?;
// Result: ~40 bytes vs ~120 bytes JSON
Compression
~84% reduction
Setup
Add derive macro
Use Case
Maximum compression
📊

3. Iceberg - Data Lakes

Write directly to Iceberg tables with full columnar optimization.

use tauq::tbf_iceberg::TbfFileWriter;

let writer = TbfFileWriter::new(schema);
writer.write_records(records)?;
Compression
~84% + columnar
Setup
Arrow required
Use Case
Data lakes

Size Comparison

1000 employee records (5 fields each)

JSON 87 KB (baseline)
100%
TQN (text) 43 KB (-51%)
49%
TBF (binary) 14 KB (-84%)
16%