Tauq Binary Format (TBF)
High-performance binary serialization: up to 84% smaller than JSON, with optional schema-aware encoding.
84%
Smaller than JSON
(with schema)
(with schema)
4µs
Parse time per record
(faster than Protobuf)
(faster than Protobuf)
3
Encoding approaches
(pick what you need)
(pick what you need)
When to Use TBF
| Scenario | Use TBF? | Use TQN (Text)? |
|---|---|---|
| LLM input/output | ❌ No | ✅ Yes (54% fewer tokens) |
| Database storage | ✅ Yes | ❌ No |
| Network protocols | ✅ Yes | ❌ No |
| Config files | ❌ No | ✅ Yes (human-readable) |
| Data interchange | ✅ Yes | ⚠️ Depends |
| Apache Iceberg tables | ✅ Yes | ❌ No |
Key Features
Compact Binary Encoding
- ✅ Up to 84% smaller than JSON (with schema)
- ✅ 44-56% reduction (CLI default)
- ✅ Adaptive integer encoding
- ✅ Dictionary compression
- ✅ Columnar encoding for tables
Schema-Based Optimization
- ✅ Type-aware encoding
- ✅ Offset-based encoding for ranges
- ✅ Zero-copy deserialization
- ✅ Optional schema hints
- ✅ No code generation needed
Iceberg Integration
- ✅ Apache Iceberg tables
- ✅ Arrow RecordBatch
- ✅ Columnar file format
- ✅ Distributed processing
- ✅ Time-series friendly
Three Encoding Approaches
⚡
1. CLI - Generic Serde
Perfect for quick conversions. No setup required, works with any data.
# Convert TQN to TBF
$ tauq build data.tqn --format tbf -o data.tbf
# Compression achieved: 94 KB → 41 KB (44% reduction) Compression
44-56% reduction
Setup
None
Use Case
Quick conversions
🎯
2. Rust API - Schema-Aware
Best compression with type hints. Zero-copy deserialization, type safety.
use tauq::tbf::{TableSchemaBuilder, FieldEncoding, TableEncode};
use serde::{Serialize, Deserialize};
#[derive(Serialize, Deserialize, TableEncode)]
struct Employee {
#[tauq(encoding = "u16")]
id: u32,
name: String,
#[tauq(encoding = "u8", offset = 18)]
age: u32,
salary: f64,
}
let employees = vec![
Employee { id: 1, name: "Alice".into(), age: 30, salary: 75000.0 },
Employee { id: 2, name: "Bob".into(), age: 25, salary: 65000.0 },
];
let bytes = tbf::to_bytes(&employees)?;
// Result: ~40 bytes vs ~120 bytes JSON Compression
~84% reduction
Setup
Add derive macro
Use Case
Maximum compression
📊
3. Iceberg - Data Lakes
Write directly to Iceberg tables with full columnar optimization.
use tauq::tbf_iceberg::TbfFileWriter;
let writer = TbfFileWriter::new(schema);
writer.write_records(records)?; Compression
~84% + columnar
Setup
Arrow required
Use Case
Data lakes
Size Comparison
1000 employee records (5 fields each)
JSON 87 KB (baseline)
100%
TQN (text) 43 KB (-51%)
49%
TBF (binary) 14 KB (-84%)
16%
Overview
Detailed format description, encoding strategies, and when to use each approach.
Learn more →
Schema-Based Encoding
Type hints, offset encoding, and optimization techniques for maximum compression.
Learn more →
Iceberg Integration
Using TBF with Apache Iceberg for data lakes and distributed processing.
Learn more →