December 20, 20245 min read
JSONL (JSON Lines): When to Use It and How
JSONL
Data Processing
JSONL (JSON Lines) is a convenient format for storing structured data that can be processed one record at a time. Learn when and how to use it effectively.
What is JSONL?
JSONL is a text format where each line is a valid JSON value, typically an object. It's also known as Newline-delimited JSON (NDJSON).
{"name": "Alice", "age": 30, "city": "New York"}
{"name": "Bob", "age": 25, "city": "Los Angeles"}
{"name": "Charlie", "age": 35, "city": "Chicago"}JSONL vs JSON Array
Standard JSON Array
[
{"name": "Alice"},
{"name": "Bob"},
{"name": "Charlie"}
]JSONL Format
{"name": "Alice"}
{"name": "Bob"}
{"name": "Charlie"}Advantages of JSONL
- Stream Processing - Process one line at a time without loading entire file
- Memory Efficient - Handle files larger than available RAM
- Append-Friendly - Add new records without rewriting the file
- Error Isolation - One malformed line doesn't break the entire file
- Parallel Processing - Easily split files for distributed processing
Common Use Cases
- Log files and event streams
- Data export/import operations
- Machine learning datasets (e.g., OpenAI fine-tuning)
- ETL pipelines and data processing
- Database backups and migrations
Reading JSONL in JavaScript
// Parse JSONL string
function parseJsonl(jsonlString) {
return jsonlString
.split('\n')
.filter(line => line.trim())
.map(line => JSON.parse(line));
}
// Stream reading (Node.js)
const readline = require('readline');
const fs = require('fs');
async function* readJsonl(filePath) {
const stream = fs.createReadStream(filePath);
const rl = readline.createInterface({ input: stream });
for await (const line of rl) {
if (line.trim()) {
yield JSON.parse(line);
}
}
}
// Usage
for await (const record of readJsonl('data.jsonl')) {
console.log(record);
}Writing JSONL in JavaScript
// Convert array to JSONL
function toJsonl(array) {
return array
.map(item => JSON.stringify(item))
.join('\n');
}
// Stream writing (Node.js)
const fs = require('fs');
function appendToJsonl(filePath, record) {
const line = JSON.stringify(record) + '\n';
fs.appendFileSync(filePath, line);
}
// Usage
const data = [
{ id: 1, name: 'Alice' },
{ id: 2, name: 'Bob' }
];
console.log(toJsonl(data));Working with Large Files
// Process millions of records efficiently
async function processLargeJsonl(filePath, processor) {
let count = 0;
for await (const record of readJsonl(filePath)) {
await processor(record);
count++;
if (count % 10000 === 0) {
console.log(`Processed ${count} records`);
}
}
return count;
}
// Example: Filter and transform
await processLargeJsonl('users.jsonl', async (user) => {
if (user.active) {
appendToJsonl('active-users.jsonl', {
id: user.id,
email: user.email
});
}
});Best Practices
- Use UTF-8 encoding consistently
- Ensure each line is valid JSON (no trailing commas)
- Handle empty lines gracefully
- Consider compression (.jsonl.gz) for storage
- Include a schema or type field for heterogeneous data
Conclusion
JSONL is an excellent choice for streaming data, large datasets, and scenarios where you need to append records efficiently. Its simplicity and compatibility with standard JSON tools make it a practical format for many data processing workflows.