Demo / User Manual

Byte2Bitâ„¢ Atlas

Storage optimization, scientific data pipelines, and Python workflows.

Lossless compressionCloud-optimized workflowsZarr · GRIB · NetCDF · HDF5 · GeoTIFF · NumPy
Byte2Bitâ„¢ Atlas visual
Executive snapshot

What Atlas is built to do

Atlas acts as a storage optimization layer for numeric multidimensional data by compressing large archives while preserving fast, query-friendly access patterns.
Upto 60%cloud storage cost reduction shown on product page
Decompression Speed faster analytics pipelines claimed for selective retrieval
100%lossless verification workflow
What Atlas is built to do visual
Product overview

A simple mental model

Source dataGRIB, Zarr, NetCDF, HDF5, GeoTIFF, NumPy
Byte2Bit Atlascompress, chunk, index, verify
Byte2Bit Zarr.b2b stores
Analyticsread only what you need

Goal: make large scientific and sensor archives smaller without forcing teams to abandon existing Python and Zarr-style workflows.

Coverage

Supported data and formats

Data types

INTEGERREALCOMPLEXDECIMALBYTEARRAYRAWBYTES

Atlas supports common integer and real numeric types, interleaved complex values, fixed-point decimals, byte arrays, and raw byte payloads.

Input workflows

NumPyZarr v3GRIB / GRIB2NetCDFHDF5GeoTIFF / COGQGIS via GDAL

The unified b2b.transform(...) API auto-detects common scientific formats and writes Byte2Bit-compressed Zarr output.

Configuration

Compression modes and defaults

Recommended default

compression_level=4 and quantize_scale=0 for lossless workflows unless a project explicitly chooses another mode.

Available modes

Lossless levels described in the product demo: 0, 1, 2, and 4. Error-bounded compression is available for real-valued data with level 1.

Manual guidance: start with the default lossless path, run verification, then benchmark storage reduction and decode speed on representative data.
Step 1

Install and sanity check

Use a clean Python environment, install the wheel plus scientific format dependencies, then confirm that byte2bitZarr imports correctly.

python3 -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
python -m pip install --force-reinstall /dist/byte2bitzarr-...whl
export BYTE2BIT_LICENSE_PATH=/dist/COMPANYNAME-0001.license.json
python -m pip install xarray cfgrib eccodes h5py netcdf4 h5netcdf rasterio

python -c "import byte2bitZarr as b2b; print('OK', b2b.__name__)"
NumPy workflow

Compress a NumPy array

For in-memory workflows, encode to bytes and decode into a pre-allocated NumPy output buffer.

import numpy as np
import byte2bitZarr as b2b

x = (np.random.default_rng(0).standard_normal(100000).astype(np.float32) * 5.0)
codec = b2b.Byte2Bit(dtype=np.float32)  # defaults: level 4, lossless

encoded = codec.encode(x)
out = np.empty_like(x)
codec.decode(encoded, out=out)

print('encoded bytes:', len(encoded))
print('equal:', np.array_equal(x, out))
Demo 1 — Compress a NumPy array visual
File workflow

Save NumPy payloads as files

Use single-array .b2b files for transfer or persistence, and archive helpers for multiple named arrays.

import numpy as np
import byte2bitZarr as b2b

x = np.random.default_rng(0).standard_normal(100000).astype(np.float32)

b2b.save_byte2bit("Data/numpy/example_array.b2b", x,
                   compression_level=4, quantize_scale=0)

out = np.empty_like(x)
b2b.load_byte2bit_into("Data/numpy/example_array.b2b", out)
print("equal:", np.array_equal(x, out))

info = b2b.inspect_byte2bit("Data/numpy/example_array.b2b")
print(info)
Zarr v3 integration

Persist compressed data in Zarr v3

Write through Zarr with the Byte2Bit serializer to create chunked, queryable compressed stores.

import numpy as np
import zarr
import byte2bitZarr as b2b

x = np.random.default_rng(0).standard_normal(100000).astype(np.float32)
store = zarr.storage.LocalStore("example.byte2bit.b2b")
root = zarr.group(store=store, overwrite=True, zarr_format=3)

arr = root.create_array(
    "x", shape=x.shape, chunks=(10000,), dtype=x.dtype,
    serializer=b2b.Byte2BitZarrV3Codec(compression_level=4, quantize_scale=0),
    compressors=[],
)
arr[:] = x
print("written")
Core API

Transform existing scientific datasets

One function handles the day-to-day conversion path for multiple input formats.
import byte2bitZarr as b2b

b2b.transform("Data/a.zarr", "Data/a.byte2bit.b2b")
b2b.transform("Data/nwp/*.grib", "Data/nwp.byte2bit.b2b")
b2b.transform("Data/dataNetcdf/*.nc", "Data/netcdf.byte2bit.b2b")
b2b.transform("Data/hdf5Data/092535.hdf5", "Data/092535.byte2bit.b2b")
b2b.transform("Data/raster/sample.tif", "Data/raster/sample.byte2bit.b2b",
              cogLayout=True)

Output suffixes are normalized to .b2b. For GeoTIFF/COG, cogLayout=True uses the COG internal block layout when available.

Transform existing scientific datasets visual
Pipeline demos

GRIB and Zarr pipelines

GRIB → Byte2Bit Zarr

Use for weather and forecast files. Atlas provides cloud-native output, smaller-than-original GRIB storage, and faster decompression.

b2b.transform("*.grib", "out.byte2bit.b2b")

Zarr → Byte2Bit Zarr

Use for existing chunked datasets. With Atlas you get storage and egress reduction workflow with faster decompression.

b2b.transform("in.zarr", "out.byte2bit.b2b")
Recommendation: transform a few GRIB files and a few Zarr stores, then compare size, decode time, and verification output.
Validation

Verify

Verification of the original and Atlas compressed data.

import byte2bitZarr as b2b

b2b.verify("Data/us_20260217T0100Z.zarr", "Data/us_20260217T0100Z.byte2bit.b2b")
b2b.verify("Data/nwp/*.grib", "Data/nwp.byte2bit.b2b")
b2b.verify("Data/dataNetcdf/*.nc", "Data/dataNetcdf.byte2bit.b2b")
b2b.verify("Data/hdf5Data/092535.hdf5", "Data/092535.byte2bit.b2b")

b2b.verify_equal("Data/us_20260217T0100Z.zarr", "Data/us_20260217T0100Z.byte2bit.b2b",
                 arrays=["x", "y"], progress_every=0)
Forecast query

Weather-data query workflow

For forecast archives, Atlas can combine compressed Zarr stores with SQLite metadata for fast selector-first queries.
1. Ingest raw GFS / GRIBdownload bounded sample bundles
2. Transform + indexwrite .b2b stores and refresh SQLite metadata
3. Query latest forecastthreshold-based maturity selection and optional point values
Weather-data query workflow visual
Operational example

Forecast query example

This query selects the latest forecast satisfying the threshold rule, and can optionally attach point values by latitude/longitude.

python python/examples/query_latest_forecast.py   --db Data/gfs_metadata/forecast_index.sqlite   --short-name 2t   --type-of-level heightAboveGround   --start 2026-03-08T00:00:00Z   --end 2026-03-09T00:00:00Z   --threshold 6h   --lat 52.52   --lon 13.40   --with-values

For each maturity M, select the latest forecast F where F ≤ M - threshold.

Weather data details

GRIB variable lookup

Packed GRIB output stores fields under messages/msg_XXXXXX. Use array attributes to find meteorological variables before decoding values.

Key attributes

shortName, name, paramId, typeOfLevel, level, dataDate, dataTime, stepRange

Decode target

After selecting a message key, use b2b.decompress_gridsimple_message(store, msg_key) to decode a field to NumPy.

ML workflow

ML checkpoints and model artifacts

Atlas can store model tensors and metadata in a .b2b.ckpt container for compact transfer and restore.

import numpy as np
import byte2bitZarr as b2b

tensors = {
    "model_weight": np.arange(12, dtype=np.float32).reshape(3, 4),
    "model_bias": np.linspace(-0.2, 0.2, 3, dtype=np.float32),
}
metadata = {"global_step": 120, "epoch": 3, "learning_rate": 1e-3}

b2b.save_ml_checkpoint("Data/checkpoints/model.b2b.ckpt", tensors,
                       metadata=metadata, compression_level=4,
                       quantize_scale=0, atomic=True, checksum=False)
loaded = b2b.load_ml_checkpoint("Data/checkpoints/model.b2b.ckpt")
print(loaded["metadata"]["global_step"])
ML checkpoints and model artifacts visual
Geospatial

GeoTIFF / COG and QGIS path

For geospatial raster workflows, transform GeoTIFF/COG inputs into Byte2Bit Zarr and use the GDAL plugin path for desktop GIS demos.

COG ingest

Use cogLayout=True to keep COG internal block layout where available; use False for a simple TIFF/no chunk-layout path.

QGIS demo

GDAL Byte2Bit plugin integrated with QGIS and tested using Sentinel imagery.

Next steps

Ready to compress data at scale?

Byte2Bitâ„¢ Atlas gives teams a practical path from large scientific archives to smaller, verified, query-friendly stores.

TransformVerifyQueryIntegrate
Ready to compress data at scale? visual
Use ← / → or Space. Press P to print.