Cyto File Format Specification

Prev

Overview

Cyto files (.cyto extension) are HDF5-based data files that store well plate scan results from CytoTronics instruments. These files contain both raw measurement data and metadata about the scan, experiment, and hardware configuration.

File Structure

Base Format

Cyto files are standard HDF5 files that can be opened with any HDF5-compatible library. The files use:

  • HDF5 version: v1.8 or later
  • Compression: Zstandard (Zstd) compression for datasets
  • File extension: .cyto (though .h5 or .hdf5 extensions work equally well)

Root-Level Metadata

The file's root group contains a special attribute _metadata_ that stores the file header as a JSON string. This header contains essential information about the scan.

Header Fields

  • scanType (integer): Identifies the type of scan performed (see Scan Types section below)
  • experimentName (string): User-defined name for the experiment
  • plateNumber (string): Identifier for the well plate being scanned
  • startTime (ISO 8601 timestamp): When the scan started
  • endTime (ISO 8601 timestamp): When the scan completed
  • resultVersion (integer): Version number of the data format (current version is 4)
  • softwareVersion (object): Information about the software that created the file
  • scanLoopTime (ISO 8601 timestamp, optional): For repeated scans, the time of the scan loop
  • scanStep (integer, optional): For repeated scans, which step in the sequence this represents
  • hostname (string): Name of the computer/device that performed the scan
  • hardware (object, optional): Hardware serial numbers and identification
    • fpgaSerialNumber: FPGA board serial number
    • powerBoardSerialNumber: Power board serial number
    • adcBoardSerialNumber: ADC board serial number
    • wellPlateSerialNumber: Well plate serial number
  • scanMetadata (object): Scan-specific configuration and parameters (see below)

Scan Metadata Structure

The scanMetadata field contains scan-type-specific configuration parameters. All scan metadata objects include:

  • scanType (integer): Same as the top-level scanType field

For Impedance Scans, additional fields include:

  • acSignals (array): List of AC signal configurations, each containing:
    • frequencyHz (number): Frequency in Hz
    • codeAmplitude (number): Signal amplitude in hardware-specific units
    • Other signal-specific parameters
  • saveRaw (boolean): Whether raw ADC data was saved
  • adcFrequencyHz (number): ADC sampling frequency
  • vRef1SettingV, vRef2SettingV (number): Reference voltage settings
  • Additional scan-specific timing and configuration parameters

For Electrophysiology Scans, additional fields include:

  • fsKhz (number): Sampling frequency in kHz (e.g., 50 for 50 kHz)
  • durationS (number): Total scan duration in seconds
  • scanColumnPairs (array): Column pairs for scanning pattern
  • numFrames (integer): Number of frames captured
  • Additional timing and trigger configuration

Scan Types

Each file contains data for a specific scan type, identified by an integer value in the header. The scan type determines the structure and interpretation of the data arrays in the file.

Current Scan Types

The following scan types are currently supported on Pixel™ hardware (which use the CytoTronics CT100 imaging chip):

Impedance-Based Scans

Scan Type ID Description
VERTICAL_FIELD_HIGH_FREQ_HBW_CT100 33 Vertical Field High Frequency (4 frequencies), HBW
VERTICAL_FIELD_HIGH_FREQ_HBW_CT100_CALIBRATION 34 Calibration for Vertical Field High Frequency, HBW
VERTICAL_FIELD_LOW_FREQ_HBW_CT100 35 Vertical Field Low Frequency (2 frequencies), HBW - Barrier measurement
VERTICAL_FIELD_LOW_FREQ_HBW_CT100_CALIBRATION 36 Calibration for Vertical Field Low Frequency, HBW
LATERAL_FIELD_HBW_CT100 37 Lateral Field, HBW
LATERAL_FIELD_D2_HBW_CT100 39 Lateral Field with Distance = 2, HBW
ELECTRODE_IMPEDANCE_HBW_CT100 38 Electrode Impedance (Radial Field), HBW
ELECTRODE_IMPEDANCE_D2_HBW_CT100 40 Electrode Impedance with Distance = 2, HBW

Notes:

  • Vertical Field scans measure trans-epithelial electrical resistance (TEER) across cells
  • Lateral Field scans measure lateral impedance between adjacent electrodes
  • Electrode Impedance (also called Radial Field or RF) measures impedance from electrode to ground
  • HBW = High Bandwidth
  • D2 variants use electrodes with distance = 2 for increased sensitivity

Electrophysiology Scans

Scan Type ID Description
CARDIAC_PACED_CT100 25 Cardiac scan with external pacing
CARDIAC_SPONTANEOUS_CT100 26 Cardiac scan without external pacing
NEURAL_SPONTANEOUS_CT100 41 Neural scan - spontaneous activity recording
NEURAL_ACTIVITY_SCAN_CT100 42 Neural activity scan - structured recording

Notes:

  • Electrophysiology scans record high-speed time-series data
  • Typical sampling rates range from 10-50 kHz
  • Data is organized by well, with each well containing multiple electrode channels

Legacy Scan Types

Earlier scan types from beta hardware (scan types 1-14, 18-27) are considered obsolete and are not documented here.

Data Organization

Impedance Scan Data Structure

Impedance scans typically contain the following datasets:

Main Data Arrays

  1. /imgMagnitudes (float32)

    • Shape: [rows, columns, frequencies, adc_channels, pixels]
    • Contains the magnitude of impedance measurements
    • Organized by well plate position, frequency, ADC channel, and pixel
  2. /imgDCComponent (float32)

    • Shape: Same as imgMagnitudes
    • DC component of the signal for each measurement
  3. /imgVRefs (float32)

    • Shape: [rows, columns, frequencies, reference_channels]
    • Voltage reference measurements
  4. /temperatureData (uint16)

    • Temperature sensor readings during scan
  5. /rawADCData (uint16, optional)

    • Raw ADC samples (only saved if explicitly requested)
    • Large dataset, typically omitted

Electrophysiology Scan Data Structure

Electrophysiology scans contain time-series data with the following structure:

Main Datasets

  1. /ephysData

    • Shape: [samples, rows, columns, frames, adc_channels, pixel_channels]
    • Data type: uint16
    • High-speed time-series data from all electrodes
  2. /pixelSource and /pixelDest (uint16)

    • Mapping arrays for pixel reorganization
    • Shape: [n_pixels, 2]
  3. /vrefSource and /vrefDest (uint16)

    • Mapping arrays for voltage reference channels
    • Shape: [n_vrefs, 2]
  4. /scanStartTime and /scanEndTime (uint64)

    • Timestamps for scan timing
    • High-resolution timing information
  5. /temperatureData (uint16)

    • Temperature sensor readings

Python Access Examples

Basic File Access and Metadata

import h5py
import hdf5plugin  # Required for Zstd compression support
import json
import numpy as np

# Open a Cyto file
with h5py.File('scan_data.cyto', 'r') as f:
    # Read the header metadata
    header_json = f.attrs['_metadata_']
    header = json.loads(header_json)

    # Basic header information
    print(f"Scan Type: {header['scanType']}")
    print(f"Experiment: {header['experimentName']}")
    print(f"Plate: {header['plateNumber']}")
    print(f"Start Time: {header['startTime']}")
    print(f"Result Version: {header['resultVersion']}")

    # Access scan-specific metadata
    scan_meta = header['scanMetadata']
    print(f"\nScan Metadata Version: {scan_meta.get('__VERSION__', 'N/A')}")

    # For impedance scans - get frequency information
    if 'acSignals' in scan_meta:
        print("\nFrequencies:")
        for signal in scan_meta['acSignals']:
            freq_khz = signal['frequencyHz'] / 1000
            print(f"  {freq_khz:.1f} kHz")

    # For ephys scans - get sampling parameters
    if 'fsKhz' in scan_meta:
        print(f"\nSample Rate: {scan_meta['fsKhz']} kHz")
        print(f"Duration: {scan_meta['durationS']} seconds")
        if 'numFrames' in scan_meta:
            print(f"Frames: {scan_meta['numFrames']}")

    # List all datasets
    print("\nAvailable datasets:")
    for key in f.keys():
        dataset = f[key]
        print(f"  {key}: shape={dataset.shape}, dtype={dataset.dtype}")

Reading Impedance Data

import h5py
import hdf5plugin  # Required for Zstd compression support
import numpy as np
import matplotlib.pyplot as plt

def load_impedance_scan(filename):
    """Load impedance scan data from a Cyto file."""
    with h5py.File(filename, 'r') as f:
        # Load magnitude data
        magnitudes = f['/imgMagnitudes'][:]
        dc_component = f['/imgDCComponent'][:]
        vrefs = f['/imgVRefs'][:]

        # Get dimensions
        n_rows, n_cols, n_freqs, n_adcs, n_pixels = magnitudes.shape

        print(f"Plate dimensions: {n_rows} x {n_cols} wells")
        print(f"Frequencies: {n_freqs}")
        print(f"ADC channels: {n_adcs}")
        print(f"Pixels per well: {n_pixels}")

        return {
            'magnitudes': magnitudes,
            'dc_component': dc_component,
            'vrefs': vrefs
        }

# Example: Visualize a single well at a specific frequency
data = load_impedance_scan('barrier_scan.cyto')

# Extract data for well at row=2, col=3, frequency=0
well_row, well_col, freq_idx = 2, 3, 0
well_data = data['magnitudes'][well_row, well_col, freq_idx, :, :]

# well_data is now [n_adcs, n_pixels]
# You can reshape or process as needed for your analysis
plt.figure(figsize=(10, 6))
plt.imshow(well_data, aspect='auto', cmap='viridis')
plt.colorbar(label='Impedance Magnitude')
plt.title(f'Well ({well_row}, {well_col}) - Frequency Index {freq_idx}')
plt.xlabel('Pixel')
plt.ylabel('ADC Channel')
plt.show()

Reading Electrophysiology Data

import h5py
import hdf5plugin  # Required for Zstd compression support
import json
import numpy as np
import matplotlib.pyplot as plt

def load_ephys_well(filename, row, col, start_time=0, end_time=None):
    """
    Load electrophysiology data for a specific well.

    Parameters:
    -----------
    filename : str
        Path to the .cyto file
    row : int
        Well row index
    col : int
        Well column index
    start_time : float
        Start time in seconds (default: 0)
    end_time : float or None
        End time in seconds (default: None = entire duration)

    Returns:
    --------
    dict with 'data', 'sample_rate', 'shape', and timing info
    """
    with h5py.File(filename, 'r') as f:
        # Get header for sample rate
        header = json.loads(f.attrs['_metadata_'])
        scan_metadata = header['scanMetadata']
        sample_rate = scan_metadata['fsKhz'] * 1000  # Convert kHz to Hz

        # Get dataset shape
        ephys_data = f['/ephysData']
        total_samples = ephys_data.shape[0]

        # Calculate sample indices
        start_sample = int(start_time * sample_rate)
        end_sample = int(end_time * sample_rate) if end_time else total_samples

        # Read data for specific well
        # Note: rows are stored in reverse order
        n_rows = ephys_data.shape[1]
        row_corrected = n_rows - 1 - row

        # Load the data slice (Version 4 format)
        # Shape: [samples, rows, columns, frames, adc, pixels]
        data = ephys_data[start_sample:end_sample, row_corrected, col, :, :, :]

        # Load mapping arrays
        pixel_source = f['/pixelSource'][:]
        pixel_dest = f['/pixelDest'][:]

        return {
            'data': np.array(data),
            'sample_rate': sample_rate,
            'shape': data.shape,
            'pixel_source': pixel_source,
            'pixel_dest': pixel_dest,
            'start_sample': start_sample,
            'end_sample': end_sample
        }

# Example usage
result = load_ephys_well('cardiac_spontaneous.cyto', row=4, col=5,
                         start_time=1.0, end_time=3.0)

print(f"Loaded data shape: {result['shape']}")
print(f"Sample rate: {result['sample_rate']} Hz")
print(f"Duration: {result['data'].shape[0] / result['sample_rate']:.2f} seconds")

# Plot time series for a specific channel
# Data shape: [samples, frames, adc, pixels]
channel_data = result['data'][:, 0, 0, 0]  # First frame, first ADC, first pixel

time_axis = np.arange(len(channel_data)) / result['sample_rate']

plt.figure(figsize=(12, 4))
plt.plot(time_axis, channel_data)
plt.xlabel('Time (s)')
plt.ylabel('ADC Value (uint16)')
plt.title('Electrophysiology Signal')
plt.grid(True)
plt.show()

Extracting Metadata

import h5py
import hdf5plugin  # Required for Zstd compression support
import json
from datetime import datetime

def extract_scan_metadata(filename):
    """Extract and parse all metadata from a Cyto file."""
    with h5py.File(filename, 'r') as f:
        header = json.loads(f.attrs['_metadata_'])

        # Parse timestamps
        start_time = datetime.fromisoformat(header['startTime'])
        end_time = datetime.fromisoformat(header['endTime']) if header['endTime'] else None

        # Extract scan-specific metadata
        scan_metadata = header['scanMetadata']

        metadata = {
            'experiment': {
                'name': header['experimentName'],
                'plate_number': header['plateNumber'],
                'start_time': start_time,
                'end_time': end_time,
                'duration': (end_time - start_time).total_seconds() if end_time else None
            },
            'scan': {
                'type': header['scanType'],
                'type_label': scan_metadata.get('scanLabel', 'Unknown'),
                'version': header['resultVersion']
            },
            'hardware': header.get('hardware', {}),
            'software': header['softwareVersion'],
            'scan_config': scan_metadata
        }

        # For impedance scans, extract frequency information
        if 'acSignals' in scan_metadata:
            frequencies = [signal['frequencyHz'] for signal in scan_metadata['acSignals']]
            metadata['scan']['frequencies_hz'] = frequencies
            metadata['scan']['frequencies_khz'] = [f/1000 for f in frequencies]

        # For ephys scans, extract sampling rate
        if 'fsKhz' in scan_metadata:
            metadata['scan']['sample_rate_hz'] = scan_metadata['fsKhz'] * 1000

        return metadata

# Example usage
metadata = extract_scan_metadata('scan_data.cyto')

print("Experiment Information:")
print(f"  Name: {metadata['experiment']['name']}")
print(f"  Plate: {metadata['experiment']['plate_number']}")
print(f"  Start: {metadata['experiment']['start_time']}")
print(f"  Duration: {metadata['experiment']['duration']:.1f} seconds")

print(f"\nScan Type: {metadata['scan']['type_label']}")
print(f"Version: {metadata['scan']['version']}")

if 'frequencies_khz' in metadata['scan']:
    print(f"Frequencies: {metadata['scan']['frequencies_khz']} kHz")

if 'sample_rate_hz' in metadata['scan']:
    print(f"Sample Rate: {metadata['scan']['sample_rate_hz']} Hz")

Batch Processing Multiple Wells

import h5py
import hdf5plugin  # Required for Zstd compression support
import numpy as np
from tqdm import tqdm

def process_all_wells_impedance(filename, frequency_index=0,
                                  process_func=np.mean):
    """
    Process impedance data for all wells in a plate.

    Parameters:
    -----------
    filename : str
        Path to the .cyto file
    frequency_index : int
        Which frequency to analyze
    process_func : callable
        Function to apply to each well's data (e.g., np.mean, np.std)

    Returns:
    --------
    2D array with processed values for each well
    """
    with h5py.File(filename, 'r') as f:
        magnitudes = f['/imgMagnitudes']
        n_rows, n_cols = magnitudes.shape[:2]

        results = np.zeros((n_rows, n_cols))

        for row in tqdm(range(n_rows), desc="Processing rows"):
            for col in range(n_cols):
                # Load data for this well
                well_data = magnitudes[row, col, frequency_index, :, :]

                # Apply processing function
                results[row, col] = process_func(well_data)

        return results

# Example: Calculate mean impedance for each well
mean_impedance = process_all_wells_impedance('barrier_scan.cyto',
                                              frequency_index=1,
                                              process_func=np.mean)

# Visualize as heatmap
import matplotlib.pyplot as plt

plt.figure(figsize=(10, 8))
plt.imshow(mean_impedance, cmap='viridis', aspect='auto')
plt.colorbar(label='Mean Impedance Magnitude')
plt.title('Well Plate Heatmap')
plt.xlabel('Column')
plt.ylabel('Row')
plt.show()

Data Versioning

Cyto files include a resultVersion field in the header to track the data format version for each scan type. The current version is Version 4 for most scan types.

Current Data Format (Version 4)

Impedance Scans:

  • Multi-dimensional arrays organized by: [rows, columns, frequencies, adc_channels, pixels]
  • Includes temperature data
  • Uses Zstandard compression
  • Reference voltage measurements included

Electrophysiology Scans:

  • Shape: [samples, rows, columns, frames, adc_channels, pixel_channels]
  • Multi-frame support for advanced acquisition modes
  • High-resolution timing information (scanStartTime, scanEndTime)
  • Includes pixel remapping arrays (pixelSource, pixelDest)
  • Temperature monitoring during acquisition

Important Notes

Coordinate Systems

  1. Well Plate Coordinates:

    • Rows and columns follow standard microplate notation
    • Row 0, Column 0 is typically the top-left well (A1)
  2. Electrophysiology Data:

    • Rows are stored in reverse order in the HDF5 file
    • When accessing row N, use: n_rows - 1 - N
  3. Pixel Mapping:

    • Electrophysiology data requires remapping using pixelSource and pixelDest arrays
    • This corrects for hardware scanning patterns

Compression

All datasets use Zstandard compression, which provides:

  • Excellent compression ratios (typically 3-5x for impedance data)
  • Fast decompression speeds
  • The hdf5plugin library handles this automatically

To install required dependencies:

pip install h5py hdf5plugin numpy

Memory Considerations

  • Impedance files: Typically 10-100 MB
  • Electrophysiology files: Can be 500 MB to several GB
  • Use slicing to load only needed portions:
# Good: Load only what you need
with h5py.File('large_file.cyto', 'r') as f:
    subset = f['/ephysData'][0:10000, 5, 6, :, :, :]  # Load specific slice

# Avoid: Loading entire dataset at once
# data = f['/ephysData'][:]  # May cause memory issues for large files

File Format History

The Cyto file format is currently at Version 4 for data arrays and Version 6 for the file header. This represents the mature, stable format used by all current Pixel™ instruments.

Legacy versions (1-3) may be encountered in archived data but are not generated by current systems.

Additional Resources

For questions or issues with Cyto file formats, please contact CytoTronics support.