Skip to content

API Reference

This section contains the complete API reference for GridGulp, automatically generated from the source code.

Core Classes

GridGulp

The main entry point for table detection and extraction.

Config

Configuration options for customizing GridGulp behavior.

Models

DetectionResult

The result of table detection, containing all detected tables and metadata.

TableInfo

Information about a detected table including range, confidence, and headers.

FileInfo

File metadata and type information.

SheetData

Raw sheet data with cell access methods.

Detectors

SimpleCaseDetector

Fast detector for single tables starting near A1.

IslandDetector

Detects multiple disconnected data regions.

ExcelMetadataExtractor

Extracts native Excel table definitions.

MultiHeaderDetector

Detects complex multi-row headers with merged cells.

Readers

ExcelReader

Reader for Excel files using openpyxl.

CSVReader

Reader for CSV and delimited text files.

TextReader

Reader for generic text files with table detection.

Utilities

FileFormatDetector

Advanced file type detection using multiple methods.

DataFrameExtractor

Converts detected tables to pandas DataFrames.

Exceptions

All GridGulp exceptions inherit from GridGulpError:

  • FileNotFoundError - File does not exist
  • ReaderError - Error reading file
  • UnsupportedFormatError - File format not supported
  • DetectionError - Error during table detection
  • ExtractionError - Error extracting table data
  • ConfigError - Invalid configuration

Type Hints

GridGulp uses extensive type hints. Key types include:

from pathlib import Path
from typing import Any, Optional, Union

FilePath = Union[str, Path]
CellValue = Union[str, int, float, bool, None]
TableRange = tuple[int, int, int, int]  # start_row, start_col, end_row, end_col