Back to Publications

GLiNER2 for Swift: Unified Schema-Based Information Extraction on Apple Silicon

    Tech Note
  • Machine Learning

We present GLiNER2Swift, a native Swift/MLX implementation of GLiNER2, a unified schema-based information extraction framework originally introduced by Zaratiana et al. in the GLiNER line of work. GLiNER2 extends the original GLiNER paradigm of generalist and lightweight named entity recognition toward a broader schema-driven information extraction framework. The original implementation is available in Python by Fastino AI (https://github.com/fastino-ai/gliner2).

Our work brings this unified, schema-based extraction approach to the Apple ecosystem with first-class Apple Silicon support. Designed as a CPU-first, production-ready library for macOS, GLiNER2Swift enables developers and researchers to deploy advanced information extraction pipelines directly in Swift applications without relying on Python runtimes or GPU acceleration.

1. Background

The GLiNER framework was introduced as a generalist, lightweight alternative to traditional NER systems, allowing entity extraction without task-specific retraining. The original GLiNER paper:

Zaratiana, U., Tomeh, N., Holat, P., & Charnois, T. (2023). GLiNER: Generalist and Lightweight Model for Named Entity Recognition. arXiv preprint arXiv:2311.08526. https://arxiv.org/abs/2311.08526

proposes a model that performs named entity recognition using label descriptions rather than fixed classification heads. As stated in the paper:

"GLiNER is a generalist model for Named Entity Recognition that can generalize to arbitrary entity types defined at inference time without task-specific fine-tuning." -- Zaratiana et al., 2023

GLiNER2 extends this idea into a broader schema-based information extraction framework, supporting not only NER but also classification and structured extraction under a unified architecture. The reference Python implementation is available at: https://github.com/fastino-ai/gliner2

2. Motivation

Despite rapid progress in NLP research, most modern frameworks remain Python-centric and GPU-dependent. This creates friction for macOS-native development, especially in contexts where:

  • On-device inference is required for privacy or latency
  • Deployment targets Apple Silicon hardware
  • Swift is the primary language
  • Tight integration with native macOS applications is needed

GLiNER2Swift addresses this gap by providing a numerically faithful port of GLiNER2 implemented fully in Swift and optimized for macOS 14+ on Apple Silicon (M1/M2/M3).

3. Unified Schema-Based Extraction in Swift

GLiNER2Swift preserves the schema-driven paradigm introduced by GLiNER. Instead of training separate models for NER, classification, and structured extraction, developers define extraction tasks dynamically via label schemas.

The library currently supports:

  • Named Entity Recognition (NER)
  • Text Classification
  • Structured Data Extraction
  • (Relation extraction -- in progress)

Example usage:

let model = try await GLiNER2.fromPretrained("macpaw-research/gliner2-base-v1")

let entities = try model.extractEntities(
    from: "Tim Cook is CEO of Apple in Cupertino.",
    labels: ["person", "company", "location"]
)

This design enables rapid adaptation to new domains without retraining, aligning with the original GLiNER philosophy of inference-time label flexibility.

4. Architecture and Implementation

GLiNER2Swift is a direct architectural port of the Python GLiNER2 implementation and aims to achieve numerical parity with it. The model architecture includes:

  • Encoder: DeBERTa v3 with disentangled attention
  • Span Marker: MLP-based span representation
  • Count LSTM: Entity count prediction mechanism
  • Downscaled Transformer: Schema embedding module

The implementation leverages:

Performance on Apple Silicon:

  • Model loading: ~2 seconds
  • Inference: ~50ms per sentence (length-dependent)

The system is CPU-first and does not require GPU acceleration.

5. Design Principles

GLiNER2Swift follows several core principles:

  1. Native-first -- Pure Swift implementation without Python bridges
  2. On-device by default -- Privacy-preserving local inference
  3. Reproducibility -- Architectural fidelity to the reference implementation
  4. Developer ergonomics -- Swift Package Manager integration
  5. Extensibility -- Foundation for future fine-tuning and adapter support

Currently supported model:

  • fastino/gliner2-base-v1 (205M parameters)
  • macpaw-research/gliner2-base-v1_mlx (FP-16 instead of FP-32 to reduce model size)

Planned features include training loops, relation extraction, LoRA adapters, and additional model variants.

6. Applications and Impact

GLiNER2Swift enables:

  • Intelligent document parsing
  • Flexible, on-device zero-shot NER
  • Structured extraction from various document types
  • Real-time zero-shot classification pipelines

By bringing GLiNER2 to Swift, we bridge modern NLP research and native Apple platform development. This work demonstrates that transformer-based schema-driven extraction can run efficiently on-device, without GPU dependency or server infrastructure.

GLiNER2Swift represents a step toward privacy-preserving, local-first AI tooling for macOS -- bringing unified information extraction directly into Swift applications while staying faithful to the original GLiNER research vision.

Code

Related publications