LeXtract - a LangExtract alternative for elixir

Lately i’m working a lot with NLP, and one of the most useful libraries is Google’s LangExtract, that provides a pretty neat approach to extract structured data from text, and of course, its in python. So i decided to port the library as close to LangExtract as possible, it still needs some work done, but thanks to ReqLLM - Composable LLM client built on Req - #31 by mikehostetler . Most of the work was already done haha. Most of the work was already done haha. So, here it is:

LLM-powered text extraction library for Elixir. Based on Google’s LangExtract

LeXtract enables you to extract structured information from unstructured text using Large Language Models (LLMs). It provides a simple, streaming API with support for multiple LLM providers.

Features

  • Multi-Provider LLM Support - Works with OpenAI, Gemini, Anthropic, and other providers through ReqLLM
  • Streaming API - Memory-efficient batch processing with lazy streams
  • Automatic Text Chunking - Handles long documents with configurable chunk sizes and overlap
  • Character-Level Alignment - Precise alignment of extractions to source text positions
  • Schema Generation - Automatic schema inference from examples
  • Template-Based Configuration - Reusable extraction templates in JSON or YAML
  • Structured Output Mode - Enhanced reliability with schema validation
  • Multi-Pass Extraction - Improved recall through multiple extraction passes
  • Flexible Output Formats - Support for JSON and YAML output formats
9 Likes