Lately i’m working a lot with NLP, and one of the most useful libraries is Google’s LangExtract, that provides a pretty neat approach to extract structured data from text, and of course, its in python. So i decided to port the library as close to LangExtract as possible, it still needs some work done, but thanks to ReqLLM - Composable LLM client built on Req - #31 by mikehostetler . Most of the work was already done haha. Most of the work was already done haha. So, here it is:
LLM-powered text extraction library for Elixir. Based on Google’s LangExtract
LeXtract enables you to extract structured information from unstructured text using Large Language Models (LLMs). It provides a simple, streaming API with support for multiple LLM providers.
Features
- Multi-Provider LLM Support - Works with OpenAI, Gemini, Anthropic, and other providers through ReqLLM
- Streaming API - Memory-efficient batch processing with lazy streams
- Automatic Text Chunking - Handles long documents with configurable chunk sizes and overlap
- Character-Level Alignment - Precise alignment of extractions to source text positions
- Schema Generation - Automatic schema inference from examples
- Template-Based Configuration - Reusable extraction templates in JSON or YAML
- Structured Output Mode - Enhanced reliability with schema validation
- Multi-Pass Extraction - Improved recall through multiple extraction passes
- Flexible Output Formats - Support for JSON and YAML output formats




















