AI Lab Tracker
Labs
Timeline
What's New
Dataverse
library
2024-03-28
Upstage
Open-source ETL pipeline for LLM data processing with a block-based interface. Supports multi-source ingestion, Spark-based distributed processing, and privacy-aware filtering. Accepted to NAACL 2025 Demo.
Paper (arXiv)
GitHub
Paper
Venue
NAACL 2025
arXiv
HTML
Library
GitHub Repository →
data
open-source
research