Skip to content

Build

Design, configure, and operate data pipelines in DataForge.

Pipeline building blocks

  • Sources — define inputs and parsing rules
  • Connections — Salesforce, JDBC, Kafka, Unity Catalog, and more
  • Processing — ingestion queue, job runs, processing queue, workflow queue
  • Outputs — write to Snowflake, Databricks, and other destinations

Configuration

Reusable patterns

Scheduling and lineage

  • Schedules — when ingestion runs
  • Lineage — visualize how data flows
  • Projects — group sources and outputs into projects

Advanced

  • SDK — custom processing in Python, Scala, or notebooks
  • Talos AI — AI-powered data assistant
  • Agents — remote agents for on-prem ingestion
  • Users and Access — user and role administration