Sources¶
A Source represents a single schema of data — typically backed by a database table, file, or stream. Sources are the foundation of every pipeline.
Start here¶
- Sources Overview — the Source model and the Sources screen
Configuration¶
- Source Settings — configure a single source
- Raw Schema — automatic schema tracking
- Relations — joins to other sources or master data
- Rules — validation and enrichment rules
Operations¶
- Inputs — what an Input is and how it relates to a Source
- Process — what happens when a Source processes
- Dependencies — how Sources depend on each other
- Viewing Source Data — query the hub table
Advanced¶
- Complex Data Types — JSON, arrays, nested fields
- Sub-Sources — multiple input streams under one Source
- Unmanaged External Source — reference data that DataForge does not manage
- Custom Refresh — override DataForge's default refresh behavior