Skip to content

Viewing Source Data

Every source has a hub table behind the scenes in Databricks that represents all of the input data brought into the source.


Querying Source Data in Hub Tables

All source data is visible via hub tables stored in the hive_metastore catalog in Databricks.

To query a source, open a Databricks SQL query or notebook and use: dataforge.hub_ or the source view name from Source settings. See Databricks docs: SQL Editor or Notebooks.

The Source ID appears in the site URL when a Source is open, or in the list on the main Sources page. For view names, use the project schema from project settings.

Examples:

select * from dataforge.hub_1
select * from project_schema.source_view_name


Data View Tab

The Data View tab is only available for Private Enterprise customers.

The Data View tab shows a Source's data. Standard customers clicking the tab are redirected to the hub table in Databricks. Users can filter and sort data to validate enrichment and validation rule execution.

The Data Viewer Screen

Column headers are color coded to indicate Raw vs. Enriched vs. System data.

  • Blue Header: Raw data from the Source
  • Green Header: Enriched data generated by Enrichment Rules
  • Gray Header: System data tracking data lineage and results of Validation Rule execution

Rows are color coded to indicate a record's Validation flag.

  • White Row: Records that pass Validation
  • Yellow Row: Records warned during Validation
  • Red Row: Records failed during Validation

Filtering Results

Filters and column selections are retained per browser and source until cleared.

Typing IQL

Type DataForge QL in the top bar to filter data, using [This].field as in enrichment rules. The ` key accesses common Spark functions.

Selecting Columns

Click Select Columns to choose visible columns and see data types. Use the text box to filter the column list. If Retain Selection is off, only columns matching the current filter are shown.

Sorting Data

Clicking a column's name sorts the data in ascending order. Clicking again switches to descending order.

Column Name

Downloading Data

The Download button exports the filtered/sorted data as a .csv.