Viewing Source Data¶

Every source has a hub table behind the scenes in Databricks that represents all of the input data brought into the source.

Querying Source Data in Hub Tables¶

All source data is visible via hub tables stored in the hive_metastore catalog in Databricks.

To query a source, open a Databricks SQL query or notebook and use: dataforge.hub_ or the source view name from Source settings. See Databricks docs: SQL Editor or Notebooks.

The Source ID appears in the site URL when a Source is open, or in the list on the main Sources page. For view names, use the project schema from project settings.

Examples:

select * from dataforge.hub_1

select * from project_schema.source_view_name

Data View Tab¶

The Data View tab is only available for Private Enterprise customers.

The Data View tab shows a Source's data. Standard customers clicking the tab are redirected to the hub table in Databricks. Users can filter and sort data to validate enrichment and validation rule execution.

The Data Viewer Screen

Column headers are color coded to indicate Raw vs. Enriched vs. System data.

Blue Header: Raw data from the Source
Green Header: Enriched data generated by Enrichment Rules
Gray Header: System data tracking data lineage and results of Validation Rule execution

Rows are color coded to indicate a record's Validation flag.

White Row: Records that pass Validation
Yellow Row: Records warned during Validation
Red Row: Records failed during Validation

Filtering Results¶

Filters and column selections are retained per browser and source until cleared.

Typing IQL¶

Type DataForge QL in the top bar to filter data, using [This].field as in enrichment rules. The ` key accesses common Spark functions.

Selecting Columns¶

Click Select Columns to choose visible columns and see data types. Use the text box to filter the column list. If Retain Selection is off, only columns matching the current filter are shown.

Sorting Data¶

Clicking a column's name sorts the data in ascending order. Clicking again switches to descending order.

Column Name

Downloading Data¶

The Download button exports the filtered/sorted data as a .csv.