Scalable mono repo data engineering workloads #44

dantaylrr · 2024-10-25T15:18:19Z

This is an "in-depth" example of a scalable, modularised repo structure for all Data Engineering workloads using DABs. It supports the following:

All types of artefacts (python scripts, notebooks, DLT pipelines & libraries).
Jobs (both serverless & classic).
Environment / dependancy overrides for serverless jobs.
Pipelines.

The aim of this example is not to define a de-facto approach for DABs projects but to give companies / individuals an example as to how they might structure all of their Databricks application code in a single mono-repo.

All additional fixtures such as Makefiles, workflows, pre-commit hooks, etc. are entirely optional & can be removed if needed.

All in-line comments & README instructions can be altered so please let me know if anything has been missed / is too specific / doesn't make sense.

chander-pal · 2024-11-12T16:16:08Z

knowledge_base/scalable_mono_repo_de/databricks.yml

@@ -0,0 +1,79 @@
+# This is a Databricks asset bundle definition for my_project.
+# See https://docs.databricks.com/dev-tools/bundles/index.html for documentation.
+bundle:


Adding databricks_cli_version can help mitigate issues related to the CLI version. Since new features are frequently added, this can be helpful.

dantaylrr added 2 commits October 25, 2024 15:52

Adding DE mono-repo example

bb78f9a

Refactoring README

59b7c09

chander-pal reviewed Nov 12, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scalable mono repo data engineering workloads #44

Scalable mono repo data engineering workloads #44

dantaylrr commented Oct 25, 2024

chander-pal Nov 12, 2024

chander-pal Nov 12, 2024

Scalable mono repo data engineering workloads #44

Are you sure you want to change the base?

Scalable mono repo data engineering workloads #44

Conversation

dantaylrr commented Oct 25, 2024

chander-pal Nov 12, 2024

Choose a reason for hiding this comment

chander-pal Nov 12, 2024

Choose a reason for hiding this comment