From Database to Domain: Elevating Software Development with DDD Introduction

In the complex landscape of software development, aligning design methodologies with business needs is crucial. Domain-Driven Design (DDD) emerges as a key approach in addressing this alignment, especially in projects characterized by intricate business rules and processes. This methodology stands in contrast to traditional practices, such as embedding business logic within databases, offering a more adaptable and business-focused perspective.

Section 1: Understanding Domain-Driven Design

Definition and Focus DDD is centered around developing software that intricately reflects the business models it aims to serve. It emphasizes a deep understanding of the business domain, ensuring that the software development process is driven by this knowledge, thereby facilitating a common language between developers and business stakeholders.

History and Evolution Pioneered by Eric Evans, DDD has grown from a set of principles into a comprehensive approach, widely recognized for its ability to tackle complex business challenges through software.

Aligning Design with Business Needs The essence of DDD lies in its focus on business-relevant software development, a principle that aligns closely with the need for software to be adaptable and directly linked to business objectives.

Section 2: Core Concepts of Domain-Driven Design

In DDD, concepts like Entities, Value Objects, Aggregates, Domain Events, Repositories, and Bounded Contexts form the foundation of a robust domain model.

Entities and Value Objects: Entities are defined by their identity, playing a crucial role in maintaining business continuity, while Value Objects add depth and integrity to the domain model.

Domain Model vs Database-Level Logic

The decision to embed business logic in the domain model rather than in the database is pivotal. Traditional database-centric approaches can lead to scalability challenges and obscure the business logic from the development team. A domain-centric approach, as proposed by DDD, enhances clarity, flexibility, and testability, a significant shift from database-heavy methodologies.

Impact of Database Logic vs. Domain Model Flexibility

A critical aspect to consider in software architecture is how changes in business logic affect different layers of the application, particularly the database and the user interface (UI). Traditional approaches that embed business logic in the database often lead to a rigid structure where changes in the database logic can cascade up, impacting the UI layer significantly. This rigidity can result in a cumbersome and time-consuming process for implementing changes, especially when business requirements evolve frequently.

In contrast, Domain-Driven Design offers a more flexible approach. By encapsulating business logic within the domain model rather than the database, DDD allows for independent management of how data is formatted and handled at different levels of the application. This separation of concerns means that:

  • Changes at the Database Level: Alterations in the database schema or logic can be absorbed by the domain model without necessarily impacting the UI. The domain model acts as a buffer, allowing for adaptations in the data representation without requiring changes in the user interface.

  • UI Flexibility: The UI can evolve independently of the database structure. The domain model can format and present data in ways that are most suitable for user interaction, irrespective of how that data is stored or processed in the backend.

  • Bi-directional Adaptability: The domain model offers flexibility in both directions – it can adapt to changes in the database while also accommodating different requirements or formats needed by the UI. This adaptability is key in modern applications where user experience is paramount and business requirements are ever-changing.

By adopting a domain-centric approach as advocated by DDD, applications become more resilient to change and more aligned with agile development practices. This flexibility is a significant advantage in today’s fast-paced and user-centric software development environment.

As we have seen, the core concepts of DDD – Entities, Value Objects, Aggregates, Domain Events, Repositories, and Bounded Contexts – are essential in crafting a domain model that is both robust and flexible. This domain-centric approach, emphasizing clarity and adaptability, marks a significant shift from traditional database-heavy methodologies. The impact of this shift is profound, not only in the architecture of the software but also in the way it aligns with and supports business objectives. Now, let’s explore how these theoretical concepts translate into real-world benefits, further underlining the value of DDD in modern software development.

Section 3: Benefits of Implementing DDD

The implementation of Domain-Driven Design goes beyond just shaping a technical framework; it brings several key advantages that enhance both the development process and the final software product. These benefits, stemming directly from the principles and concepts discussed earlier, include enhanced communication across teams, improved software quality, and greater scalability and maintainability. In this section, we will delve into each of these benefits in more detail, showcasing how the foundational principles of DDD contribute to effective and efficient software development.

  • Enhanced Communication: The adoption of a ubiquitous language and a shared understanding of the domain model bridges the gap between technical teams and business stakeholders. This common ground improves collaboration and ensures that business requirements are accurately translated into technical solutions. In practice, this leads to fewer misunderstandings, more efficient development cycles, and solutions that better meet business needs.

  • Improved Software Quality: By focusing deeply on the domain, developers create solutions that are more aligned with the actual business problems they are meant to solve. This alignment results in higher-quality software that is not only functional but also robust and resilient to changing business requirements. Furthermore, the modular nature of DDD allows for more targeted testing and quality assurance, leading to more reliable and maintainable code.

  • Scalability and Maintainability: DDD's emphasis on a well-structured domain model facilitates the creation of software architectures that are easier to scale and maintain over time. This is especially beneficial in complex systems where changes are frequent and scalability is a concern. The clear separation of concerns and bounded contexts within DDD makes it easier to isolate and address specific areas of the system without impacting the whole, thereby enhancing maintainability.

Section 4: Real-World Applications of DDD

DDD has been successfully applied across various industries, from finance to e-commerce, demonstrating its versatility in aligning software solutions with complex business needs.

Rebuilding guardian.co.uk with DDD

Custom House development team applied DDD and used value objects 

Conclusion

Domain-Driven Design represents a sophisticated methodology for aligning software development with business complexities. It offers a stark contrast to traditional database-centric approaches, advocating for a more agile and business-focused development process. DDD not only addresses technical requirements but also closely aligns with business objectives, making it an essential approach in modern software development.

Are you intrigued by the possibilities of Domain-Driven Design? If you're looking to delve deeper into DDD or considering a shift from traditional database-centric models to a more domain-focused approach, we invite you to join the conversation. Embrace the complexities of modern software development by exploring DDD, a methodology that can unlock unprecedented levels of adaptability and alignment with business goals in your projects. Dive into the world of DDD to discover how it can transform your approach to software development and help you navigate the intricate landscape of business needs and technical challenges. Stay informed, stay ahead, and let DDD guide your journey in the evolving world of software innovation.

Additional Resources

What is User Acceptance Testing?

Once upon a time, a clinical laboratory realised that, if instead of having to find the paper documents related to each test and writing down the results by hand they could have a smart monitor, they would save a lot of time.

The practitioner would only tap on the test they were working on and quickly update its status with a second tap. They found an engineering firm and explained to them what they needed and the engineering work began.

A dedicated team of engineers started working on the product. There were so many parts: the software, the touch screen, the computer-like device running the software, the height-adjustable stand, etc.

After several months of hard work, the product was ready. Not only had they tested every single part on its own, but they had also even tested the larger components after coming together. They provided a demonstration to the lab, showing the software and how they can put it in any room thanks to its simple design of only one height-adjustable monitor.

The order was delivered and everyone was satisfied; until they started using it. The lab came back to the engineering team complaining that the monitor is useless because it only detects the touches of the bare hand.

It was a disaster. The engineering team had no idea that the practitioners are required to have medical gloves on at all times in the lab. All the different parts of the monitor were tested. The only issue was that it was tested by the wrong people and in the wrong environment.

The idea is clear. In the process of testing our products, it is also important to give the actual user a chance to test it too because, well, they are the user of this product. If they are not satisfied with it there is no point in producing it anyway.

It might seem like this does not apply when the product in question is software but the reality is it does apply the same if not more. This part of the testing is called User Acceptance Testing or UAT for short.

Most of the other types of testing you might be familiar with happen in the process of development or deployment. We write unit tests to make sure each individual function of the code does what we expect it to do, similar to testing if the power button on the monitor in that clinical lab turns it on. The very last test that happens right before the product is pushed out is UAT.

Let me interest you in another anecdote about manufacturing. In the early 1800s wristwatches were becoming a thing and manufacturers were about to emerge. However, a Swiss company that had managed to be one of the first to export watches to foreign countries was having a frustrating issue.

They would export the watches all around the world and a lot of them would be returned reporting that the watch was not working. They would try these returned watches to see what the issue is only to find out there are no issues and the watch is working perfectly fine.

They would send the watches back again as all tests were showing that the watches were fine but the same thing would happen again - most people found the watches to quickly get out of sync so they would return them.

After thorough investigations, they finally found what the issue was. The mechanical pieces behind the watch were designed to move with the speed that would represent a second… in Switzerland! Other countries with significantly different altitudes would have different air pressure which meant the same mechanics would not work on the watch and the handles would either move faster or slower, causing the time to get out of sync.

There are many lessons to learn from that story but the one we are interested in is that, not only UAT should happen by the end-user but also it should happen in the exact same environment the product is going to be used.

Let us put it all together in the context of software development. The product is developed and all unit tests, integration tests, etc. have passed. The final step of testing is UAT. We said UAT should happen by the end-user but how is that possible in software?

It really depends on who your user is. Sometimes companies have a group of test users who are willing to test the products. Sometimes your customer is other businesses using your APIs and tools so if their automated tests on your new version pass it can act as a UAT for you. And in some cases, you will have a Product Owner that will represent your end-user for you. If your team cannot be directly engaged with the customer then the product owner is your customer. They will define the requirements and then they will also test those requirements at the last stage.

The second thing we talked about was being in the same environment that the product is being used. This can be of special importance if your “production” environment is different from what the developers have on their local computers.

A lot of the companies that have an online service have something called a UAT environment. This is basically what that is. It is an environment with the same data and databases, the same networking and infrastructure and basically the same everything. The only thing that is different is the product/service. That way, you can ensure that the product is being tested in the same environment as it is going to be used and any performance issues or bugs will be discovered. The only difference between a UAT environment and a production one is who uses them.

Fun fact, this article was also UATed - I sent it to a couple of other engineers (the type of end-user I am expecting here) so they read it on their computers (the environment I am expecting you to be in) and make sure they understood it! And they did so I hope you got an idea of what User Acceptance Testing is too and why it is incorporated into the delivery process of engineering teams.

Of course, just like any other type of testing or any process really, there is a lot more to know about its good practices and frameworks, dos and don’ts, etc. that you can search about and find heaps of good articles on the internet!

Did you find this article interesting? Do you think others can enjoy it too? Feel free to share it with your friends!

How We Took A Table Offline While Maintaining Uptime

Recently a bug found its way into production which caused excessive updates to one of our critical Postgres tables. While we identified and squashed the bug quickly, the damage was already done: about 160% of the live tuples were dead. At a critical threshold of dead tuples Postgres' query planner will start to ignore indices as it thinks there are much more rows than there are. In this degraded state, queries that would normally complete in sub-second time were now taking multiple seconds, even minutes to complete.

Usually dead tuples aren’t much of an issue as they’re vacuumed away regularly by the auto vacuum process. Auto vacuum works well because it only acquires a SHARE UPDATE EXCLUSIVE lock, meaning selects, updates, and inserts can still occur. We calculated from the currently running vacuum’s progress that the auto vacuum was going to take more than 30 days to complete. With Black Friday sales around the corner we needed another solution. A full vacuum was calculated to take roughly an hour and would reclaim a lot more space, however this would acquire an ACCESS EXCLUSIVE lock, blocking all access to the table for the entire duration: not possible without taking the entire site down!

Or is it? If we can get a full vacuum to work, the database would recover in less than an hour!

We analysed the usage of this table. We determined that, so long as there were no writes, removing all relations (and replacing with null / no rows) would be acceptable in the short term, as the values were already denormalised into our elasticsearch database. Could we do this with minimal code changes?

Our Django application uses select_related/prefetch_related heavily so we would need to create an empty table with the same signature to allow for these queries to work without exceptions. Here’s what we came up with:

BEGIN;
-- we want to be able to rename fast later, so lock the table now
LOCK TABLE affected_table IN ACCESS EXCLUSIVE MODE;

-- create a table with the same columns as affected_table.
-- Foreign keys and indices are not required as
-- there’ll (theoretically) be no data inserted
CREATE TABLE "affected_table_empty"
(
    "id"              serial                   NOT NULL PRIMARY KEY,
    "date"            timestamp with time zone NOT NULL,
    "enabled"         boolean                  NOT NULL,
    "data"            varchar(128)             NOT NULL,
    "product_id"      integer                  NOT NULL,
    "mode_id"         integer                  NOT NULL
);
-- swap the tables around. Table renaming is a fast operation
ALTER TABLE "affected_table" RENAME TO "affected_table_broken";
ALTER TABLE "affected_table_empty" RENAME TO "affected_table";
COMMIT;

We had some foreign keys linking to this table so each of those constraints must be dropped:

-- Another fast operation
ALTER TABLE "table_1" DROP CONSTRAINT "tbl1_fk_affected_table_id";

At this point we’ll have:

  • No references to the affected table (now renamed to broken) in the database
  • An empty table for our code to look at

Testing this out revealed a problem. If a table with a foreign key had a value set, Django would crash with DoesNotExist as it assumes there would be a value.

>>> obj = Table1.objects.get(pk=1)
>>> obj.affected_table_id
<<< 156
>>> obj.affected_table
<<<  DoesNotExist: AffectedTable matching query does not exist

As mentioned earlier, we’re heavy users of select_related so we couldn’t just mock the columns to be None. Here’s what we came up with to work around it:

class DisablingForwardOneToOneDescriptor(ForwardOneToOneDescriptor):
    def __get__(self, instance, cls=None):
        if waffle.switch_is_active(f"disable_relation_{self.field}"):
            return None
        return super().__get__(instance, cls=cls)


class DisablingOneToOneField(models.OneToOneField):
    forward_related_accessor_class = DisablingForwardOneToOneDescriptor

We use django-waffle to manage feature flags which makes it easier to turn off relations through this new class.

The above class could be used in place of a OneToOne relation (but could easily be extended to a normal ForeignKey) as follows:

class Table(models.Model):
    affected_table = DisablingOneToOneField("AffectedTable", null=True, blank=True)

Now when we try again it will just return None:

>>> obj = Table1.objects.get(pk=1)
>>> obj.affected_table_id
<<< 156
>>> obj.affected_table
<<< None

Calling obj.save() is also safe, it will preserve the existing ID.

After some vigorous testing we were confident and ready to roll in production.

The steps that needed to occur were:

  1. Scale down Celery workers to prevent any writes to the empty table
  2. Disable the relation via the feature switch
  3. Create an empty version of the table and swap it with the affected table
  4. Drop all foreign key constraints to the affected table
  5. Vacuum the affected table
  6. Swap the affected table back with the empty table
  7. Insert any data that managed to find its way into the empty table into the affected table
  8. Add all the foreign key constraints back in
  9. Re-enable the relation via the feature switch
  10. Scale workers back up

We executed the plan and observed that the system showed no signs of distress, so we were ready to perform the vacuum.

VACUUM FULL affected_table_broken;

The vacuum completed successfully with no queries blocked!

Now to reverse the procedure to get the data back:

BEGIN;
LOCK TABLE affected_table_broken, affected_table IN ACCESS EXCLUSIVE MODE;
ALTER TABLE "affected_table"
    RENAME TO "affected_table_empty";
ALTER TABLE "affected_table_broken"
    RENAME TO "affected_table";

-- insert any rows that managed to sneak through
INSERT INTO affected_table (date, enabled, data, product_id,
                                         mode_id)
    (SELECT date,
            enabled,
            data,
            product_id,
            mode_id
     FROM "affected_table_empty");
COMMIT;

We’ll also need to add the foreign key constraints back in but in a non blocking way. The trick to doing this is to create it as NOT VALID, meaning existing rows won’t be checked when it’s added. The constraint can later be checked to clear out any invalid values, and it’s done with a less intrusive lock (SHARE UPDATE EXCLUSIVE) so regular activity can continue.

-- put the constraint back on
-- note that it is initially deferred and not valid (old rows won't be checked, new inserts and updates will be). This means it will run fast!
ALTER TABLE "table_1"
    ADD CONSTRAINT "tbl1_fk_affected_table_id" FOREIGN KEY (table_id) REFERENCES affected_table (id) DEFERRABLE INITIALLY DEFERRED NOT VALID;
ALTER TABLE "table_1" VALIDATE CONSTRAINT "tbl1_fk_affected_table_id";

After disabling the feature switch and scaling our Celery workers back up our site was back to normal.

We were fortunate enough to be able to substitute null values for the affected table and rely on other databases. If this was not possible we’d have no choice but to bring the entire site down while the vacuum was running. Given we’re in the Christmas period and with Black Friday coming up, everyone was relieved that we could keep our uptime high!

How to Transform your Team's Communication with Stakeholders

How to Transform your Team's Communication with Stakeholders

As tech people - we quickly flock to digital tools to manage our workflow and processes.  Although an Agile digital task tracker is a great start - at Kogan.com our Physical wall has had huge impact in communicating work in process and priority to our stakeholders - find out why and how you can transform your stakeholder communication.

eClaire - printing our trello cards for our physical wall

There is nothing that creates visibility, collaboration and flexibility of process in a team like a physical wall.

It allows you to show what is currently being worked on and whois owning it, where the bottlenecks are in your process, how much and what is in the backlog and the list goes on.

It provides an awesome mechanism to trigger and facilitate conversations around and provides a visual que that often speaks a thousand words.

But as much I advocated for physical walls to manage a team they are not without their limitations.

They lack the detail and the reporting aspect that you really need an online tool to fulfill and that's why...

Continuously Improving our Process - Retrospectives

Like many agile based teams we regularly run retrospectives to gauge as a team how we are going and think of what and how to improve.

We have one every two weeks, they are time boxed to one hour and are held standing up (like we do for nearly all of our team meetings).

We have the typical ‘happy’ column, a ‘sad’ column and a ‘puzzling’ column. Everyone brainstorms on post-its, we group it together and vote on what we would like to discuss.

The team is prompted by being next to our wall and running through some of the achievements/big events in the past two weeks.

I ask probing questions such as:

  • What has slowed down your progress?
  • What has enabled you?
  • From when you pick up a story to when you deploy it to production, what obstacles do you have?
  • What areas have been inefficient and where is there unnecessary work or rework?

There is one key part that distinguishes a productive retrospective from a time waster - the actions and improvements that are an output of the retro.

One thing that is tempting is to discuss the particular details of an issue that has just occurred - which can often turn into a bit of a whinge and sometimes even a blame game (especially if the people involved are not present at the retro). Then a tactical action is thought of to address the latest symptom and the team moves on to the next highest voted item.

It is far more valuable for the team to discuss the patterns and root cause of the issue.

To do this, the team should discuss what process was followed and give other examples of that process in play. This is to remove the emotion and raise it up to a more high level discussion about repeatable systemic issues.

Once we understand the root cause we explore what is within our control to improve. That way, when we begin to think of actions to improve the situation we are thinking of changes/tweaks to our process rather than a quick fix or bandaid.

Then, once we have an action to improve our process, we run through a few hypothetical situations to ensure we have a shared understanding of what our new improvement looks like. We’ll often look at a few tasks in our backlog to give us some examples and we run through how things will play out with our new improvement.

The next retrospective the first thing we discuss are our actions from the last retro:

  • Has the action been implemented? if not, why not?
  • Is the issue still occurring? if yes, why?
  • Did the action improve the issue? if not, why not?

By beginning our retrospective with the previously agreed actions, it reinforces to the team the purpose of our retrospectives, which is, to share examples of our teams anchors and engines, to discuss their root cause and to action improvements to our process.