Code Smell 231 - Redundant Data
11 November, 2023
15
15
0
Contributors
Where are your sources of truth?
TL;DR: Say it only once
Problems
- Don't Repeat Yourself principle violation
- Consistency problems
- Maintainability
- Testing and Debugging
Solutions
- Keep the responsibilities to relevant objects and delegate to a single source of truth
Context
The principle of "Don't Repeat Yourself" (DRY) encourages you to avoid redundancy and duplication of behavior.
Redundant data can lead to inconsistencies because updates or changes need to be made in multiple places.
If you update one instance of the data and forget to update another, your system can become inconsistent, which can lead to errors and unexpected behavior.
Maintaining redundant data can be a nightmare when it comes to making changes or updates since It increases the workload and the likelihood of introducing errors during maintenance.
With a single source of truth, you only need to make changes in one place, simplifying the maintenance process.
When data is repeated in multiple places, it becomes difficult to identify the authoritative source of that data, leading to confusion for developers.
Sample Code
Wrong
class Transfer:
def __init__(self, amount, income, expense):
self.amount = amount
self.income = income
self.expense = expense
class Income:
def __init__(self, amount):
self.amount = amount
# amount is the same for party and counterparty
class Expense:
def __init__(self, amount):
self.amount = amount
transfer_amount = 1000
# simplification: should be a money object with the currency
income = Income(transfer_amount)
expense = Expense(transfer_amount)
transfer = Transfer(transfer_amount, income, expense)
print("Transfer amount:", transfer.amount)
print("Income amount:", transfer.income.amount)
print("Expense amount:", transfer.expense.amount)
Right
class Transfer:
def __init__(self, amount):
self.amount = amount
self.income = Income(self)
self.expense = Expense(self)
class Income:
def __init__(self, transfer):
self.transfer = transfer
def get_amount(self):
return self.transfer.amount
class Expense:
def __init__(self, transfer):
self.transfer = transfer
def get_amount(self):
return self.transfer.amount
transfer_amount = 1000
transfer = Transfer(transfer_amount)
print("Transfer amount:", transfer.amount)
print("Income amount:", transfer.income.get_amount())
print("Expense amount:", transfer.expense.get_amount())
Detection
[X] Manual
This is a semantic smell
Exceptions
- For performance issues, you can add caches and redundancy, but you need extra effort to keep the data synchronized
Tags
- Data
Conclusion
In larger and more complex systems, redundancy becomes a significant problem.
As your system grows, the challenges associated with maintaining and synchronizing redundant data also increase.
Redundant data also increases the surface area for testing and debugging.
You need to ensure that all copies of the data behave consistently, which can be a challenging task.
Relations
Disclaimer
Code Smells are my opinion.
Credits
Photo by Jørgen Håland on Unsplash
Everything will ultimately fail. Hardware is fallible, so we add redundancy. This allows us to survive individuals hardware failures, but increases the likelihood of having at least one failure at any given time.
Michael Nygard
Software Engineering Great Quotes
This article is part of the CodeSmell Series.