Draft: Putting the Kibosh on Getters and Setters


Perspective

I recently picked up a new book on object-oriented programming called “Object-Oriented Design Heuristics” by Arthur J. Riel. In the book, Riel explains programming’s natural evolution from procedures and data structures to object-oriented. He talks about how programmers adapted their development style to reduce the effects of changes to data structures.

In procedural programming, programmers develop software by decomposing a system into procedures (or functions) that act on a set of shared data structures. One of the problems associated with this type of programming is what Riel calls the “unidirectional relationship between data and behavior”. This unidirectional relationship means you can tell which data a procedure depends on by looking at the procedure (its parameters, local variables, and global variable access), but you can’t tell which procedures use a data structure by looking at the data structure. Take the following pseudocode for example:

struct Account
{
	int accountNumber;
	string name;
	double balance;
}

Without looking ahead, can you tell what behaviors are available for the Account struct? Do you know what they look like and what parameters they use? How do they affect the Account struct?

Imagine in this psuedocode that the deposit and withdraw functions live in one file and the Account struct lives in another. Someone looking at the functions knows they manipulate an Account. Someone looking at the Account struct by itself doesn’t know that the deposit and withdraw functions use it, or that possibly hundreds of other functions in many other source code files use it. It’s also difficult to follow and control the structure’s state changes in any significantly complex system.

This unidirectional relationship causes problems when multiple programmers try to work with the same parts of the system. For example, assume one programmer wrote the functions shown above. Another programmer starts working on a transfer function. The programmer writing transfer decides to redefine account balance as type Money instead of double. So this programmer makes the following change:

struct Account
{
	int accountNumber;
	string name;
	Money balance;
}

function transfer(Account fromAccount, Account toAccount, Money amount) {
	fromAccount.balance -= amount;
	toAccount.balance += amount;
}

The changes just broke the deposit and withdraw functions and any other functions dependent on the fact that Account stores its balance as a double. Someone has to update all functions that use doubles to instead use the Money type by either changing the parameters, or converting double types to Money types in the functions. This is not a big deal with only a few functions, but it doesn’t scale well in a system of several thousand lines of code. Think of all the procedures, unrelated to deposit, withdraw, and transfer that might be touching this property.

Consider another situation when one developer adds properties to the struct to accommodate a new function. They must update any place all of the properties are required, for example:

struct Account
{
	int accountNumber;
	string name;
	Money balance;
	Date dateOpened; // Added dateOpened property
}
function copy(Account from, Account to) {
	to.accountNumber = from.accountNumber;
	to.name = from.name;
	to.balance = from.balance;
	// dateOpened is not copied.
}

Since the functions and data are physically separate, a programmer may forget to update this function because they don’t know it exists.

Programmers managed the complexity introduced by unidirectional relationships between data and behavior by creating bidirectional relationships between data and behavior, which ultimately evolved into OOP. Here is what Riel says:

When first exposed to the [procedural] solution to the preceding problem [regarding changes to shared data structures], many people state that they do not build [procedural] software in this way. They place each data structure of their system in a separate file in which they include any functions in the system relying on that data structure. It is true that this makes the system more maintainable, but what is the file they have created? It is the bidirectional relationship between data and behavior, namely, a class, in object-oriented terminology. What the object-oriented paradigm does is to replace the convention of encapsulating data and behavior using the file system with a design-language-level mechanism. In short, it takes programming by convention carried out by the best programmers and replaces it with a lower-level mechanism.

OOP provides behavioral and declarative interfaces to objects based on this evolution in design. It’s not that data is unimportant in OOP. What’s important is the data and state of objects are hidden and have bidirectional relationships with the functions that manipulate them. The behaviors (methods) become the gateway to the data, and force objects to interact through invoking them.

Hate is a strong word, but I really, really, really don’t like getters and setters

I would like to say that we should never use getters and setters, but I’m dangerously unqualified to make absolute statements like that. Instead I will say I have strong evidence indicating getters and setters are a first class mistake.

There is a typical argument against the idea that getters and setters are bad, which is, “…but getters and setters are methods, and methods provide encapsulation because you don’t know if the data you’re getting and setting is stored in a variable, or calculated, or got-from/set-to another object.”

The method’s implementation doesn’t matter. What matters is the metaphor. The metaphor of getters and setters is that of a data structure. Other classes using the getters and setters implement the behavior separately. This separates data and behavior and reverts to the unidirectional relationship that OOP is intended to remove. Basically, getters and setters break OOP.

Here’s what Riel says:

…accessor methods are not dangerous because they give away implementation details – they are dangerous because they indicate poor encapsulation of related data and behavior. Why is someone getting [the data]? What are they doing with [the data], and why isn’t the … class doing the work for them?

Riel emphasizes that there is nothing wrong with procedural programming. It has its uses and benefits, just as OOP has its uses and benefits in managing complexity in large applications.

It becomes a problem when we intend to do object-oriented programming, but are creating unidirectional relationships between data and behavior. Doing this removes the benefits of OOP. If you’re not interested in the benefits of OOP, then feel free to continue with getters and setters as long as you’re aware that the system is not object-oriented and you should not expect to receive its benefits.

Even in the future nothing works

In modern OOP, instead of the struct from above we have, for example, in Java:

public class Account
{
    private int accountNumber;
    private String name;
    private Money balance;

    public int getAccountNumber() {...}
    public void setAccountNumber(int accountNumber) {...}

    public String getName() {...}
    public void setName(String name) {...}

    public Money getBalance() {...}
    public void setBalance(Money balance) {...}
}

or in C#:

public class Account
{
	public int AccountNumber { get; set; }

	public string Name { get; set; }

	public Money Balance { get; set; }
}

These are metaphorically identical to the struct from the first example. Look at these classes and try to guess what they do. They don’t do anything. They get and set values for the programmer to use. But the programmer is not part of an object-oriented system. Methods are meant to implement messages sent by other objects. They are not attributes for the programmer to use for scripting.

Furthermore, we can’t trust this object. Client code needs defensive checks to make sure the values returned from the getters are not null . Every other object that uses this class needs to defensively check return values.

Some applications will have static utility functions to use with a class like this to correctly get values. In a large enough application only a handful of programmers will know these utility functions exist or remember to use them. The rest will repeat null checks everywhere.

public void deposit(Account account, Money amount) {
	Money balance = account.getBalance();

	if(balance != null) {
		account.setBalance(balance + amount);
	}
}

It gets worse when building all objects like data structures, because we need to dig into them…

// Remember getBalance returns a Money type?
// I guess getCurrency returns a Currency type.
account.getBalance().getCurrency().getAbbreviation();

That looks like a fast track to a NullPointerException… Needs more null checks:

if(account.getBalance() != null && account.getBalance().getCurrency() != null
	&& account.getBalance().getCurrency().getAbbreviation() != null) {
	account.getBalance().getCurrencty().getAbbreviation()
	// Whatever we were going to do...
}

If someone handed you this Account class, would you know what to do with it? Would you know where to find all the functions needed to implement a feature? Would you know which chain of getters to follow to get the object containing the data you need to do some work? How many times will that chain of getters be written? How many unrelated parts of the system change when one of the objects in the chain changes?

Additionally, when we have things like Service classes manipulating objects using getters and setters, any relationship to the actions the service is performing are lost outside of the service. If a developer needs to reuse part of some functionality performed in a service, the developer needs to know that service exists. In a system with lots of services, developers may not know a service exists. They end up repeating logic that manipulates the state of an object that they could otherwise reuse if the class had methods that are bidirectionally related to the code that uses the class.

Lastly, getters and setters increase the surface area of a class requiring client objects to know more about a class, and increasing the number of functions that have to change when a class’s data changes.

Readability

Harold Abelson said in the book, “Structure and Interpretation of Computer Programs”, “Programs must be written for people to read, and only incidentally for machines to execute.”

One of the benefits of bidirectionality is readability. Instead of arbitrary return values with no meaningful context, we see objects having meaningful interactions with collaborators.

Imagine the last account example was part of a procedure to transfer money between accounts and was checking the currency to see if it matches or requires exchange. We can rewrite it as:

public class Account {
	private Money balance;

	public void transferAmountToOtherAccount(Money amount, Account otherAcount) {
		balance.subtract(amount);
		otherAccount.transferAmountFromAccount(amount, this);
	}
}

Ignoring how terrifyingly bad this banking software is and just examining the interactions, one thing you’ll notice is there is no publicly visible state. All state is encapsulated and only accessible by the objects/classes themselves. For example, we don’t know the state of the Account object when we call transferAmountToOtherAccount, nor does the Account know the state of the other objects when it communicates with them. The objects should use their state to respond appropriately. Effectively what this does is remove state as a metaphor allowing us to program declaratively.

What happened to the currency exchange? The Money object will take care of it. They are unnecessary details that getters and setters force us to deal with.

We also don’t have to worry so much about null checks because we’re not querying chains of getters that might return a null reference. It’s a good idea to remove nulls from code, so we can code assuming every object is a valid reference. However, should this code require null checks they only have to be written once within the method or class, so we’re not exposing clients of the class to null objects.

Unidirectional Data Flow

Continuing with the previous example. You’ll also notice the methods have no return values. This is the unidirectional data flow. Information flows one way. This is reminiscent of Continuation-Passing Style (CPS). Unidirectional flow is a factor of James Ladd’s East concept which I’ve written about previously. An alternative to void methods is to always return this to allow method chaining.

If you saw the interactions on a sequence diagram, you’d see lots of arrows going to the right as they pass objects along. If an object needs to talk to an object earlier in the sequence (to the left), then that object must pass itself (this) along.

If we can’t return something like an error or success code, how might we handle things like errors? Maybe something like this:

public class Account {

	public void transferAmountToOtherAccountWithErrorHandler(Money amount, Account otherAcount, TransferErrorHandler errorHandler) {
		if(<some kind of error>) {
			errorHandler.errorInSourceAccount(new Error(...), this);
			return;
		}

		balance.subtract(amount);
		otherAccount.transferAmountFrom(amount, this, errorHandler);
	}
}

There is no need to return an arbitrary error value and have the error logic in disparate parts of the code. The abstract logic for transferring money and handling transfer errors are in one place. Classes implementing the TransferErrorHandler interface can take care of the details, making it easy to enumerate the various ways errors are handled.

A key point here is an object owns its data. Objects can’t pull data out of other objects, but they can ask other objects to release their data to an appropriate interface. This gives the object an opportunity to refuse the message. Normally, refusal is indicated by returning null, causing obvious problems. Instead we can show acceptance or refusal by sending or not sending a message, respectively.

Instead of:

Money bal = account.GetBalance();

if(bal != null)
	DoSomethingThatMayInvolveBalance(bal);

We have:

account.DoSomethingThatMayInvolveBalance(aThingThatIsPartOfThisInteraction);
public class Account {
	public void DoSomethingThatMayInvolveBalance(AThingThatIsPartOfThisInteraction interactingObj) {
		if(this.balance != null)
			interactingObj.DoThingWith(this.balance);
	}
}
Advertisements

Comments!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s