Generative Design is Doomed to Fail

Daniel Davis – 20 February 2020

Update, May 21, 2023: I wrote this article back in 2020. At the time, Autodesk was a leader in generative design and was pitching it as the future of design software. After hearing this spiel one too many times, I penned this article critiquing their vision (as you can gather from the title, I wasn’t a fan). Fast forward three years and the generative design landscape is totally different. New tools like ChatGPT, Dall-E, and MidJourney have exploded in popularity, overtaking Autodesk as the leaders in this space, and leaving this article feeling a little dated (already!). In many ways, this article now serves as a strange time capsule – an old collection of issues with Autodesk’s earlier version of generative design.

To bring this article up to date, I’ve added a short epilogue exploring what happened to generative design, why Autodesk’s strategy didn’t work, and reflecting on what I got right and wrong in the original article (spoiler alert: generative design didn’t die!)

A concerned Autodesk representative pulled me aside at an event recently. “I read your article,” she began.

I tried to recall whether I’d said anything controversial. But my most recent article was relatively tame, just 1,300 words in Architect Magazine about algorithms generating building layouts. If anything, it was complimentary of Autodesk.

Around us, people at the conference were discussing the industry’s most pressing issues – robots, automation, climate change. The representative leaned in to reveal hers: “I noticed you didn’t mention generative design in your article.”

The representative worked for Autodesk’s communications team. As you’ll be aware, Autodesk has been ramping up its efforts to brand and promote generative design, putting out a series of videos, articles, and presentations that tout the benefits of the generative process. Others in the industry have followed suit, announcing their own generative design tools in superlative laced press releases. None of this is new. People have been peddling generative design as far back as the 1980s. But it never had the clout of someone like Autodesk. After years of never really going anywhere, suddenly everyone is talking about generative design. Suddenly it feels inevitable.

The Autodesk representative seemed taken aback when I told her that I didn’t believe the hype. That I avoided using the term ‘generative design’ in the article because I didn’t think it was worth promoting. That it was a distraction. A white whale. That it wasn’t the future of design, or anything. That it was a dead-end. That something more significant was happening. That I needed more time to explain.


The three steps of generative design: specifying goals, generating solutions, and selecting the best option (source).

On the surface, generative design is an enticing vision. Rather than employing a designer to laboriously create a design concept, you can instead use an algorithm to quickly generate thousands of options and ask the designer to pick the best one. Effectively, the designer becomes an editor. They specify the goals of the project, an algorithm churns out an array of options, then the designer returns to select the strongest idea, and – voilà – you’ve got a building. Since the algorithm can produce countless design concepts, the designer can, in theory, consider more possibilities than they would on a typical project, improving the chances of finding an optimal design or discovering a novel solution. A better design with less effort, what’s not to like?

Comparing and selecting design options at the end of the generative design process using Autodesk’s Project Refinery (source).

Some of you are going to disagree with how I’ve characterized generative design. In the current vernacular, the term ‘generative design’ has a reasonably loose meaning. This isn’t unusual — other technical terms like ‘parametric’ or ‘machine learning’ have grown more vague as they have become more popular. In the case of generative design, the word ‘generative’ is often confused as a catch-all for ‘generated’. Throughout this article, I’m going to refer to generative design as a three-stage process where (1) designers define the project’s goals, (2) algorithms produce a range of solutions, and (3) then designers pick the best result. Although you might quibble with this definition, this is how Autodesk defines generative design today, and it’s what many people are currently pushing as the future of design. For the purposes of this article, I’m only focused on this prevailing definition (if you think this process shouldn’t be called ‘generative design,’ if you’d rather call it optioneering or something else, you can change the name used in this article).

Whatever you want to call it, I’m deeply concerned that many in the industry are advocating that generative design is the future of architecture. As I’ll explain in this article, once you get beyond the marketing hype, there are real technical and human reasons why generative design’s three-step process is doomed to fail.

The Worst Way to Write an Email

To understand the absurdity of generative design, I think it helps to imagine generative design in a different context, to see it as a naked idea without the baggage of the architecture industry.

Consider email. According to McKinsey, the average worker spends about 11 hours a week reading and answering emails. Each email is hand-crafted, letter by letter, word by word. Clack, clack, clack. It’s easy to see why talented people hate doing this menial work.

So why not reinvent email? Rather than typing out each email, why not have an algorithm generate the first draft? Or a hundred first drafts? Why not make a generative email program? You pick the subject, an algorithm writes some options, you read them, choose the best, and hit send. Not only would you save time, but you’d also probably end up sending better emails because you can explore more possibilities and spend longer considering what you’re saying. What’s not to like?

All of this is technically possible. In fact, I’ve mocked up a quick prototype below.

GenerativeMail

Created with gpt2, gpt2 Cloud Run, and Paracord

Why Generative Design Doesn’t Work

I imagined that it’d be instructive to see generative design used in the most ridiculous way possible. But as I ran the generative email program for the first time, I was no longer sure if it was such an absurd concept. I mean, sure, most of the emails were incoherent, but every now and again one of them would be unexpectedly erudite. In those moments, you could see the potential. You could imagine that with better algorithms, an improved interface, and faster computers that this crazy idea might actually work.

Writing about generative design using generative design.

Generative design often appears close to working. It’s been that way for decades. Time and time again we’re strung along by seductive demonstrations and fooled into thinking we’re on the cusp of a breakthrough. These demos are easy enough to create. Take an algorithm that spits out hundreds of random designs, develop an interface to display them, combine, and you’ve got a passable mockup of generative design. Want to truly impress? Apply your prototype to a simplified design problem and explain that it’ll work just the same in a complicated, real-life situation. If there are any concerns about the quality of results, draw upon your inner techno-optimist to explain that the algorithm will improve with time. It’s that simple. It’s the sort of thing you can create in a weekend, on a lark, to illustrate a blog post.

While it’s trivial to show that generative design is possible, it’s much harder to take the next step and show that generative design is useful. In fact, it rarely happens. This is the real challenge of generative design: going from the plausible to the practical. Up until now, we’ve just been doing the easy bit, we’ve been showing that it’s possible. This feels like progress, yet the hard part is still to come. I don’t want to be a downer, but I don’t think we’ll get there. By my count, there are 6 major reasons why generative design is unlikely to progress.

1. You’re on the hook for generating the options

The way generative design is sold, it often appears that a designer only involved in defining the project’s goal and picking the best options. In reality, the designer is also responsible for creating the algorithm that generates the plans. Which is no small feat.

In the generative email program, the text is written by an algorithm called GPT-2. This program, developed by OpenAI, builds upon decades of research on neural networks and natural language processing. The result is an algorithm that can write everything from New Yorker articles to Harry Potter screenplays (GPT-2 was so convincing that OpenAI initially held it back, calling the software ‘too dangerous to release’ because of it’s ability to automate the production of fake content).

These chairs look similar because the underlying generative algorithm is limited in what it can produce. If you wanted to create a chair that looked different, you’d need to spend time rewriting the generative algorithm (source).

There is no GPT-2 for buildings. That is to say, if you’re using generative design, there is no pre-built mechanism for generating all the design options. Instead, you have to create your own system. From scratch. This is a bit like creating a factory that manufactures design schemes. If the factory is repetitively making a reasonably uniform product, then it’s relatively straightforward to setup. But if you want to produce a lot of variation, then it can get really complicated. In many cases, it will take more time and skill to set up the factory compared to doing the work manually. To avoid this complexity, people tend to limit what the factory can produce, which is why demonstrations of generative design often churn out hundreds of similar-looking design options. Rather than exploring the full range of design outcomes, you end up exploring what the algorithm can create. Often this produces less exciting outcomes and takes longer than you’ve been led to believe.

2. Quantity doesn’t substitute for quality

If your boss asks you to sketch out a proposal, what is the right number of plans to produce? Perhaps you’d return with 3 to 5 ideas. If you’re feeling confident, you might advance just a single proposal. But you’d never, in any situation, come to your boss with a presentation containing 100 different schemes. It’d be absurd.

Yet, with generative design, we routinely generate hundreds of different options. And we celebrate this like it’s a virtue. The thinking is simple: the algorithms can’t tell good ideas from bad, but they can create designs incredibly quickly, so if we rapidly produce hundreds of options, we increase the chances of inadvertently generating a good design. Effectively, we buy more lottery tickets.

The deluge of options obscures the fact that most of the outcomes aren’t viable. In the case of the generative email program, if the algorithm was any good, it’d be able to select the 3 to 5 most compelling drafts. If it was really confident, it’d select just one. But instead, we have algorithms that thoughtlessly create hundreds of options. This isn’t a virtue, it’s not the future, it’s a byproduct of lousy software. The fact of the matter is: one hundred shitty designs aren’t anywhere equivalent to one considered design. If your software was any good, it’d produce fewer designs, not more.

3. Comparing options is harder than it looks

Design options from Parafin (source).

Once you’ve generated all of these options, a person needs to select the best proposal. This is one of the main appeals of generative design – the algorithm handles the laborious work of creating the options, and all you have to do is sit back and pick your favorite.

It sounds leisurely, but it is actually difficult work. Ask any professor: would you rather grade 100 student essays or write one article of your own? Truth is, it takes effort to consider a design option seriously. And the challenge of evaluating different options only increases as the output becomes more involved and more complex. For example, 100 emails can be skimmed relatively quickly, but 100 books would require a lot of reading. Now imagine comparing 100 different buildings, I mean really comparing them, not just skimming through the images – oy vey!

Further complicating things, humans hate having too many choices. More choices give us more opportunities to make the wrong decision (which is something we fear), and the choices make it cognitively challenging to recall options and draw comparisons. This is sometimes called ‘overchoice’ or ‘the paradox of choice.’ Research shows that we particularly dislike being given a lot of similar alternatives, as is often the case for generative design, since there is no clear winner, leaving us to make a seemingly impossible choice between nearly identical options.

To put it simply, presenting designers with a lot of options is generally a terrible idea. Designers will find it stressful, they’ll struggle to make meaningful evaluations and comparisons, and it might not save as much time as you’d expect because it’s such an involved process to do well.

4. What you can measure isn’t what matters

Proponents of generative design argue that having too many options isn’t a problem because you can always hide the bad ones. You just need to measure the performance of each option and remove anything that doesn’t match the designer’s performance criteria.

In the generative email program, you can filter the emails by length. Admittedly, the length isn’t the best performance metric, but it’s easy to calculate. Perhaps in the future, we’ll be able to measure more critical factors, like the text’s persuasiveness or wittiness. But it’s not guaranteed that we’ll get there. Just because you can count the number of words in an email, doesn’t mean that one day you’ll be able to measure these more visceral concepts.

In the field of architecture, there’s no consensus on what constitutes good architecture and no established ways of measuring it. So we measure something else. In the 1960s and 70s, a lot of research focused on evaluating buildings in terms of walking times between rooms, which was easily calculated but not particularly important. Today, we might look at solar gain or view analysis, which is a component of architectural performance but not the full story. Perhaps in the future, we’ll be able to quantify other more visceral aspects of architectural performance, but I’m not holding my breath.

For people using generative design, this puts them in a bind. They can either use these arbitrary metrics and end up optimizing for the wrong thing, the thing that can be easily measured. Or they can ignore the metrics and wade through a lot of unfiltered options. I don’t see this situation improving any time soon – architectural performance is so complicated that we may never get to a place where we can quantify it and use it as a filter.

Apartment layouts are compared using solar potential, revenue, and program, which are easily calculated measures of performance but not the full picture. It is easy to inadvertently optimize for the calculable rather than the important (source).

5. Designers don’t work like this

Generative design simplifies the design process into three steps: briefing, ideation, and deciding. This is a gross simplification of what architects actually do. It feels like a caricature cooked up to tease designers, ‘oh, you know those melodramatic architects, all they have to do is take the brief, create a bunch of options, and pick their favorite, what’s so hard about that?’ Honestly, it’s insulting.

Study after study has shown that designers don’t follow a linear process, that design is necessarily messy and iterative. You experience this writing emails. You’ll write something down, re-read it, realize it sounds wrong, revise, re-read, edit, and iteratively work towards a final draft. At WeWork, Andrew Heumann recorded architects as they worked, and observed a similar pattern as he watched designers rapidly cycle between macro and micro changes, between the broader project objectives and the specific implementation.

In demonstrations of generative design, the iterative nature of the design process isn’t apparent because there are no real stakes. You don’t have the pressure of design reviews, the insanity of client revisions, budget cuts, and public submissions. You’re playing a designer on easy mode.

On a real project, you’ll never get it right the first time – the generative design algorithms aren’t good enough, and the circumstances of the project will change once you’ve created your first draft. So you have to make revisions. And generative design doesn’t accommodate revisions since it assumes the design process only moves forward. To make a revision, you either need to throw everything out and start the generative design process again, or you can abandon using generative design and make the change manually. Either way, generative design makes it hard for designers to work iteratively.

6. No one else works like this

The most damming indictment of generative design is that you don’t see it used in other creative fields. Adobe isn’t holding press conferences saying that generative design is the future of graphic design (InDesign will just be an interface where you upload your text, the software creates 100 different page layouts, and the designer picks their favorite). Apple isn’t pushing generative design for Final Cut (upload your raw footage, the software edits 100 different films, you watch them all, and pick your favorite). Microsoft isn’t adding generative design to Word. Autodesk isn’t even hawking generative design to their other key markets, such as media and entertainment.

To be fair, some things on the market today that resemble generative design. Spotify, for instance, will automatically generate several playlists and let you pick your favorite to listen to. But Spotify has an advantage, they can build this once and sell it to millions of customers, they’re not creating a one-off algorithm to redesign their office. Additionally, the interface is quite different to what we’ve been calling generative design – you don’t give Spotify a brief, and the software only produces a handful of carefully curated options (it’s not randomly putting songs into hundreds of different playlists).

Flight booking websites essentially follow a generative process, you enter the brief (dates and destination), it creates dozens of itineraries, and you select the best combination. But is it the future of design?

The only place that I’ve really seen generative design thrive is on flight booking websites. These websites essentially take you through a generative process: 1) you specify the dates and destination, 2) the software generates dozens of different itineraries, 3) and you filter the routes by time, cost, and stopovers, and then select the best one. It works well. But anyone that’s used Google Flights and thought that it’d make a good design interface is out of their fucking mind.

If not generative design?

Generative design is our industry’s white whale. We’ve spent years hunting it with money, PowerPoint slides, and armies of interns. You get the sense that we’re within striking distance, and yet we’ve never landed it. It feels like we’ve made progress, and yet there are seemingly insurmountable challenges ahead. It feels possible, and yet never quite practical.

My concern is that many companies have jumped on the generative design bandwagon, swept up in the mania, never pausing to consider why this hasn’t worked previously or why other design industries aren’t onboard.

A lot of this would be avoidable if we had a better understanding of how design actually gets done inside architecture firms. Truth is, we know shockingly little about how design happens – especially in a digital world. As a result, people end up prophesizing about the future of the design, based not on an understanding of the design process, but on an understanding of the technology. Often this comes with fairly naive and condescending assumptions about the work that designers do, which makes concepts like generative design seem reasonable, perhaps even desirable.

Until we get to a point where algorithms replace designers (which may never happen), algorithms will only be practical if they work with humans. The real challenge isn’t the technology, it’s the interface, it’s how the algorithms fit the designer and their process. Generative design asks designers to change this process, to follow a stilted three-stage procedure.

To me, a more fruitful path seems to be taking the existing process and finding ways to enhance it with algorithmic smarts. Consider email. The process is similar to typing a letter on a typewriter, except you’re surrounded by spell-checkers, predictive keyboards, smart compose functions, bots, spam filters, and email prioritizers that all work alongside you, assisting, guiding, and bettering your writing. Creators of other design tools, such as Adobe, have gone in a similar direction, developing algorithms that work within existing design processes. In Photoshop, for instance, Adobe has developed targeted tools that automate specific procedures (such as content-aware fill and smart object selection). The designer works in a familiar manner, but computation is helping accelerate tedious tasks and guiding the user through challenging decisions.

In the end, I get the appeal of generative design. It’s alluring, captivating, and perhaps even inspiring. But generative design’s problems with choice overload, imprecise metrics, and a lack of design integration are so core to how it operates that they’re probably insurmountable. Or at least not easily solved by the usual trio of proposed solutions: better algorithms, an improved interface, and faster computers. Ultimately, I worry that generative design has become a distraction. I’m left wondering what might have happened if we were guided by the process instead of the technology.


I’d like to thank Andrew Heumann and Nathan Miller for their thoughts and comments on an earlier draft of this article. I’d also like to apologize to all my friends working on generative design applications – I love you all.


SEC Disclosure: I own a small amount of Autodesk stock because ultimately I have more faith in Autodesk’s marketing team than any of the arguments in this post. Nothing in this article should be taken as investment advice.


Epilogue
May 21, 2023

This article is an interesting snapshot in time. I wrote it in February 2020, a few weeks before the pandemic shut down everything. And while that was only three years ago, a lot has changed since then, especially when it comes to generative design.

Back in 2020, generative design was poorly defined and not widely used. Autodesk was leading the conversation. They had latched onto the term and were using it to describe their vision for the future of design. In this world, designers would function more like curators, picking from hundreds of options that computers created.

In retrospect, Autodesk’s vision of generative design seems pretty prescient. But despite their early advantage, Autodesk isn’t a major player in generative design today – companies like Microsoft, Google, and Adobe are far ahead. So what happened?

A critical clue lies in the differences between today’s generative design tools and Autodesk’s vision three years ago. Autodesk imagined generative design being this three-step process. First, a designer would create a parametric model, then a computer would churn through different permutations of the model, and finally, the designer would sort through all the options and pick their favorite.

This process is actually quite different to how tools like ChatGPT function. For one thing, you don’t have to create the underlying model. OpenAI has already developed a large, general-purpose model (the GPT of ChatGPT) capable of doing everything from writing songs to composing college essays. But as I point out in the article, there isn’t a general-purpose algorithm for architecture, so “you’re on the hook for generating the options.” This is one reason why ChatGPT feels easy to use – all the hard work of creating the model is already done. But in Autodesk’s version of generative design, this is a task a designer has to undertake on every project.

Another key difference is that most modern generative design tools only produce a couple of answers. Midjourney generates four different options in response to a prompt. ChatGPT produces one. But Autodesk was fixated on giving designers lots of options. For them, the big challenge was generating and sorting these options efficiently. Most of their resources went into solving this problem. But it turns out, this was the wrong problem to solve because you don’t always need to generate lots of options. As I point out in the article, “If your software was any good, it’d produce fewer designs, not more.”

So while Autodesk was messing around with ways to generate and sort hundreds of design options, the other companies were inventing these powerful, general-purpose models (like GPT). And ultimately these versatile algorithms that could be applied to any project proved more productive than developing an interface to create a bespoke algorithm for every project.

At the moment, generative AI is at a really interesting stage. While the underlying technology continues to improve, much of the innovation from companies like Google and Microsoft centers around the interface. It’s a critical detail (“the real challenge isn’t the technology, it’s the interface, it’s how the algorithms fit the designer and their process.”) Even for something like ChatGPT, the underlying technology has been around for a while. When I wrote the article, I used GPT2 for the email tool, which is a slightly earlier version of the language model that ChatGPT uses. So the technology was out there. People were building things with it. But it wasn’t until OpenAI hit upon the chat interface that it really took off.

I’m still skeptical about whether we can create a GPT for architecture. The main problem is the data. Tools like Dall-E or GPT require huge data sets to train. But in architecture, there isn’t a large corpus of models to train an algorithm on. Autodesk certainly has the potential to gather this data (and in some alternate reality, they spent time getting this data rather than going down the rabbit hole of creating interfaces to sort design options). But even if a GPT for architecture is never created, these other tools will still have an enormous influence on the architecture industry. Just look at the amount of writing most architecture firms do – whether it’s sending emails to clients or responding to RFPs. Much of this work will be automated by these new tools.

So whatever the case, these generative algorithms are coming for the architecture profession. I think we can pretty confidently say that generative design isn’t doomed to fail. But it has shifted and changed over the past couple of years. Generative design today is different to how Autodesk imagined it. Looking back on this article, it feels like a review of a Nokia phone in 2006 – it’s kinda moot when the iPhone gets released a year later.