🌄Generative AI images vs 3D CAD renderings

Preview | SOLIDWORKS USER FORUM

Use your SOLIDWORKS ID or 3DEXPERIENCE ID to log in.

OWOboe WU15/05/2025

Several friends and I had a couple of chats about generative AI images before. Recently, a LinkedIn post started a similar discussion.

The comparison shows that ChatGPT follows a CAD screenshot of a water bottle to generate a photorealistic image.

With proper prompts, ChatGPT also cranks out rich variations quickly, such as a bottle filled with icy water😉.

So a question may emerge: Are generative AI images replacing 3D CAD renderings?

My short answer is No. They actually complement each other.

Here is my thinking.

First, let's touch a bit on the inner workings of AI images today, without getting too nerdy.

Fundamentally, since 2020, the impressive and growing performance of AI images comes from the Diffusion and Transformer algorithms, hence some popular product names, such as Stable Diffusion and ChatGPT (T stands for Transformer here).

Please feel free to dig into the papers in the links, but a key distinction here is that these two algorithms are probabilistic, meaning their outputs are based on probabilities, or guesses. For example, the most likely color of this pixel on this spot in this expected image is "green", or the most likely word to complete this given sentence is "park", according a massive amount of training data.

Pixel by pixel, word by word, sound bite by bite, these models are being used to generate exquisite images, texts, translations, music, voices, podcasts and so on, such as the ChatGPT water bottles above.

In contrast, 3D CAD renderings are deterministic, or precise as guided by math and physics in geometries, colors, materials (Such as Physically Based Rendering, or PBR) and other specifications. The product designs are, and have to be, 100% respected in photorealistic renderings.

Confused? Here is another simple example that can help further distinguish the "probabilistic" and "deterministic" approaches.

Math tells us 2x3=6. This is deterministic, leaving no ambiguity or guesswork.

Another way to figure out the answer could be to survey an ocean of books, documents and chats, and guess that 6 is the most probable one, but this approach has no idea of mathematical multiplication.

Furthermore, sometimes the guesses may get it wrong, which leads to hallucinations. You may have seen a 6-finger-hand on a highly compelling human portrait by ChatGPT, or wrong answers on "How many R's are in the word strawberry".

By the way, these are easy-to-detect hallucinations, but I'm afraid we are consuming many more subtle, but vital hallucinations in daily Large Language Model (LLM) usage, so please be careful and verify the answers. This is another big topic that we can discuss separately.

Now let's examine the LinkedIn AI images from a CAD screenshot.

At a quick glance, they may look ok, but after careful review, you may spot various discrepancies, such as the handel, segment heights, surface patterns and so on. To a manufacturer, each and every details here matter significantly to the design, engineering, production, sales and marketing. These discrepancies may NOT be acceptable.

What about the ice water variation?

As shown below, it doesn't stay consistent with the prior AI image or the CAD design because it's a new probabilistic guess. For instance, now the middle glass segment is completely gone.

Then, what use cases can AI images help with?

As you may have read in the LinkedIn post comments, a good use case is to quickly conceptualize CAD models with imaginative lightings, materials, colors, environments, variations, if the audience doesn't care too much about specifics yet.

As aforementioned, this kind of quick and rough concept visuals complement 3D CAD renderings, without sticking to the details 100%. Also the pixel remixing could inspire some 3D CAD rendering setups in background, lighting, color schemes and so on, like getting ideas from dreams😄

Another synergy between AI and 3D CAD has been implemented by 3DEXCITE as featured in this keynote at the NVIDIA conference.

Here the probabilistic AI creates beautiful 3D, 360, or 2D environments per the the panel prompts on the right, for the deterministic vehicle at the center. The environments can be inspirational, and less accurate, whereas the car (or any product) stays authentic.

Actually when I was asking whether it's ok to put some dirt on the tires to make it look more realistic, our team shared that OEMs are super strict on the car appearances, so the answer is NO. You get the point😉

What's next then?

Encouragingly, AI image outputs are getting more accurate, consistent and efficient over time. As 6-finger hands are seen less and less these days.

On the other hand (pun intended), hallucination is a feature, not a bug as articulated by Andrej Karpathy, a leading AI researcher with first-hand experiences at OpenAI and Tesla. Probabilistic algorithms inherently make things up by remixing pixels, words, sound bites (or tokens behind the scene technically). They don't understand the fundamental principles, such as math and physics.

I'm glad to see the use cases where two approaches complement each other, and hope AI imagery can enjoy new algorithmic breakthroughs in the future🎇