More Research
Generative AI · 3D CAD Design

Text2CAD

Generating Sequential CAD Designs from
Beginner-to-Expert Level Text Prompts

Mohammad Sadil Khan1*†
Sankalp Sinha1*
Talha Uddin Sheikh1
Didier Stricker1
Muhammad Zeshan Afzal1
NeurIPS 2024 ✦ Spotlight
01

Contribution

Text2CAD is the first AI framework for generating parametric CAD designs from multi-level textual descriptions, supporting prompts from abstract shape descriptions to detailed parametric instructions.

Key Contributions

  • 01

    Novel Data Annotation Pipeline — Leverages open-source LLMs and VLMs to annotate the DeepCAD dataset with multi-level text prompts of varying complexity and parametric detail.

  • 02

    Text2CAD Transformer — An end-to-end Transformer-based autoregressive architecture for generating complete CAD design history from natural language prompts.

02

Data Annotation

Our pipeline generates multi-level text prompts describing CAD construction workflow with varying complexities via a two-stage approach:

  1. Stage 1: Shape description generation using VLM (LlaVA-NeXT).
  2. Stage 2: Multi-level textual annotation generation using LLM (Mixtral-50B).
Data Annotation Pipeline
03

Text2CAD Transformer

The Text2CAD Transformer converts natural language into parametric 3D CAD models by deducing intermediate design steps autoregressively. Given text prompt \(T\) and CAD subsequence \(\mathbf{C}_{1:t-1}\), a pretrained BERT Encoder with trainable Adaptive layer extracts \(T_{adapt}\), which passes through \(\mathbf{L}\) decoder blocks alongside CAD sequence embedding \(F^0_{t-1}\).

Text2CAD Architecture
04

Visual Results

05

Quantitative Results

Two evaluation strategies assess Text2CAD performance:

  1. CAD Sequence Evaluation
    • F1 Scores for Line, Arc, Circle and Extrusion elements.
    • Chamfer Distance (CD) for geometric alignment.
    • Invalidity Ratio (IR) measuring reconstruction invalidity.
  2. Visual Inspection — GPT-4 and Human preference evaluation.

Select a tab to explore results. Hover over charts for values.

Win-rate breakdown by prompt complexity — GPT-4 evaluation

Abstract

Beginner

Intermediate

Expert

Win-rate breakdown by prompt complexity — Human evaluation

Abstract

Beginner

Intermediate

Expert

06

Acknowledgement

This work was in parts supported by the EU Horizon Europe Framework under grant agreement 101135724 (LUMINOUS).

07

Citation

If you find our work useful, please cite:

Text2CAD — NeurIPS 2024
@inproceedings{khan2024text2cad, title = {Text2CAD: Generating Sequential {CAD} Designs from Beginner-to-Expert Level Text Prompts}, author = {Mohammad Sadil Khan and Sankalp Sinha and Sheikh Talha Uddin and Didier Stricker and Sk Aziz Ali and Muhammad Zeshan Afzal}, booktitle = {Advances in Neural Information Processing Systems}, pages = {7552--7579}, publisher = {Curran Associates, Inc.}, year = {2024}, volume = {37}, url = {https://proceedings.neurips.cc/paper_files/paper/2024/file/0e5b96f97c1813bb75f6c28532c2ecc7-Paper-Conference.pdf}, }