Text2CAD: Designers can efficiently generate parametric CAD models from text prompts. The prompts can vary from abstract shape descriptions to detailed parametric instructions.

Contribution

Our proposed Text2CAD is the first AI framework for generating parametric CAD designs using multi-level textual descriptions . Our main contributions are:

A Novel Data Annotation Pipeline that leverages open-source LLMs and VLMs to annotate DeepCAD dataset with text prompts containing varying level of complexities and parametric details.
Text2CAD Transformer: An end-to-end Transformer based autoregressive architecture for generating CAD design history from input text prompts.

Data Annotation

Our data annotation pipeline generates multi-level text prompts describing the construction workflow of a CAD model with varying complexities. We use a two-stage method -

Stage 1: Shape description generation using VLM (LlaVA-NeXT).
Stage 2: Multi-Level textual annotation generation using LLM (Mixtral-50B).

Text2CAD Transformer

Text2CAD Transformer converts natural language descriptions into parametric 3D CAD models by deducing all its intermediate design steps autoregres- sively. Our model takes as input a text prompt \(T\) and a CAD subsequence \(\mathbf{C}_{1:t-1}\) of length \({t-1}\). The text embedding \(T_{adapt}\) is extracted from \(T\) using a pretrained BeRT Encoder followed by a trainable Adaptive layer. The resulting embedding \(T_{adapt}\) and the CAD sequence embedding \(F^0_{t-1}\) is passed through \(\mathbf{L}\) decoder blocks to generate the full CAD sequence in auto-regressive way.

Visual Results

Visual examples of 3D CAD model generation using varied prompts. (1) Three different prompts yielding the same ring-like model, some without explicitly mentioning ’ring’. (2) Three diverse prompts resulting in same star-shaped model, each emphasizing different star characteristics.

Qualitative results of the reconstructed CAD models of DeepCAD and Text2CAD on DeepCAD dataset. From top to bottom - Input Texts, Reconstructed CAD models using DeepCAD and Text2CAD respectively and GPT-4V Evaluation.

❮ ❯

Quantitative Results

We evaluated the performance of Text2CAD using two strategies.

CAD Sequence Evaluation: We assess the parametric correspondence between the generated CAD sequences with the input texts. This is done using the following metrics:
- F1 Scores of Line, Arc, Circle and Extrusion using the method proposed in CAD-SIGNet.
- Chamfer Distance (CD) measures geometric alignment between the ground truth and reconstructed CAD models of Text2CAD and DeepCAD.
- Invality Ratio (IR) Measures the invalidity of the reconstructed CAD models.
Visual Inspection: We compare the performance of Text2CAD and DeepCAD with GPT-4 and Human evaluation.

Click on the tab to visualize the bar chart. You can also hover on the bars to see the metrics.

Citation

If you use our dataset, please cite our works.


                    @Inproceedings{khan2024textcad, 

                    title={Text2CAD: Generating Sequential {CAD} Designs from Beginner-to-Expert Level Text Prompts},  

                    author={Mohammad Sadil Khan and Sankalp Sinha and Sheikh Talha Uddin and Didier Stricker and Sk Aziz Ali and Muhammad Zeshan Afzal},  

                    booktitle = {Advances in Neural Information Processing Systems},

	                pages = {7552--7579},

	                publisher = {Curran Associates, Inc.},  

                    year={2024},  

                    volume = {37}, 

                    url = {https://proceedings.neurips.cc/paper_files/paper/2024/file/0e5b96f97c1813bb75f6c28532c2ecc7-Paper-Conference.pdf},  

                }


                    @Inproceedings{Khan_2024_CVPR, 

                    author    = {Khan, Mohammad Sadil and Dupont, Elona and Ali, Sk Aziz and Cherenkova, Kseniya and Kacem, Anis and Aouada, Djamila}, 

                    title     = {CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention},

                    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, 

                    month     = {June}, 

                    year      = {2024}, 

                    pages     = {4713-4722} 

                    }

NeurIPS 2024 (Spotlight 🤩)