Model compilation

Model compilation is an essential step in the deployment of AI models. It is a process applied in some deep learning frameworks to prepare a model for inference. Software frameworks designed to make training machine learning models easy, often leave a lot of inference-time performance on the table because they must be flexible for practitioners to be able to experiment rapidly. Compilation takes the output of a training process and squeezes out this flexibility, leaving only the information required to run the model in inference. In short, it tailors models to the target hardware, which improves efficiency, reduces latency, and enables their use in various devices and applications.

Related Articles

No items found.