Model parallelism

Model parallelism is a form of parallelism where a model is divided up to sit on different GPUs. It is useful for increasing the speed of models at inference and during training.

Not to be confused with data parallelism (see data parallelism definition), both can be applied in tandem make the most of available resources, reduce training and deployment times, and optimize the end-to-end machine learning process for better results and faster development.

Related Articles

No items found.