





inference PRODUCT LINE
Comino GRANDO AI INFERENCE servers are engineered for high-performance, low-latency inference and fine-tuning of today’s most advanced pre-trained machine learning and deep learning models. Perfectly suited for Generative AI, these systems effortlessly handle workloads based on popular models and platforms such as Stable Diffusion, Midjourney, Hugging Face Transformers, Character.AI, QuillBot, DALL·E 2, DALL·E 3, OpenAI GPT-3.5/4, Mistral, LLaMA, Claude, DeepSeek, Mixtral, Whisper, Falcon, BLOOM, and more.With flexible, cost-optimized multi-GPU configurations, GRANDO servers are designed to scale seamlessly — whether deployed on-premises or within a high-performance data center. Ideal for inference, fine-tuning, or real-time AI applications, Comino GRANDO delivers the versatility, power, and reliability needed to support your most demanding AI workflows.
.png)
Multi-GPU Server
NVIDIA OPTION: 8x L40S GPUs
AMD OPTION: 8x Radeon RX 7900XTX
1x AMD EPYC 9004/9005 32 cores CPU
Comino Liquid Cooling
.png)
Multi-GPU Server
NVIDIA OPTION: 8x RTX PRO 6000 (96 GB) GPUs
AMD OPTION: 8x Radeon PRO W7900
1x AMD EPYC 9004/9005 64 cores CPU
Comino Liquid Cooling
.png)
Multi-GPU Server
NVIDIA OPTION: 8x H200 GPUs
1x AMD EPYC 9004/9005 128 cores CPU
Comino Liquid Cooling
The GRANDO AI INFERENCE BASE and PRO servers are built for high-efficiency AI workloads, supporting up to eight liquid-cooled NVIDIA RTX PRO 6000 / L40S or AMD W7900 / 7900 XTX GPUs, each with up to 96GB of VRAM — delivering a total of up to 768GB VRAM per server. Optimized for low- and mid-precision inference, these systems are ideal for running large language models (LLMs) such as LLaMA, DeepSeek, GPT, Mixtral, and other transformer-based architectures with ease and consistency.
At the top of the performance spectrum, the GRANDO AI INFERENCE MAX redefines power and scalability. Featuring up to 8x NVIDIA H100 or H200 GPUs, this flagship system offers an incredible 1.12TB of unified VRAM, making it the most powerful inference and fine-tuning solution in the Comino lineup. Designed for enterprise AI deployments and cutting-edge research, INFERENCE MAX delivers the ultimate performance for large-scale generative models and advanced neural networks.
Comino’s cutting-edge liquid-cooling technology eliminates thermal throttling entirely, unlocking up to 50% higher sustained performance compared to air-cooled alternatives — even under full load. Built for long-term reliability, GRANDO systems offer up to 3 years of maintenance-free operation, while remaining as simple to maintain as air-cooled systems.With seamless integration via API, the Comino Monitoring System (CMS) provides robust remote diagnostics, performance monitoring, and fleet control—ready to plug into your infrastructure from day one.
GRANDO AI INFERENCE servers are fully optimized and pre-tested for a comprehensive range of industry-standard AI software stacks, including:
- Toolkits & Runtimes: NVIDIA CUDA Toolkit, cuDNN, AMD ROCm, ONNX Runtime, OpenVINO, TensorRT, DeepSpeed, Hugging Face Transformers, Intel oneAPI
- Frameworks: PyTorch, TensorFlow, JAX, Keras, MXNet, PaddlePaddle, FastAI, PyTorch Lightning, Flax, Chainer
- Dev Environments & Libraries: Python, NumPy, SciPy, Dask, Ray, RAPIDS, scikit-learn, Apache TVM, Triton Inference Server
Each server is equipped with up to 8x liquid-cooled NVIDIA (H200 / H100 / RTX PRO 6000 / L40S) or AMD (W7900 / 7900 XTX) GPUs, paired with ultra-fast, high-core-count Threadripper PRO or cost-effective EPYC CPUs. Whether you’re running massive-scale inference, multi-modal AI, LLMs, vision models, or custom pipelines, GRANDO delivers maximum throughput, compatibility, and future-proof scalability for the most demanding AI workloads.
"INFINITE Inference Power for AI"
Unlock the power of performance with Sendex!
"A lot of inference power comes from this Powerhouse machine from Comino which has not one, not two, not three - it has six GPUs inside!
Harrison Kinsley, the coding maestro aka Sentdex, dives into the ultimate tech thrill with the Comino Grando Workstation featuring a mind-blowing 6x NVIDIA GPUs!"
Talk To Engineer
Let's talkGrando AI Inference Product Specifications
Dedicated IPMI
Dedicated IPMI
Dedicated IPMI
UP TO 38A @ 120V / 21A @ 220V
UP TO 54A @ 120V / 30A @ 220V
UP TO 54A @ 120V / 30A @ 220V
DEEP learning PRODUCT LINE
Comino GRANDO AI Deep Learning Workstations are purpose-built for on-premise training and fine-tuning of complex neural networks on large datasets, with a strong emphasis on Generative AI—yet fully capable across a wide range of AI domains.These systems offer best-in-class, customizable multi-GPU configurations designed to accelerate the development of compute-intensive models including Diffusion models, Multimodal systems, Computer Vision pipelines, Large Language Models (LLMs), and other advanced architectures. Whether you're building foundational models or fine-tuning for specialized tasks, GRANDO workstations deliver the raw power, flexibility, and efficiency AI professionals need.
.png)
Multi-GPU Workstation
NVIDIA OPTION: 2x RTX 5090 GPUs
AMD OPTION: 4x Radeon RX 7900XTX
1x AMD Threadripper Pro 7975WX CPU
Comino Liquid Cooling

Multi-GPU Workstation
NVIDIA OPTION: 4x L40S GPUs
AMD OPTION: 4x Radeon PRO W7900
1x AMD Threadripper Pro 7985WX CPU
Comino Liquid Cooling

Multi-GPU Workstation
NVIDIA OPTION: 4x H200 GPUs
1x AMD Threadripper Pro 7995WX CPU
Comino Liquid Cooling
GRANDO AI DL MAX Workstation: Ultimate On-Premise AI Training PowerhouseThe GRANDO AI DL MAX workstation is engineered for maximum performance, featuring up to four liquid-cooled NVIDIA H100 or H200 GPUs with up to 564GB of combined HBM memory, paired with a 96-core AMD Threadripper PRO CPU running at speeds up to 5.1GHz. This advanced solution delivers up to 50% higher sustained performance compared to traditional air-cooled systems—making it ideal for demanding AI training and fine-tuning workloads.
Beyond raw power, Comino systems are built for long-term reliability and ease of use, offering up to 3 years of maintenance-free operation, with servicing as straightforward as with air-cooled setups. The integrated Comino Monitoring System (CMS) supports remote management and full API integration, allowing seamless deployment into your existing software infrastructure.
.jpg)
GRANDO DL workstations are pre-tested and fully optimized for a wide range of industry-leading AI toolkits and frameworks, including:
- Toolkits & Runtimes: NVIDIA CUDA Toolkit, cuDNN, AMD ROCm, ONNX Runtime, OpenVINO, TensorRT, Intel oneAPI, DeepSpeed, Triton Inference Server, Hugging Face Transformers
- Frameworks & Libraries: PyTorch, TensorFlow, JAX, Keras, MXNet, PaddlePaddle, FastAI, PyTorch Lightning, Flax, Chainer, Scikit-learn, XGBoost, LightGBM, Ray, RAPIDS, Dask, NumPy, SciPy, Apache TVM
Each system is equipped with up to four high-performance GPUs — NVIDIA H200, H100, RTX PRO 6000, L40S, RTX 5090 or AMD W7900 / 7900 XTX—paired with the latest high-frequency, multi-core Threadripper PRO CPUs for unparalleled training and inference performance.
With silent operation and exceptional thermal efficiency, GRANDO DL workstations are purpose-built for complex and compute-intensive AI workloads including:
- Text & Language Models: GPT-3, GPT-4, Claude, LLaMA, Mistral, Mixtral, Falcon, DeepSeek, BLOOM, T5, BERT, RoBERTa
- Vision & Multimodal Models: Stable Diffusion, Midjourney, DALL·E 2, DALL·E 3, ControlNet, CLIP, Segment Anything (SAM), YOLOv8, Detectron2
- Conversational & Personal AI: Character.AI, QuillBot, Replika, Jasper, Cohere, Open Assistant
- Audio & Speech Models: Whisper, Bark, Tortoise TTS, VALL-E
Whether you're working on fine-tuning, inference, model development, or real-time deployment, GRANDO DL delivers exceptional performance, reliability, and flexibility for AI professionals, researchers, and creators alike.
Talk To Engineer
Let's talkGrando AI DL Product Specifications
Dedicated IPMI
Dedicated IPMI
Dedicated IPMI
UP TO 20A @ 120V / 11A @ 220V
UP TO 20A @ 120V / 11A @ 220V
UP TO 30A @ 120V / 16A @ 220V
certified to the partners programs

.png)
.png)




.png)




Have a media inquiry? Looking for more info about Comino? Contact one of our team members at pr@comino.com
Technology Partners
At Comino, we are dedicated to our flexibility, showcasing a wide array of components to demonstrate our versatility. This expansive range prevents us from being confined by constraints imposed by single vendors. Through custom-tailored solutions that cater to the specific needs of each client, our meticulously selected offerings ensure precise and individualized results. By embracing this multifaceted strategy, we remain committed to delivering exceptional, bespoke solutions that fulfill the unique requirements of our valued clients.











