Every team that starts experimenting with generative AI (gen AI) eventually runs into the same wall: scaling it. Running 1 or 2 models is simple enough. Running dozens, supporting hundreds of users, and keeping GPU costs under control, is something else entirely.Teams often find themselves juggling hardware requests, managing multiple versions of the same model, and trying to deliver performance that actually holds up in production. These are the same kinds of infrastructure and operations challenges we have seen in other workloads, but now applied to AI systems that demand far more resources
NOTE: This blog has been updated to announce support for additional supported third-party model providers for the Red Hat Ansible Lightspeed intelligent assistant. Additional testing and validation of new model providers is ongoing. For the most recent list of supported model providers, please refer to Red Hat's official documentation. Earlier this year, we released the Red Hat Ansible Lightspeed intelligent assistant, a generative AI service which delivers an intuitive chat assistant embedded within Ansible Automation Platform. The Ansible Lightspeed intelligent assistant is like having an An
For years, my career in cybersecurity was defined by a sense of urgency and criticality. As a leader of incident response teams, I lived on the front lines, constantly reacting to the latest software vulnerabilities, cyberattacks, and anomalies. My days were a blur of alerts, patch deployments, and the relentless pressure to mitigate risk and restore operations. It was a challenging, high-stakes environment where every vulnerability felt like a direct threat.Now, I've traded the immediate firefight for a more proactive battlefield as a manager within Red Hat Product Security. This has given me