Skip to content

Conversation

@rgolangh
Copy link

@rgolangh rgolangh commented Dec 25, 2025

Motivation
This enables AI Lab to leverage Intel IPEX containers for hardware
acceleration on Intel Arc GPUs, providing better performance for
inference workloads on Intel hardware.

Modifications

  • Add Intel IPEX image to llamacpp image definitions
  • Update getLlamaCppInferenceImage() to detect and use Intel GPUs
  • Add Intel GPU device passthrough (/dev/dri) for container creation
  • Add Intel-specific environment variables (ZES_ENABLE_SYSMAN, OLLAMA_NUM_GPU)

How was this tested

  • Downloaded ibm-granite/granite-4.0-micro
    Note: 'hybrid' model with the -h- in their name do not work. like ibm-granite/granite-4.0-h-micro
  • Created a new service
  • Run intel_gpu_top to examine GPU utilization
  • Execute curl to invoke the chat endpoint

Signed-off-by: Roy Golan rgolan@redhat.com

@rgolangh rgolangh requested review from a team, benoitf and jeffmaury as code owners December 25, 2025 17:43
@rgolangh rgolangh requested review from cdrage and gastoner December 25, 2025 17:43
@rgolangh rgolangh force-pushed the feat/intel-arc-gpu-support branch from 43a37a8 to 6ad37bc Compare December 25, 2025 17:55
@bmahabirbu bmahabirbu force-pushed the feat/intel-arc-gpu-support branch from 6885791 to 004430e Compare January 6, 2026 04:10
Copy link
Contributor

@bmahabirbu bmahabirbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish i could test but LGTM this is awesome thanks!!

"default": "quay.io/ramalama/ramalama-llama-server@sha256:9560fdb4f0bf4f44fddc4b1d8066b3e65d233c1673607e0029b78ebc812f3e5a",
"cuda": "quay.io/ramalama/cuda-llama-server@sha256:1a6d4fe31b527ad34b3d049eea11f142ad660485700cb9ac8c1d41d8887390cf"
"cuda": "quay.io/ramalama/cuda-llama-server@sha256:1a6d4fe31b527ad34b3d049eea11f142ad660485700cb9ac8c1d41d8887390cf",
"intel": "docker.io/intelanalytics/ipex-llm-inference-cpp-xpu:latest"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess it will be better to pin the image to its digest. I'll update it

@rgolangh rgolangh force-pushed the feat/intel-arc-gpu-support branch from 004430e to 5a4eed2 Compare January 6, 2026 05:39
@rgolangh
Copy link
Author

rgolangh commented Jan 12, 2026 via email

return llamacpp.default;
case VMType.LIBKRUN:
case VMType.LIBKRUN_LABEL:
if (gpu?.vendor === GPUVendor.INTEL) return llamacpp.intel;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue: libkrun machines are Apple only, don't think this makes sense

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe but for the Podman landscape it is restricted to MacOS

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, removing

@rgolangh rgolangh force-pushed the feat/intel-arc-gpu-support branch from 5a4eed2 to 77fb333 Compare January 17, 2026 18:45
@rgolangh
Copy link
Author

rgolangh commented Jan 19, 2026

Can you review again please? Thanks

@rgolangh rgolangh force-pushed the feat/intel-arc-gpu-support branch 2 times, most recently from 2813f52 to 0947e71 Compare January 21, 2026 14:30
@bmahabirbu
Copy link
Contributor

@rgolangh just a few comments but LGTM again, just wondering about the digest

@rgolangh
Copy link
Author

@rgolangh just a few comments but LGTM again, just wondering about the digest

I ramalama image is set with a digest:
https://github.com/containers/podman-desktop-extension-ai-lab/pull/4006/changes#diff-282db7021967fc3e5e8d19443a8ed52e09c94aa13d8e7e4e2475d6d11ad5ac98R8

@bmahabirbu
Copy link
Contributor

@rgolangh ah sorry for not being clear I thought that the cuda digest was changed but it looks to be the same just a space edit made it seem different. LGTM then! I really appreciate the contribution and the effort

);
});

test('LIBKRUN vmtype with Intel GPU should use llamacpp.intel image and no custom entrypoint', async () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: this test does not make sense to me as libkrun is MacOS only

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, removing

return llamacpp.default;
case VMType.LIBKRUN:
case VMType.LIBKRUN_LABEL:
if (gpu?.vendor === GPUVendor.INTEL) return llamacpp.intel;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe but for the Podman landscape it is restricted to MacOS

@rgolangh rgolangh force-pushed the feat/intel-arc-gpu-support branch from 0947e71 to 306294e Compare January 22, 2026 19:42
- Add Intel IPEX image to llamacpp image definitions
- Update getLlamaCppInferenceImage() to detect and use Intel GPUs
- Add Intel GPU device passthrough (/dev/dri) for container creation
- Add Intel-specific environment variables (ZES_ENABLE_SYSMAN, OLLAMA_NUM_GPU)
- Set user=0 for Intel GPU on Linux and disable DeviceRequests

This enables AI Lab to leverage Intel IPEX containers for hardware
acceleration on Intel Arc GPUs, providing better performance for
inference workloads on Intel hardware.

Signed-off-by: Roy Golan <rgolan@redhat.com>
@rgolangh rgolangh force-pushed the feat/intel-arc-gpu-support branch from 306294e to 0928d89 Compare January 22, 2026 19:43
@rgolangh
Copy link
Author

@jeffmaury I removed LIBKRUN changes. Please take a look.

});

user = '0';
} else if (gpu.vendor === GPUVendor.INTEL) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: this part should be removed as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants