Skip to content

Overview

Introduction

VTune POC to collect hotspots with HW sampling - Integrated public release vtune - Tested using Llama2 (OOB), Stream and HPCG - Tested using TERRAFORM_OPTIONS=--docker - Workload needs to be run with DOCKER_OPTIONS="--privileged" - Can be triggered with TERRAFORM_OPTIONS=--vtune - Trace log will be archived in vtune.tar.gz and can be opened by vtune GUI

POC Logs

Stream : It is too short to get enough detail.

HPCG : Run with gated. Still a large log file .

Parameters

Pls refer to defaults/main.yaml with definitions in comments

System Requirements

  • Only supports baremetal system

Trace Log

  • vtune.tar.gz inside WSF logs sub-folder. For example: worker-0-1-vtune\vtune.tar.gz

Contact

  • Stage1 Contact: Alex H Zhang

Validation Notes

  • TODOs:
  • Doesn't support pre-PRQ systems. Need to enable internal vtune
  • Still in experiement to attach to a specific process in a docker image

  • Known Issues:

  • There might be unknow modules while resolving symbols in report

Trouble Shooting

  • If any issue while vtune collection, please check vtune log under worker---vtune first
  • Try to set config vtune_force_install to yes to see if it help solve issues
  • If there is issue while using HW sampling, please check if driver is installed and do force install also
    lsmod | grep sep
    lsmod | grep vtsspp
    
  • If ctest is pending on trace stop for long time, please try to kill the vtune / amplxe related process manually, then retest

See Also