• Advertisement
Sign in to follow this  

New HLSL compiler, with SM6.0 support, has been released as open source

This topic is 448 days old which is more than the 365 day threshold we allow for new replies. Please post a new topic.

If you intended to correct an error in the post then please contact us.

Recommended Posts

Advertisement

There's been so many exciting things happening lately. First the release of PIX for PC, and now an open LLVM-based HLSL compiler! This is great news for the graphics development community :)

Share this post


Link to post
Share on other sites

There's been so many exciting things happening lately. First the release of PIX for PC, and now an open LLVM-based HLSL compiler! This is great news for the graphics development community :)

 

SM6 based on LLVM is not really a news, it has been reveal more than a year or two ago already. But to see it finally here is kind of the last mile to the release :) HLSL is now an antediluvian language if you compare to all the great improvements of the past few years with c++1x and i have a lot of hope on that aspect. I want my auto and template function ( for functors ) to factorize code :)

 

In regards to the backend, i am looking forward to per vendor custom intrinsic, as DXIL support that directly. In my mind, first, come on AMD an access to the raw values from the interpolators ( there is no hardware anymore, it is just a few intrinsics to read the 3 values and barycentric ). The obvious application is SLERP for quaternion based tangent basis :) and debug one pass wireframe overlay without GS.

 

On the downside, damn i hate CMake, and also, i have a pending warp driver pixel shader crash with microsoft ( but maybe switching to dxil will solve that ), answer in a day or two when i will have the time to setup a virtual box to test the runtime :)

Edited by galop1n

Share this post


Link to post
Share on other sites

SM6 based on LLVM is not really a news, it has been reveal more than a year or two ago already.
 

 

Don't worry, I was completely aware that they were working on this, just not aware of when it would actually make it out to the public ;)

The fact that the future development of HLSL is out in the open will help slim down a lot of shader codebases, as I'm sure a robust compiler from HLSL to SPIR-V will follow soon (aside from the work that's already being done in glslang). Also as you mentioned the language now has the freedom to evolve towards more modern concepts.

Share this post


Link to post
Share on other sites
Just becauw I'm too lazy to download and try it yet :) --- does the new compiler still support SM2/3/4/5 and the old bytecode formats? Or will my toolchain need to keep using FXC for those?

Share this post


Link to post
Share on other sites

Yes, the SM6.0 is documented for a little while, the major change is the intrinsic to control and communicate across a wave : https://msdn.microsoft.com/en-us/library/windows/desktop/mt733232(v=vs.85).aspx

 

As for DXIL and the new compiler, everything you want to know is in the git depot, they did a good job on the documentation of the extra parts, like DXIL obviously, but it is a little spread in the file hierarchy.

 

Also, it gonna take a while to get use to the new intermediate, here a small compute shader :

;
; Input signature:
;
; Name                 Index   Mask Register SysValue  Format   Used
; -------------------- ----- ------ -------- -------- ------- ------
; no parameters
;
; Output signature:
;
; Name                 Index   Mask Register SysValue  Format   Used
; -------------------- ----- ------ -------- -------- ------- ------
; no parameters
;
; Pipeline Runtime Information: 
;
;
;
; Buffer Definitions:
;
; cbuffer cb_
; {
;
;   struct dx.alignment.legacy.cb_
;   {
;
;       struct dx.alignment.legacy.struct.Cb
;       {
;
;           uint instanceCount;                       ; Offset:    0
;           uint indexPerInstance;                    ; Offset:    4
;           uint useOcclusion;                        ; Offset:    8
;           uint pad;                                 ; Offset:   12
;           column_major float4x4 viewProj;           ; Offset:   16
;       
;       } cb_                                         ; Offset:    0
;
;   
;   } cb_                                             ; Offset:    0 Size:    80
;
; }
;
; Resource bind info for instances_
; {
;
;   struct struct.Instance
;   {
;
;       float3 min;                                   ; Offset:    0
;       float3 max;                                   ; Offset:   16
;   
;   } $Element;                                       ; Offset:    0 Size:    24
;
; }
;
; Resource bind info for result_
; {
;
;   struct struct.Uint
;   {
;
;       uint val;                                     ; Offset:    0
;   
;   } $Element;                                       ; Offset:    0 Size:     4
;
; }
;
; Resource bind info for occluded_
; {
;
;   struct struct.Uint
;   {
;
;       uint val;                                     ; Offset:    0
;   
;   } $Element;                                       ; Offset:    0 Size:     4
;
; }
;
;
; Resource Bindings:
;
; Name                                 Type  Format         Dim      ID      HLSL Bind  Count
; ------------------------------ ---------- ------- ----------- ------- -------------- ------
; cb_                               cbuffer      NA          NA     CB0            cb0     1
; instances_                        texture  struct         r/o      T0             t0     1
; depths_                           texture     f32          2d      T1             t1     1
; result_                               UAV  struct         r/w      U0             u0     1
; occluded_                             UAV  struct         r/w      U1             u1     1
;
target datalayout = "e-m:e-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "dxil-ms-dx"

%struct.Cb = type { i32, i32, i32, i32, %class.matrix.float.4.4 }
%class.matrix.float.4.4 = type { [4 x <4 x float>] }
%class.StructuredBuffer = type { %struct.Instance }
%struct.Instance = type { <3 x float>, <3 x float> }
%class.Texture2D = type { float, %"class.Texture2D<float>::mips_type" }
%"class.Texture2D<float>::mips_type" = type { i32 }
%class.RWStructuredBuffer = type { %struct.Uint }
%struct.Uint = type { i32 }
%cb_ = type { %struct.Cb }
%dx.alignment.legacy.struct.Cb = type { i32, i32, i32, i32, [4 x <4 x float>] }
%dx.alignment.legacy.cb_ = type { %dx.alignment.legacy.struct.Cb }
%dx.types.Handle = type { i8* }
%dx.types.Dimensions = type { i32, i32, i32, i32 }
%dx.types.CBufRet.i32 = type { i32, i32, i32, i32 }
%dx.types.ResRet.f32 = type { float, float, float, float, i32 }
%dx.types.CBufRet.f32 = type { float, float, float, float }

@dx.typevar.0 = external addrspace(1) constant %struct.Cb
@dx.typevar.1 = external addrspace(1) constant %class.StructuredBuffer
@dx.typevar.2 = external addrspace(1) constant %struct.Instance
@dx.typevar.3 = external addrspace(1) constant %class.Texture2D
@dx.typevar.4 = external addrspace(1) constant %"class.Texture2D<float>::mips_type"
@dx.typevar.5 = external addrspace(1) constant %class.RWStructuredBuffer
@dx.typevar.6 = external addrspace(1) constant %struct.Uint
@dx.typevar.7 = external addrspace(1) constant %cb_
@dx.typevar.8 = external addrspace(1) constant %dx.alignment.legacy.struct.Cb
@dx.typevar.9 = external addrspace(1) constant %dx.alignment.legacy.cb_
@llvm.used = appending global [10 x i8*] [i8* addrspacecast (i8 addrspace(1)* bitcast (%struct.Cb addrspace(1)* @dx.typevar.0 to i8 addrspace(1)*) to i8*), i8* addrspacecast (i8 addrspace(1)* bitcast (%class.StructuredBuffer addrspace(1)* @dx.typevar.1 to i8 addrspace(1)*) to i8*), i8* addrspacecast (i8 addrspace(1)* bitcast (%struct.Instance addrspace(1)* @dx.typevar.2 to i8 addrspace(1)*) to i8*), i8* addrspacecast (i8 addrspace(1)* bitcast (%class.Texture2D addrspace(1)* @dx.typevar.3 to i8 addrspace(1)*) to i8*), i8* addrspacecast (i8 addrspace(1)* bitcast (%"class.Texture2D<float>::mips_type" addrspace(1)* @dx.typevar.4 to i8 addrspace(1)*) to i8*), i8* addrspacecast (i8 addrspace(1)* bitcast (%class.RWStructuredBuffer addrspace(1)* @dx.typevar.5 to i8 addrspace(1)*) to i8*), i8* addrspacecast (i8 addrspace(1)* bitcast (%struct.Uint addrspace(1)* @dx.typevar.6 to i8 addrspace(1)*) to i8*), i8* addrspacecast (i8 addrspace(1)* bitcast (%cb_ addrspace(1)* @dx.typevar.7 to i8 addrspace(1)*) to i8*), i8* addrspacecast (i8 addrspace(1)* bitcast (%dx.alignment.legacy.struct.Cb addrspace(1)* @dx.typevar.8 to i8 addrspace(1)*) to i8*), i8* addrspacecast (i8 addrspace(1)* bitcast (%dx.alignment.legacy.cb_ addrspace(1)* @dx.typevar.9 to i8 addrspace(1)*) to i8*)], section "llvm.metadata"

define void @main() {
entry:
  %occluded__UAV_structbuf = call %dx.types.Handle @dx.op.createHandle(i32 58, i8 1, i32 1, i32 1, i1 false)  ; CreateHandle(resourceClass,rangeId,index,nonUniformIndex)
  %result__UAV_structbuf = call %dx.types.Handle @dx.op.createHandle(i32 58, i8 1, i32 0, i32 0, i1 false)  ; CreateHandle(resourceClass,rangeId,index,nonUniformIndex)
  %depths__texture_2d = call %dx.types.Handle @dx.op.createHandle(i32 58, i8 0, i32 1, i32 1, i1 false)  ; CreateHandle(resourceClass,rangeId,index,nonUniformIndex)
  %instances__texture_structbuf = call %dx.types.Handle @dx.op.createHandle(i32 58, i8 0, i32 0, i32 0, i1 false)  ; CreateHandle(resourceClass,rangeId,index,nonUniformIndex)
  %0 = call i32 @dx.op.threadId.i32(i32 93, i32 0)  ; ThreadId(component)
  %cb__buffer = call %dx.types.Handle @dx.op.createHandle(i32 58, i8 2, i32 0, i32 0, i1 false)  ; CreateHandle(resourceClass,rangeId,index,nonUniformIndex)
  %1 = call %dx.types.Dimensions @dx.op.getDimensions(i32 73, %dx.types.Handle %depths__texture_2d, i32 0)  ; GetDimensions(handle,mipLevel)
  %2 = extractvalue %dx.types.Dimensions %1, 0
  %3 = extractvalue %dx.types.Dimensions %1, 1
  %4 = call %dx.types.CBufRet.i32 @dx.op.cbufferLoadLegacy.i32(i32 60, %dx.types.Handle %cb__buffer, i32 0)  ; CBufferLoadLegacy(handle,regIndex)
  %5 = extractvalue %dx.types.CBufRet.i32 %4, 0
  %cmp = icmp ult i32 %0, %5
  br i1 %cmp, label %if.then, label %if.end.118

if.then:                                          ; preds = %entry
  %BufferLoad = call %dx.types.ResRet.f32 @dx.op.bufferLoad.f32(i32 69, %dx.types.Handle %instances__texture_structbuf, i32 %0, i32 0)  ; BufferLoad(srv,index,wot)
  %6 = extractvalue %dx.types.ResRet.f32 %BufferLoad, 0
  %7 = extractvalue %dx.types.ResRet.f32 %BufferLoad, 1
  %8 = extractvalue %dx.types.ResRet.f32 %BufferLoad, 2
  %BufferLoad149 = call %dx.types.ResRet.f32 @dx.op.bufferLoad.f32(i32 69, %dx.types.Handle %instances__texture_structbuf, i32 %0, i32 12)  ; BufferLoad(srv,index,wot)
  %9 = extractvalue %dx.types.ResRet.f32 %BufferLoad149, 0
  %10 = extractvalue %dx.types.ResRet.f32 %BufferLoad149, 1
  %11 = extractvalue %dx.types.ResRet.f32 %BufferLoad149, 2
  %12 = call %dx.types.CBufRet.f32 @dx.op.cbufferLoadLegacy.f32(i32 60, %dx.types.Handle %cb__buffer, i32 1)  ; CBufferLoadLegacy(handle,regIndex)
  %13 = extractvalue %dx.types.CBufRet.f32 %12, 2
  %14 = extractvalue %dx.types.CBufRet.f32 %12, 3
  %15 = call %dx.types.CBufRet.f32 @dx.op.cbufferLoadLegacy.f32(i32 60, %dx.types.Handle %cb__buffer, i32 2)  ; CBufferLoadLegacy(handle,regIndex)
  %16 = extractvalue %dx.types.CBufRet.f32 %15, 2
  %17 = extractvalue %dx.types.CBufRet.f32 %15, 3
  %18 = call %dx.types.CBufRet.f32 @dx.op.cbufferLoadLegacy.f32(i32 60, %dx.types.Handle %cb__buffer, i32 3)  ; CBufferLoadLegacy(handle,regIndex)
  %19 = extractvalue %dx.types.CBufRet.f32 %18, 2
  %20 = extractvalue %dx.types.CBufRet.f32 %18, 3
  %21 = call %dx.types.CBufRet.f32 @dx.op.cbufferLoadLegacy.f32(i32 60, %dx.types.Handle %cb__buffer, i32 4)  ; CBufferLoadLegacy(handle,regIndex)
  %22 = extractvalue %dx.types.CBufRet.f32 %21, 2
  %23 = extractvalue %dx.types.CBufRet.f32 %21, 3
  %24 = extractvalue %dx.types.CBufRet.f32 %12, 1
  %25 = extractvalue %dx.types.CBufRet.f32 %15, 1
  %26 = extractvalue %dx.types.CBufRet.f32 %18, 1
  %27 = extractvalue %dx.types.CBufRet.f32 %21, 1
  %28 = extractvalue %dx.types.CBufRet.f32 %12, 0
  %29 = extractvalue %dx.types.CBufRet.f32 %15, 0
  %30 = extractvalue %dx.types.CBufRet.f32 %18, 0
  %31 = extractvalue %dx.types.CBufRet.f32 %21, 0
  br label %for.body

for.body:                                         ; preds = %for.inc, %if.then
  %pMin.0.i1204 = phi float [ 2.000000e+00, %if.then ], [ %pMin.1.i1, %for.inc ]
  %pMin.0.i0203 = phi float [ 2.000000e+00, %if.then ], [ %pMin.1.i0, %for.inc ]
  %pMax.0.i1202 = phi float [ -2.000000e+00, %if.then ], [ %pMax.1.i1, %for.inc ]
  %pMax.0.i0201 = phi float [ -2.000000e+00, %if.then ], [ %pMax.1.i0, %for.inc ]
  %vCount.0200 = phi i32 [ 0, %if.then ], [ %vCount.1, %for.inc ]
  %maxDepth.0199 = phi float [ 0.000000e+00, %if.then ], [ %maxDepth.1, %for.inc ]
  %c.0198 = phi i32 [ 0, %if.then ], [ %inc31, %for.inc ]
  %and = and i32 %c.0198, 1
  %tobool5 = icmp ne i32 %and, 0
  %.sink.i0 = select i1 %tobool5, float %6, float %9
  %and6 = and i32 %c.0198, 2
  %tobool7 = icmp ne i32 %and6, 0
  %.sink147.i1 = select i1 %tobool7, float %7, float %10
  %and14 = and i32 %c.0198, 4
  %tobool15 = icmp ne i32 %and14, 0
  %.sink148.i2 = select i1 %tobool15, float %8, float %11
  %32 = fmul fast float %13, %.sink.i0
  %33 = fmul fast float %16, %.sink147.i1
  %34 = fadd fast float %33, %32
  %35 = fmul fast float %19, %.sink148.i2
  %36 = fadd fast float %34, %35
  %37 = fadd fast float %36, %22
  %38 = fmul fast float %14, %.sink.i0
  %39 = fmul fast float %17, %.sink147.i1
  %40 = fadd fast float %39, %38
  %41 = fmul fast float %20, %.sink148.i2
  %42 = fadd fast float %40, %41
  %43 = fadd fast float %42, %23
  %cmp23 = fcmp fast olt float %37, %43
  br i1 %cmp23, label %if.then.26, label %for.inc

if.then.26:                                       ; preds = %for.body
  %44 = fmul fast float %24, %.sink.i0
  %45 = fmul fast float %25, %.sink147.i1
  %46 = fadd fast float %45, %44
  %47 = fmul fast float %26, %.sink148.i2
  %48 = fadd fast float %46, %47
  %49 = fadd fast float %48, %27
  %50 = fmul fast float %28, %.sink.i0
  %51 = fmul fast float %29, %.sink147.i1
  %52 = fadd fast float %51, %50
  %53 = fmul fast float %30, %.sink148.i2
  %54 = fadd fast float %52, %53
  %55 = fadd fast float %54, %31
  %div.i0 = fdiv fast float %55, %43
  %div.i1 = fdiv fast float %49, %43
  %inc = add i32 %vCount.0200, 1
  %FMin = call float @dx.op.binary.f32(i32 35, float %pMin.0.i0203, float %div.i0)  ; FMin(a,b)
  %FMin151 = call float @dx.op.binary.f32(i32 35, float %pMin.0.i1204, float %div.i1)  ; FMin(a,b)
  %FMax = call float @dx.op.binary.f32(i32 34, float %pMax.0.i0201, float %div.i0)  ; FMax(a,b)
  %FMax150 = call float @dx.op.binary.f32(i32 34, float %pMax.0.i1202, float %div.i1)  ; FMax(a,b)
  %div29 = fdiv fast float %37, %43
  %FMax160 = call float @dx.op.binary.f32(i32 34, float %maxDepth.0199, float %div29)  ; FMax(a,b)
  br label %for.inc

for.inc:                                          ; preds = %if.then.26, %for.body
  %maxDepth.1 = phi float [ %FMax160, %if.then.26 ], [ %maxDepth.0199, %for.body ]
  %vCount.1 = phi i32 [ %inc, %if.then.26 ], [ %vCount.0200, %for.body ]
  %pMax.1.i0 = phi float [ %FMax, %if.then.26 ], [ %pMax.0.i0201, %for.body ]
  %pMax.1.i1 = phi float [ %FMax150, %if.then.26 ], [ %pMax.0.i1202, %for.body ]
  %pMin.1.i0 = phi float [ %FMin, %if.then.26 ], [ %pMin.0.i0203, %for.body ]
  %pMin.1.i1 = phi float [ %FMin151, %if.then.26 ], [ %pMin.0.i1204, %for.body ]
  %inc31 = add nuw nsw i32 %c.0198, 1
  %cmp2 = icmp eq i32 %inc31, 8
  br i1 %cmp2, label %for.end, label %for.body

for.end:                                          ; preds = %for.inc
  %pMin.1.i1.lcssa = phi float [ %pMin.1.i1, %for.inc ]
  %pMin.1.i0.lcssa = phi float [ %pMin.1.i0, %for.inc ]
  %pMax.1.i1.lcssa = phi float [ %pMax.1.i1, %for.inc ]
  %pMax.1.i0.lcssa = phi float [ %pMax.1.i0, %for.inc ]
  %vCount.1.lcssa = phi i32 [ %vCount.1, %for.inc ]
  %maxDepth.1.lcssa = phi float [ %maxDepth.1, %for.inc ]
  %cmp32 = icmp ne i32 %vCount.1.lcssa, 8
  %.maxDepth.0 = select i1 %cmp32, float 1.000000e+00, float %maxDepth.1.lcssa
  %cmp37 = icmp ne i32 %vCount.1.lcssa, 0
  %frombool = zext i1 %cmp37 to i8
  %cmp40.i0 = fcmp fast oge float %pMin.1.i0.lcssa, 1.000000e+00
  %cmp40.i1 = fcmp fast oge float %pMin.1.i1.lcssa, 1.000000e+00
  %56 = or i1 %cmp40.i0, %cmp40.i1
  %57 = and i1 %cmp37, %56
  %visible.0 = select i1 %57, i8 0, i8 %frombool
  %tobool45 = icmp ne i8 %visible.0, 0
  %cmp46.i0 = fcmp fast ole float %pMax.1.i0.lcssa, -1.000000e+00
  %cmp46.i1 = fcmp fast ole float %pMax.1.i1.lcssa, -1.000000e+00
  %58 = or i1 %cmp46.i0, %cmp46.i1
  %59 = and i1 %58, %tobool45
  %.visible.0 = select i1 %59, i8 0, i8 %visible.0
  %tobool51 = icmp ne i8 %.visible.0, 0
  %60 = extractvalue %dx.types.CBufRet.i32 %4, 2
  %tobool52 = icmp ne i32 %60, 0
  %61 = and i1 %tobool51, %tobool52
  br i1 %61, label %if.then.54, label %if.end.107

if.then.54:                                       ; preds = %for.end
  %FMax152 = call float @dx.op.binary.f32(i32 34, float %pMin.1.i0.lcssa, float -1.000000e+00)  ; FMax(a,b)
  %FMax153 = call float @dx.op.binary.f32(i32 34, float %pMin.1.i1.lcssa, float -1.000000e+00)  ; FMax(a,b)
  %FMin154 = call float @dx.op.binary.f32(i32 35, float %FMax152, float 1.000000e+00)  ; FMin(a,b)
  %FMin155 = call float @dx.op.binary.f32(i32 35, float %FMax153, float 1.000000e+00)  ; FMin(a,b)
  %FMax156 = call float @dx.op.binary.f32(i32 34, float %pMax.1.i0.lcssa, float -1.000000e+00)  ; FMax(a,b)
  %FMax157 = call float @dx.op.binary.f32(i32 34, float %pMax.1.i1.lcssa, float -1.000000e+00)  ; FMax(a,b)
  %FMin158 = call float @dx.op.binary.f32(i32 35, float %FMax156, float 1.000000e+00)  ; FMin(a,b)
  %FMin159 = call float @dx.op.binary.f32(i32 35, float %FMax157, float 1.000000e+00)  ; FMin(a,b)
  %mul58.i0 = fmul fast float %FMin154, 5.000000e-01
  %mul58.i1 = fmul fast float %FMin155, 5.000000e-01
  %add.i0 = fadd fast float %mul58.i0, 5.000000e-01
  %add.i1176 = fsub fast float 5.000000e-01, %mul58.i1
  %sub.i0 = add i32 %2, -1
  %sub.i1 = add i32 %3, -1
  %conv.i0 = uitofp i32 %sub.i0 to float
  %conv.i1 = uitofp i32 %sub.i1 to float
  %mul59.i0 = fmul fast float %add.i0, %conv.i0
  %mul59.i1 = fmul fast float %add.i1176, %conv.i1
  %Round_z = call float @dx.op.unary.f32(i32 28, float %mul59.i0)  ; Round_z(value)
  %Round_z161 = call float @dx.op.unary.f32(i32 28, float %mul59.i1)  ; Round_z(value)
  %conv61.i0 = fptosi float %Round_z to i32
  %conv61.i1 = fptosi float %Round_z161 to i32
  %mul62.i0 = fmul fast float %FMin158, 5.000000e-01
  %mul62.i1 = fmul fast float %FMin159, 5.000000e-01
  %add63.i0 = fadd fast float %mul62.i0, 5.000000e-01
  %add63.i1177 = fsub fast float 5.000000e-01, %mul62.i1
  %mul66.i0 = fmul fast float %add63.i0, %conv.i0
  %mul66.i1 = fmul fast float %add63.i1177, %conv.i1
  %Round_z162 = call float @dx.op.unary.f32(i32 28, float %mul66.i0)  ; Round_z(value)
  %Round_z163 = call float @dx.op.unary.f32(i32 28, float %mul66.i1)  ; Round_z(value)
  %conv68.i0 = fptosi float %Round_z162 to i32
  %conv68.i1 = fptosi float %Round_z163 to i32
  %sub70.i0.186 = sub i32 %conv68.i0, %conv61.i0
  %sub70.i1.187 = sub i32 %conv61.i1, %conv68.i1
  %62 = or i32 %sub70.i1.187, %sub70.i0.186
  %63 = icmp ugt i32 %62, 1
  br i1 %63, label %while.body.preheader, label %for.cond.76.preheader

while.body.preheader:                             ; preds = %if.then.54
  br label %while.body

for.cond.76.preheader.loopexit:                   ; preds = %while.body
  %shr74.i1.lcssa = phi i32 [ %shr74.i1, %while.body ]
  %shr74.i0.lcssa = phi i32 [ %shr74.i0, %while.body ]
  %shr.i1.lcssa = phi i32 [ %shr.i1, %while.body ]
  %shr.i0.lcssa = phi i32 [ %shr.i0, %while.body ]
  %inc73.lcssa = phi i32 [ %inc73, %while.body ]
  br label %for.cond.76.preheader

for.cond.76.preheader:                            ; preds = %for.cond.76.preheader.loopexit, %if.then.54
  %mip.0.lcssa = phi i32 [ 0, %if.then.54 ], [ %inc73.lcssa, %for.cond.76.preheader.loopexit ]
  %umax.0.i1.lcssa = phi i32 [ %conv61.i1, %if.then.54 ], [ %shr.i1.lcssa, %for.cond.76.preheader.loopexit ]
  %umax.0.i0.lcssa = phi i32 [ %conv68.i0, %if.then.54 ], [ %shr.i0.lcssa, %for.cond.76.preheader.loopexit ]
  %umin.0.i1.lcssa = phi i32 [ %conv68.i1, %if.then.54 ], [ %shr74.i1.lcssa, %for.cond.76.preheader.loopexit ]
  %umin.0.i0.lcssa = phi i32 [ %conv61.i0, %if.then.54 ], [ %shr74.i0.lcssa, %for.cond.76.preheader.loopexit ]
  %cmp77.181 = icmp sgt i32 %umin.0.i1.lcssa, %umax.0.i1.lcssa
  %cmp83.178 = icmp sgt i32 %umin.0.i0.lcssa, %umax.0.i0.lcssa
  %or.cond = or i1 %cmp77.181, %cmp83.178
  br i1 %or.cond, label %for.end.98, label %for.body.86.lr.ph.preheader

for.body.86.lr.ph.preheader:                      ; preds = %for.cond.76.preheader
  br label %for.body.86.lr.ph

while.body:                                       ; preds = %while.body, %while.body.preheader
  %mip.0192 = phi i32 [ %inc73, %while.body ], [ 0, %while.body.preheader ]
  %umax.0.i1191 = phi i32 [ %shr.i1, %while.body ], [ %conv61.i1, %while.body.preheader ]
  %umax.0.i0190 = phi i32 [ %shr.i0, %while.body ], [ %conv68.i0, %while.body.preheader ]
  %umin.0.i1189 = phi i32 [ %shr74.i1, %while.body ], [ %conv68.i1, %while.body.preheader ]
  %umin.0.i0188 = phi i32 [ %shr74.i0, %while.body ], [ %conv61.i0, %while.body.preheader ]
  %inc73 = add i32 %mip.0192, 1
  %shr.i0 = ashr i32 %umax.0.i0190, 1
  %shr.i1 = ashr i32 %umax.0.i1191, 1
  %shr74.i0 = ashr i32 %umin.0.i0188, 1
  %shr74.i1 = ashr i32 %umin.0.i1189, 1
  %sub70.i0 = sub nsw i32 %shr.i0, %shr74.i0
  %sub70.i1 = sub nsw i32 %shr.i1, %shr74.i1
  %64 = or i32 %sub70.i1, %sub70.i0
  %65 = icmp ugt i32 %64, 1
  br i1 %65, label %while.body, label %for.cond.76.preheader.loopexit

for.body.86.lr.ph:                                ; preds = %for.inc.96, %for.body.86.lr.ph.preheader
  %v.0183 = phi i32 [ %inc97, %for.inc.96 ], [ %umin.0.i1.lcssa, %for.body.86.lr.ph.preheader ]
  %visible.2182 = phi i8 [ %frombool92.lcssa, %for.inc.96 ], [ 0, %for.body.86.lr.ph.preheader ]
  br label %for.body.86

for.body.86:                                      ; preds = %for.body.86, %for.body.86.lr.ph
  %u.0180 = phi i32 [ %umin.0.i0.lcssa, %for.body.86.lr.ph ], [ %inc94, %for.body.86 ]
  %visible.3179 = phi i8 [ %visible.2182, %for.body.86.lr.ph ], [ %frombool92, %for.body.86 ]
  %66 = and i8 %visible.3179, 1
  %tobool87 = icmp ne i8 %66, 0
  %TextureLoad = call %dx.types.ResRet.f32 @dx.op.textureLoad.f32(i32 67, %dx.types.Handle %depths__texture_2d, i32 %mip.0.lcssa, i32 %u.0180, i32 %v.0183, i32 undef, i32 undef, i32 undef, i32 undef)  ; TextureLoad(srv,mipLevelOrSampleCount,coord0,coord1,coord2,offset0,offset1,offset2)
  %67 = extractvalue %dx.types.ResRet.f32 %TextureLoad, 0
  %cmp90 = fcmp fast oge float %.maxDepth.0, %67
  %68 = or i1 %tobool87, %cmp90
  %frombool92 = zext i1 %68 to i8
  %inc94 = add nsw i32 %u.0180, 1
  %cmp83 = icmp slt i32 %u.0180, %umax.0.i0.lcssa
  br i1 %cmp83, label %for.body.86, label %for.inc.96

for.inc.96:                                       ; preds = %for.body.86
  %frombool92.lcssa = phi i8 [ %frombool92, %for.body.86 ]
  %inc97 = add nsw i32 %v.0183, 1
  %cmp77 = icmp slt i32 %v.0183, %umax.0.i1.lcssa
  br i1 %cmp77, label %for.body.86.lr.ph, label %for.end.98.loopexit

for.end.98.loopexit:                              ; preds = %for.inc.96
  %frombool92.lcssa.lcssa = phi i8 [ %frombool92.lcssa, %for.inc.96 ]
  br label %for.end.98

for.end.98:                                       ; preds = %for.end.98.loopexit, %for.cond.76.preheader
  %visible.2.lcssa = phi i8 [ 0, %for.cond.76.preheader ], [ %frombool92.lcssa.lcssa, %for.end.98.loopexit ]
  %tobool99 = icmp eq i8 %visible.2.lcssa, 0
  br i1 %tobool99, label %if.then.100, label %if.end.107

if.then.100:                                      ; preds = %for.end.98
  %AtomicAdd = call i32 @dx.op.atomicBinOp.i32(i32 81, %dx.types.Handle %occluded__UAV_structbuf, i32 0, i32 1, i32 0, i32 undef, i32 1)  ; AtomicBinOp(handle,atomicOp,offset0,offset1,offset2,newValue)
  %add103 = add i32 %AtomicAdd, 5
  call void @dx.op.bufferStore.i32(i32 70, %dx.types.Handle %occluded__UAV_structbuf, i32 %add103, i32 0, i32 %0, i32 undef, i32 undef, i32 undef, i8 1)  ; BufferStore(uav,coord0,coord1,value0,value1,value2,value3,mask)
  br label %if.end.107

if.end.107:                                       ; preds = %if.then.100, %for.end.98, %for.end
  %visible.4 = phi i8 [ %visible.2.lcssa, %for.end.98 ], [ %visible.2.lcssa, %if.then.100 ], [ %.visible.0, %for.end ]
  %69 = and i8 %visible.4, 1
  %tobool108 = icmp eq i8 %69, 0
  br i1 %tobool108, label %if.end.118, label %if.then.109

if.then.109:                                      ; preds = %if.end.107
  %AtomicAdd164 = call i32 @dx.op.atomicBinOp.i32(i32 81, %dx.types.Handle %result__UAV_structbuf, i32 0, i32 1, i32 0, i32 undef, i32 1)  ; AtomicBinOp(handle,atomicOp,offset0,offset1,offset2,newValue)
  %add114 = add i32 %AtomicAdd164, 5
  call void @dx.op.bufferStore.i32(i32 70, %dx.types.Handle %result__UAV_structbuf, i32 %add114, i32 0, i32 %0, i32 undef, i32 undef, i32 undef, i8 1)  ; BufferStore(uav,coord0,coord1,value0,value1,value2,value3,mask)
  br label %if.end.118

if.end.118:                                       ; preds = %if.then.109, %if.end.107, %entry
  %cmp119 = icmp eq i32 %0, 0
  br i1 %cmp119, label %if.then.122, label %if.end.146

if.then.122:                                      ; preds = %if.end.118
  %70 = call %dx.types.CBufRet.i32 @dx.op.cbufferLoadLegacy.i32(i32 60, %dx.types.Handle %cb__buffer, i32 0)  ; CBufferLoadLegacy(handle,regIndex)
  %71 = extractvalue %dx.types.CBufRet.i32 %70, 1
  call void @dx.op.bufferStore.i32(i32 70, %dx.types.Handle %result__UAV_structbuf, i32 0, i32 0, i32 %71, i32 undef, i32 undef, i32 undef, i8 1)  ; BufferStore(uav,coord0,coord1,value0,value1,value2,value3,mask)
  call void @dx.op.bufferStore.i32(i32 70, %dx.types.Handle %result__UAV_structbuf, i32 2, i32 0, i32 0, i32 undef, i32 undef, i32 undef, i8 1)  ; BufferStore(uav,coord0,coord1,value0,value1,value2,value3,mask)
  call void @dx.op.bufferStore.i32(i32 70, %dx.types.Handle %result__UAV_structbuf, i32 3, i32 0, i32 0, i32 undef, i32 undef, i32 undef, i8 1)  ; BufferStore(uav,coord0,coord1,value0,value1,value2,value3,mask)
  call void @dx.op.bufferStore.i32(i32 70, %dx.types.Handle %result__UAV_structbuf, i32 4, i32 0, i32 0, i32 undef, i32 undef, i32 undef, i8 1)  ; BufferStore(uav,coord0,coord1,value0,value1,value2,value3,mask)
  %72 = call %dx.types.CBufRet.i32 @dx.op.cbufferLoadLegacy.i32(i32 60, %dx.types.Handle %cb__buffer, i32 0)  ; CBufferLoadLegacy(handle,regIndex)
  %73 = extractvalue %dx.types.CBufRet.i32 %72, 1
  call void @dx.op.bufferStore.i32(i32 70, %dx.types.Handle %occluded__UAV_structbuf, i32 0, i32 0, i32 %73, i32 undef, i32 undef, i32 undef, i8 1)  ; BufferStore(uav,coord0,coord1,value0,value1,value2,value3,mask)
  call void @dx.op.bufferStore.i32(i32 70, %dx.types.Handle %occluded__UAV_structbuf, i32 2, i32 0, i32 0, i32 undef, i32 undef, i32 undef, i8 1)  ; BufferStore(uav,coord0,coord1,value0,value1,value2,value3,mask)
  call void @dx.op.bufferStore.i32(i32 70, %dx.types.Handle %occluded__UAV_structbuf, i32 3, i32 0, i32 0, i32 undef, i32 undef, i32 undef, i8 1)  ; BufferStore(uav,coord0,coord1,value0,value1,value2,value3,mask)
  call void @dx.op.bufferStore.i32(i32 70, %dx.types.Handle %occluded__UAV_structbuf, i32 4, i32 0, i32 0, i32 undef, i32 undef, i32 undef, i8 1)  ; BufferStore(uav,coord0,coord1,value0,value1,value2,value3,mask)
  br label %if.end.146

if.end.146:                                       ; preds = %if.then.122, %if.end.118
  ret void
}

; Function Attrs: nounwind readnone
declare i32 @dx.op.threadId.i32(i32, i32) #0

; Function Attrs: nounwind readonly
declare %dx.types.Handle @dx.op.createHandle(i32, i8, i32, i32, i1) #1

; Function Attrs: nounwind readonly
declare %dx.types.CBufRet.i32 @dx.op.cbufferLoadLegacy.i32(i32, %dx.types.Handle, i32) #1

; Function Attrs: nounwind readonly
declare %dx.types.CBufRet.f32 @dx.op.cbufferLoadLegacy.f32(i32, %dx.types.Handle, i32) #1

; Function Attrs: nounwind readonly
declare %dx.types.Dimensions @dx.op.getDimensions(i32, %dx.types.Handle, i32) #1

; Function Attrs: nounwind readonly
declare %dx.types.ResRet.f32 @dx.op.bufferLoad.f32(i32, %dx.types.Handle, i32, i32) #1

; Function Attrs: nounwind readnone
declare float @dx.op.binary.f32(i32, float, float) #0

; Function Attrs: nounwind readnone
declare float @dx.op.unary.f32(i32, float) #0

; Function Attrs: nounwind readonly
declare %dx.types.ResRet.f32 @dx.op.textureLoad.f32(i32, %dx.types.Handle, i32, i32, i32, i32, i32, i32, i32) #1

; Function Attrs: nounwind
declare i32 @dx.op.atomicBinOp.i32(i32, %dx.types.Handle, i32, i32, i32, i32, i32) #2

; Function Attrs: nounwind
declare void @dx.op.bufferStore.i32(i32, %dx.types.Handle, i32, i32, i32, i32, i32, i32, i8) #2

attributes #0 = { nounwind readnone }
attributes #1 = { nounwind readonly }
attributes #2 = { nounwind }

!llvm.ident = !{!0}
!dx.valver = !{!1}
!dx.version = !{!2}
!dx.shaderModel = !{!3}
!dx.resources = !{!4}
!dx.typeAnnotations = !{!16, !39}
!dx.entryPoints = !{!43}

!0 = !{!"clang version 3.7 (tags/RELEASE_370/final)"}
!1 = !{i32 1, i32 0}
!2 = !{i32 0, i32 7}
!3 = !{!"cs", i32 6, i32 0}
!4 = !{!5, !10, !14, null}
!5 = !{!6, !8}
!6 = !{i32 0, %class.StructuredBuffer* undef, !"instances_", i32 0, i32 0, i32 1, i32 12, i32 0, !7}
!7 = !{i32 1, i32 24}
!8 = !{i32 1, %class.Texture2D* undef, !"depths_", i32 0, i32 1, i32 1, i32 2, i32 0, !9}
!9 = !{i32 0, i32 9}
!10 = !{!11, !13}
!11 = !{i32 0, %class.RWStructuredBuffer* undef, !"result_", i32 0, i32 0, i32 1, i32 12, i1 false, i1 false, i1 false, !12}
!12 = !{i32 1, i32 4}
!13 = !{i32 1, %class.RWStructuredBuffer* undef, !"occluded_", i32 0, i32 1, i32 1, i32 12, i1 false, i1 false, i1 false, !12}
!14 = !{!15}
!15 = !{i32 0, %dx.alignment.legacy.cb_* undef, !"cb_", i32 0, i32 0, i32 1, i32 80, null}
!16 = !{i32 0, %struct.Cb addrspace(1)* @dx.typevar.0, !17, %class.StructuredBuffer addrspace(1)* @dx.typevar.1, !24, %struct.Instance addrspace(1)* @dx.typevar.2, !26, %class.Texture2D addrspace(1)* @dx.typevar.3, !29, %"class.Texture2D<float>::mips_type" addrspace(1)* @dx.typevar.4, !32, %class.RWStructuredBuffer addrspace(1)* @dx.typevar.5, !34, %struct.Uint addrspace(1)* @dx.typevar.6, !35, %cb_ addrspace(1)* @dx.typevar.7, !37, %dx.alignment.legacy.struct.Cb addrspace(1)* @dx.typevar.8, !17, %dx.alignment.legacy.cb_ addrspace(1)* @dx.typevar.9, !37}
!17 = !{i32 80, !18, !19, !20, !21, !22}
!18 = !{i32 6, !"instanceCount", i32 3, i32 0, i32 7, i32 5}
!19 = !{i32 6, !"indexPerInstance", i32 3, i32 4, i32 7, i32 5}
!20 = !{i32 6, !"useOcclusion", i32 3, i32 8, i32 7, i32 5}
!21 = !{i32 6, !"pad", i32 3, i32 12, i32 7, i32 5}
!22 = !{i32 6, !"viewProj", i32 2, !23, i32 3, i32 16, i32 7, i32 9}
!23 = !{i32 4, i32 4, i32 2}
!24 = !{i32 28, !25}
!25 = !{i32 6, !"h", i32 3, i32 0}
!26 = !{i32 28, !27, !28}
!27 = !{i32 6, !"min", i32 3, i32 0, i32 7, i32 9}
!28 = !{i32 6, !"max", i32 3, i32 16, i32 7, i32 9}
!29 = !{i32 8, !30, !31}
!30 = !{i32 6, !"h", i32 3, i32 0, i32 7, i32 9}
!31 = !{i32 6, !"mips", i32 3, i32 4}
!32 = !{i32 4, !33}
!33 = !{i32 6, !"handle", i32 3, i32 0, i32 7, i32 5}
!34 = !{i32 4, !25}
!35 = !{i32 4, !36}
!36 = !{i32 6, !"val", i32 3, i32 0, i32 7, i32 5}
!37 = !{i32 80, !38}
!38 = !{i32 6, !"cb_", i32 3, i32 0}
!39 = !{i32 1, void ()* @main, !40}
!40 = !{!41}
!41 = !{i32 0, !42, !42}
!42 = !{}
!43 = !{void ()* @main, !"main", null, !4, !44}
!44 = !{i32 0, i64 16, i32 4, !45, i32 5, [164 x i8] c"\02\00\00\00\02\00\00\00\18\00\00\00\00\00\00\00\A4\00\00\00\01\00\00\00\02\00\00\00\00\00\00\000\00\00\00\00\00\00\00\00\00\00\00<\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\04\00\00\00D\00\00\00\00\00\00\00\01\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\FF\FF\FF\FF\01\00\00\00\01\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\FF\FF\FF\FF\01\00\00\00\01\00\00\00\01\00\00\00\00\00\00\00\00\00\00\00\FF\FF\FF\FF\00\00\00\00\01\00\00\00\01\00\00\00\00\00\00\00\00\00\00\00\FF\FF\FF\FF"}
!45 = !{i32 64, i32 1, i32 1}

Edited by galop1n

Share this post


Link to post
Share on other sites

Just becauw I'm too lazy to download and try it yet :) --- does the new compiler still support SM2/3/4/5 and the old bytecode formats? Or will my toolchain need to keep using FXC for those?

For what I've read it only produces DXIL and you need Windows 10 with drivers that support DXIL. So take a wild guess...

Though hopefully it won't take long until someone writes a DXIL -> old bytecode converter.

Share this post


Link to post
Share on other sites
Yes, the SM6.0 is documented for a little while, the major change is the intrinsic to control and communicate across a wave : https://msdn.microsoft.com/en-us/library/windows/desktop/mt733232(v=vs.85).aspx

Thanks... actually I've already seen that... but I was looking for documentation on barycentrics and programmable blending.

Edited by Infinisearch

Share this post


Link to post
Share on other sites
Sign in to follow this  

  • Advertisement