Your code is basically requesting 8 corners of a cube, and is calculating trilinear interpolation values between them. Where I think it might be going wrong is that you're calculating the interpolation values in the HLSL, but you're leaving the point sampling to the texture fetch instruction. I suspect there might be some subtle floating point rounding differences that mean that occasionally you're using interpolation values for one cube, but texture samples from another.
I think the solution would be to provide the texture fetch instruction with unambiguous coordinates to eliminate the possibility of a rounding error. Some pseudocode which could replace the first 3 lines of your tex3D_trilinear function:
// t = frac(t); // This line might help if magnitude of t is very large. float3 topCorner = (t - halfTexelSize.xxx) * textureSize; float3 topCornerFloor = floor(topCorner); float3 f = topCorner - topCornerFloor; float3 topCornerUV = topCornerFloor / textureSize; float3 topCornerUV += halfTexelSize.xxx; // This half texel offset means you're attempting to sample the centre of the texel instead of the top-left corner. I think it helps, but if it doesn't, then try without it, or try a 1/4 or 1/3 offset. float4 x = float4(topCornerUV, 0);