• Content count

  • Joined

  • Last visited

Community Reputation

346 Neutral

About max343

  • Rank
  1.   This is contradictory to what some NVIDIA engineers have been telling me. The procedure I described is not one I invented on my own but rather was given to me by NVIDIA with all the remarks.
  2. For the sake of completeness, number 2 is a bit more complicated than that for the case when A writes to UAV and B reads from it (which is the more widespread way to use async compute).   First things first, fences do not ensure UAV writes by themselves. In essence fence is just waiting for a counter to reach certain value, that counter is set on execution queue, while UAV writes happen elsewhere. So if you have your A (on compute queue), it most likely writes to UAV so it cannot be synchronized by a fence just like that. Second thing to get out of the way is that while the dependency A->B is quite clear, the dependency of B->A is much less clear but it's there (WAR hazard). So you need to synchronize two things, A->B and B->A.   Starting from the easy one, B->A. If all B does is reading (SRV) then it's enough to use a fence to synchronize since all reads happen during or before execution. So when B finishes, the counter increments and that means you're free to write stuff to the buffer. A->B is a bit more complicated since you need to ensure UAV writes. This can be done explicitly by putting a resource barrier of UAV type on the same queue A is executing on, but before incrementing (signaling) the fence. While the actual resource transition happens on queue that B is executed on (due to limitations mentioned in "1").   Remember that buffers are created with implicitly set D3D12_RESOURCE_FLAG_ALLOW_SIMULTANEOUS_ACCESS, so that means the hardware will not prevent access to the same buffer scheduled from two different queues. You need to ensure correctness on your own.
  3. Point inside convex polyhedron defined by planes

    I don't have inequalities in my solution. What I suggested disregards direction because it already assumes that the feasible region of the half spaces defines a convex polyhedron. If we'd flip one of the inequalities, the feasible region would be empty.
  4. Point inside convex polyhedron defined by planes

    But the simplex algorithm doesn't search through interior points, just through the boundary. Phase I just finds a boundary vertex of the feasible region or returns that the problem is infeasible. OP wanted something close to the centroid, this is distinctly an interior point.
  5. Point inside convex polyhedron defined by planes

    I don't think that Simplex algorithm is the best choice here, since it searches for maximum on the boundary, while OP wanted some interior point.
  6. Point inside convex polyhedron defined by planes

    BTW, sometimes it makes more sense to find the center of the circumscribed sphere rather than the centroid (which gives bias to clustered vertices). In this case you can do this: 1. Find x0 as I previously described. 2. Solve the weighted minimization problem with the weights: Wi = |Ni*x0-Di|
  7. Point inside convex polyhedron defined by planes

    I haven't tested it, but maybe you can do this: Your plane equations are: Ni*x-Di, with Ni normalized row vectors, Di are scalars, and x is a column vector. Now minimize the sum of squared distances with respect to x: S=sum[(Ni*x-Di)^2] Obtain: x=((sum[Ni^T*Ni])^+) * sum[Di*Ni^T] I'm almost sure that the pseudo-inverse is not required (and you can take the regular inverse), because this reminds me a lot of SVD. However, this is just a hunch. EDIT: Yup, the matrix should be invertible for any closed polyhedrons. No need for pseudo-inverse.
  8. Soft body physics

    I'm doubtful that introductory calculus/linear algebra is sufficient if you want rigor. All theory of deformable bodies is based on tensor calculus.
  9. Exactly. For instance, pure scaling and translation have only three meaningful elements in them. You can use this to reduce the number of floating point operations drastically. Only thing to note is to use SIMD operations rather than scalar.
  10. You can optimize this bit of code. All those multiplications are of specific matrices, and can be implemented in a faster way than general matrix multiplication.
  11. Magnet Physic's Problem

    Magnets behave differently than harmonic oscillators. In fact Lorentz force is much more complicated than just any restoring force. You can qualitatively understand the interaction process between two magnets by modeling them with currents in closed loops.<br />Modeling this effect with a damped harmonic oscillator is just a nasty shortcut, nothing more.
  12. Magnet Physic's Problem

    What kind of repulsing force you're interested in? In other words, what is the nature of the force depicted by the yellow arrow? If you don't really care about the nature of the force you can use damped harmonic oscillator to achieve the wanted effect. Otherwise you need to be more specific.
  13. Best way to downscale a heightmap during runtime

    Completely agree. Even if the OP isn't going to use wavelets in this case, knowing even the simplest transform (Haar) gives a whole new perspective on the frequency domain.
  14. Best way to downscale a heightmap during runtime

    Wavelet transform might work for you, but you should know that it doesn't differ from the mean by much. Linear filters are notorious for eliminating a lot of important data (mostly transitions), while all you want is to eliminate small bumps with is as little effect on the other data as possible. You should really consider non-linear filters, like the bilateral filter or the more general nlm filter. Efficient implementation of these two is a bit tricky but doable as long as you don't go wild with the filter radius. Also, don't even try to implement them in the naive way, they'll be terribly slow.
  15. Fast sin and cos functions

    BTW, a follow up to my previous post. I totally omitted it, but you can obtain even more precision by employing the same divide&conquer technique to the seventh degree Taylor approximation of sin(x) (essentially it's an eighth degree approximation) restricted to [-pi/2,pi/2]. Just factor the approximation into: (531.7555982144694 + x^2 * (x^2 - 32.521961561056756))*(x^2 - 9.47803843894324)*x/5040 Now perform all independent operations first. This includes computing x^2 very early and reusing it. Sadly, in this way the quadratic term (after x^2=y) cannot be factored into linear terms, but it's still quite efficient.