Jump to content

  • Log In with Google      Sign In   
  • Create Account

/* Why you crying? */


Posted by , 20 April 2014 - - - - - - · 3,675 views
accidental noise library, perlin and 2 more...
My schedule lately is one that can be summed up by two simple words: crushing exhaustion. A change of career (unfortunate, but necessary) has initiated a period of long shifts and a somewhat higher level of physical labor than I have been accustomed to lately, so that the attendant emotional and physical exhaustion while my body re-adapts to a less sedentary lifestyle has sapped me of any will to work on anything even remotely complicated. Thus, GC has suffered, and suffered greatly.

I have been talking a bit with @evolutional lately, though, as he's been porting ANL to C#. I haven't really dipped into the ANL much lately. It's just a thing that I use, and I've learned to work around any issues that occur. But talking with Oli has kind of gotten me back to thinking about it, and especially about the aspects of it that bug the shit out of me.

ANL was initially deeply inspired by libnoise. I followed the model of chained objects that libnoise uses to construct complex noise expressions from simple building blocks. The idea of it is still sound, I think, but the execution is a little bit flawed. I'm not even remotely any kind of optimization master, not by a damned sight, but even I understand that arbitrarily-allocated objects linked together by pointers and invoked by virtual functions is probably not the most cache friendly and performant scheme you could think of. Not to mention that complex noise expressions can often be optimized by smartly caching intermediate values, but implementing a cache as a module inserted into the chain (as ANL currently does), or by building cache functionality directly into the modules themselves (as ANL has also done, and might still be doing, I don't remember) is not the best way to do it if you want to provide for the possibility of multi-threaded execution. In order to provide for multi-threading, the caching needs to be performed external to the modules and local to the thread so that threads don't step on each other's toes.

Some time back, I had started thinking about an alternative scheme. Instead of building a noise description as a pointer-linked network of objects, I instead visualized it as a basic vector of instructions. The initial idea was to condense the noise expression down to a linear vector of in-order operations that could be fed to a vm to generate the final value, but such a scheme would result in lots of redundant calculations, since it wouldn't take advantage of the early-out provided by, ie, the Select module which selects between one or another source depending on the value of a control source.

So lately, given the fact that I can't really make myself concentrate on anything too technically challenging, I started on an experimental branch of the ANL that encapsulates a noise expression as a vector of opcode or instructions as I had visualized. Rather than evaluating the vector sequentially, however, I use vector-index cross-referencing (similar to how the objects in the existing model would cross-reference each other via pointer) to evaluate source branches to a module before evaluating the module itself, taking advantage of Select's early-out branching if available.

A noise expression is a simple vector of opcodes, with optional data in the form of constant data (float or rgba) and source indexes referencing other opcodes in the vector as sources. The opcode vector is passed to an instance of a sort of virtual machine for evaluation. The VM instance maintains an internal cache vector for locally caching the values of opcodes as they are evaluated.

The switch has necessitated a change to how noise expressions are constructed. This has actually simplified the interface quite significantly. Rather than a whole slew of separate classes for each function type, I now have a single factory/builder class that provides an interface with methods for constructing opcodes or opcode chains--somewhat similar to the existing TreeBuilder interface. By calling methods on this factory class, opcodes are appended to an internal vector. Once the expression is built, then the final vector of opcodes can be passed to a VM for evaluation.

Additionally, I have eliminated the interface of get(x,y), get(x,y,z), get(x,y,z,w), get(x,y,z,w,u,v) variants depending on the dimensionality. In lieu of method overloads, instead a noise expression is now evaluated using a special Coordinate class that specifies the dimensionality of the coordinate. This simplifies the interface, and precludes having to provide different VMs for each coordinate dimensionality.

I have not finished fully implementing all of the functions in the base library yet, but I have done enough to sort of test the concept out, and it feels right to me. Here is a simple example (C++):

int main()
    CTestFactory factory;
    anl::TArray2D<ANLFloatType> img(256,256);

    unsigned int zero=factory.constant(0);
    unsigned int one=factory.constant(1);
    unsigned int negone=factory.constant(-1);
    unsigned int freq=factory.constant(2);

    unsigned int basis=factory.cellularBasis(negone,one,zero,zero,zero,zero,zero,zero,0,123845);
    unsigned int cell=factory.scaleX(factory.scaleY(basis,freq),freq);
    unsigned int fbm1=factory.simplefBm(anl::OP_GradientBasis, 3, 6, 2, 127345);

    unsigned int fbm2=factory.simplefBm(anl::OP_ValueBasis, 3, 6, 3, 546321);
    unsigned int p=factory.translateX(cell, fbm1);
    factory.translateZ(p, fbm2);
    anl::CTestNoiseVM vm(factory.getKernel());

    // Map the kernel function to an image and save PNG
    for(int x=0; x<img.width(); ++x)
        for(int y=0; y<img.height(); ++y)
            // Evaluate the kernel
            anl::CCoordinate coord(((ANLFloatType)x)/((ANLFloatType)img.width()), ((ANLFloatType)y)/((ANLFloatType)img.height()),0);
            anl::SVMOutput out=vm.evaluate(coord);
            ANLFloatType v=out.outfloat_;

    anl::saveDoubleArray("out.png", &img);

    // Print kernel

    return 0;
To start with, a factory is created. The factory encapsulates a std::vector<SInstruction> noise expression kernel, and provides some methods for adding opcodes to the kernel. An image is created for visualizing the output. Then various factory methods are called to build the expression vector, in this case setting up some constants, creating a cellular noise module and a couple of simple fBm fractals, then using the fractals to perturb the cellular function in X and Z. After the kernel is built, a VM is instantiated using the instruction string, then the image is iterated and each pixel colored based on the output of the VM at the specified coordinate.

Doing it this way seems to me like it will better facilitate cache coherency. All of the expression is encapsulated in a sequential vector of instructions that should easily fit entirely in the cache, rather than as widely-spread disparate object instances scattered throughout the heap and linked by pointers. All caching is done locally to the VM, so you could instance any number of VMs in separate threads from the same expression vector. (Ostensibly, anyway; I am not really very skilled or knowledgeable about multi-threading, so it's entirely possible I'm still making mistakes here.)

In the process of this experiment, I am also re-evaluating how certain tasks are accomplished. One area I have simplified things is the Gradient function. A gradient function lets you specify a line segment as two N-dimensional points, and it maps any input coordinate to somewhere on the gradient described by the line passing through both points. Although it's a simple linear equation, the Gradient function has been the single largest source of PM, Facebook and email questions since the ANL became publicly available, especially since my articles on Minecraft- and Terraria-type world generation which use the gradient function as the basis for ground/air delineation.

In the new branch, rather than providing a Gradient function and forcing the user to specify endpoints, instead I provide the much simpler methods named x(), y(), etc... in the factory class, mapping to the opcodes OP_X, OP_Y, etc... These opcodes quite simply operate by returning the corresponding component of the input coordinate. ie, if the VM is called to evaluate the point (0.5, 0.23, 0.76), then the opcode OP_X will return the value of 0.5, or the x coordinate. This provides an easy way to obtain a gradient aligned with one of the 6 axes. If the gradient is required to lie along a vector other than an axis, then it can be rotated using a rotation opcode (only 2D and 3D rotations are planned, as with vanilla ANL, since 4D and 6D rotation is a bitch, and I honestly see no use cases for them.) This hopefully simplifies the gradient functionality enough that I won't be constantly answering questions about it.

In a similar fashion, I have broken out the ScaleDomain and TranslateDomain functions into their component pieces (while still keeping the basic forms as well for now, though they might be going away to remove API clutter). That is, rather than having to specify all of your domain transformations similar to scaleDomain(src, xscale, yscale, zscale, wscale, uscale, vscale), which is horribly redundant if you only want to use, say, xscale and uscale, now you can specify them component-wise: scaleX(src, scale); scaleU(src, uscale);.

There is still a lot to do and test on this, and I'm not sure how long it's going to take given my work schedule. For now, though, I'm going to take this kids over to grandma's for Easter dinner. Happy Easter, everyone.