Jump to content

  • Log In with Google      Sign In   
  • Create Account


Member Since 02 Feb 2013
Offline Last Active Aug 31 2014 11:28 AM

Posts I've Made

In Topic: Error loading S3TC textures (DXT3 and DXT5)

08 June 2014 - 10:08 PM

when I was messing with WebGL (to a very limited extent), I had a similar issue.


I had addressed it mostly by disabling Alpha in the Canvas, as apparently the browser interprets the canvas alpha as alpha-blending with the page background, so for any "transparent" parts of the canvas, it peeks through to the page background.


dunno if this helps.

In Topic: what is most difficult in programming to you?

05 June 2014 - 09:15 AM

Starting on something new ... it's much easier to keep refining the previous project.


yes, this is a big one.


with the old project, generally nearly all the infrastructure is already in place, but with a new project, one is often pretty much back with the bare language, or are using a different language with less familiar APIs (ex: Java, Dart, or Java via GWT, ..., vs me generally much preferring to use C).



likewise, with new design:

I have had much more success with designs which build on prior designs, than with things that are designed entirely new.


like, designs which already exist seem to often have a few advantages:

they are implementable, and generally free of "didn't take something into account, so it is unimplementable as-described or will otherwise be internally overly complex or inefficient" and with a smaller amount of "this feature seemed nice on paper but is nearly useless in practice";

often a lot of the details related to various issues will already be made (people will have fiddled with it, making sure it addresses various special cases, is relatively efficient, ...).


this is not necessarily to say much for "elegance" or "perceived simplicity", where often the more elegant and simple-looking the thing is, the more dragons that lurk in the implementation (hidden complexities and inefficiencies). often times "things which work well" end up very ugly in a design-aesthetics sense (and sometimes with features which seem blatantly counter-intuitive, ...).


this doesn't mean one can't write new code or create new designs, but does often give some favor to going the "embrace and extend" route with the design aspects, and relying more on iteration than on clean design.

In Topic: what is most difficult in programming to you?

03 June 2014 - 11:27 PM

I seem to have fair bit of difficulty with a few major things:

making UIs/artwork/... which doesn't suck;

doing stuff which lacks an obvious way forwards (I can do pretty good at throwing together implementations of various specs or cloning stuff, but often things are a lot slower/painful if I have to try to find the way forward for myself);

trying to do anything "from a clean slate" (vs hacking on or extending things);



some general things I seem to have difficulty with:

thinking "high-level", pretty much my entire world is "whatever I have on hand at the moment" (like, it gets frustrating with people always expecting me to think "higher level", for me, these sorts of "high-level" thoughts simply don't really exist in the first place);

thinking of "the future" as something which actually exists (it always just seems so distant, whereas whatever is going on at the moment, I can see it as it is taking place);

being expected to plan/research/... things in isolation, vs just going and "doing whatever" (and working out the specifics as they come up);



my way forwards is usually just to pile up various things and options / experiences / ..., and see whichever is more usable/promising/works-better/... then each thing seemingly opening the way to the next thing (like, if I do something myself, I have a better feel for what all is involved and how everything works), or if all else fails, lots of fiddling with stuff.

In Topic: Memory allocator. Maintaining free list.

03 June 2014 - 11:40 AM

Something else that may help is if you trade space for speed, and allocate pre-sized blocks of ints, say 32 or 64, and keep all blocks the same size.  If objects request a number of blocks of memory and they handle them internally it becomes a simple problem.  It would take more size of course because there'd be unused numbers, but its a thought.


(if I understand correctly) this is basically what I had meant by a slab allocator.

a slab allocator can be fast, but also reduce memory use by allowing multiple objects to share the same object-header (all of them will implicitly have the same logical-type and share other details).


for 4 byte items, you would allocate space for 256 items (1KiB), and then add them to a free-list of 4 byte items.

for 8 or 16 byte items, it would be similar, just allocating 2KiB or 4KiB instead.


then, for a range of items, you would use a number of free-lists of fixed-size members, say, using something like:


to map the size to a specific free-list.


while this isn't perfect, allocations/frees can generally be made pretty fast (in the case where the free-lists aren't empty).


allocation in this case basically looks like:




        mm_freelist[nb]=*(void **)ptr;



    ... fall back to general-case allocation ...



faster is possible via a few tricks, namely having functions hard-coded for a specific allocation/free size, and using separate per-thread contexts (to avoid a need for locking during free-list allocations). for a general-case allocator, it is usually necessary to use a lock to prevent threads from stomping on each other (and for operations which work with or update the heap).


another trick here is to use pre-made "allocation handles", where one first creates a handle-object describing the type of allocation (size, type, flags, ...), and then does allocations/frees using this handle. internally, the handle may contain function-pointers for the specific allocation/free logic, and some data members that may be used for special free-lists or other things.


these may or may not use slabs, but will typically use a pool-allocation strategy (where dedicated free-lists exist for specific types of object, ...).



this may not necessarily apply to "general" code, but in my case code sometimes pops up which requires high-speed allocation/freeing of typically-small objects (often with specific type-information and special semantics/behaviors).


likewise, these large numbers of small objects comprise a big part of the heap (there are often millions of them in dumped memory-stats, 16-64 byte objects comprising the vast majority of these), vs the relatively small number of significantly-larger allocations (thousands of kB range objects, and tens of MB-range objects).



keeping the object-header small is generally also a goal.


currently, my MM/GC uses a 16-byte header, which basically holds:

    the object type-ID (objects in my main MM each have a qualified type, externally represented as a string giving a type-name);

    the object allocation size;

   data about where the allocation-call was made (useful for debugging);

   reference-counts (for ref-counted data);

   some flags and other things;

   a check-value (used to help detect if the header was corrupted, say by an overrun, *).


a lot of this is basically bit-packed, with a few special header sub-variants (such as for objects currently in the free-list, ...).


it was originally an 8-byte header, but was expanded to 16 partly as:

I wanted to support larger objects (the same header is used by large-objects as well, and would otherwise limit allocations in 64-bit apps to 256MB);

needed space for additional metadata;

many objects require 16-byte alignment anyways (for SSE/etc);



*: detecting corrupt headers is useful for things like heap forensics, where the MM may detect that a header is corrupt and try to dump diagnostic information and try to detect which memory object overflowed or underflowed, and try to make a guess as to where in the codebase to start looking for the offending code, in addition to raising an exception. also, we may free an object and it helps the MM to detect if a given object overran its bounds without requiring additional memory for additional check-values.


though, say, using 12-bits for a check-value, there is a 1/4096 chance for trashed-headers to escape notice, but this can be reduced some by use of additional "sanity checks" (example: does the size in the header fall within the range of what could fit within the given memory-cells and is appropriate for this part of the heap; is the object type-ID and source-ID valid?), but this needs to be balanced with performance (leaving it more as something for heap-analysis).

In Topic: Memory allocator. Maintaining free list.

02 June 2014 - 07:24 AM


for allocating small objects (generally under 1-4kB), I prefer to use bitmaps over fixed-size cells instead of linked-lists.


pros (vs a linked-list allocator):

they tend to have a smaller per-object overhead (no internal pointers needed, but a constant per-space overhead for the bitmaps);

they will implicitly merge adjacent free memory once memory is freed;

object lookup from arbitrary pointers can be made very fast (good for things like dynamic type-checking and similar, which can be made O(1));

basic logic is fairly simple;

easier to add logic to detect/analyze overruns (they will corrupt object data, but will not necessarily cause critical damage to heap metadata, which may be stored elsewhere in memory).



direct allocation/freeing becomes linearly slower depending on object sizes (need to scan/update bitmaps);

high overheads for larger objects (due using lots of bitmap entries / cells);

more limited "effective range" of object sizes (vs being similarly effective over a wide range of sizes, as-is the case for lists).


partial solutions:

use a pool allocator to speed up allocation/freeing (the pool allocator then serves as the primary allocator, and uses the bitmap allocator for backing, where the pool-allocator has free-lists of fixed-size items);

use several different strategies, picking the one most appropriate for the given object size.


for example, in my current main allocator, it works sort of like this (16 byte cells):

* under 16 bytes: use a slab-based allocator (say, allocate memory for items in groups of 256 and pad them to a power-of-2 size);

* 16-6143 bytes: use bitmap allocator (with a pool allocator to make it faster);

* 6kB+: use a "large object" heap, which manages memory objects mostly via sorted arrays pointers and object headers (mostly using binary lookup and binary-insert).


I had at one point considered adding an additional layer of 256-byte cells which would be used for 4kB to 256kB, but had not done so.



though, yes, a linked list works. a few ideas:

pad object sizes up to multiples of a certain value (ex: 8 or 16 bytes), which will help reduce fragmentation;

use free-lists of fixed allocation sizes (avoids the need for searching), or size-ranges (reduces searching);

only merge with next item on free, and/or have a special case to walk the heap and merge adjacent heap items as-needed.