Sign in to follow this  
StakFallT

Question about dlmalloc

Recommended Posts

Quick question, anyone familiar with Doug Lea's dlmalloc, able to tell me what the variable 'nb' stands for? I know it is declared as size_t, but what does it stand for? I'm trying to understand the code, but I can't track the logic of the code while looking through it because I don't know what it stands for. So far my theories are:

new bin
node bin
new bytes
node bytes
normal bin
nBytes (I don't think this is it though, as I've seen that declared elsewhere.)

As you can see the possibilities are endless, and I'm just going around and around trying to figure it out.

Edit: (That occurred before submittion) I tried to keep this short (It would have ended after that last sentence), but trying to anticipate, I think in keeping details out, people are going to read this and be like "Why would he think the name would be anything other than its data type?"; so unfortunately I think elaboration is needed... *sigh* maybe one of these times I'll have a short posting that I don't think will lead to questions about my question [img]http://public.gamedev.net//public/style_emoticons/default/sad.png[/img] )

[b][u]The implicit assumption being made:[/u][/b]
Sometimes it's irrelevant to call a variable name a name based on the type, as every variable has to be declared as a type and that can be figured out by either finding the data type itself or easier yet, many modern IDEs will tell you the type by highlighting over it, making the name utterly and absolutely useless if it's name is some form of its datatype. Given this, one can maximize the information readily available by naming it to the [i]slightly[/i] bigger picture. By taking advantage of this fact ([i]That each variable has to be datatyped and IDEs can tell you easily[/i]) and naming the variable something [i]along the lines[/i] of what object is going to be using it/referring to it, you can squeeze a lot of discernible information in relatively small locality. It only really leads to logic-ambiguity when most of the variables in close code-location use this method of "meta-naming" (Don't know of any other better term to use for the idea of naming to use, rather than naming to datatype.)

[u][b]A concrete example of why the implicit assumption [i]can[/i] make sense:[/b][/u]
I understand, on the surface, new bin (Or even node bin) might not make sense, as nb is declared as a size -- i.e. why call it anything but it's datatype in this sense... I'll give a more concrete example of what I'm trying to discern. Let's assume (For the moment) that nb stands for new bin (Despite nb being declared as a size datatype). That tells me a great deal actually -- Ok, so nb is of a size type, but knowing it's a new bin, I can look at it and tell not only that it's immediate use is to hold a size value but that what I'm looking at is probably code that is spinning up some preliminary work (Maybe to tell/check what the size of the new bin will be; sorta like a throw-away variable, but that may actually get used so it's not completely a waste.) prior to actually creating the new bin.

I understand that this may be considered a bad naming scheme as it could lead to ambiguity, but in some ways I think it makes more sense actually (Properly used that ), a data type name doesn't exactly tell you what it will be used with, only what data type it holds. Also, consider the fact that the code is over 5k lines and a two letter variable is used about (literally) 100 times; and when someone is looking at the code, a two letter variable doesn't cut it.

I can look at the small blocks of code and follow the immediate purpose of the small blocks obviously, but without going after an understanding of the middle-sized picture, looking at the small blocks of code don't help. I'm trying to connect the pieces in my head. I have found some PDFs describing the overall picture of the library itself, which I think I'm beyond; since the most useful information out of the PDFs I found is a hierarchy chart visually depicting what terminology goes where (Chunks, bins, mspaces, etc.). I'm in that middle-sized step to understanding the code. Essentially, seeing how each of the functions fit together. Right now, I can't figure out what the functions are doing until I know more about nb, more specifically, what it stands for ([i]As silly as that may seem.[/i]). As for any questions that may come after knowing the answer, I'll cross that bridge when I get there.

Really sorry about another long post, I start out short and then foresee potential questions that some may have, so then I start to fill in; in an attempt to wrap up my threads quickly (As back and forths cause bumps over others). Thanks in advance though!

Share this post


Link to post
Share on other sites
According to a quick Google search:

[quote]
When a user requests req bytes of dynamic memory (via malloc(3) or realloc(3) for example), dlmalloc first calls request2size() in order to convert req to a usable size [b]nb[/b] (the effective size of the allocated chunk of memory, including overhead). The request2size() macro could just add 8 bytes (the size of the prev_size and size fields stored in front of the allocated chunk) to req and therefore look like this:

[i]#define request2size( req, nb ) \[/i]
[i] ( nb = (req) + SIZE_SZ + SIZE_SZ )[/i]
[/quote]
Source: [url="http://www.phrack.org/archives/57/p57_0x08_Vudo%20malloc%20tricks_by_MaXX.txt"]http://www.phrack.or...cks_by_MaXX.txt[/url]

Looks like "number of [actual] bytes" or some variant thereof.

Share this post


Link to post
Share on other sites
Yeah that was one of the documents I found. I think the problem was I doing a search on "nb" inside each document I found, and was coming back with over 100 locations. So going through each one, anytime I hit a paragraph I would look at the words just before and after "nb" and see if it seemed like it was getting to the point -- if it wasn't I chalked it up to not being a definition. If I had read the first part of that paragraph I -definitely- would have moved on "When a user requests req bytes" is all I would have read before hitting the find next button. I think for this particular paragraph I saw "nb (the effect size of the allocated chunk" and chalked it up to close to what I was looking for but needing more and then continuing to move on, completely missing the part about request2size being used to convert a request amount to nb.

There is still some ambiguity though. By the sounds of it, nb is first spun up before any chunks are referenced and is then later used to create the chunk; i.e. it is NOT a variable that holds data that is pulled [i]out of the chunk after[/i] it has been created, rather it is used to create the chunk in the first place. Another way of putting it, it's only used in on direction -- i.e. It's not used for BOTH creation and for later reading, it's used for one direction, and that direction is creation. So it exists before the chunk does. Is that how you read it?

[u]What I've pieced together so far, from what you referred me to and what I have found:[/u]
I think it's starting to come together a little for me. By the sounds of it, (Combining the pdf at:
[url="http://pubs.doc.ic.ac.uk/GCspy/GCspy.pdf"]http://pubs.doc.ic.a...GCspy/GCspy.pdf[/url] - page 2), chunks are the base, and broken up into varying sized small bins (Under the small bin container) and treebins (Under the treebins container -- tree bins containing chunks larger than 256 bytes, and keyed via their size in binary form); the chunks are created by dlmalloc which when called, it has to first call request2size() to get the "actualized" size, this size ([i]called nb[/i]) is the actual size used when specifying the size to the allocation function which is some relay function for either mmap or mmap emulation (emulation on Win32 -- which boils down to the use of VirtualAlloc). MSpaces are individualized memory allocations, that have their own allocators.

[u]What's still left for me:[/u]
Now two things remain:
1) What does nb stand for?
2) To understand better how the idea of one large chunk that is broken up into smaller pieces square with all of this when some functions deal with contiguous memory and some deal with non-contiguous memory, as how such a choice does not appear to be up to the user but merely based on what specific functions are used which are based, ultimately, upon the extensive preprocessor macros that are set up. (This one is one I'm truly stumped on.) I found some code for examples of how to use dlmalloc, apparently it's just a call to dlmalloc and dlfree; so I guess I'll start my work there and track through the function calls.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this