Sorting multi dimensional structures

Started by
13 comments, last by glacai 20 years, 9 months ago
Hi Charles,

quote:BTW ... something else. Don''t believe all what your politicians or papers said about France. They are all liars, in France too. Chirac=Blair=Bush=liars. I still enjoy visiting England when I can and don''t care much about considerations for these intellectual crooks.

Personally I like France too and where you come from certainly isn''t a consideration to me, you took time and effort to help which I appreciate very much.

Anyway I''ve got it going and it seems to be working well, in a similar way to what you suggested (I think). I don''t need to order by id at any point, I was thinking of using id as an index into an array but the id could be upto 32 bit number, I know hashing and it''s been used extensively in other parts of the program so was trying to avoid anymore.

As the max is only 5x500 at the moment I decided to order the struct as read from the file, inserting into a doubly linked list into position, if id already exists update a hit for each id match and combine the weight, reshuffle the list if updated weighs higher than previous. It all seems to be working in an instant so that will do me for now I''ve still got to implement the positioning though.

btw, MAXSTRUCT was an example not what I use, thought it would make it easier to explain its not a set size.

quote:Don''t worry I was really messy in the past too

I hope its not too messy, its a small section out of 10000 odd lines!!

Thanks for your time, regards.
Advertisement
K I think your implementation is the normal way of doing it and theorically normally efficient. But when it comes to optimize I try to remove such things as double linked lists that give poor cache, data bursting and adressing anticipation (AGIs). That's why I gave this tedious solution as your question used the word quickest. Maybe you meant quicker to write not fastest code. Still you are right if the data size remains as you say (500*10 structs), on modern PCs your solution will be surely 10000 times good enough.

One of my deafults as a coder is I am so trained to optimize code in terms of CPU (3D stuff) that I always tend to propose more complex but more optimized solutions even for the first version. Rare exceptions is when it does not pay the candel at all and adds too much work for nothing (rare in most 3D stuff). Sometimes it's a bit obscure for ppl that work with me. But as I don't loose much time doing that way ... I like to have always fast running code even for prototypes and try to comment and insert a pseudo code more straightforward to understand.

Just a remark I used a gallicism (a french distortion in english), that's where nationality counts

vulgar (latin origin) is vulgaire in french which means both vulgar and popular. I meant popular words of course.


[edited by - Charles B on July 5, 2003 2:02:24 PM]
"Coding math tricks in asm is more fun than Java"
quote:Maybe you meant quicker to write not fastest code.

I did mean fastest code not quicker to write, I''d be interested in whatever you think is the fastest most efficient however complicated.

quote:One of my deafults as a coder is I am so trained to optimize code in terms of CPU (3D stuff)

Thats why I considered this site first as the best place to ask, code optimization is second nature to game programmers.

Maybe I''m not up to your standards, but its fun trying
K I'll try to think more deeply about your DLList algo and mentally examine the drawbacks in terms of optimality. I will reedit and fill this mail later. (some work to do right now.)

But first I'd like to know if I guessed your method well :

Your sort is a single pass insert sort whith four steps in the main loop. :

- find the last element appended with the "same-id"
- continue till you find the right position for the struct in the "same id" sub list.
- recombine the weights (increment). Do you update the combined weight in all the structs with same id ? Or did you find a trick to update only the first struct, like a list of all sublist heads ?
- check if the count of structs or combined weights maintain the "same-id" sub-list in position or translate it all closer to the head of the current complete list.

Am I right ?

Then it's grossly a n square algo (insert sort) and even a bit worse.

[edited by - Charles B on July 6, 2003 10:30:09 AM]
"Coding math tricks in asm is more fun than Java"
Thanks Charles,

Exact struct I'm working with:

struct DATA_BANKS {
bool used;
DOCID docid; // unsigned char[4]
unsigned short int cnt; // cnt of positions, maybe higher than MAXPOSTIONS
unsigned short int pos[MAXPOSITIONS];
float weight;
} bank[MAXWORDS][MAXBANKS]; // 10 x 500 (I may up it)

As the data is read from a file into a data bank (seperare file for each word), file contains:
docid weight count pos pos pos pos docid weight count pos pos ... ...
count being how many positions there is for that docid, its inserted into:

typedef struct TOP_DATA {
DOCID docid;
int hits; // incemented each time docid is found, max MAXWORDS
int bank[MAXWORDS][2]; // holds which bank is from, for testing information
float weight; // f - total weight of docid matched words
float average; // f - average of total weight, not used yet
struct TOP_DATA *prev;
struct TOP_DATA *next;
} TDATA, *pTDATA;

- First I see if the docid is already listed (done badly, just traverses the list looking)
- If docid matched, hit and weight is added, bank info set, then I check if hit or weight is higher than previous, if it is swap, check previous, repeat (just thought thats really bad as well, I need to move to position and swap don't know what I was thinking!)
- If docid not matched, traverse from head to hits/weight > next, insert docid, hit, weight etc.
- This is where I will (haven't yet) implement the positioning sort, looking for the closest bank[1, 2, 3] == pos 1, 2, 3
- Then I got a function that returns the head, freeing head and moves to next, ready to return next highest, which is perfect as only a set amount of results are needed at a time.

Thanks Charles, don't worry if you're busy!

[edited by - glacai on July 6, 2003 11:47:51 AM]

[edited by - glacai on July 6, 2003 11:48:50 AM]

This topic is closed to new replies.

Advertisement