• Announcements

• Download the Game Design and Indie Game Marketing Freebook07/19/17

GameDev.net and CRC Press have teamed up to bring a free ebook of content curated from top titles published by CRC Press. The freebook, Practices of Game Design & Indie Game Marketing, includes chapters from The Art of Game Design: A Book of Lenses, A Practical Guide to Indie Game Marketing, and An Architectural Approach to Level Design. The GameDev.net FreeBook is relevant to game designers, developers, and those interested in learning more about the challenges in game development. We know game development can be a tough discipline and business, so we picked several chapters from CRC Press titles that we thought would be of interest to you, the GameDev.net audience, in your journey to design, develop, and market your next game. The free ebook is available through CRC Press by clicking here. The Curated Books The Art of Game Design: A Book of Lenses, Second Edition, by Jesse Schell Presents 100+ sets of questions, or different lenses, for viewing a game’s design, encompassing diverse fields such as psychology, architecture, music, film, software engineering, theme park design, mathematics, anthropology, and more. Written by one of the world's top game designers, this book describes the deepest and most fundamental principles of game design, demonstrating how tactics used in board, card, and athletic games also work in video games. It provides practical instruction on creating world-class games that will be played again and again. View it here. A Practical Guide to Indie Game Marketing, by Joel Dreskin Marketing is an essential but too frequently overlooked or minimized component of the release plan for indie games. A Practical Guide to Indie Game Marketing provides you with the tools needed to build visibility and sell your indie games. With special focus on those developers with small budgets and limited staff and resources, this book is packed with tangible recommendations and techniques that you can put to use immediately. As a seasoned professional of the indie game arena, author Joel Dreskin gives you insight into practical, real-world experiences of marketing numerous successful games and also provides stories of the failures. View it here. An Architectural Approach to Level Design This is one of the first books to integrate architectural and spatial design theory with the field of level design. The book presents architectural techniques and theories for level designers to use in their own work. It connects architecture and level design in different ways that address the practical elements of how designers construct space and the experiential elements of how and why humans interact with this space. Throughout the text, readers learn skills for spatial layout, evoking emotion through gamespaces, and creating better levels through architectural theory. View it here. Learn more and download the ebook by clicking here. Did you know? GameDev.net and CRC Press also recently teamed up to bring GDNet+ Members up to a 20% discount on all CRC Press books. Learn more about this and other benefits here.
Followers 0

"Correct" Music Note Structure?

9 posts in this topic

I am programming a basic MIDI sequencing tool, and am starting off with the core program and internal music representation, then moving to UI, and finally to interfacing with the audio output API (I don't even know what API I'm going to use, but something MIDI-related). I have written a bit of something for my internal representation of a musical note and of its channel. I am asking for advice on both my note and what API to use for MIDI audio development (real time efficiency must also be considered). Here is my note structure as of now:
[source lang="cpp"]struct note {
char midiVal;
/* MIDI value, or note value */
char timeStmp [3];
/* Time value, sometimes called timestamp, basically when the note will play on the timeline.
It is organized by mm:ss:ms, or minutes, seconds, milliseconds, respectively. As you guess,
there is an inherent limit of 99 minutes per song. */
char channel;
/* The software channel. There are 0-255 channels. Channels organize groups of similar notes, like a
"nation of notes" if you will. Channels can be manipulated as one whole unit or as individual notes. */
float physChannel;
/* The actual physical channel, of no relation to the previous data. It is, of course, left and right audio streams, and each note will
contain data for how much it will blend on either side of the audio stream. A positive number is right stream, negative is left. */
char duration [3];
/* The duration of the note based on the same timing conventions used earlier for timestamp data. Used to specify the duration,
and yet again, the limit for a single note's duration is 99 minutes. pah. */
char instrument;
/* Specifies, out of a bank of a mere possible 256, which instrument a not belongs to. */
char volume;
/* Dictates what volume a note will be. based off of a scale of 0-255, zero being mute, and 255 being
the mortal enemy of your grandmother. */
ULINT name;
/* Not really a name, more like a serial number. Anyway, this puts a limit on possible notes as well (do you like my limits?).
Now, there is a ceiling of only 4,294,967,296 notes in a piece, and that piece can be 99 minutes long. */
};[/source]
The code comments can be a bit basic at times, but they were written so that someone with no clue of what I am doing could pick up on at least a little of what I was talking about and recognize similarities with other API's. I didn't mean to offend anyone with dumb humor or elementary explanations on basic subjects.

EDIT: "ULINT" is an unsigned long integer, fyi. Edited by MrJoshL
0

Share this post

Share on other sites
There are several things I would do differently.

For starters, things will be much easier if you make your timestamps a more reasonable type. Say, a float expressing seconds from the beginning of the song, or perhaps an integer number of milliseconds. Similarly for the duration.

I would make the note value a float, to allow for [url="http://en.wikipedia.org/wiki/Microtonal_music"]microtonality[/url].

I don't know if there is any reason why a note should know what channel it belongs to. Presumably there will be a struct "channel" which will contain the notes in that channel, and you don't need to repeat the information of what the channel is inside each note. It is possible that you have a good reason to do that, but I don't know what it is.

physChannel' is not very descriptive. It is known as "pan" in MIDI, so perhaps you should consider changing its name to something like that.

I would probably have used a 16-bit integer for the instrument. I know there are synthesizers with more than 256 instruments, and it's probably not worth saving the extra byte, given that people have already bumped into this limit at some point.

Similarly to the channel, I don't think I would make the identifier a part of the note. If some container of notes wants to locate them using an identifier, that's great, but I think that should be part of the container, not the contained type. (In C++ I would perhaps store the notes in an object of type std::map<unsigned, note>, where the unsigned integers are the identifiers.)

[code]struct note {
float note; // in semitones, with 60 being C4, like in MIDI
float timestamp;
float duration;
short instrument;
char volume;
char pan;
};[/code] Edited by Álvaro
1

Share this post

Share on other sites
Oh, one more thing. You should reorder the elements in your structs from largest to smallest, or you may end up with a bunch of padding (unused bytes to guarantee proper alignment of some types) that might make your struct larger than it needs to be.
1

Share this post

Share on other sites
Thank you, Álvaro, for your good advice and information, it is very much appreciated. By the way, do you have any suggestions on which API I should use?
0

Share this post

Share on other sites
I have worked with MIDI a ton and long ago I made a tool that generated valid MIDI files (the tool’s goal was to algorithmically generate music—it worked but the music it generated sucked ass).

Álvaro is correct about everything but the time. The times/durations must be in a resolution no shorter than microseconds. They should be stored in ULINT.
Today’s software have resolutions of up to around 960 PPQN (possibly more) and 250 BPM, giving you a resolution as low as 4.166667 microseconds between events.

The variable-length time stamps inside the MIDI files store the number of ticks between each successive event. For efficient run-time performance you should convert all of these event time stamps into literal times, which is why you need to store raw microsecond values. Internally you will still need to maintain this tick-style format so that you can add/remove events reliably and change the tempo, etc., without losing precision, but before playing a song you should make a quick prepass to convert all those ticks into absolute times.

Volume only ranges from 0 to 127, by the way. Most MIDI events do.

For anything you think will be in the range from 0-255, use an [color=#000080]unsigned char[/color], not a [color=#000080]char[/color]. No reason to bite yourself in the ass with a useless sign bit, especially when shifting things.

For the instrument patch you are storing severely too little information. My Yamaha MOTIF XF8 has 1,353 voices and this is fairly common these days.
You need to look into the MSB/LSB system for selecting banks and patches.

L. Spiro Edited by L. Spiro
1

Share this post

Share on other sites
[quote name='L. Spiro' timestamp='1355554503' post='5010863']
Álvaro is correct about everything but the time. The times/durations must be in a resolution no shorter than microseconds. They should be stored in ULINT.
Today’s software have resolutions of up to around 960 PPQN (possibly more) and 250 BPM, giving you a resolution as low as 4.166667 microseconds between events.[/quote]

That makes sense. I picked milliseconds because that's what he was using, according to his comments (although he was trying to fit the milliseconds field in a single byte...).

[quote]For anything you think will be in the range from 0-255, use an [color=#000080]unsigned char[/color], not a [color=#000080]char[/color]. No reason to bite yourself in the ass with a useless sign bit, especially when shifting things.[/quote]

Oh, yes. This is important, and using char' by itself is even worse than that: Whether `char' is signed or not depends on the compiler, so you should definitely say explicitly which one you want. Sorry I missed that.
1

Share this post

Share on other sites
Bit of a brainfart here, but hopefully something useful: [img]http://public.gamedev.net//public/style_emoticons/default/smile.png[/img]

As L. Spiro says, don't store you times as floats or something like that, store than as ticks.

Sequencers commonly work on a scale of PPQN (pulses per quarter note), so if you adjust tempo (once off or gradually throughout a song) it just *works*. The PPQN values are usually things like 48,96,192 etc.

Bear in mind that if you are doing 4/4 music that's all good, but if you are using triplets, or groove, you'll want the PPQN divisible by 3, and with enough precision for your 'groove'.

You'll also probably want to store your note timings as offsets from the start of a pattern, rather than the start of the song. This way you can several instances of the same pattern at different parts in the song.

Also instead of storing things as e.g. char[3] to save space, it's probably more sensible just to make them 4 byte unsigned int / ints and keep your structures 4 byte aligned so you (or the processor) aren't faffing about for no reason. You can always compress them on import / export, if you really need to.

Another reason for PPQN is so you easily change the output sample rate (assuming you are going to do some audio instead of purely MIDI).

I've done several audio / sequencing apps and don't think I stored anything as floats. PPQN can be used to calculate the exact sample for an instrument to start / end (and you might precache this kind of info). You could possibly use something more accurate to get within sample accuracy for timing, but I've never bothered myself.

It's really worth using a plugin architecture for different components of a sequencing / audio app, I'd highly recommend it. You can make effects (reverb, delay, chorus etc) plugins, and instruments plugins. You could potentially also use VST plugins or similar if you can work out their interface (you may find some open source apps that have managed this).[img]http://public.gamedev.net//public/style_emoticons/default/cool.png[/img]

I'm currently rewriting a sequencer / audio app I wrote a few years ago, and have actually moved to using plugins for things like quantization / legato / groove / argeggios. Have a think about whether you want to be able to do stuff like 'undo' quantization, keep original values, or have a modification 'stack' applied to notes.

I don't think you'll get the exact structures bang on first time, it's the kind of thing you write a first version, then realise there's a better way of doing it, redo it, etc etc. But it is fairly easy to get something usable. You may also spend as much time on user interface / editing features as the stuff 'under the hood'.

As for APIs, I have so far cheated and don't actually use MIDI input or output (although I have done that in the distant past and it wasn't that difficult I don't think). I have just been writing a MIDI file importer though refreshing my memory lol.

If you want realtime MIDI input you'll have to pay much more attention to latency and the APIs you use. I was just getting by with the old Win32 audio functions for primary / secondary buffers, but the latency is awful, so using direct sound or I think there may be a new API in windows 7 would be better. Sorry can't help yet in that as I haven't researched it myself yet.

Also I'd add, consider using direct3d or (in my case) opengl to accelerate the graphics side. This way you can easily show the position within a song without overloading the CPU and causing stalls and having your audio stutter.

Once you start doing the audio side a bit of SSE / SIMD stuff helps. And you have to think carefully about how you'll structure your tracks / sends to effects, to make it efficient but also customizable.
1

Share this post

Share on other sites
More stuff:

Note pitch: I'd stick with just a note number like MIDI for now, and the 12 note western scale. 99% of music is written like this, and handling other systems is a bit more advanced and something you can tap on later. Storing notes as float frequencies I wouldn't recommend for several reasons : accuracy (say you transpose down, then up later) .. the wavelengths don't have a linear relationship with note number. You might want to do operations based on the relative pitches of notes, or detect chords etc. All of this would be stupidly difficult just trying to store wavelength / frequencies. Besides the fact your source instruments may have different base frequencies anyway and these would need to be compensated for.

Pan: Why limit yourself to stereo pan? What about surround sound?

Channels / instrument info on a note: Would you want the note to determine this, or the track and / or pattern? Having a 'grouping' feature for notes can be useful though. Remember you are going to want to be able to do stuff like edit the instruments you are using quickly and easily, and not change this for every note.

What happens when by accident you set 2 bunches of notes to the same instrument ID (if storing on the notes?) you have then lost their 'individuality'. Better to store something else that then maps to the instrument.

Volume: This is usually key velocity rather than volume (there is midi volume as well, but you wouldn't store this per note, but as a separate event), which in midi is 0-127. There is also release velocity, which may or may not be used by the instrument.

There's also other stuff like pitch bend, aftertouch etc, which you can store as a separate event.

Note name / ID: Why try and store this on the note? If your pattern has e.g. an array or vector of 35 notes, then you know its ID as you access it.

An example to start with might be something like this:

class Note
{
public:
int m_iStartTime; // in PPQN. this could be negative? if you want some notes to start before the official start of pattern
unsigned int m_uiLength; // in PPQN
unsigned int m_uiKey; // e.g. like MIDI have middle C as 60
unsigned int m_uiVelocity; // 0-127?
unsigned int m_uiReleaseVelocity; // 0-127?
};

Once you have a simple system working then it will become more obvious where to add things.

To reiterate on the notes side of things, don't worry so much about space saving, just concentrate on simplicity. Note data doesn't tend to be that large. It's more when you get to the audio side you need to pay attention to the data structures / bottlenecks.

And rather than just having a struct-like class you can use accessor functions so the actual data underneath can be anything you want.
1

Share this post

Share on other sites
On MIDI, AFAIR track voice info is not bound to notes, but is controlled by channel events. And is compose by bank and voice.

Of course, since you're abstracting away the MIDI protocol, you could put that info on notes. But you'll have cope with those diferences when you're actually sending MIDI info to the MIIDI devices.

E.G. Sending the voice change event would require you to look ahead on the [i]note stream[/i] to see if the next note has a different voice program, and actually send the voice change event a little earlier.
1

Share this post

Share on other sites
[quote name='lawnjelly' timestamp='1355586234' post='5010964']
Note pitch: I'd stick with just a note number like MIDI for now, and the 12 note western scale. 99% of music is written like this, and handling other systems is a bit more advanced and something you can tap on later. Storing notes as float frequencies I wouldn't recommend for several reasons : accuracy (say you transpose down, then up later) .. the wavelengths don't have a linear relationship with note number. You might want to do operations based on the relative pitches of notes, or detect chords etc. All of this would be stupidly difficult just trying to store wavelength / frequencies. Besides the fact your source instruments may have different base frequencies anyway and these would need to be compensated for.
[/quote]

I still think I would use a float for this, but instead of the frequency it would be a number of semitones (which is 12*log2(frequency)+some_constant). You don't magically lose precision if you add and subtract integer values to integer values, even if those integers are stored in floating-point variables.
1

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Followers 0

• 10
• 11
• 20
• 11
• 28