Ok, so first I have no experience with this so, i could be very wrong about this. But here are my initial thoughts.
I would create an object that contained the following information
1) KeyMask (buttons corrosponding to correct note)
2) StateMask: (1=press, 2=held, 4=released)
I would create two queues of these objects, SongQueue and BufferQueue
and an array of these objects ActiveNotes.
break the song into fixed time steps, say 5 per second. so a song of a minute and thirty seconds would have 450 of these objects.
determine the length of time it takes for the animation to start at one end and enter the strike-zone. Assume it takes 5 seconds
populate the buffer queue with 25 empty objects (no mask) (5 seconds * 5 per second)
Determine the length of time it takes for the animation to enter the strike-zone and exit the strike-zone (assume 1 second)
Allocate room for 5 (1 second * 5 per second) of these objects in the ActiveNotes array.
During game play I would pop one out of the song queue and an enqueue it into the buffer queue at a rate of 1 every 1/5th of a second.
The animation starts when an object with a mask of press is enqueued into the BufferQueue
The note plays when the object with a mask of press is dequeued from the bufferQueue.
when one of the objects is dequeued from the bufferQueue, place into the i%5 indice of the ActiveNotes list
Every 1/5th of a second poll the keyboard. Check the keyboard state against the notes in the active list.
So... the song is encoded as A,B,C: button press, +=press or hold, - = hold ~=hold or release <=release, 0=no mask
So
A++-----~~<
would give the user 1 and 3/5ths of a second to press A, then they would have to hold it for 11-6 second and then they'd have up to 1 and 3/5ths of a second to release the button.
(The extra one second is every calculation is how long it would take for the last of a given action to leave the active list)
Active queue Buffer Queue Song Queue T (Time in seconds)
00000 0000000000000000000000000 A++-----~~< T=0
00000 000000000000000000000000A ++-----~~< T=0.2 (A button press animation starts)
00000 00000000000000000000000A+ +-----~~< T=0.4 (A button press animation starts)
00000 A++-----~~< T=5.2
0000A ++-----~~< T=5.4 (A note plays, and A note animation enters strike zone) players can press A
000+A +-----~~< T=5.6 players can press or be holding A
00++A -----~~< T=5.8 (See above)
0-++A ----~~< T=6.0 (See above)
--++A ---~~< T=6.2 (See above)
--++- --~~< T=6.4 Players can press or be holding A (due to the + still in active notes)
--+-- -~~< T=6.6 (See above)
----- ~~< T=6.8 Players must be holding A, pressing or releasing is incorrect
-~--- ~< T=7.0 Players can be holding or releasing A
~~--- < T=7.2 (See above)
~~--< T=7.4 (See above)
~~-0< T=7.6 (See above)
~~00< T=7.8 (See above)
~000< T=8.0 (See above)
0000< T=8.2 players can only be releasing
00000 T=8.4 Song is over
I already see a few issues, for example being required to release the key after already being allowed to release or hold is no good.
And being required to release or hold after already being allowed to release is also not good.
Instead, maybe it should be looked at key must be released that that point. But that's my thoughts anyway.
*EDIT*:
Thinking about it more I think the state masks would have to be
Press
Press or Hold
Hold
Release or Hold
Released, Release or Hold
Released or Release
It feels as though you might be able to skip the press, and go strait to press or hold, but then pressing early doesn't cause an issue, and it is a better trigger since you'd only have one per note rather than a press and hold for which there would be multiple per note.