My thoughts would be to make the system far less complicated.
What I mean is I might start playing the attack animation when the attack key is pressed and then after a certain delay check to see if I an in range of the enemy (and facing the enemy). If I am then I assume that the sword hit the enemy.
I don't think I would mess with hit boxes or even creating the sword independently (depending on how many swords I had of course). I would probably draw or have an artist draw the hero with the sword. Or I would have the sword animated with the hero on a separate layer but I certainly would not attempt to position the sword somehow to the hero's hand every frame (as this would be asking for a lot of frustration).
When possible I would want to use other factors then animations and collision detection to determine my game state since animations and collision detection can take a lot of work. So to me the animation would be what is visible to the player, but it would not directly be involved in my game logic.
Just my two cents.