Am I trying to hard to create a classic domain model? Are there any "best practises" with regards to e.g. collections when using a component based engine? Is it more common to see denormalized data structures in component based engines?
I'm not sure if there's an established set of best practices, because even now there are a lot of different interpretations and implementations of the "component-based" model. In the interest of full disclosure, for a little over a year I was doing some very heavy database work as part of a project at work, and really grew to love relational data models and the separation of data and logic, so obviously this influenced what I believe to be one of the better implementations of a component-based design. In the case of collections, a collection isn't so much a literal object that contains other objects, as much as it is a result set of a particular query, namely one where you retrieve all objects with CollectionID = 'x' (or in our case, ArmyID = 'x'). The possible values of 'x', or rather the set of army ID's, can either be inferred from the distinct set of army ID's present across all UnitComponent's, or be stored on entities that contain an ArmyComponent. In the latter case you establish an informal constraint on army ID's, and the notion of "valid" versus "invalid" ID's, but whether or not you need that type of constraint is up to you. Usually you'd only have the ArmyComponent if an army needed additional information, and wasn't just a logical grouping or joining of units.
In this case, would building be an entity connected to a player via a PlayerBuildingComponent (or such)? I'm having difficulty actually separating entities and components at times, usually when relationships are involved (which I have quite many of).
Ideally, an entity is just a logical grouping of components, so there really isn't an entity "object" to add data to because it's a logical grouping, similar to the relationship between armies and units. It's all components. For player ownership, you'd have a PlayerOwnedComponent that can be put on any entity to indicate that it belongs to a particular player. For buildings, you would have a separate BuildingComponent that handles data just related to being a building. The rule of thumb is, if you have to ask whether or not a component needs to be separated into smaller components, it probably does It's a bit like the single responsibility principle, only when applied to data.
Concrete example of how it works in my game: Say a player has x farms. There is no interest for the player to see these farms as unique, so I would group them into some data structure which says: BuildingType: Farm, Amount: x. But then behaviour needs to be added to this entity (or component, depending on your answer to the above question), so we add ProducesFoodComponent to this entity. Am I on the right track with regards to making the Building an entity in this case, and then linking this entity building to my player entity via a PlayerBuilding Component on either the Player or Building entity? If I don't do separate the building data structure into an entity, I'll get components having children of components, which does not seem right.
This relates back to what I mentioned earlier about logical groupings and queries. You wouldn't have an actual data structure that groups farms, or even cares that they are farms. All the system cares about, is entities that have ProducesFoodComponents, and entites that belong to the player of interest (which will have a PlayerOwnedComponent of the correct player ID). Depending on your level of normalization, it will do a query not far off from this, to get the total food production amount for a player with ID 'x' (I apologize ahead of time for any syntax errors in my SQL!):
select sum(food.food_amount) from produces_food_components as food inner join player_owned_components as player on food.entity_id = player.entity_id where player.owner_player_id = x
Now of course your implementation might not use an actual query language, but that's the essence of how systems would retrieve the information they need from the collection of components and relations.