So that code that uses it can use the same interface without needing to know the difference. That was the whole reason.
I don't see why you need to mix string types that don't need validation and string types that do.
Then have them provide the same interface. They don't need to be able to interact with each other for that. But see below...
I don't. I really don't. Somehow, text has to get into a string. If I read UTF-8 from a file, it goes into an array of code units before it goes into the string class. So, I need to interact with it there. The others functions are for convenience.
Right, but that instance of 'char* to UTF-8' is logically completely different from 'char* to std::string'. You go on to show that you do fully appreciate this difference - so the only thing I don't understand is why you talk about trying to implement an interchangeable interface, when the two are not comparable. The equivalent to std::string(char*) is UTF8String(int*). There is no legitimate part of your code where you have a char* and you could interchangeably create either std::string or UTF8String. We don't build strings out of arrays of the storage type, because that is just an implementation detail - we build them out of arrays of the character type.
Of course you do need a function that builds UTF8 strings out of bytes, performing the encoding operation - but that has no equivalent in std::string.
I don't see what you gain from that separate character type - surely that per-character validation operation is only half of the story since you already need to have a 'charType' in order to construct it. As I would imagine it, once the data is successfully into the UTF-8 string, all characters are valid by definition, and before data is in the UTF-8 string, it's just bytes and meaningless until validated.