For the slowness concern; when I write iterative generation algorithms like this, I usually allocate a block of ints twice the size of the map, and re-use it over and over as the process continues, sometimes as a doubly-linked list structure. Before the "grow" step I described, I'd initialize all elements to 0, and all unassigned land adjacent to assigned-province tiles into the list. After changing province assignment of a tile, I'd add all the adjacent, unassigned land to the list. This way, every pass I'm just walking the list of province-bordering tiles, not the entire map. You can tell if a tile is already in the list and doesn't need to be added, because one or both of its pointer indexes will be nonzero. The phase is done when the list is empty.
Once you add the Conquer phase, the blockiness of some provinces will start to disappear, especially if you make your initial seed chunks a little smaller.