The program counter and the accompanying notions of current and next instructions are abstractions, very useful to define what the various jump instructions do (choose the next instruction to execute, overriding the default of the one after the current instruction) but not necessarily related to how a processor actually works. Instructions can be executed out of order and simultaneously, making these concepts something that is merely emulated from the point of view of the program.
From what I read the Program Counter can hold the address of the current instruction or the next instruction.
1) Does that mean it can only hold one of the two at one time? If so, why is this?
You can trust a processor to hold the address of an instruction it's fetching for as long as it's needed and then forget it, but there can be a queue of instruction addresses to fetch (possibly shared by multiple hardware threads or multiple cores) or something even more different from a program counter register.
Fetching data from memory is the job of specialized hardware, including the instruction cache. Registers simply hold addresses, which are sent to the appropriate parts of the microprocessor.
I also read an instruction needs to get fetched, is the program counter doing the fetching too? I thought it was its job is to hold address of an instruction.
If you pretend the program is a sequence of instructions, you go to the next one; at the abstraction level of memory addresses you need to account for the length of encoded instructions, which is often variable.
2) When an instruction gets fetched: what does it mean for a program counter to increased its stored value by 1? Is it as easy as adding the number 1 to this value?
Circuits and registers count in binary (high/low voltage); hexadecimal is only a human-friendly representation of binary numbers that, unlike decimal numbers, doesn't involve carry. But actually it's none of your business, the only addresses you need to read and write correctly are the ones in your program (which can have a far more complex structure with segments, offsets, implicit or discarded bits, etc.).
3) Is the address of the instruction binary or hexadecimal? Is it encoded beforehand before the program counter fetches it?
The IR is an abstraction (what's the processor doing?) that doesn't correspond directly to microprocessor operation; the instructions being executed are represented implicitly in the configuration of what each register and execution unit do at each clock cycle. Moreover, what's executed is usually a sequence of low-level operations, not the complex instructions in the program.
4) If the program counter is holding the address of the next instruction, is the instruction register holding the address of the current instruction?
Usually, dedicated address decoding units (hence the famous historical repurposing of the x86 LEA instruction to do arithmetic); on the cheap, the same ALUs used for general arithmetic (a lot slower). The result is the same, so it doesn't matter for the programmer.
5) Who does the decoding of any of the two addresses of the instructions?