Consider a machine implementing the five-stage pipelined MIPS ISA, as we discussed in lectures:Fetch - Decode - Execute - Memory - Write-back (one clock cycle at each stage). Here, the machinedoes NOT have the data hazard detection unit (i.e., no interlocking) but inserts nop instructionsbefore dependent instructions for correct program execution (by leveraging compiler). Both registerdata forwarding and branch prediction are equipped. Note that we assume the branch predictorpredicts all paths as always-taken, and the next program counter is available after the decode stage.Initially, registers $t1, $t2, $t3, and $s0 are 0, 0, 0, and 200, respectively.Loop: lw $t1, 0($s4)lw $t2, 400($s0)add $t3, $t1, $t2sw $t3, 0($s0)sub $s0, $s0, #4bnez $s0, Loop(a) Re-write the code above with minimal changes for correct execution with minimal latency.Hint: insert nop instructions or reorder (reschedule) instructions.(b) How many cycles are needed to execute the code segment? Hint: Loop.