I read through all the answers and what surprised me most was that only one had the true meaning of the x32 and x64 bit designation.
Most of the answers are partially right but Hiral Patel hit the nail on the head.
The biggest difference between a 32 bit software design and a 64 bit software design is the number of bus lines that are dedicated to the processor. As Patel said, think of it as a highway with lanes of traffic dedicated to the communications to the CPU. The more lanes of traffic available to send information available to the CPU, the faster the response time can be as it responds to the instructions.
These lanes work in both directions so not only will it execute faster on commands it can also issue commands faster to the connected devices. That in is where most of the speed issues lie. The connected peripherals will normally operate much slower than the CPU's. The biggest culprit being the hard drives. SSD's have made this a little better, but still have problems keeping up with the CPU demands, and that is why they tend to cache their commands and responses.
In the old days we actually had 4, 8 and 16 bit machines. At the time they seemed very fast but when compared to today they were very slow.
One thing I want to clear up on a statement that Riviera made. 32 bit software running on a 64 bit machine. It is true that the slower software will run on a 64 bit machine, but there is a cost to that. It does run slower, then it would on a normal 32 bit machine as many times it must run through a translator for proper bus assignment, this extra step can slow the operation of the software down.