Although your calculation is mostly correct, it did not take into consideration the network's capability to slowly increase the block gas limit. If a block becomes full with transactions, miners are allowed to bump the limit of the next block by a certain amount (for details please see Yellow paper, page 6, equations 40-42 + surrounding context).
Based on the above rationale, there is no theoretical limit to the number of transactions that can be squished into a block, just needs a bit of time to adjust. Practically you need to get those transactions to the miners, they have to process it, distribute the results etc, so based on how optimal the implementations are, there's an upper limit. What that is with the current network, nobody can really say. We did extensive spam tests on the Olympic test network where we actually rewarded people to keep pushing junk into the network, and reached a transaction throughput of about 25 tx/sec. Since then a huge amount of work went into the implementations, so they would probably be able to handle even more. However, calculating with this 3/4 year old experimental results, you would get about 2.16M votes per day.
That number is probably higher now, but you get the order of magnitude that the network seems to be able to handle. To push this number further up, there's extensive research being done in scalability and proof of stake, which would enable orders of magnitude larger transaction counts, but that's a long term goal.
Block time in Ethereum's Proof-of-Stake system (called Casper) is being conservatively targeted at around four seconds. Vlad Zamfir of the Ethereum Foundation believes the block time will ultimately end up being much lower (sub-second) while Vitalik is not as convinced on that front. Vlad discusses this in this excellent video explaining Casper: https://youtu.be/3g2CwTnn0Us
The block time is able to be lowered because validators aren't burning cash on electricity to mine, which makes producing uncles much less painful relative to the canonical block. Validators are rewarded on the basis of how they place their bets, and won't lose money so long as they always bet on blocks which make it into the security record. So their profit is slightly higher for betting on a canonical block relative to an uncle, but their cost is the same either way (and much reduced relative to PoW mining, as is the reward).
Casper therefore, in addition to its much-improved security properties relative to Proof-of-Work, is able to provide much lower confirmation latency. This does not mean, however, that we have improved Ethereum's underlying efficiency in terms of transactions per second. With Casper, Ethereum is still a single-threaded global computer, and any improvement in TPS will come from validators upgrading their hardware.
Sharding, not Casper, is Ethereum's proposed scaling solution for increasing TPS by orders of magnitude.
Best Answer
Back-of-the-envelope estimate. Sharding is Ethereum's way of achieving high TPS.
If you're looking at the future of Ethereum, then you're probably looking at a future involving sharding. In this case, you only have to make sure the appropriate shard gets your transaction as a propagation delay. With a group of about 100 validators per shard with each validator in the group connected to 10 other validators, you can be 2 hops away from every validator in your group. Transactions can come from either other validators or someone from outside your validator cluster. Even a naïve algorithm will "only" reduce your bandwidth by a factor of 10. I assume CPU cycles and I/O is much faster than network bandwidth.
If every node is connected to 100 other nodes, even with a large amount of overlap -- say, essentially, you're only connected to 20 nodes that are distinct from your "near neighbours", and there is one node per person on the planet, we can make some assumptions about efficient routing and find you'd never have to be more than 8 hops away from the required validator cluster. If we assume the average packet has to propagate across North America (some will be closer, some will be farther), then that's around a half second propagation delay. If we have a well-known set of validators, we can actually directly connect to one of the ones involved in your transaction, as with a client-server architecture.
If you have validators on a 1Gbps pipe, you can receive ~100Mbps' worth of transactions in your cluster. That's 2500 transactions per second per shard with a propagation time of the latency between you and one of the shard's nodes + internal shard propagation delays. That's closer to a bit over one trip around the "world", worst-case, or about 400ms.