Colossal armor is not going to fix it
Colossal armor is Spectacle, but once the idea has worn off the combat will be exactly the same. The PC's abilities to become giant and fight other giants will not really offer any difference in the way they fight. They could pick up and throw a building, but they could do the same thing on a smaller scale with a rock. Now if they were fighting giant, armor clad enemies as regular sized people things could certainly get interesting.
"Terrain is often of more value than bravery....Bravery is of more value than numbers." - Sun Tzu
Your original instinct to use terrain was right, terrain can and should shape every combat that occurs. If the party isn't using the terrain, the enemies certainly should. Anything with the intelligence level of a wild predator (like a wolf) or higher should be able to utilize the terrain in some way, sentient creatures and people should especially make use of it to gain advantage in combat. For example if your players are able to flank enemies that were behind cover those enemies should probably be Shaken as their safe position is now exposed.
Combat encounters are their own story vignettes
Too often people view action and story as being separate (this is an issue in books, films, and RPGs), that the two don't mix well. The opposite is in fact true. The best action scenes are those where plot development and characterization is actively ongoing. Likewise each combat should have its own mini story arc with a rise and fall in tension as it goes on. For example, the party has just arrived at an abandoned warehouse they suspect the enemies of using as safe-house. The party attacks them and they begin to lose numbers and start retreating, the party rushes forward to capitalize on this and finds themselves ambushed by reinforcements the enemies called in. Now they need to escape, they pull back into the warehouse and are followed, as they reach the other side they explode some fuel tanks and light the building on fire cutting off enemy pursuit as they disappear into the night.
Having such numerical rules of thumb are both design decisions and design guidelines
There is no “correct” ratio of monster damage to player damage or player HP to player damage, but these kinds of ratios are well worth thinking about. They influence balance, but also influence the play and feel of your game. If HP is about 4× damage, you expect people to take four hits (or last four rounds, or whatever period you’re aggregating damage over). Deciding what ratios you want (which ratios will generate the feel you’re going for and encourage the types of play you’re expecting) is a big part of the design process, and it is, of course, very complicated.
There’s also probably not just one ratio you want for any given pair of numbers. Tough characters are supposed to last longer, so perhaps their HP:expected-damage ratio is higher. Nimble characters are expected to deal high damage but be fragile. So on and so forth. So you really need different notions of what these ratios are supposed to be for the different sorts of characters you want to support. You can make this very math-y and codified (e.g. D&D 4e) or very vague and flexible (e.g. almost any point-buy system you care to name).
And then there are issues of system-mastery: how much do you want to reward people for thinking carefully about their character’s mechanical components? How much do you want to punish people for failing to do so? Are certain archetypes going to be possible, but bad ideas? Is that a good thing, or something you want to either build real support for or prevent? If you allow them, are you going to warn players about them explicitly?
This isn’t even all of the numerical thoughts a game-designer needs to have, without even getting into the other crucial aspects of the design (setting and tone and description and art and so on and so forth).
Theory-crafting can help
Designing a balanced game requires, more than anything else, a lot of time spent fiddling with it. Theory-crafting is faster than play, so on some level you get more fiddling per unit time. This can be useful: look at your current ruleset, and try to determine the maximum possible values someone can get for various statistics. You’ll have to try going at this a lot of different ways, however; optimums are often found in surprising locations, and you can never be certain you have found one.
Having an interested community helps massively here. More minds, more eyes, more original ideas and approaches.
But it cannot replace play-testing
When theory-crafting finds game-breaking exploits, they’re worth fixing. Sometimes the builds that yield them aren’t even powerful in practice; it’s just about maintaining those design decisions you made (e.g. those ratios) and keeping the environment stable (if something achieves an anomalously-large value for some number, and you ignore it because that number isn’t very important, you now had better remember that it can get anomalously-large before you add a new thing that depends on that number!)
But theory-crafting cannot and will not catch everything that is actually a problem. Having overly-high numbers might make things too easy, but the players having abilities that you didn’t expect to see at all is often much more troublesome. If a player figures out a clever way to fly when they’re not supposed to, that might invalidate whole sections of your plan.
And flying is an easy example. The real danger are things you didn’t even think of, that never seemed relevant or important, until some clever player breaks everything. This is what play-testing uncovers.
Again, an active and involved community helps a ton here; each group may only play it once (or maybe a couple of times), but if you have many groups, that’s many tests. Especially, say, if one person can run it for multiple groups, letting each group try different tricks.
Especially if this is actually a matter of testing, and these are explicitly (volunteer or paid) testers, it is worthwhile to encourage destructive testing. This means that the testers are trying their hardest to break things; they are stressing the system, maybe trying out some of those theory-crafted builds that you found to be just within the bounds of acceptable, maybe just trying oddball things. Even abusing foreknowledge of the events of the game, for repeat players: if someone could plan for those events and be over-prepared for those specific events, someone might accidentally stumble upon the same preparations. Plus it's just good to get a gauge for what is the easiest someone could get through here?
Testing the opposite side is useful too: if someone made some truly hare-brained decisions, can they still enjoy this game? Try throwing some intentionally “awful” characters through: how much do they suffer for those poor decisions? Is that level of penalty for bad decisions appropriate to your design goals?
Best Answer
You should test different aspects of the game in phases;
Phase 1: Rule mechanics / balance
This is something you can do on your own. You are proposing some game mechanics, formulas, dice, probabilities etc. Lay them out in front of you and calculate the end result probabilities and cross-interactions. What happens when character A attempts action B in situation C. Three possible outcomes of X, Y and Z. What are the probabilities? Is that what you want?
This phase can be a bit math-intensive but gives you a good big-picture of your game. Modify and tweak as needed, then move onto phase 2 when you got it approximately right. You will be revisiting these calculations later but for now, "about right" is "right"
Phase 2: Rules playtesting
Get yourself a few playtester players and run a few games. Your players will find and exploit a huge number of flaws and loopholes. Go back to phase 1 as needed to revise your rules.
Phase 3: Setting review
This only applies if you also have an established setting in your game. Have other people read your setting material and gather feedback. Clear out inconsistencies, fill gaps and ambiguities as needed.
Phase 4: Alpha testing
By now, you should have your rules and setting written out. Run games for playtesters but now, also join games as a player and observe from a different perspective. If possible, have others run and play games without you, and gather feedback. Revisit previous phases as needed. Move on when the game feels complete and solid.
Phase 5: Beta testing
…is similar to Alpha testing but you should have your book/pdf/website in your intended final format. Keep gathering feedback, but keep the focus on the output material. Build towards having a clear and understandable layout of your rules and setting. Fix your wording to clarify details.
At the end of it, you should have a working game. How much effort you put into any of these phases depends entirely on you and your game.