You raise a valid concern. On the one hand, we often talk of periphrastic tenses (and other constructions); on the other, some insist that a tense should be confined to a single word. Others, again, hold that tense is a property of a sentence or clause, not of a word or phrase. Can this problem be solved at all?
The short answer is: there are different models; some models are incompatible with certain other models; and we are free to choose whichever model we prefer. The term periphrastic tense is useful in a model that allows for tenses that consist of more than one word, but not in a model that doesn't. The definition of "tense" is not an objective fact that exists independent of human analysis: it is ultimately a label of convenience created by the observer. Both kinds of models have merit.
Most language users happen to think of will do as the future tense. Some linguists use other models. There is no consensus, not even among linguists, about what constitutes a tense.
Even word boundaries are not objective facts
Perhaps the most fundamental issue you raise is that of word boundaries. What were once considered two separate words may fuse into a single, new word, as in cantare habeo => chanterai. At some point in its development, the status of this phrase-or-word must have been uncertain. This shows how relative the whole terminology is.
But in most cases, a reasonable case can be made for either one or the other, so that the fundamental issue temporarily recedes to the background; it should be noted, however, that what we consider a "word" is to some extent intrinsically subjective and a matter of convention. It is just a convenient demarcation. But let's move on.
Is tense determined by form or by function?
Let me illustrate the problem by means of Latin, where terminology has been fixed for a long time. Tense comes from Latin tempus, "time"; part of the oldest concept of tenses had to do with notions of time. However, there was never a one-to-one correspondence between tenses and temporal references. The pluperfect, for example, is normally used to refer to a time before a narrated time in the past, just as in English; and yet after postquam, "after", the perfect was used, not the pluperfect. Similarly, the imperfect and pluperfect could be used to refer to an hypothetical situation in the present, as in English if I was rich... (although subjunctives were far more common). And so on.
Si domi eram, pater me puniebat. = If at_home I_was, father me punished.
"if I were at home, father would punish me."
Postquam Galliam vidi, vici. = After Gaul I_saw, I_conquered_it.
"After I had seen Gaul, I conquered it."
And yet we still call the verbs in these examples imperfect and perfect, respectively, even though they do not have their usual temporal references. The reason we do this is that the form is named after its most common function, even though it can indeed have other functions. Latin and English do this and are by no means the only languages.
Do we then look only at the form of the verb, not at its function, when defining tenses in Latin? No. What we call the passive perfect is periphrastic/analytic/compound, just as in English:
Canis sum. = Dog I_am.
"I am a dog."
Visus sum. = Seen I_am
. "I am/was seen."
You could say this is not a special tense, but two words, one being a past particple, the other a present verb; and yet this is called the passive perfect. The reason is that it functions just as the perfect does—except that it is passive. Here function determines what we call it. This happens in English too when we say I will do it is in the future tense.
Humans like symmetrical systems
So then what constitutes a tense, if we can count neither on form, nor on function, at least not reliably so? The answer is probably symmetry. If there is a present active (video "I see"), a present passive (videor, "I am (being) seen"), and a perfect active (vidi "I saw"), we would like there to be a perfect passive. Because there was no such verbal form, a phrase was made to be equivalent, (visus sum "I was/am seen in the past"). We humans like our systems neat and symmetrical if possible:
Active Passive
Present video videor
Imperfect videbam videbar
Perfect vidi [visus sum]
Future videbo videbor
Now is this label "passive perfect" merely a convention? It may have been once, but, as people start believing in it, they start using it in ways that neatly fit the system, even if the meaning of visus sum was once somewhat different. It is in some ways a self-fulfilling prophecy. Whenever a sentence in the active perfect was passivated, instead of saying "oh, I can't do that", people started thinking, "this is the passive perfect; I will use it". The same applies to I will do it in English.
All three approaches have up-sides and down-sides
Is this a perfect system of terminology? No. There are serious disadvantages. But it has been in use for a long while, and most people think of "I will do it" as fitting within a neat system of past, present, and future, because that is the most convenient and obvious partition of our verb tenses, or so we feel.
Various branches of linguistics have proposed different systems and different terminologies in the past. This is a productive and beneficial approach. Some chose to focus on form and consider the English periphrastic future not a tense at all; they will only count affixes and endings as capable of forming tenses. This system certainly has merit.
Others have emphasised function; they have gone so far as to declare that, since many forms can be used for more than one function, as with si eram... / "if I was...", only foregoing form altogether leads to a consistent approach. Hence they treat tense as a property of a clause or sentence, not of a word or phrase. That way, only combined with a word like yesterday does was acquire a past tense; in if I was at work today, you wouldn't see me here, it is a present tense, because it refers to a situation in the present, be it an hypothetical one. This approach, too, has merit.
One could use several systems at once
As an alternative, we could invent new words for these two new approaches, such as *single-word tenses for the English simple present and simple past, and time-reference or temporality for the time-reference of a clause or sentence. Many different models are possible. Insisting on one model without considering the benefits of other models seems unwise. And saying "x is A" when you mean "I find the model in which x is called A most useful" is a simplification.
Suppletion as an illustration of a convenient choice
Some systems are uncontested, even though at some point in the past a fairly arbitrary choice must have been made.
I go.
I went.
Do these two forms belong to the same verb? Yes, you will, say, because that is what you were taught, and because they "feel" like the same verb, just with odd forms. But, in the past, there were two verbs, both meaning something like going (although there were no doubt some differences between them). At some point the present form of a verb resembling go was taken, its past forms discarded (or not, if such never existed), and the past form of a verb resembling went.
We could say, "there are two defective verbs in modern English, one lacking a past form, the other a present form"; but we choose not to do so. That is to some degree arbitrary, but in this case it is just very convenient. If certain linguists would prefer to treat them as two different verbs, then let them do so, if this is somehow more convenient in a certain linguistic analysis. Or they could just say "this verb consists of two different roots", as they no doubt do.
Best Answer
If you will please read the rest of your link, you see that it itself spells out that should is “properly” the past tense form of shall, which is itself a present tense verb. This is similar to how would is the morphological past tense form of the present tense verb will. Similarly with may and might.
English verbs have no morphological future form, not now, and not ever. You cannot change an English verb's form to make it a future tense verb the way you can change sing to sang or cry to cried when you create a past tense verb form. With the future, that is impossible, and always has been.
Instead of changing the form of the verb to express such things, English uses other words to convey that an event takes place at some time other than now. For example:
By that tomorrow at three you can tell that the action is not taking place today, but in the future. The verb see does not change form, however. On the other hand, for the past, it does:
There is nothing more to be done with that verb than see and saw, plus seen and seeing for participles. For anything else, you need other words because you cannot turn that verb into a future tense form.
Modalities
English has something better than future tense: it has modalities via modal verbs — which are something else altogether. Mode and tense are different things.
All English modal verbs have both an epistemic sense and a deontic sense. This instance of shall in your translation does ɴᴏᴛ simply indicate some future event in an epistemic way the way this one does here:
Rather, this is the deontic modality of permission, of command, of inevitability — the one seen in:
So this is not “future tense”; it is a present tense modal verb being used in the deontic mode to indicate a special kind of future event.
The verb come is in the infinitive. We cannot use the past tense form came here, and since there is no future tense form for that verb (nor for any in English), we use a modal plus an infinitive instead to get much the same job done as a future tense would have.
Compared with real future tense
For an actual future tense example, since there is nothing in English, you need a language that had such a thing. Conveniently here, you could look at your verse’s Latin translation, since that language had a future tense. In the Vulgate, that verse runs like this:
That verb instabunt is the third person plural future active indicative form of the verb insto. It’s a third person plural form to agree with tempora periculosa, “dangerous times”. But it is not present tense. It is future tense. If it had been in the present tense, it would have been instant, but by changing the tense of the verb from present to future tense, you get instabunt instead.
That’s how future tense works in languages that have such a thing: the verb’s inflection changes. You have to use a different word.
So why did the English translators choose a deontic modal instead of an epistemic modal for this verb? The answer to that can be found by examining the other verb in that sentence. Scito is a future imperative of the verb scio “to know”. It is not a normal present imperative, which would have been just sci (although this is a rarely used form) in the present imperative, not scito in the future imperative.
With the commanding verb in the future imperative, it therefore makes more sense to use the deontic modal shall here for the translation in the other clause rather than the more normal epistemic will. So you get shall come, a present tense verb being used in the deontic modality combined with another verb in the infinitive.
There’s more to it than this, but this should get you started.