Yeah, it's probably meant to make sure people don't say something like "duh-jango". Many English speakers don't know that much about phonetics and aren't consciously aware that the "j" sound is an affricate that starts out the same way as the "d" sound.
Even if they are aware, some might mistakenly think "dj" was supposed to represent a lengthened version of this sound, i.e. /dd͡ʒ/ realized as something like [dː͡ʒ]. See the answers and comments to this ELU question ("In the word “Scent”, is the S or the C silent?") that assert that "scent" is pronounced with a longer initial consonant than "sent" or "cent". I think these comments are wrong, but it shows how someone might think this kind of thing based on the spelling.
Generally, double consonants are not pronounced distinctly in English, unless they are part of different syllables and the emphasis is on the second syllable.
A word like dissatisfied is formed by adding a prefix dis- to the word satisfied. It starts off with two s in separate syllables, and can be pronounced like that- one at the end of the first syllable, one at the end of the second syllable.
A word like irregular is formed by adding a prefix in- to the word regular. The n-r combination is difficult to say, so we replace the n by another r. The same thing happens with the letter l, so in + licit becomes illicit. According to Cambridge Dictionary, the first l is not pronounced, likewise with in + modest. Note that this conversion only happens with word that passed through medieval latin: more modern words like inroad (1540), inlay (16th C) and inline (1913) are unaffected.
The same kind of conversion happens in arabic for sun letters (il+r -> irr). In arabic double consonants are always clearly pronounced, and this applies to sun letter conversions too.
In a non-rhotic dialects there is an identifiable reaason for not pronouncing the first r, because in non-rhotic dialects (England english, for example) an r followed by a consonant is not pronounced.
In rhotic dialects such as US english, the pronunciation of the n-become-r is, according to Merriam-Webster, optional.
I am a native of England (non-rhotic) and I do not pronounce it as a double r. I can and do double the r when speaking arabic, so I do understand the difference. Other natives of England do not pronounce the double r. If I heard somebody pronounce it with a double r, I would assume that they were foreign. I believe that I have heard natives of Scotland (rhotic) pronouncing it with a double r. I cannot comment on US english.
Here are recordings of me saying irregular and erectile:
And here I say irregular again, pronouncing the two r's separately.
Best Answer
It's pronounced /spel/ in the audio clip.
Phonemically, English has two bilabial plosive consonants, /b/ and /p/.
Phonetically, these two sounds can be realized in more than one way. The relevant ones to our question are [b] (for /b/), and [pʰ] and [p˭] (for /p/).
[b] is voiced.
[pʰ] is aspirated and unvoiced.
[p˭] is unaspirated and unvoiced.
The unaspirated [p˭] sound is common in English when a "p" (i.e., the /p/ sound) comes after an "s" (the /s/ sound), e.g., spool, spin, spell, etc.
In the audio clip given by Wiktionary, the /p/ sound is a [p˭] sound, that is, it's an unaspirated /p/ sound.
Note that it's not a /b/ sound in English.
For a native speaker of a Chinese language/dialect, it's not surprising to hear this unaspirated /p/ sound as a [b], because the Chinese unaspirated unvoiced bilabial plosive consonant sound ([p˭]) is romanized in Pinyin as b.
(For more information, see Standard Chinese phonology.)
So, I'd say that the OP hears the sound correctly, but it's a "b" only in Chinese. In English, it's a "p".
And in my humble opinion, this is quite normal for a non-native speaker.
The trick is to know that a sound could be thought of as two different consonants in two different languages. Keep that in mind and you would do just fine in listening tasks.