Short answer
To master the /z/ sound, make a long /s/ sound and sing loudly at the same time. It won't sound anything like a J, /dʒ/ at all.
Full answer
The Production of /z/
/z/ is a voiced alveolar fricative. To understand how we make a /z/, we need to think about some different parts of the mouth.
If you look in the mirror you will see a line running down the middle of your tongue (called the mid-saggital line).
If you feel behind your teeth with your tongue, would will feel a little shelf. That's your alveolar ridge. Behind that your mouth suddenly arches upwards to form the roof of your mouth.
To produce a /z/, you have to place the blade of your tongue (that's the bit just behind the very tip) on the alveolar ridge. You create a little furrow or ridge down the mid-saggital line. The rims (sides) of your tongue rest against the inside of your side teeth.
You then force air from your lungs down the narrow channel, this creates friction, or turbulence in the air as the air is forced through this furrow and through the narrow hole created between your tongue and the alveolar ridge. This turbulence causes a hissing noise. While this is happening, your vocal folds (sometimes called your vocal cords) vibrate. This gives the sound pitch which is heard at the same time as the turbulence
It might be helpful at this point to think about how we make the sound /s/.
The production of /s/
To produce a /s/ you have to place the blade of your tongue (that's the bit just behind the very tip) on the alveolar ridge. You create a little furrow or ridge down the mid-saggital line. The rims (sides) of your tongue rest against the inside of your side teeth.
You then force air from your lungs down the narrow channel, this creates friction, or turbulence in the air as the air is forced through this furrow and through the narrow hole created between your tongue and the alveolar ridge. This turbulence causes a hissing noise.
This probably sounds a bit familiar!
How to make a /z/ if you are accidentally making a J sound as in jump, /dʒ/
Now, you will have noticed that to make a /z/ we do exactly the same thing as we do for /s/. There is no difference between the position of your tongue, teeth, or any other part of your mouth at all. The only difference is that when we make a /z/ we have voicing, or vocal fold vibration. This gives the /z/ pitch; we can make a high pitched /z/ or a low pitched /z/. When we make an /s/ we just get a hissing sound. It does not have the same quality of pitch. This is because there is no vibration from the vocal cords.
Because of this, if you can already say /s/ with no problem, you just need to add vocal fold vibration to make a /z/. You need to add pitch. How can you do this? The answer is: you need to sing while you make an /s/. Start making an /s/ and then sing while you are making it. You need to make the /s/ for several seconds. First do it at a high pitch then a low pitch. If you can hear a high or low pitch then you are making a /z/. You can then start practising it at a normal speaking type of pitch.
It is much easier to do this than to try and follow the instructions for making a /z/. If you are making a /dʒ/ sound, the sound in the word jump, then I could give you advice like "move your tongue slightly forward in your mouth towards your front teeth - but don't make a complete closure with your tongue when you start the sound". However, this is very, very difficult to do without anyone to help you. In my experience, singing a note whilst making an /s/ sound is quite easy to do, more effective and more fun.
Hope that's helpful!
References:
You can read about /s/ and /z/ in Gimson's pronunciation of English by Alan Cruttenden, 8th Edition 2014
It's pronounced /spel/ in the audio clip.
Phonemically, English has two bilabial plosive consonants, /b/ and /p/.
Phonetically, these two sounds can be realized in more than one way. The relevant ones to our question are [b] (for /b/), and [pʰ] and [p˭] (for /p/).
[b] is voiced.
[pʰ] is aspirated and unvoiced.
[p˭] is unaspirated and unvoiced.
The unaspirated [p˭] sound is common in English when a "p" (i.e., the /p/ sound) comes after an "s" (the /s/ sound), e.g., spool, spin, spell, etc.
In the audio clip given by Wiktionary, the /p/ sound is a [p˭] sound, that is, it's an unaspirated /p/ sound.
Note that it's not a /b/ sound in English.
For a native speaker of a Chinese language/dialect, it's not surprising to hear this unaspirated /p/ sound as a [b], because the Chinese unaspirated unvoiced bilabial plosive consonant sound ([p˭]) is romanized in Pinyin as b.
(For more information, see Standard Chinese phonology.)
So, I'd say that the OP hears the sound correctly, but it's a "b" only in Chinese. In English, it's a "p".
And in my humble opinion, this is quite normal for a non-native speaker.
The trick is to know that a sound could be thought of as two different consonants in two different languages. Keep that in mind and you would do just fine in listening tasks.
Best Answer
Try this for word
Some guidance, Bird Is The Word by the Trashmen, can be found here
and this for ward
These pronunciations are for a flat accent sometimes termed newscaster or General American, there can be regional differences but this is the most basic.