Unlike in some other languages, English spelling tends to reflect the developmental history of the word rather than its pronunciation. Therefore, it takes more learning and practice to pronounce English words. After learning the basic rules, you also need to learn some exceptions, and with enough practice, you may be able to spot some patterns.
Given that English is built on Greek, Latin, Anglo-Saxon / Norse, and French influence, and continues to assimilate words from other languages, it helps to consider which set of pronunciation rules to apply depending on the word's origin. For example, "ch" in words of Greek origin (e.g. psyche) would generally have a /k/ sound. In words taken from French during an earlier period (e.g. chief), "ch" would have a /tʃ/ sound. Later French borrowings (e.g. chef) would have a softer /ʃ/ sound.
Even with lots of experience, any English speaker who claims to be able to read any word correctly is lying. Here is a whole thread on Reddit full of words that people have mispronounced for years. Some examples include:
- hyperbole, epitome, synecdoche
- draught
- lingerie, macabre, melee
- segue
- açai
- awry
- victuals
- quinoa
- chalcedony
I'd also add
- row (in the sense of a fight)
- chassis
No amount of experience would ever help you guess the British pronunciation of "lieutenant".
Part of the difficulty is, believe it or not, deliberately introduced. In words like "scent" and "debt", silent letters were added to make them fit their etymology.
Your only consolation is that English is still easier to read than Chinese.
Have you figured out the pronunciation of the words above? Here are the answers!
/haɪˈpɝːbəli/ /ɪˈpɪt.ə.mi/ /sɪˈnɛkdəki/
/dɹɑːft/
/ˌlɑn.(d)ʒəˈɹeɪ/ /məˈkɑːbɹə/ /mɛˈleɪ/
/ˈsɛɡweɪ/
/ˈa.saj/
/əˈɹaɪ/
/ˈvɪtəlz/
/ˈkinˌwɑ/
/kælˈsɛdəni/
/raʊ/
/ˈtʃæsi/ or /ˈʃæsi/
/lɛfˈtɛnənt/
Most native English speakers you hear will effortlessly pronounce the th digraph you're having trouble with. While there are some dialects of English that pronounce it /d/ or /t/ or /f/ depending on position, standard pronunciations in the US and UK pronounce it "normally" and that is what you should strive to emulate if you want to sound like a native speaker.
There are phonemes in every language that non-native speakers have trouble with, and English is no exception. This is the advantage of growing up speaking a language from childhood. And by this I mean from very early childhood, what most people would consider the pre-verbal period of a baby's life.
9-month-old babies are aware of the phonemes in their own language as they start to use both prosodic and phonotactic cues to discriminate individual speech sounds of their language
Some studies have shown that unless a baby hears its language's phonemes in its first six months of life, it may never code for them at all.
The point is, yes, it's hard to duplicate certain sounds in another language. You may never pronounce those sounds perfectly. But unless you make the effort, your pronunciation will always mark you as foreign* and, worse, you may have trouble communicating with native speakers.
Addendum
In response to a comment I'm including further information about the critical nature of language exposure in the early months of life:
At birth, infants are prepared to learn any language. For example, an American baby adopted by an Inuit-speaking Eskimo family would grow up speaking fluent Inuktitut and have no trouble saying words such as qikturiaqtauniq ("mosquito bite"). However, even before their first birthdays, babies begin to lose the ability to hear the distinctions among phonemes in languages other than their own. By around the age of six months, babies have already begun to hear the sounds of their own language in the same way that adult speakers do, as Patricia Kuhl and her associates (1992) have shown in their research.
It's worth noting that they say babies before their first birthdays are beginning to lose "the ability to hear the distinctions among phoneme in languages other than their own." Not that they've lost it, but that the longer a child goes without hearing those distinctions and, consequently, producing them itself, the harder it will be for that child to reproduce all the language's sounds. By the time one reaches adulthood, it can be a monumental task.
Anecdotally, my own name, which is Germanic and contains the ü sound in German, is extremely difficult for me to pronounce fluently; and a word like Brüder, with its combination of the glottal /r/ immediately followed by the ü, is well-nigh impossible for me—even though I worked in Germany for a time and acquired a fair bit of fluency. It was always a source of chagrin for me, especially when I would hear my coworkers pronounce my name flawlessly and without effort.
* And in case you think that it's all right to use those non-standard sounds produced by dialectical speakers, be aware that even to sound like them you would have to master the whole range of their pronunciations as well, and be able to use those when appropriate, which would be just as big a task (if not bigger) as learning the standard version.
Best Answer
In connected speech, /ð/ at the start of function words may be assimilated to a preceding consonant in some cases. However, I don't think there are any circumstances where this kind of assimilation always occurs—my impression is that it is gradient. Also, the identity of the preceding consonant probably affects the probability of assimilation.
I have found a source "Applied English Phonology", by Mehmet Yavas, that gives a more specific description of the conditions of this assimilation:
I think "takes them" may not be the best example of the phonetic process in question, since them additionally has an alternative form ’em that may occur after any consonant, not only alveolar consonants.
The fact that "in the" could be realized as [ɪnːə] or [ɪnə] rather than [ɪnðə] is mentioned in Geoff Lindsey's blog post "Lucas quiz – the answers".
Another known phonetic phenomenon is deletion (which could be seen as assimilation followed by mandatory shortening) of [θ] or [ð] before the suffix -(e)s. This is lexicalized for many speakers in the noun clothes /kloʊz/, although the non-assimilated pronunciation /kloʊðz/ is not uncommon either. Some speakers (I think a smaller number) also have this type of assimilation/deletion in the word "months", pronouncing it as [mʌnts]. This has been covered in other places on this site (e.g. How to distinguish 'month' and 'months' in pronunciation?)
As far as I know, no native speakers (without speech impediments) use [z] for /ð/, or [s] for /θ/, in contexts other than assimilation to an adjacent /s/ or /z/.
Some native speakers do use realizations other than [ð] and [θ] more generally—I discuss this in more detail in my answer to Do all native English speakers actually pronounce the "th" sound?—but as far as I know it is always something non-sibilant like [d̪], [d̪͡ð], [v]. If you can't manage [θ] in "thorough" or "thief", I would say it's better to fall back on [f] or [t] than to use [s].
As for "at the beginning" and "what the heck", if you pronounce them at a reasonable pace, it will probably not even be noticed if you use a dental stop [d̪] rather than a dental fricative.