In this question, it is established that Thieves' Cant is more of an encoding system built on top of a language than a language in its own right.
According to the Basic Rules, Thieves' Cant is:
…a secret mix of dialect, jargon and code allows you to hide messages in seemingly normal conversation. Only another creature that knows thieves' cant understands such messages. It takes four times longer to convey such a message than it does to speak the same idea plainly.
In addition, you understand a set of secret signs and symbols used to convey short, simple messages, such as whether an area is dangerous or the territory of a thieves' guild, whether loot is nearby, or whether the people in an area are easy marks or will provide a safe house for thieves on the run. (PBR, p. 27)
Can two creatures communicate using Thieves' Cant if they do not have a basic spoken language in common, or if they are speaking different basic languages? E.g. if one creature knows Orcish, Goblin, and Thieves' Cant, and another creature knows Elvish, Giant, Common, and Thieves' Cant, can they communicate with each other by embedding Thieves' Cant into their own languages?
Hypothetically speaking, I can see how this could be possible in our own world, if we assume that Thieves' Cant depends on nuances of body language, emphasis, and tempo, e.g., one might say to a non-Spanish speaker:
Tenemos quEEEEEEE ir (wink) a la tieeeeenda de (shuffle left foot) armas para (twirl hair and giggle) comprar una espada (wink twice) para mi herMANO (wag right pinky and nod).
The literal message itself ("We need to go to the weapon store to buy a sword for my brother.") would just be a carrier – the real message might be hidden within the gestures and the accented and extended syllables. Perhaps the drawn-out "E" vowel means "robbery planned", the wink means "bank", shuffling the left foot means "be there at sunset", etc., and these are things that you wouldn't necessarily need to know Spanish to extract.
Is this how Thieves' Cant actually works, or does it require actual comprehension of the literal message transmitted before the "secret" message can be extracted from it?
RAW is unclear, but Sage intent seems to require a common language
As you've noted, RAW doesn't provide enough detail to answer this by itself.
Sage Advice doesn't have anything specifically answering this, but there are some related questions that illuminate the intention behind Thieves' Cant:
Can Comprehend Languages understand Thieves’ cant?
Is Eyes of the Rune Keeper intended to let you understand codes, such as numeric ciphers or Thieves’ Cant jargon?
Both of these suggest strongly that Thieves' Cant is conveyed via an existing language that the reader and writer both understand.
(That said, there's no mechanical consequence to this; make whatever decision seems most fun for your table.)