Learn English – Want to know the components in a sentence, clause, and phrase

sentencesentence-patternssyntactic-analysis

This is a bit different sort of question coming from a computer science student, working on a Natural Language Processing project.

As a part of our project we got stuck into a situation where we want to know: how are the essential components in a sentence arranged?

We know that every sentence contains a main verb (V) and then the structure goes on from noun(N) followed by verb(V) to some more complex forms, following some patterns of N, V, Adverb(AV), Adjective(AJ), Determiners(D), Preposition(P), etc.

Example:

Apple is red in color is not a correct English sentence. The correct English is "An apple is red in color." So can we infer that N V AJ P N is wrong English and the correct English sentence follows this template D N V AJ P N?

In the same way clauses and phrases also follow some unique patterns of N, V, AV, AJ, A, etc.

Can someone tell me how the pattern is followed? Is there any rule that it follows from which we can infer whether a given sentence is valid or not?

To qualify my previous statement, I want to know whether a given group of words is a valid English sentence or not by recognizing the pattern and combination of N, V, AV, AJ, A, etc.

My goal is to recognize simple sentences; I am not working with complex sentence forms.

Can anyone help me with this?

Best Answer

You might like to know about Context Free Phrase Structure Grammar (CFPSG), which is similar to the approach you're taking, but it allows for intermediate categories, like NP. I'll give an example reformulation for your example:

S -> NP VP
NP -> D N
VP -> V AJ P CNP
D -> an
N -> apple
V -> is
AJ -> red
P -> in
CNP -> color

If there is a way to derive a phrase by starting with S and using rules to make substitutions, then the phrase is said to be generated by the grammar. A set of phrases all of which are generated by such a grammar is said to be a language generated by the grammar.

An advantage of having such intermediate categories as NP available, is that once you have described the fact that "apple" is not a good NP in subject position, it will follow that it is also not good in other sentence positions. *"I'd like apple".

CFPSGs have had much use in grammar and in computer science. The classic Unix tool yacc, "Yet Another Compiler Compiler", is based on CFPSG, for instance, and the languages generated by CFPSGs are those recognized by the push down store automata.

Related Topic