NDLs vs. linear filters: a layman's analogy
Imagine that we want to send messages through messengers. Each messenger may carry only a portion of the message, and to send the complete message we may need to send several messengers. We may choose to select the messangers based on their height ("frequency") range, for example, between 5'6" and 5'8". This way a group of messengers in a different height range would carry a different message. To receive the message, we would need to interview the messengers in the respective height ("frequency") range
In their travel (for example, by train), our messengers may commingle with other travellers, and, when they arrive at their destination, they would be surrounded by other people of various heights. Let us first assume that these strangers form a completly disorganized crowd ("thermal noise")
To find our messengers, we would need to reject those people in the crowd whose height is outside of the 5'6" to 5'8" range, selecting only those between 5'6" and 5'8". In our analogy, for a disorganized crowd both the NDL and the linear filter would make the same selection
Still, there may be quite a few strangers left in our selection. In the figure above, 8 people fall within the 5'6" to 5'8" range, and only 5 are our messengers. To get our message, we would need to interview 8 people instead of 5, which would take more time and effort, reducing our "channel capacity"
Let us now assume that there is another group of people travelling together with our messengers in some organized, or structured fashion, as shown by the blue color in the figure below
In such organized group, different "frequencies" are "coupled", forming some kind of "structure". For example, in the figure below the additional group consists of pairs of people of different heights. In our analogy, such a structured group would represent a technogenic, or man-made interference
Our linear filter simply selects the heights ("frequencies") in the specified range, and will not discriminate between the strangers from the random ("thermal") crowd and the structured ("technogenic") group. As can be seen in the figure below, when applied to the whole group, the linear filter will select all our messengers (black) together with the members of the random crowd in the 5'6" to 5'8" range (red), as well as the members of the structured group whose height is between 5'6" and 5'8" (blue). Now we would need to interview 12 people instead of 5, which would take even more time and effort, further reducing our "channel capacity"
The NDL, on the other hand, is configured to recognize a specific structure of the noise such as the coupling of different height people in our analogy. Further, the NDL is capable of rejecting such structured features before making the final selection of the people within the desired height range. As a result, the final selection would not contain any members of the organized group regardless how big this group is, as can be seen in the figure below. Such a disproportionately strong (nonlinear) impact of the NDL on the signal+noise mixture is a very important property of the NDL
Note that, as can be seen in the figure above, the linear filter breaks out the pairs in the organized group, making it impossible to further identify the technogenic interference and separate it from the thermal noise. This is why the NDL must replace (or precede) the linear filter in the receiver chain. If the signal bandwidth has already been reduced by linear filtering and the structure of the technogenic interference has been altered, effective suppression of the man-made interference becomes unattainable
With this analogy in mind, please take a look at a more technical example here, and at the following illustrative application