Ah well. Looks like I'll be talking to myself, for the most part. Since the only commentators have decided to all beggar off (, I could have said "bugger off", but I didn't).
So where was I?
Oh right, Information Theory, the idea of channels and connections; something that gets delivered, like a newspaper, or a message of some kind. How, in the abstract world of IT, information is a string of arbitrary bits, you might chop the string up into chunks and call them words, or symbols, or tokens, again this is arbitrary.
In good old binary, there are only two symbols needed (two states), and binary comes in useful chunks of powers of 2.
You can have arbitrarily complex alphabets, which are just representative "mappings" of any set of possible symbols, say 2^16, of which a subset might represent certain instructions, another subset might represent values, fractional and integral, another might be pixels in a compressed or uncompressed ("raw"), image, and each depends, obviously on how it is interpreted, or what is done with or to it (raw video data usually won't map to a process image --run on a CPU, for instance). The key is recognition, that every pattern is different.
If you, an observer, start receiving a stream of bits down some connection, and you don't know what the protocol is, all you can do is store it, and wait for more.
You might compare what's arriving with what you have received, in a kind of continuous search for some pattern (like, a repeated pattern). If it's just a continuous stream of random bits, you're sunk. Without some kind of repetition to "recognise", there's only meaningless information.