information-theory

Fibonacci coding

Can anybody suggest a good book/paper/website/background reading about universal codes for integers and especially Fibonacci code (in the sense of http://en.wikipedia.org/wiki/Fibonacci_code)? Thanks! Edit: Thanks for the answers and the useful links so far! I am sorry if I have not made myself completely clear: I am not asking about co...

What is the computer science definition of entropy?

I've recently started a course on data compression at my university. However, I find the use of the term "entropy" as it applies to computer science rather ambiguous. As far as I can tell, it roughly translates to the "randomness" of a system or structure. What is the proper definition of computer science "entropy"? ...

Relation of Entropy to Lossless Compression Rate

From Shannon's Source Coding Theorem we know that the entropy of a compressed string is bounded by the entropy of the original string like so: H(X) <= L < H(X) + 1/N where H(X) is entropy of the source string, N is the length of the source string, and L is the expected length of the compressed string. This necessarily means that ther...

Shannon's entropy formula. Help my confusion.

Hi, my understanding of the entropy formula is that it's used to compute the minimum number of bits required to represent some data. It's usually worded differently when defined, but the previous understanding is what I relied on until now. Here's my problem. Suppose I have a sequence of 100 '1' followed by 100 '0' = 200 bits. The al...

Entropy repacking

I have been tossing around a conceptual idea for a machine (as in a Turing machine) and I'm wondering if any work has been done on this or related topics. The idea is a machine that takes an entropy stream and gives out random symbols in any range without losing any entropy. I'll grand that is a far from rigorous description so I'll g...

what part of numbers has more entropy?

Given the sequence pf numbers N1, N2, N3... from some source, not a PRNG but say sensor or logging data of some kind, is it safe to assume that processing it like this Nn / B = Qn  Rem Mn will result in the sequence Q haveing less entropy than the sequence M? Note: assume that B is such that both Q and M has the same sized range. ...

How to prove to our users that they are not being cheated?

I have an information theory question about how to prove (or at least give statistical evidence) that an auction website is not shilling its users. We recently launched a pay-per-bid auction website. It is a new type of auction where the users pay to bid on timed auctions. Each bid raises the price and increases the time of the auction...

Theory: Compression algorithm that makes some files smaller but none bigger?

I came across this question; "A lossless compression algorithm claims to guarantee to make some files smaller and no files larger. Is this; a) Impossible b) Possible but may run for an indeterminate amount of time, c) Possible for compression factor 2 or less, d) Possible for any compression factor?" I'm leaning ...

error correction code upper bound

If I want to send a d-bit packet and add another r bits for error correction code (d>r) how many errors I can find and correct at most? ...

Calculating Mutual Information For Selecting a Training Set in Java

Scenario I am attempting to implement supervised learning over a data set within a Java GUI application. The user will be given a list of items or 'reports' to inspect and will label them based on a set of available labels. Once the supervised learning is complete, the labelled instances will then be given to a learning algorithm. Thi...

Extended Huffman code

Hello, I have this homework: finding the code words for the symbols in any given alphabet. It says I have to use binary Huffman on groups of three symbols. What does that mean exactly? Do i use regular Huffman on [alphabet]^3? If so, how do I then tell the difference between the 3 symbols in a group? ...

Algorithm for rating the monotonicity of an array (i.e. judging the "sortedness" of an array)

EDIT: Wow, many great responses. Yes, I am using this as a fitness function for judging the quality of a sort performed by a genetic algorithm. So cost-of-evaluation is important (i.e., it has to be fast, preferably O(n).) As part of an AI application I am toying with, I'd like to be able to rate a candidate array of integers base...

redundant encoding?

This is more of a computer science / information theory question than a straightforward programming one, so if anyone knows of a better site to post this, please let me know. Let's say I have an N-bit piece of data that will be sent redundantly in M messages, where at least M-1 of those messages will be received successfully. I am inte...

redundant encoding of a sampled counter

this is an extension of my other question about redundant encoding, but the two are different enough that I wanted to split it up. What if I have a counter that increments once per tick, and at erratic rates a message will be sent out containing information about the counter. If I know that the interval between messages will be at most ...

Practical way of explaining "Information Theory"

Information theory comes into play where ever encoding & decoding is present. For example: compression(multimedia), cryptography. In Information Theory we encounter terms like "Entropy", "Self Information", "Mutual Information" and entire subject is based on these terms. Which just sound nothing more than abstract. Frankly, they don't r...

A good intro to information theory, please?

I know about Wikipedia and MacKay's Information Theory, Inference, and Learning Algorithms (is it appropriate as textbook?). A textbook starting with Shannon's entropy and going through Conditional entropy and Mutual information is sought... Any idea? If you are following such a course at your university, which textbook is used? Thanks...

how to compute information entropy in a two-step decision?

I have a question which I think involves "conditional entropy" in the field of information theory. I am trying to wrap my head around it, but could use some help. Consider an example in which we have four houses. In the first house there are eight people, four people live in the second house, and there are two people in third house,...

How to adjust the distribution of values in a random data stream?

Given a infinite stream of random 0's and 1's that is from a biased (e.g. 1's are more common than 0's by a know factor) but otherwise ideal random number generator, I want to convert it into a (shorter) infinite stream that is just as ideal but also unbiased. Looking up the definition of entropy finds this graph showing how many bits o...

Is information a subset of data?

I apologize as I don't know whether this is more of a math question that belongs on mathoverflow or if it's a computer science question that belongs here. That said, I believe I understand the fundamental difference between data, information, and knowledge. My understanding is that information carries both data and meaning. One thin...

Can information encoded with a one time pad be distinguished from random noise?

I understand that the cyphertext from a properly used one time pad cypher reveals absolutely no data about the encrypted message. Does this mean that there is no way to distinguish a message encrypted with a one time pad from completely random noise? Or is there some theoretical way to determine that there is a message, even though you ...