How does entropy depend on location and scale?Entropy of Cauchy (Lorentz) DistributionDeriving Negentropy....
How to extract coefficients of a generating function like this one, using a computer?
What is this fluorinated organic substance?
Could citing a database like libgen get one into trouble?
Why should I allow multiple IP addresses on a website for a single session?
Non-inverting amplifier ; Single supply ; Bipolar input
*p++->str : Understanding evaluation of ->
How do I tell my girlfriend she's been buying me books by the wrong author for the last nine months?
Can you run PoE Cat6 alongside standard Cat6 cables?
GFCI versus circuit breaker
Installed software from source, how to say yum not to install it from package?
Merging two data frames into a new one with unique items marked with 1 or 0
How far can gerrymandering go?
What's the idiomatic (or best) way to trim surrounding whitespace from a string?
Classify 2-dim p-adic galois representations
Why are symbols not written in words?
Does "boire un jus" tend to mean "coffee" or "juice of fruit"?
Which high-degree derivatives play an essential role?
Are the Gray and Death Slaad's Bite and Claw attacks magical?
How might boat designs change in order to allow them to be pulled by dragons?
How can I change my buffer system for protein purification?
"Best practices" for formulating MIPs
Why will we fail creating a self sustaining off world colony?
Can I deep fry food in butter instead of vegetable oil?
What is the meaning of "it" in "as luck would have it"?
How does entropy depend on location and scale?
Entropy of Cauchy (Lorentz) DistributionDeriving Negentropy. Getting stuck[Revised]Proving the expected bold{density} of being the Nth order statistics is decreasing in sample sizeRelationship between least-squares regression and information theoryAsymptotic relation for a class of probability distribution functionsLet $X_{(1)}leq X_{(2)}$ be the order statistics. Evaluate $operatorname{Var}(X_{(j)})$, $operatorname{Cov}(X_{(1)},X_{(2)})$Why we multiply the density instead of taking the integral?Is continuous mutual information the correct analogue of the discrete version?What does the integral of a function times a function of a random variable represent, conceptually?If $Xsim operatorname{lognormal}$ then $Y:=(X-dmid xgeq d)$ has approximately a Generalized Pareto distributionSequence of shifted exponential distributions has uniform conditionals?
$begingroup$
The entropy of a continuous distribution with density function $f$ is defined to be the negative of the expectation of $log(f),$ and therefore equals
$$H_f = -int_{-infty}^{infty} log(f(x)) f(x)mathrm{d}x.$$
We also say that any random variable $X$ whose distribution has density $f$ has entropy $H_f.$ (This integral is well-defined even when $f$ has zeros, because $log(f(x))f(x)$ can be taken to equal zero at such values.)
When $X$ and $Y$ are random variables for which $Y = X+mu$ ($mu$ is a constant), $Y$ is said to be a version of $X$ shifted by $mu.$ Similarly, when $Y = Xsigma$ ($sigma$ is a positive constant), $Y$ is said to be a version of $X$ scaled by $sigma.$ Combining a scale with a shift gives $Y=Xsigma + mu.$
These relations occur frequently. For instance, changing the units of measurement of $X$ shifts and scales it.
How is the entropy of $Y = Xsigma + mu$ related to that of $X?$
distributions data-transformation entropy
$endgroup$
add a comment |
$begingroup$
The entropy of a continuous distribution with density function $f$ is defined to be the negative of the expectation of $log(f),$ and therefore equals
$$H_f = -int_{-infty}^{infty} log(f(x)) f(x)mathrm{d}x.$$
We also say that any random variable $X$ whose distribution has density $f$ has entropy $H_f.$ (This integral is well-defined even when $f$ has zeros, because $log(f(x))f(x)$ can be taken to equal zero at such values.)
When $X$ and $Y$ are random variables for which $Y = X+mu$ ($mu$ is a constant), $Y$ is said to be a version of $X$ shifted by $mu.$ Similarly, when $Y = Xsigma$ ($sigma$ is a positive constant), $Y$ is said to be a version of $X$ scaled by $sigma.$ Combining a scale with a shift gives $Y=Xsigma + mu.$
These relations occur frequently. For instance, changing the units of measurement of $X$ shifts and scales it.
How is the entropy of $Y = Xsigma + mu$ related to that of $X?$
distributions data-transformation entropy
$endgroup$
add a comment |
$begingroup$
The entropy of a continuous distribution with density function $f$ is defined to be the negative of the expectation of $log(f),$ and therefore equals
$$H_f = -int_{-infty}^{infty} log(f(x)) f(x)mathrm{d}x.$$
We also say that any random variable $X$ whose distribution has density $f$ has entropy $H_f.$ (This integral is well-defined even when $f$ has zeros, because $log(f(x))f(x)$ can be taken to equal zero at such values.)
When $X$ and $Y$ are random variables for which $Y = X+mu$ ($mu$ is a constant), $Y$ is said to be a version of $X$ shifted by $mu.$ Similarly, when $Y = Xsigma$ ($sigma$ is a positive constant), $Y$ is said to be a version of $X$ scaled by $sigma.$ Combining a scale with a shift gives $Y=Xsigma + mu.$
These relations occur frequently. For instance, changing the units of measurement of $X$ shifts and scales it.
How is the entropy of $Y = Xsigma + mu$ related to that of $X?$
distributions data-transformation entropy
$endgroup$
The entropy of a continuous distribution with density function $f$ is defined to be the negative of the expectation of $log(f),$ and therefore equals
$$H_f = -int_{-infty}^{infty} log(f(x)) f(x)mathrm{d}x.$$
We also say that any random variable $X$ whose distribution has density $f$ has entropy $H_f.$ (This integral is well-defined even when $f$ has zeros, because $log(f(x))f(x)$ can be taken to equal zero at such values.)
When $X$ and $Y$ are random variables for which $Y = X+mu$ ($mu$ is a constant), $Y$ is said to be a version of $X$ shifted by $mu.$ Similarly, when $Y = Xsigma$ ($sigma$ is a positive constant), $Y$ is said to be a version of $X$ scaled by $sigma.$ Combining a scale with a shift gives $Y=Xsigma + mu.$
These relations occur frequently. For instance, changing the units of measurement of $X$ shifts and scales it.
How is the entropy of $Y = Xsigma + mu$ related to that of $X?$
distributions data-transformation entropy
distributions data-transformation entropy
asked 8 hours ago
whuber♦whuber
212k34 gold badges465 silver badges847 bronze badges
212k34 gold badges465 silver badges847 bronze badges
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Since the probability element of $X$ is $f(x)mathrm{d}x,$ the change of variable $y = xsigma + mu$ is equivalent to $x = (y-mu)/sigma,$ whence from
$$f(x)mathrm{d}x = fleft(frac{y-mu}{sigma}right)mathrm{d}left(frac{y-mu}{sigma}right) = frac{1}{sigma} fleft(frac{y-mu}{sigma}right) mathrm{d}y$$
it follows that the density of $Y$ is
$$f_Y(y) = frac{1}{sigma}fleft(frac{y-mu}{sigma}right).$$
Consequently the entropy of $Y$ is
$$H(Y) = -int_{-infty}^{infty} logleft(frac{1}{sigma}fleft(frac{y-mu}{sigma}right)right) frac{1}{sigma}fleft(frac{y-mu}{sigma}right) mathrm{d}y$$
which, upon changing the variable back to $x = (y-mu)/sigma,$ produces
$$eqalign{
H(Y) &= -int_{-infty}^{infty} logleft(frac{1}{sigma}fleft(xright)right) fleft(xright) mathrm{d}x \
&= -int_{-infty}^{infty} left(logleft(frac{1}{sigma}right) + logleft(fleft(xright)right)right) fleft(xright) mathrm{d}x \
&= logleft(sigmaright) int_{-infty}^{infty} f(x) mathrm{d}x -int_{-infty}^{infty} logleft(fleft(xright)right) fleft(xright) mathrm{d}x \
&= log(sigma) + H_f.
}$$
These calculations used basic properties of the logarithm, the linearity of integration, and that fact that $f(x)mathrm{d}x$ integrates to unity (the Law of Total Probability).
The conclusion is
The entropy of $Y = Xsigma + mu$ is the entropy of $X$ plus $log(sigma).$
In words, shifting a random variable does not change its entropy (we may think of the entropy as depending on the values of the probability density, but not on where those values occur), while scaling a variable (which, for $sigma ge 1$ "stretches" or "smears" it out) increases its entropy by $log(sigma).$ This supports the intuition that high-entropy distributions are "more spread out" than low-entropy distributions.
As a consequence of this result, we are free to choose convenient values of $mu$ and $sigma$ when computing the entropy of any distribution. For example, the entropy of a Normal$(mu,sigma)$ distribution can be found by setting $mu=0$ and $sigma=1.$ The logarithm of the density in this case is
$$log(f(x)) = -frac{1}{2}log(2pi) - x^2/2,$$
whence
$$H = -E[-frac{1}{2}log(2pi) - X^2/2] = frac{1}{2}log(2pi) + frac{1}{2}.$$
Consequently the entropy of a Normal$(mu,sigma)$ distribution is obtained simply by adding $logsigma$ to this result, giving
$$H = frac{1}{2}log(2pi) + frac{1}{2} + log(sigma) = frac{1}{2}log(2pi,e,sigma^2)$$
as reported by Wikipedia.
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f415435%2fhow-does-entropy-depend-on-location-and-scale%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Since the probability element of $X$ is $f(x)mathrm{d}x,$ the change of variable $y = xsigma + mu$ is equivalent to $x = (y-mu)/sigma,$ whence from
$$f(x)mathrm{d}x = fleft(frac{y-mu}{sigma}right)mathrm{d}left(frac{y-mu}{sigma}right) = frac{1}{sigma} fleft(frac{y-mu}{sigma}right) mathrm{d}y$$
it follows that the density of $Y$ is
$$f_Y(y) = frac{1}{sigma}fleft(frac{y-mu}{sigma}right).$$
Consequently the entropy of $Y$ is
$$H(Y) = -int_{-infty}^{infty} logleft(frac{1}{sigma}fleft(frac{y-mu}{sigma}right)right) frac{1}{sigma}fleft(frac{y-mu}{sigma}right) mathrm{d}y$$
which, upon changing the variable back to $x = (y-mu)/sigma,$ produces
$$eqalign{
H(Y) &= -int_{-infty}^{infty} logleft(frac{1}{sigma}fleft(xright)right) fleft(xright) mathrm{d}x \
&= -int_{-infty}^{infty} left(logleft(frac{1}{sigma}right) + logleft(fleft(xright)right)right) fleft(xright) mathrm{d}x \
&= logleft(sigmaright) int_{-infty}^{infty} f(x) mathrm{d}x -int_{-infty}^{infty} logleft(fleft(xright)right) fleft(xright) mathrm{d}x \
&= log(sigma) + H_f.
}$$
These calculations used basic properties of the logarithm, the linearity of integration, and that fact that $f(x)mathrm{d}x$ integrates to unity (the Law of Total Probability).
The conclusion is
The entropy of $Y = Xsigma + mu$ is the entropy of $X$ plus $log(sigma).$
In words, shifting a random variable does not change its entropy (we may think of the entropy as depending on the values of the probability density, but not on where those values occur), while scaling a variable (which, for $sigma ge 1$ "stretches" or "smears" it out) increases its entropy by $log(sigma).$ This supports the intuition that high-entropy distributions are "more spread out" than low-entropy distributions.
As a consequence of this result, we are free to choose convenient values of $mu$ and $sigma$ when computing the entropy of any distribution. For example, the entropy of a Normal$(mu,sigma)$ distribution can be found by setting $mu=0$ and $sigma=1.$ The logarithm of the density in this case is
$$log(f(x)) = -frac{1}{2}log(2pi) - x^2/2,$$
whence
$$H = -E[-frac{1}{2}log(2pi) - X^2/2] = frac{1}{2}log(2pi) + frac{1}{2}.$$
Consequently the entropy of a Normal$(mu,sigma)$ distribution is obtained simply by adding $logsigma$ to this result, giving
$$H = frac{1}{2}log(2pi) + frac{1}{2} + log(sigma) = frac{1}{2}log(2pi,e,sigma^2)$$
as reported by Wikipedia.
$endgroup$
add a comment |
$begingroup$
Since the probability element of $X$ is $f(x)mathrm{d}x,$ the change of variable $y = xsigma + mu$ is equivalent to $x = (y-mu)/sigma,$ whence from
$$f(x)mathrm{d}x = fleft(frac{y-mu}{sigma}right)mathrm{d}left(frac{y-mu}{sigma}right) = frac{1}{sigma} fleft(frac{y-mu}{sigma}right) mathrm{d}y$$
it follows that the density of $Y$ is
$$f_Y(y) = frac{1}{sigma}fleft(frac{y-mu}{sigma}right).$$
Consequently the entropy of $Y$ is
$$H(Y) = -int_{-infty}^{infty} logleft(frac{1}{sigma}fleft(frac{y-mu}{sigma}right)right) frac{1}{sigma}fleft(frac{y-mu}{sigma}right) mathrm{d}y$$
which, upon changing the variable back to $x = (y-mu)/sigma,$ produces
$$eqalign{
H(Y) &= -int_{-infty}^{infty} logleft(frac{1}{sigma}fleft(xright)right) fleft(xright) mathrm{d}x \
&= -int_{-infty}^{infty} left(logleft(frac{1}{sigma}right) + logleft(fleft(xright)right)right) fleft(xright) mathrm{d}x \
&= logleft(sigmaright) int_{-infty}^{infty} f(x) mathrm{d}x -int_{-infty}^{infty} logleft(fleft(xright)right) fleft(xright) mathrm{d}x \
&= log(sigma) + H_f.
}$$
These calculations used basic properties of the logarithm, the linearity of integration, and that fact that $f(x)mathrm{d}x$ integrates to unity (the Law of Total Probability).
The conclusion is
The entropy of $Y = Xsigma + mu$ is the entropy of $X$ plus $log(sigma).$
In words, shifting a random variable does not change its entropy (we may think of the entropy as depending on the values of the probability density, but not on where those values occur), while scaling a variable (which, for $sigma ge 1$ "stretches" or "smears" it out) increases its entropy by $log(sigma).$ This supports the intuition that high-entropy distributions are "more spread out" than low-entropy distributions.
As a consequence of this result, we are free to choose convenient values of $mu$ and $sigma$ when computing the entropy of any distribution. For example, the entropy of a Normal$(mu,sigma)$ distribution can be found by setting $mu=0$ and $sigma=1.$ The logarithm of the density in this case is
$$log(f(x)) = -frac{1}{2}log(2pi) - x^2/2,$$
whence
$$H = -E[-frac{1}{2}log(2pi) - X^2/2] = frac{1}{2}log(2pi) + frac{1}{2}.$$
Consequently the entropy of a Normal$(mu,sigma)$ distribution is obtained simply by adding $logsigma$ to this result, giving
$$H = frac{1}{2}log(2pi) + frac{1}{2} + log(sigma) = frac{1}{2}log(2pi,e,sigma^2)$$
as reported by Wikipedia.
$endgroup$
add a comment |
$begingroup$
Since the probability element of $X$ is $f(x)mathrm{d}x,$ the change of variable $y = xsigma + mu$ is equivalent to $x = (y-mu)/sigma,$ whence from
$$f(x)mathrm{d}x = fleft(frac{y-mu}{sigma}right)mathrm{d}left(frac{y-mu}{sigma}right) = frac{1}{sigma} fleft(frac{y-mu}{sigma}right) mathrm{d}y$$
it follows that the density of $Y$ is
$$f_Y(y) = frac{1}{sigma}fleft(frac{y-mu}{sigma}right).$$
Consequently the entropy of $Y$ is
$$H(Y) = -int_{-infty}^{infty} logleft(frac{1}{sigma}fleft(frac{y-mu}{sigma}right)right) frac{1}{sigma}fleft(frac{y-mu}{sigma}right) mathrm{d}y$$
which, upon changing the variable back to $x = (y-mu)/sigma,$ produces
$$eqalign{
H(Y) &= -int_{-infty}^{infty} logleft(frac{1}{sigma}fleft(xright)right) fleft(xright) mathrm{d}x \
&= -int_{-infty}^{infty} left(logleft(frac{1}{sigma}right) + logleft(fleft(xright)right)right) fleft(xright) mathrm{d}x \
&= logleft(sigmaright) int_{-infty}^{infty} f(x) mathrm{d}x -int_{-infty}^{infty} logleft(fleft(xright)right) fleft(xright) mathrm{d}x \
&= log(sigma) + H_f.
}$$
These calculations used basic properties of the logarithm, the linearity of integration, and that fact that $f(x)mathrm{d}x$ integrates to unity (the Law of Total Probability).
The conclusion is
The entropy of $Y = Xsigma + mu$ is the entropy of $X$ plus $log(sigma).$
In words, shifting a random variable does not change its entropy (we may think of the entropy as depending on the values of the probability density, but not on where those values occur), while scaling a variable (which, for $sigma ge 1$ "stretches" or "smears" it out) increases its entropy by $log(sigma).$ This supports the intuition that high-entropy distributions are "more spread out" than low-entropy distributions.
As a consequence of this result, we are free to choose convenient values of $mu$ and $sigma$ when computing the entropy of any distribution. For example, the entropy of a Normal$(mu,sigma)$ distribution can be found by setting $mu=0$ and $sigma=1.$ The logarithm of the density in this case is
$$log(f(x)) = -frac{1}{2}log(2pi) - x^2/2,$$
whence
$$H = -E[-frac{1}{2}log(2pi) - X^2/2] = frac{1}{2}log(2pi) + frac{1}{2}.$$
Consequently the entropy of a Normal$(mu,sigma)$ distribution is obtained simply by adding $logsigma$ to this result, giving
$$H = frac{1}{2}log(2pi) + frac{1}{2} + log(sigma) = frac{1}{2}log(2pi,e,sigma^2)$$
as reported by Wikipedia.
$endgroup$
Since the probability element of $X$ is $f(x)mathrm{d}x,$ the change of variable $y = xsigma + mu$ is equivalent to $x = (y-mu)/sigma,$ whence from
$$f(x)mathrm{d}x = fleft(frac{y-mu}{sigma}right)mathrm{d}left(frac{y-mu}{sigma}right) = frac{1}{sigma} fleft(frac{y-mu}{sigma}right) mathrm{d}y$$
it follows that the density of $Y$ is
$$f_Y(y) = frac{1}{sigma}fleft(frac{y-mu}{sigma}right).$$
Consequently the entropy of $Y$ is
$$H(Y) = -int_{-infty}^{infty} logleft(frac{1}{sigma}fleft(frac{y-mu}{sigma}right)right) frac{1}{sigma}fleft(frac{y-mu}{sigma}right) mathrm{d}y$$
which, upon changing the variable back to $x = (y-mu)/sigma,$ produces
$$eqalign{
H(Y) &= -int_{-infty}^{infty} logleft(frac{1}{sigma}fleft(xright)right) fleft(xright) mathrm{d}x \
&= -int_{-infty}^{infty} left(logleft(frac{1}{sigma}right) + logleft(fleft(xright)right)right) fleft(xright) mathrm{d}x \
&= logleft(sigmaright) int_{-infty}^{infty} f(x) mathrm{d}x -int_{-infty}^{infty} logleft(fleft(xright)right) fleft(xright) mathrm{d}x \
&= log(sigma) + H_f.
}$$
These calculations used basic properties of the logarithm, the linearity of integration, and that fact that $f(x)mathrm{d}x$ integrates to unity (the Law of Total Probability).
The conclusion is
The entropy of $Y = Xsigma + mu$ is the entropy of $X$ plus $log(sigma).$
In words, shifting a random variable does not change its entropy (we may think of the entropy as depending on the values of the probability density, but not on where those values occur), while scaling a variable (which, for $sigma ge 1$ "stretches" or "smears" it out) increases its entropy by $log(sigma).$ This supports the intuition that high-entropy distributions are "more spread out" than low-entropy distributions.
As a consequence of this result, we are free to choose convenient values of $mu$ and $sigma$ when computing the entropy of any distribution. For example, the entropy of a Normal$(mu,sigma)$ distribution can be found by setting $mu=0$ and $sigma=1.$ The logarithm of the density in this case is
$$log(f(x)) = -frac{1}{2}log(2pi) - x^2/2,$$
whence
$$H = -E[-frac{1}{2}log(2pi) - X^2/2] = frac{1}{2}log(2pi) + frac{1}{2}.$$
Consequently the entropy of a Normal$(mu,sigma)$ distribution is obtained simply by adding $logsigma$ to this result, giving
$$H = frac{1}{2}log(2pi) + frac{1}{2} + log(sigma) = frac{1}{2}log(2pi,e,sigma^2)$$
as reported by Wikipedia.
answered 8 hours ago
whuber♦whuber
212k34 gold badges465 silver badges847 bronze badges
212k34 gold badges465 silver badges847 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f415435%2fhow-does-entropy-depend-on-location-and-scale%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown