How can I visualize an ordinal variable predicting a continuous outcome?Treating ordinal variables as...

Are devices supposed to automatically be removed from iCloud when all content and settings are erased?

Why isn't there armor to protect from spells in the Potterverse?

Impersonating user with Core Service App and Angular client

Phonetic distortion when words are borrowed among languages

Smallest PRIME containing the first 11 primes as sub-strings

Why does `FindFit` fail so badly in this simple case?

Is it possible for a company to grow but its stock price stays the same or decrease?

Windows 10 deletes lots of tiny files super slowly. Anything that can be done to speed it up?

Would an object shot from earth fall into the sun?

Knights and Knaves: What does C say?

How important is knowledge of trig identities for use in Calculus

What would happen if I build a half bath without permits?

Top off gas with old oil, is that bad?

Can adverbs modify adjectives?

How is the Apple Watch ECG disabled in certain countries?

GPLv3 forces us to make code available, but to who?

Lost passport and visa, tried to reapply, got rejected twice. What are my next steps?

Is there any site with telescopes data?

Science fiction episode about the creation of a living pegasus, even though flightless

How to identify whether a publisher is genuine or not?

How do we know neutrons have no charge?

Why does it seem the best way to make a living is to invest in real estate?

Create the same subfolders in another folder

Convert a string of digits from words to an integer

How can I visualize an ordinal variable predicting a continuous outcome?

Treating ordinal variables as continuous for regression problemsHow to visualize (make plot) of regression output against categorical input variable?plot predicted values from a cumulative link model (clm, ordinal)Ordinal vs. Continuous Variable and Appropriate Method for Testing Difference of GroupsR Mediate: How to interpret output with an ordinal outcome?continuous independent variable with three levels

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}

What is the best way to visualize the relationship between an ordinal predictor and a continuous outcome?

So far I have the below, but I feel like this is lacking...

Enter image description here

The way I modeled it is I treated the ordinal predictor as an interval instead of categorical. If this is not the best way to treat this type of data, I'd appreciate the feedback.

edited 36 mins ago

Peter Mortensen

2052 silver badges8 bronze badges

asked 19 hours ago

762

527 bronze badges

4

$begingroup$
1. Is there any particular reason to imagine the relationship would be linear? 2. Is there indeed any need to draw any kind of curve or line? Why not simply mark in the means (or any other suitable measure of location) in each category? 3. Can you say more about the continuous outcome? What kind of thing is it?
$endgroup$
– Glen_b♦
17 hours ago

add a comment
|

What is the best way to visualize the relationship between an ordinal predictor and a continuous outcome?

So far I have the below, but I feel like this is lacking...

Enter image description here

The way I modeled it is I treated the ordinal predictor as an interval instead of categorical. If this is not the best way to treat this type of data, I'd appreciate the feedback.

edited 36 mins ago

Peter Mortensen

2052 silver badges8 bronze badges

asked 19 hours ago

762

527 bronze badges

4

$begingroup$
1. Is there any particular reason to imagine the relationship would be linear? 2. Is there indeed any need to draw any kind of curve or line? Why not simply mark in the means (or any other suitable measure of location) in each category? 3. Can you say more about the continuous outcome? What kind of thing is it?
$endgroup$
– Glen_b♦
17 hours ago

add a comment
|

What is the best way to visualize the relationship between an ordinal predictor and a continuous outcome?

So far I have the below, but I feel like this is lacking...

Enter image description here

The way I modeled it is I treated the ordinal predictor as an interval instead of categorical. If this is not the best way to treat this type of data, I'd appreciate the feedback.

edited 36 mins ago

Peter Mortensen

2052 silver badges8 bronze badges

asked 19 hours ago

762

527 bronze badges

What is the best way to visualize the relationship between an ordinal predictor and a continuous outcome?

So far I have the below, but I feel like this is lacking...

Enter image description here

The way I modeled it is I treated the ordinal predictor as an interval instead of categorical. If this is not the best way to treat this type of data, I'd appreciate the feedback.

data-visualization ordinal-data

edited 36 mins ago

Peter Mortensen

2052 silver badges8 bronze badges

asked 19 hours ago

762

527 bronze badges

edited 36 mins ago

Peter Mortensen

2052 silver badges8 bronze badges

asked 19 hours ago

762

527 bronze badges

edited 36 mins ago

Peter Mortensen

2052 silver badges8 bronze badges

edited 36 mins ago

Peter Mortensen

2052 silver badges8 bronze badges

edited 36 mins ago

Peter Mortensen

2052 silver badges8 bronze badges

asked 19 hours ago

762

527 bronze badges

asked 19 hours ago

762

527 bronze badges

asked 19 hours ago

762

527 bronze badges

4

$begingroup$
1. Is there any particular reason to imagine the relationship would be linear? 2. Is there indeed any need to draw any kind of curve or line? Why not simply mark in the means (or any other suitable measure of location) in each category? 3. Can you say more about the continuous outcome? What kind of thing is it?
$endgroup$
– Glen_b♦
17 hours ago

add a comment
|

4

$begingroup$
1. Is there any particular reason to imagine the relationship would be linear? 2. Is there indeed any need to draw any kind of curve or line? Why not simply mark in the means (or any other suitable measure of location) in each category? 3. Can you say more about the continuous outcome? What kind of thing is it?
$endgroup$
– Glen_b♦
17 hours ago

1. Is there any particular reason to imagine the relationship would be linear? 2. Is there indeed any need to draw any kind of curve or line? Why not simply mark in the means (or any other suitable measure of location) in each category? 3. Can you say more about the continuous outcome? What kind of thing is it?

– Glen_b♦
17 hours ago

add a comment
|

6 Answers
6

active

oldest

votes

The problem with this is that there's no way of knowing how many dots are bunched up together. Two solutions I've seen:

Box plot

This would give you tighter box if data points are bunched up together.

Bubble chart

Not sure if this is the official name, but basically you put the vertical axis into bins. The size of the bubble is determined by how many observations fall into that bin.

enter image description here

answered 18 hours ago

Art

3351 silver badge7 bronze badges

$begingroup$
How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.
$endgroup$
– Pieter
15 hours ago

$begingroup$
@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.
$endgroup$
– Art
15 hours ago

add a comment
|

The plot you shown is pretty good. But I think you can improve the data-ink ratio (invented by Edward Tufte) even more by showing all the datapoints. You can do this by adding jitter to the x-axis.

Another improvement is to emphasise that the ordinal variable is categorical and not continuous. You can do this by using a different colour for the different levels.

As an example I have plotted the titanic dataset in R, using the passenger class as an ordinal variable and the passenger age as the continuous variable.

library(tidyverse)

library(ggplot2)

library(titanic)



df <- titanic_train %>% mutate(Class=factor(Pclass))



ggplot(df, aes(Class, Age, color=Class)) +

  geom_jitter(height = 0) + 

  ggtitle("Titanic passenger age vs. class")

enter image description here

answered 15 hours ago

Pieter

1,4976 silver badges19 bronze badges

add a comment
|

To your scatterplot, I would add a large point indicating the mean Y-value at every unique X-value, and also do one or more of the following:

Square-root (or cube-root) transform your Y-axis. Both these
transformations can deal with zeroes, unlike log transformations.
Cube roots can also deal with negative numbers.

Make the points a bit transparent.

Add a little jitter to the X-axis values if the previous steps are insufficient.

As Glen_b notes, there is insufficient information right now to note whether adding a linear regression line is meaningful.

edited 17 hours ago

answered 18 hours ago

mkt

7,4536 gold badges31 silver badges89 bronze badges

$begingroup$
Are there any zeros in the response?
$endgroup$
– Nick Cox
17 hours ago

$begingroup$
@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.
$endgroup$
– mkt
17 hours ago

$begingroup$
Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.
$endgroup$
– Nick Cox
16 hours ago

$begingroup$
@NickCox Agreed, but the question was about how to visualise, not how to model.
$endgroup$
– mkt
16 hours ago

1

$begingroup$
Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.
$endgroup$
– Nick Cox
15 hours ago

add a comment
|

You state that one variable is ordinal, then you decide to treat it as interval. Is that reasonable? There is no way for us to know, as you have not said what the ordinal variable actually is. If you do decide to keep it as ordinal, then what to do depends on your sample size. If N is very large then I like the box plot solution. If N is not so large, then I like jitter. There are other additions you can make to the scatterplot as well - I wrote a presentation about this using SAS, but I am sure it could be duplicated in R. (If that link does not work, Googling flom, scatterplots, enhancements should find it).

But what if treating the variable as interval is not reasonable? You could come to this conclusion either substantively or by trying different codings and seeing how results change. In that case, I suggest trying optimal scaling. There is an R package optiscale that may help (I have not used this package).

answered 15 hours ago

Peter Flom♦

80.7k13 gold badges117 silver badges228 bronze badges

add a comment
|

The basic idea of regression is that the probability distribution of $y$ depends on $x$: there is some family of distributions $P_x(y)$. It's generally assumed that these distributions are all normal with a constant standard deviation (homoscedasticity), leaving only the mean as depending on $x$: $p(Y=y) = N(mu_x,sigma)$. With continuous data, you typically get only one $y$ for finitely many $x$, and no $y$ for the rest, making estimating $mu_x$ by just looking at your sample $y$ for that $x$ unworkable. So a further assumption is often made that $u_x$ is a simple linear function of $x$, so that $p(Y=y) = N(mx+b,sigma)$ for some numbers $m, b, sigma$. The linear regression formula then gives you an estimate of $m$ (slope) and $b$ (intercept) for your data.

Here, you seem to have highly skewed data, and there seems to be a general trend of decreasing spread, so if you were to use linear regression, the normality and homoscedasticity assumptions would be problematic. But you appear to have a large dataset for each value of $x$. So to estimate $mu$ for a particular $x$, there is no need to use the linear regression formula; you can simply take $bar y$ for each $x$. Which is more informative for predicting a $y$ for $x=4$: looking at the $y$ values for $x=4$, or looking at the $y$ values for $x=3$ and $x=5$, and trying interpolate between them?

You may want to show summary statistics other than just $bar y_x$. A box plot can show meadian and quartiles, for instance. You might also want to represent the standard deviation somehow.

You could also show the entire distributions. You could do that with x-dither, as Pieter suggested, or with another type of chart, such as density plots. You could put them side-by-side as in Pieter's answer, but with only six categories, it might be possible to combine them into one chart with the categories separated by colors. Here's a discussion of histograms and density plots: https://towardsdatascience.com/histograms-and-density-plots-in-python-f6bda88f5ac0

answered 3 hours ago

Acccumulation

1,8612 silver badges7 bronze badges

add a comment
|

In addition to the box plot suggested by Art, I suggest a violin plot:

enter image description here

Explicitly showing the median and interquartile range, as done in the above image, is optional.

Quoting from Wikipedia:

Violin plots are similar to box plots, except that they also show the probability density of the data at different values, usually smoothed by a kernel density estimator.

A violin plot is more informative than a plain box plot. While a box plot only shows summary statistics such as mean/median and interquartile ranges, the violin plot shows the full distribution of the data. The difference is particularly useful when the data distribution is multimodal (more than one peak).

answered 1 hour ago

user76284

12014 bronze badges

add a comment
|

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f428445%2fhow-can-i-visualize-an-ordinal-variable-predicting-a-continuous-outcome%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

6 Answers
6

active

oldest

votes

6 Answers
6

active

oldest

votes

The problem with this is that there's no way of knowing how many dots are bunched up together. Two solutions I've seen:

Box plot

This would give you tighter box if data points are bunched up together.

Bubble chart

Not sure if this is the official name, but basically you put the vertical axis into bins. The size of the bubble is determined by how many observations fall into that bin.

enter image description here

answered 18 hours ago

Art

3351 silver badge7 bronze badges

$begingroup$
How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.
$endgroup$
– Pieter
15 hours ago

$begingroup$
@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.
$endgroup$
– Art
15 hours ago

add a comment
|

The problem with this is that there's no way of knowing how many dots are bunched up together. Two solutions I've seen:

Box plot

This would give you tighter box if data points are bunched up together.

Bubble chart

Not sure if this is the official name, but basically you put the vertical axis into bins. The size of the bubble is determined by how many observations fall into that bin.

enter image description here

answered 18 hours ago

Art

3351 silver badge7 bronze badges

$begingroup$
How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.
$endgroup$
– Pieter
15 hours ago

$begingroup$
@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.
$endgroup$
– Art
15 hours ago

add a comment
|

The problem with this is that there's no way of knowing how many dots are bunched up together. Two solutions I've seen:

Box plot

This would give you tighter box if data points are bunched up together.

Bubble chart

Not sure if this is the official name, but basically you put the vertical axis into bins. The size of the bubble is determined by how many observations fall into that bin.

enter image description here

answered 18 hours ago

Art

3351 silver badge7 bronze badges

The problem with this is that there's no way of knowing how many dots are bunched up together. Two solutions I've seen:

Box plot

This would give you tighter box if data points are bunched up together.

Bubble chart

Not sure if this is the official name, but basically you put the vertical axis into bins. The size of the bubble is determined by how many observations fall into that bin.

enter image description here

answered 18 hours ago

Art

3351 silver badge7 bronze badges

answered 18 hours ago

Art

3351 silver badge7 bronze badges

answered 18 hours ago

Art

3351 silver badge7 bronze badges

answered 18 hours ago

Art

3351 silver badge7 bronze badges

$begingroup$
How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.
$endgroup$
– Pieter
15 hours ago

$begingroup$
@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.
$endgroup$
– Art
15 hours ago

add a comment
|

$begingroup$
How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.
$endgroup$
– Pieter
15 hours ago

$begingroup$
@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.
$endgroup$
– Art
15 hours ago

How does the bubble chart help to display the ordinal variable? Maybe you can change your example image to one with ordinal data.

– Pieter
15 hours ago

@Pieter Thanks for the suggestion but I couldn't find one... but you can see the variable on the x-axis are all integers (11, 12, 13, 14, ...) so that's one example of discrete data, and you can treat ordinal values as discrete data.

– Art
15 hours ago

add a comment
|

The plot you shown is pretty good. But I think you can improve the data-ink ratio (invented by Edward Tufte) even more by showing all the datapoints. You can do this by adding jitter to the x-axis.

Another improvement is to emphasise that the ordinal variable is categorical and not continuous. You can do this by using a different colour for the different levels.

As an example I have plotted the titanic dataset in R, using the passenger class as an ordinal variable and the passenger age as the continuous variable.

library(tidyverse)

library(ggplot2)

library(titanic)



df <- titanic_train %>% mutate(Class=factor(Pclass))



ggplot(df, aes(Class, Age, color=Class)) +

  geom_jitter(height = 0) + 

  ggtitle("Titanic passenger age vs. class")

enter image description here

answered 15 hours ago

Pieter

1,4976 silver badges19 bronze badges

add a comment
|

The plot you shown is pretty good. But I think you can improve the data-ink ratio (invented by Edward Tufte) even more by showing all the datapoints. You can do this by adding jitter to the x-axis.

Another improvement is to emphasise that the ordinal variable is categorical and not continuous. You can do this by using a different colour for the different levels.

As an example I have plotted the titanic dataset in R, using the passenger class as an ordinal variable and the passenger age as the continuous variable.

library(tidyverse)

library(ggplot2)

library(titanic)



df <- titanic_train %>% mutate(Class=factor(Pclass))



ggplot(df, aes(Class, Age, color=Class)) +

  geom_jitter(height = 0) + 

  ggtitle("Titanic passenger age vs. class")

enter image description here

answered 15 hours ago

Pieter

1,4976 silver badges19 bronze badges

add a comment
|

The plot you shown is pretty good. But I think you can improve the data-ink ratio (invented by Edward Tufte) even more by showing all the datapoints. You can do this by adding jitter to the x-axis.

Another improvement is to emphasise that the ordinal variable is categorical and not continuous. You can do this by using a different colour for the different levels.

As an example I have plotted the titanic dataset in R, using the passenger class as an ordinal variable and the passenger age as the continuous variable.

library(tidyverse)

library(ggplot2)

library(titanic)



df <- titanic_train %>% mutate(Class=factor(Pclass))



ggplot(df, aes(Class, Age, color=Class)) +

  geom_jitter(height = 0) + 

  ggtitle("Titanic passenger age vs. class")

enter image description here

answered 15 hours ago

Pieter

1,4976 silver badges19 bronze badges

The plot you shown is pretty good. But I think you can improve the data-ink ratio (invented by Edward Tufte) even more by showing all the datapoints. You can do this by adding jitter to the x-axis.

Another improvement is to emphasise that the ordinal variable is categorical and not continuous. You can do this by using a different colour for the different levels.

As an example I have plotted the titanic dataset in R, using the passenger class as an ordinal variable and the passenger age as the continuous variable.

library(tidyverse)

library(ggplot2)

library(titanic)



df <- titanic_train %>% mutate(Class=factor(Pclass))



ggplot(df, aes(Class, Age, color=Class)) +

  geom_jitter(height = 0) + 

  ggtitle("Titanic passenger age vs. class")

enter image description here

answered 15 hours ago

Pieter

1,4976 silver badges19 bronze badges

answered 15 hours ago

Pieter

1,4976 silver badges19 bronze badges

answered 15 hours ago

Pieter

1,4976 silver badges19 bronze badges

answered 15 hours ago

Pieter

1,4976 silver badges19 bronze badges

add a comment
|

To your scatterplot, I would add a large point indicating the mean Y-value at every unique X-value, and also do one or more of the following:

Square-root (or cube-root) transform your Y-axis. Both these
transformations can deal with zeroes, unlike log transformations.
Cube roots can also deal with negative numbers.

Make the points a bit transparent.

Add a little jitter to the X-axis values if the previous steps are insufficient.

As Glen_b notes, there is insufficient information right now to note whether adding a linear regression line is meaningful.

edited 17 hours ago

answered 18 hours ago

mkt

7,4536 gold badges31 silver badges89 bronze badges

$begingroup$
Are there any zeros in the response?
$endgroup$
– Nick Cox
17 hours ago

$begingroup$
@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.
$endgroup$
– mkt
17 hours ago

$begingroup$
Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.
$endgroup$
– Nick Cox
16 hours ago

$begingroup$
@NickCox Agreed, but the question was about how to visualise, not how to model.
$endgroup$
– mkt
16 hours ago

1

$begingroup$
Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.
$endgroup$
– Nick Cox
15 hours ago

add a comment
|

To your scatterplot, I would add a large point indicating the mean Y-value at every unique X-value, and also do one or more of the following:

Square-root (or cube-root) transform your Y-axis. Both these
transformations can deal with zeroes, unlike log transformations.
Cube roots can also deal with negative numbers.

Make the points a bit transparent.

Add a little jitter to the X-axis values if the previous steps are insufficient.

As Glen_b notes, there is insufficient information right now to note whether adding a linear regression line is meaningful.

edited 17 hours ago

answered 18 hours ago

mkt

7,4536 gold badges31 silver badges89 bronze badges

$begingroup$
Are there any zeros in the response?
$endgroup$
– Nick Cox
17 hours ago

$begingroup$
@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.
$endgroup$
– mkt
17 hours ago

$begingroup$
Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.
$endgroup$
– Nick Cox
16 hours ago

$begingroup$
@NickCox Agreed, but the question was about how to visualise, not how to model.
$endgroup$
– mkt
16 hours ago

1

$begingroup$
Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.
$endgroup$
– Nick Cox
15 hours ago

add a comment
|

To your scatterplot, I would add a large point indicating the mean Y-value at every unique X-value, and also do one or more of the following:

Square-root (or cube-root) transform your Y-axis. Both these
transformations can deal with zeroes, unlike log transformations.
Cube roots can also deal with negative numbers.

Make the points a bit transparent.

Add a little jitter to the X-axis values if the previous steps are insufficient.

As Glen_b notes, there is insufficient information right now to note whether adding a linear regression line is meaningful.

edited 17 hours ago

answered 18 hours ago

mkt

7,4536 gold badges31 silver badges89 bronze badges

To your scatterplot, I would add a large point indicating the mean Y-value at every unique X-value, and also do one or more of the following:

Square-root (or cube-root) transform your Y-axis. Both these
transformations can deal with zeroes, unlike log transformations.
Cube roots can also deal with negative numbers.

Make the points a bit transparent.

Add a little jitter to the X-axis values if the previous steps are insufficient.

As Glen_b notes, there is insufficient information right now to note whether adding a linear regression line is meaningful.

edited 17 hours ago

answered 18 hours ago

mkt

7,4536 gold badges31 silver badges89 bronze badges

edited 17 hours ago

answered 18 hours ago

mkt

7,4536 gold badges31 silver badges89 bronze badges

answered 18 hours ago

mkt

7,4536 gold badges31 silver badges89 bronze badges

answered 18 hours ago

mkt

7,4536 gold badges31 silver badges89 bronze badges

$begingroup$
Are there any zeros in the response?
$endgroup$
– Nick Cox
17 hours ago

$begingroup$
@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.
$endgroup$
– mkt
17 hours ago

$begingroup$
Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.
$endgroup$
– Nick Cox
16 hours ago

$begingroup$
@NickCox Agreed, but the question was about how to visualise, not how to model.
$endgroup$
– mkt
16 hours ago

1

$begingroup$
Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.
$endgroup$
– Nick Cox
15 hours ago

add a comment
|

$begingroup$
Are there any zeros in the response?
$endgroup$
– Nick Cox
17 hours ago

$begingroup$
@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.
$endgroup$
– mkt
17 hours ago

$begingroup$
Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.
$endgroup$
– Nick Cox
16 hours ago

$begingroup$
@NickCox Agreed, but the question was about how to visualise, not how to model.
$endgroup$
– mkt
16 hours ago

1

$begingroup$
Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.
$endgroup$
– Nick Cox
15 hours ago

Are there any zeros in the response?

– Nick Cox
17 hours ago

@NickCox Hard to say for sure. I squinted at it and it seemed like there might be. See X = 3, for example.

– mkt
17 hours ago

Question remains for OP, who should know. Incidentally I am a small fan of cube roots, used in Miles, Stokes, Vieli, Cox in Nature. We had to work hard to persuade reviewers that they were a good idea for a response that was variously positive and negative. But for the question here I would favour a Poisson model, which can work fine for non-negative continuous responses.

– Nick Cox
16 hours ago

@NickCox Agreed, but the question was about how to visualise, not how to model.

– mkt
16 hours ago

Indeed. That was a comment not an answer. But a Poisson model would imply plotting on log scale with a secondary question on how to plot observed zeros.

– Nick Cox
15 hours ago

add a comment
|

answered 15 hours ago

Peter Flom♦

80.7k13 gold badges117 silver badges228 bronze badges

add a comment
|

answered 15 hours ago

Peter Flom♦

80.7k13 gold badges117 silver badges228 bronze badges

add a comment
|

answered 15 hours ago

Peter Flom♦

80.7k13 gold badges117 silver badges228 bronze badges

answered 15 hours ago

Peter Flom♦

80.7k13 gold badges117 silver badges228 bronze badges

answered 15 hours ago

Peter Flom♦

80.7k13 gold badges117 silver badges228 bronze badges

answered 15 hours ago

Peter Flom♦

80.7k13 gold badges117 silver badges228 bronze badges

answered 15 hours ago

Peter Flom♦

80.7k13 gold badges117 silver badges228 bronze badges

add a comment
|

You may want to show summary statistics other than just $bar y_x$. A box plot can show meadian and quartiles, for instance. You might also want to represent the standard deviation somehow.

answered 3 hours ago

Acccumulation

1,8612 silver badges7 bronze badges

add a comment
|

You may want to show summary statistics other than just $bar y_x$. A box plot can show meadian and quartiles, for instance. You might also want to represent the standard deviation somehow.

answered 3 hours ago

Acccumulation

1,8612 silver badges7 bronze badges

add a comment
|

You may want to show summary statistics other than just $bar y_x$. A box plot can show meadian and quartiles, for instance. You might also want to represent the standard deviation somehow.

answered 3 hours ago

Acccumulation

1,8612 silver badges7 bronze badges

You may want to show summary statistics other than just $bar y_x$. A box plot can show meadian and quartiles, for instance. You might also want to represent the standard deviation somehow.

answered 3 hours ago

Acccumulation

1,8612 silver badges7 bronze badges

answered 3 hours ago

Acccumulation

1,8612 silver badges7 bronze badges

answered 3 hours ago

Acccumulation

1,8612 silver badges7 bronze badges

answered 3 hours ago

Acccumulation

1,8612 silver badges7 bronze badges

add a comment
|

In addition to the box plot suggested by Art, I suggest a violin plot:

enter image description here

Explicitly showing the median and interquartile range, as done in the above image, is optional.

Quoting from Wikipedia:

Violin plots are similar to box plots, except that they also show the probability density of the data at different values, usually smoothed by a kernel density estimator.

A violin plot is more informative than a plain box plot. While a box plot only shows summary statistics such as mean/median and interquartile ranges, the violin plot shows the full distribution of the data. The difference is particularly useful when the data distribution is multimodal (more than one peak).

answered 1 hour ago

user76284

12014 bronze badges

add a comment
|

In addition to the box plot suggested by Art, I suggest a violin plot:

enter image description here

Explicitly showing the median and interquartile range, as done in the above image, is optional.

Quoting from Wikipedia:

Violin plots are similar to box plots, except that they also show the probability density of the data at different values, usually smoothed by a kernel density estimator.

A violin plot is more informative than a plain box plot. While a box plot only shows summary statistics such as mean/median and interquartile ranges, the violin plot shows the full distribution of the data. The difference is particularly useful when the data distribution is multimodal (more than one peak).

answered 1 hour ago

user76284

12014 bronze badges

add a comment
|

In addition to the box plot suggested by Art, I suggest a violin plot:

enter image description here

Explicitly showing the median and interquartile range, as done in the above image, is optional.

Quoting from Wikipedia:

Violin plots are similar to box plots, except that they also show the probability density of the data at different values, usually smoothed by a kernel density estimator.

A violin plot is more informative than a plain box plot. While a box plot only shows summary statistics such as mean/median and interquartile ranges, the violin plot shows the full distribution of the data. The difference is particularly useful when the data distribution is multimodal (more than one peak).

answered 1 hour ago

user76284

12014 bronze badges

In addition to the box plot suggested by Art, I suggest a violin plot:

enter image description here

Explicitly showing the median and interquartile range, as done in the above image, is optional.

Quoting from Wikipedia:

Violin plots are similar to box plots, except that they also show the probability density of the data at different values, usually smoothed by a kernel density estimator.

A violin plot is more informative than a plain box plot. While a box plot only shows summary statistics such as mean/median and interquartile ranges, the violin plot shows the full distribution of the data. The difference is particularly useful when the data distribution is multimodal (more than one peak).

answered 1 hour ago

user76284

12014 bronze badges

answered 1 hour ago

user76284

12014 bronze badges

answered 1 hour ago

user76284

12014 bronze badges

answered 1 hour ago

user76284

12014 bronze badges

add a comment
|

draft saved

draft discarded

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mdthbs