How likely is sample A and sample B is from distribution C?Learning to create samples from an unknown...

Should I report a leak of confidential HR information?

How can I check type T is among parameter pack Ts... in C++?

How should I behave to assure my friends that I am not after their money?

How do I find and plot the intersection of these three surfaces?

What's the difference between にしては、 わりに and くせに?

Symbol for "not absolutely continuous" in Latex

Would adding an external lens allow one area outside the focal plane to be in focus?

Why isn’t the tax system continuous rather than bracketed?

What are good ways to spray paint a QR code on a footpath?

Set vertical spacing between two particular items

Did Chinese school textbook maps (c. 1951) "depict China as stretching even into the central Asian republics"?

Anagram Within an Anagram!

Dual statement category theory

AT system without -5v

Can a US president have someone sent to prison?

Is there any set of 2-6 notes that doesn't have a chord name?

Bash echo $-1 prints hb1. Why?

What shortcut does ⌦ symbol in Camunda macOS app indicate and how to invoke it?

Does the UK have a written constitution?

How to determine what is the correct level of detail when modelling?

How was film developed in the late 1920s?

Should I hide continue button until tasks are completed?

Could Sauron have read Tom Bombadil's mind if Tom had held the Palantir?

Why does this fireplace work?



How likely is sample A and sample B is from distribution C?


Learning to create samples from an unknown distributionDifference between null distribution and sampling distributionHow to make a two-tailed hypergeometric test?Why can variance be estimated from a sample taken from an alternative hypothesis?Is this sample drawn from the normal distribution ? using information from both mean and standard deviationInfer a population, and hence a sampling distribution, from a sampleWhen should I use one-sample t-test and when should I use t-test for two population means?How to combine probability plots and hypothesis tests to check normality?Testing if two distributions have the same mean by using a sample distributionHypothesis Testing - Switch hypothesis and get same result?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







2












$begingroup$


Let's say I have a sample A: [0,0,0,1]
and another sample B: [2,0,5,10,100,3,2,6]



I would like to know the probability that A and B are both picked from the same population C.



I tried applying a hypothesis test, but it gives me a p value of approx. 0.39 and I think it should be clear that it's very unlikely that both samples are from the same distribution.










share|cite|improve this question







New contributor



Franc Weser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$












  • $begingroup$
    I'm guessing you used a pooled 2-sample t test, which is not a good choice here because sample sizes are small, 100 is a far outlier, and sample variances are hugely different. But your intuition that these data are not likely to have come from the same population is correct.
    $endgroup$
    – BruceET
    6 hours ago












  • $begingroup$
    As phrased the question (which contains a request for a probability), appears to be framed as a Bayesian problem. I expect that a Bayesian analysis is likely not the OP's intent, but if answers talk about hypothesis tests they should also discuss what question those answer (in place of what the question asks).
    $endgroup$
    – Glen_b
    1 hour ago




















2












$begingroup$


Let's say I have a sample A: [0,0,0,1]
and another sample B: [2,0,5,10,100,3,2,6]



I would like to know the probability that A and B are both picked from the same population C.



I tried applying a hypothesis test, but it gives me a p value of approx. 0.39 and I think it should be clear that it's very unlikely that both samples are from the same distribution.










share|cite|improve this question







New contributor



Franc Weser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$












  • $begingroup$
    I'm guessing you used a pooled 2-sample t test, which is not a good choice here because sample sizes are small, 100 is a far outlier, and sample variances are hugely different. But your intuition that these data are not likely to have come from the same population is correct.
    $endgroup$
    – BruceET
    6 hours ago












  • $begingroup$
    As phrased the question (which contains a request for a probability), appears to be framed as a Bayesian problem. I expect that a Bayesian analysis is likely not the OP's intent, but if answers talk about hypothesis tests they should also discuss what question those answer (in place of what the question asks).
    $endgroup$
    – Glen_b
    1 hour ago
















2












2








2





$begingroup$


Let's say I have a sample A: [0,0,0,1]
and another sample B: [2,0,5,10,100,3,2,6]



I would like to know the probability that A and B are both picked from the same population C.



I tried applying a hypothesis test, but it gives me a p value of approx. 0.39 and I think it should be clear that it's very unlikely that both samples are from the same distribution.










share|cite|improve this question







New contributor



Franc Weser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






$endgroup$




Let's say I have a sample A: [0,0,0,1]
and another sample B: [2,0,5,10,100,3,2,6]



I would like to know the probability that A and B are both picked from the same population C.



I tried applying a hypothesis test, but it gives me a p value of approx. 0.39 and I think it should be clear that it's very unlikely that both samples are from the same distribution.







probability hypothesis-testing distributions p-value multivariate-analysis






share|cite|improve this question







New contributor



Franc Weser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










share|cite|improve this question







New contributor



Franc Weser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








share|cite|improve this question




share|cite|improve this question






New contributor



Franc Weser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.








asked 8 hours ago









Franc WeserFranc Weser

113 bronze badges




113 bronze badges




New contributor



Franc Weser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.




New contributor




Franc Weser is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.














  • $begingroup$
    I'm guessing you used a pooled 2-sample t test, which is not a good choice here because sample sizes are small, 100 is a far outlier, and sample variances are hugely different. But your intuition that these data are not likely to have come from the same population is correct.
    $endgroup$
    – BruceET
    6 hours ago












  • $begingroup$
    As phrased the question (which contains a request for a probability), appears to be framed as a Bayesian problem. I expect that a Bayesian analysis is likely not the OP's intent, but if answers talk about hypothesis tests they should also discuss what question those answer (in place of what the question asks).
    $endgroup$
    – Glen_b
    1 hour ago




















  • $begingroup$
    I'm guessing you used a pooled 2-sample t test, which is not a good choice here because sample sizes are small, 100 is a far outlier, and sample variances are hugely different. But your intuition that these data are not likely to have come from the same population is correct.
    $endgroup$
    – BruceET
    6 hours ago












  • $begingroup$
    As phrased the question (which contains a request for a probability), appears to be framed as a Bayesian problem. I expect that a Bayesian analysis is likely not the OP's intent, but if answers talk about hypothesis tests they should also discuss what question those answer (in place of what the question asks).
    $endgroup$
    – Glen_b
    1 hour ago


















$begingroup$
I'm guessing you used a pooled 2-sample t test, which is not a good choice here because sample sizes are small, 100 is a far outlier, and sample variances are hugely different. But your intuition that these data are not likely to have come from the same population is correct.
$endgroup$
– BruceET
6 hours ago






$begingroup$
I'm guessing you used a pooled 2-sample t test, which is not a good choice here because sample sizes are small, 100 is a far outlier, and sample variances are hugely different. But your intuition that these data are not likely to have come from the same population is correct.
$endgroup$
– BruceET
6 hours ago














$begingroup$
As phrased the question (which contains a request for a probability), appears to be framed as a Bayesian problem. I expect that a Bayesian analysis is likely not the OP's intent, but if answers talk about hypothesis tests they should also discuss what question those answer (in place of what the question asks).
$endgroup$
– Glen_b
1 hour ago






$begingroup$
As phrased the question (which contains a request for a probability), appears to be framed as a Bayesian problem. I expect that a Bayesian analysis is likely not the OP's intent, but if answers talk about hypothesis tests they should also discuss what question those answer (in place of what the question asks).
$endgroup$
– Glen_b
1 hour ago












1 Answer
1






active

oldest

votes


















2












$begingroup$

You don't say what kind of hypothesis test you used.
Doing inference on such small samples as these is always
going to be difficult. However, a nonparametric Kolmogorov-Smirnov test (in R) does reject the null hypothesis that these
two samples were randomly sampled from the same population.



There is a warning message that (on account of the ties), the P-value is not exact, but 0.034 seems sufficiently smaller than 0.05 to say that we can reject at the 5% level.



x1 = c(0,0,0,1)
x2 = c(2,0,5,10,100,3,2,6)
ks.test(x1, x2)

Two-sample Kolmogorov-Smirnov test

data: x1 and x2
D = 0.875, p-value = 0.0337
alternative hypothesis: two-sided

Warning message:
In ks.test(x1, x2) : cannot compute exact p-value with ties


Similar data without ties gives a 'cleaner' test--rejecting the null hypothesis with no warning messages.



y1 = c(.01, .02, .03, .9)
y2 = c(2,0,5,10,100,3,2.1,6)
ks.test(y1, y2)

Two-sample Kolmogorov-Smirnov test

data: y1 and y2
D = 0.875, p-value = 0.0202
alternative hypothesis: two-sided


Another possible test is the two-sample Wilcoxon (rank sum test). Its distribution theory is also somewhat disturbed by ties, but it does find a significant difference between your two samples. Looking just at the P-value, we have:



wilcox.test(x1,x2)$p.val
[1] 0.02434338
Warning message:
In wilcox.test.default(x1, x2) :
cannot compute exact p-value with ties





share|cite|improve this answer











$endgroup$
















    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "65"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });






    Franc Weser is a new contributor. Be nice, and check out our Code of Conduct.










    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f414335%2fhow-likely-is-sample-a-and-sample-b-is-from-distribution-c%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2












    $begingroup$

    You don't say what kind of hypothesis test you used.
    Doing inference on such small samples as these is always
    going to be difficult. However, a nonparametric Kolmogorov-Smirnov test (in R) does reject the null hypothesis that these
    two samples were randomly sampled from the same population.



    There is a warning message that (on account of the ties), the P-value is not exact, but 0.034 seems sufficiently smaller than 0.05 to say that we can reject at the 5% level.



    x1 = c(0,0,0,1)
    x2 = c(2,0,5,10,100,3,2,6)
    ks.test(x1, x2)

    Two-sample Kolmogorov-Smirnov test

    data: x1 and x2
    D = 0.875, p-value = 0.0337
    alternative hypothesis: two-sided

    Warning message:
    In ks.test(x1, x2) : cannot compute exact p-value with ties


    Similar data without ties gives a 'cleaner' test--rejecting the null hypothesis with no warning messages.



    y1 = c(.01, .02, .03, .9)
    y2 = c(2,0,5,10,100,3,2.1,6)
    ks.test(y1, y2)

    Two-sample Kolmogorov-Smirnov test

    data: y1 and y2
    D = 0.875, p-value = 0.0202
    alternative hypothesis: two-sided


    Another possible test is the two-sample Wilcoxon (rank sum test). Its distribution theory is also somewhat disturbed by ties, but it does find a significant difference between your two samples. Looking just at the P-value, we have:



    wilcox.test(x1,x2)$p.val
    [1] 0.02434338
    Warning message:
    In wilcox.test.default(x1, x2) :
    cannot compute exact p-value with ties





    share|cite|improve this answer











    $endgroup$


















      2












      $begingroup$

      You don't say what kind of hypothesis test you used.
      Doing inference on such small samples as these is always
      going to be difficult. However, a nonparametric Kolmogorov-Smirnov test (in R) does reject the null hypothesis that these
      two samples were randomly sampled from the same population.



      There is a warning message that (on account of the ties), the P-value is not exact, but 0.034 seems sufficiently smaller than 0.05 to say that we can reject at the 5% level.



      x1 = c(0,0,0,1)
      x2 = c(2,0,5,10,100,3,2,6)
      ks.test(x1, x2)

      Two-sample Kolmogorov-Smirnov test

      data: x1 and x2
      D = 0.875, p-value = 0.0337
      alternative hypothesis: two-sided

      Warning message:
      In ks.test(x1, x2) : cannot compute exact p-value with ties


      Similar data without ties gives a 'cleaner' test--rejecting the null hypothesis with no warning messages.



      y1 = c(.01, .02, .03, .9)
      y2 = c(2,0,5,10,100,3,2.1,6)
      ks.test(y1, y2)

      Two-sample Kolmogorov-Smirnov test

      data: y1 and y2
      D = 0.875, p-value = 0.0202
      alternative hypothesis: two-sided


      Another possible test is the two-sample Wilcoxon (rank sum test). Its distribution theory is also somewhat disturbed by ties, but it does find a significant difference between your two samples. Looking just at the P-value, we have:



      wilcox.test(x1,x2)$p.val
      [1] 0.02434338
      Warning message:
      In wilcox.test.default(x1, x2) :
      cannot compute exact p-value with ties





      share|cite|improve this answer











      $endgroup$
















        2












        2








        2





        $begingroup$

        You don't say what kind of hypothesis test you used.
        Doing inference on such small samples as these is always
        going to be difficult. However, a nonparametric Kolmogorov-Smirnov test (in R) does reject the null hypothesis that these
        two samples were randomly sampled from the same population.



        There is a warning message that (on account of the ties), the P-value is not exact, but 0.034 seems sufficiently smaller than 0.05 to say that we can reject at the 5% level.



        x1 = c(0,0,0,1)
        x2 = c(2,0,5,10,100,3,2,6)
        ks.test(x1, x2)

        Two-sample Kolmogorov-Smirnov test

        data: x1 and x2
        D = 0.875, p-value = 0.0337
        alternative hypothesis: two-sided

        Warning message:
        In ks.test(x1, x2) : cannot compute exact p-value with ties


        Similar data without ties gives a 'cleaner' test--rejecting the null hypothesis with no warning messages.



        y1 = c(.01, .02, .03, .9)
        y2 = c(2,0,5,10,100,3,2.1,6)
        ks.test(y1, y2)

        Two-sample Kolmogorov-Smirnov test

        data: y1 and y2
        D = 0.875, p-value = 0.0202
        alternative hypothesis: two-sided


        Another possible test is the two-sample Wilcoxon (rank sum test). Its distribution theory is also somewhat disturbed by ties, but it does find a significant difference between your two samples. Looking just at the P-value, we have:



        wilcox.test(x1,x2)$p.val
        [1] 0.02434338
        Warning message:
        In wilcox.test.default(x1, x2) :
        cannot compute exact p-value with ties





        share|cite|improve this answer











        $endgroup$



        You don't say what kind of hypothesis test you used.
        Doing inference on such small samples as these is always
        going to be difficult. However, a nonparametric Kolmogorov-Smirnov test (in R) does reject the null hypothesis that these
        two samples were randomly sampled from the same population.



        There is a warning message that (on account of the ties), the P-value is not exact, but 0.034 seems sufficiently smaller than 0.05 to say that we can reject at the 5% level.



        x1 = c(0,0,0,1)
        x2 = c(2,0,5,10,100,3,2,6)
        ks.test(x1, x2)

        Two-sample Kolmogorov-Smirnov test

        data: x1 and x2
        D = 0.875, p-value = 0.0337
        alternative hypothesis: two-sided

        Warning message:
        In ks.test(x1, x2) : cannot compute exact p-value with ties


        Similar data without ties gives a 'cleaner' test--rejecting the null hypothesis with no warning messages.



        y1 = c(.01, .02, .03, .9)
        y2 = c(2,0,5,10,100,3,2.1,6)
        ks.test(y1, y2)

        Two-sample Kolmogorov-Smirnov test

        data: y1 and y2
        D = 0.875, p-value = 0.0202
        alternative hypothesis: two-sided


        Another possible test is the two-sample Wilcoxon (rank sum test). Its distribution theory is also somewhat disturbed by ties, but it does find a significant difference between your two samples. Looking just at the P-value, we have:



        wilcox.test(x1,x2)$p.val
        [1] 0.02434338
        Warning message:
        In wilcox.test.default(x1, x2) :
        cannot compute exact p-value with ties






        share|cite|improve this answer














        share|cite|improve this answer



        share|cite|improve this answer








        edited 7 hours ago

























        answered 7 hours ago









        BruceETBruceET

        9,5581 gold badge8 silver badges24 bronze badges




        9,5581 gold badge8 silver badges24 bronze badges






















            Franc Weser is a new contributor. Be nice, and check out our Code of Conduct.










            draft saved

            draft discarded


















            Franc Weser is a new contributor. Be nice, and check out our Code of Conduct.













            Franc Weser is a new contributor. Be nice, and check out our Code of Conduct.












            Franc Weser is a new contributor. Be nice, and check out our Code of Conduct.
















            Thanks for contributing an answer to Cross Validated!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f414335%2fhow-likely-is-sample-a-and-sample-b-is-from-distribution-c%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Taj Mahal Inhaltsverzeichnis Aufbau | Geschichte | 350-Jahr-Feier | Heutige Bedeutung | Siehe auch |...

            Baia Sprie Cuprins Etimologie | Istorie | Demografie | Politică și administrație | Arii naturale...

            Nicolae Petrescu-Găină Cuprins Biografie | Opera | In memoriam | Varia | Controverse, incertitudini...