When to remove insignificant variables?Omitted variable bias in logistic regression vs. omitted variable bias...

Designing a magic-compatible polearm

Am I legally required to provide a (GPL licensed) source code even after a project is abandoned?

How can lift be less than thrust that is less than weight?

How long would it take to cross the Channel in 1890's?

How to maintain a closed environment for one person for a long period of time

Why isn't my calculation that we should be able to see the sun well beyond the observable universe valid?

Will generated tokens be progressively stronger when using Cathar's Crusade and Sorin, Grim Nemesis?

`-` in tar xzf -

Is it possible to get a mortgage with a custom duration in the US?

Similarity score: Can Sklearn SVR predict values greater than 1 and less than 0?

What is the meaning of "понаехать"?

Constitutionality of U.S. Democratic Presidential Candidate's Supreme Court Suggestion

Is a single radon daughter atom in air a solid?

Do I need a shock-proof watch for cycling?

If the Dragon's Breath spell is cast on a familiar, does it use the wizard's DC or familiar's DC?

Methodology: Writing unit tests for another developer

How do I farm creepers for XP without them exploding?

Is there any difference between Т34ВМ1 and КМ1858ВМ1/3?

What can I do with a research project that is my university’s intellectual property?

Count All Possible Unique Combinations of Letters in a Word

Why does the Saturn V have standalone inter-stage rings?

Can I enter the UK for 24 hours from a Schengen area, holding an Indian passport?

Can Ogre clerics use Purify Food and Drink on humanoid characters?

Is declining an undergraduate award which causes me discomfort appropriate?



When to remove insignificant variables?


Omitted variable bias in logistic regression vs. omitted variable bias in ordinary least squares regressionLogistic regression: anova chi-square test vs. significance of coefficients (anova() vs summary() in R)categorical variables in regression analysis and interaction termsHow to find the appropriate family for glm models?Dummy regression, reference group selection, Mallows' $C_p$ criterion, correlationCan I ignore multicolinearity problem if all the regression coefficients are highly significant?R - How are the significance codes determined when summarizing a logistic regression model?problem with linear regressionVIF - Variance Inflation, when to remove the variableRemove insignificant or multicollinearity variables?Convergence warning when using mixed effect model






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}







1












$begingroup$


I'm working on logistic regression model. I checked the summary of the model which is built on 5 independent variables out which one is not significant with a P-value of 0.74.I wish to know that do we directly remove the variable or is there any other way to check for it's significance?



A senior of mine suggested to do logarithmic transformation of the insignificant variable & look for correlation then. Will that count towards checking it's significance.



model <- glm(Buy ~ a_score + b_score+ c_score+lb+p, data = history, family = binomial)


All variables come out to be significant with 2 or 3 stars apart from a_score which is shown insignificant.










share|cite|improve this question









$endgroup$



migrated from stackoverflow.com 9 hours ago


This question came from our site for professional and enthusiast programmers.

























    1












    $begingroup$


    I'm working on logistic regression model. I checked the summary of the model which is built on 5 independent variables out which one is not significant with a P-value of 0.74.I wish to know that do we directly remove the variable or is there any other way to check for it's significance?



    A senior of mine suggested to do logarithmic transformation of the insignificant variable & look for correlation then. Will that count towards checking it's significance.



    model <- glm(Buy ~ a_score + b_score+ c_score+lb+p, data = history, family = binomial)


    All variables come out to be significant with 2 or 3 stars apart from a_score which is shown insignificant.










    share|cite|improve this question









    $endgroup$



    migrated from stackoverflow.com 9 hours ago


    This question came from our site for professional and enthusiast programmers.





















      1












      1








      1





      $begingroup$


      I'm working on logistic regression model. I checked the summary of the model which is built on 5 independent variables out which one is not significant with a P-value of 0.74.I wish to know that do we directly remove the variable or is there any other way to check for it's significance?



      A senior of mine suggested to do logarithmic transformation of the insignificant variable & look for correlation then. Will that count towards checking it's significance.



      model <- glm(Buy ~ a_score + b_score+ c_score+lb+p, data = history, family = binomial)


      All variables come out to be significant with 2 or 3 stars apart from a_score which is shown insignificant.










      share|cite|improve this question









      $endgroup$




      I'm working on logistic regression model. I checked the summary of the model which is built on 5 independent variables out which one is not significant with a P-value of 0.74.I wish to know that do we directly remove the variable or is there any other way to check for it's significance?



      A senior of mine suggested to do logarithmic transformation of the insignificant variable & look for correlation then. Will that count towards checking it's significance.



      model <- glm(Buy ~ a_score + b_score+ c_score+lb+p, data = history, family = binomial)


      All variables come out to be significant with 2 or 3 stars apart from a_score which is shown insignificant.







      r regression correlation






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked 9 hours ago







      AKSHIT SINGH











      migrated from stackoverflow.com 9 hours ago


      This question came from our site for professional and enthusiast programmers.









      migrated from stackoverflow.com 9 hours ago


      This question came from our site for professional and enthusiast programmers.
























          2 Answers
          2






          active

          oldest

          votes


















          6












          $begingroup$

          Let me first ask this: What is the goal of the model? If you are only interested in predicting if a customer will buy, then statistcal hypothesis tests really aren't your main concern. Instead, you should be externally validating your model via a validation/test prodecedure on unseen data.



          If, instead, you are interested in examining which factors contribute to the probability of a customer buying, then there is no need to remove variables which fail to reject the null (especially in a stepwise sort of manner). Presumably, you included a variable in your model because you thought (from past experience or expert opinion) that it played an important part in a customer deciding if they will buy. That the variable failed to reject the null doesn't make your model a bad one, it just means that your sample didin't detect an effect of that variable. That's perfectly ok.






          share|cite|improve this answer









          $endgroup$









          • 1




            $begingroup$
            Upvoted for excellence of the answer.
            $endgroup$
            – James Phillips
            8 hours ago










          • $begingroup$
            @JamesPhillips Thanks
            $endgroup$
            – Demetri Pananos
            8 hours ago










          • $begingroup$
            +1 Removing predictors potentially related to outcome (even if "insignificant") is tricky in logistic regression, given its inherent omitted-variable bias. Removing a predictor related to outcome can lead to bias in the estimates of the coefficients of the retained predictors, even if the retained predictors aren't correlated with the removed predictor.
            $endgroup$
            – EdM
            8 hours ago



















          1












          $begingroup$

          Have a look at the help pages for step(), drop1() and add1(). These will help you to add/remove variables based on AIC. However, all such methods are somewhat flawed in their path dependence. A better way would be to use the functions in the penalised or glmnet package to perform a lasso regression.






          share|cite|improve this answer









          $endgroup$














            Your Answer








            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "65"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f413606%2fwhen-to-remove-insignificant-variables%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown
























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            6












            $begingroup$

            Let me first ask this: What is the goal of the model? If you are only interested in predicting if a customer will buy, then statistcal hypothesis tests really aren't your main concern. Instead, you should be externally validating your model via a validation/test prodecedure on unseen data.



            If, instead, you are interested in examining which factors contribute to the probability of a customer buying, then there is no need to remove variables which fail to reject the null (especially in a stepwise sort of manner). Presumably, you included a variable in your model because you thought (from past experience or expert opinion) that it played an important part in a customer deciding if they will buy. That the variable failed to reject the null doesn't make your model a bad one, it just means that your sample didin't detect an effect of that variable. That's perfectly ok.






            share|cite|improve this answer









            $endgroup$









            • 1




              $begingroup$
              Upvoted for excellence of the answer.
              $endgroup$
              – James Phillips
              8 hours ago










            • $begingroup$
              @JamesPhillips Thanks
              $endgroup$
              – Demetri Pananos
              8 hours ago










            • $begingroup$
              +1 Removing predictors potentially related to outcome (even if "insignificant") is tricky in logistic regression, given its inherent omitted-variable bias. Removing a predictor related to outcome can lead to bias in the estimates of the coefficients of the retained predictors, even if the retained predictors aren't correlated with the removed predictor.
              $endgroup$
              – EdM
              8 hours ago
















            6












            $begingroup$

            Let me first ask this: What is the goal of the model? If you are only interested in predicting if a customer will buy, then statistcal hypothesis tests really aren't your main concern. Instead, you should be externally validating your model via a validation/test prodecedure on unseen data.



            If, instead, you are interested in examining which factors contribute to the probability of a customer buying, then there is no need to remove variables which fail to reject the null (especially in a stepwise sort of manner). Presumably, you included a variable in your model because you thought (from past experience or expert opinion) that it played an important part in a customer deciding if they will buy. That the variable failed to reject the null doesn't make your model a bad one, it just means that your sample didin't detect an effect of that variable. That's perfectly ok.






            share|cite|improve this answer









            $endgroup$









            • 1




              $begingroup$
              Upvoted for excellence of the answer.
              $endgroup$
              – James Phillips
              8 hours ago










            • $begingroup$
              @JamesPhillips Thanks
              $endgroup$
              – Demetri Pananos
              8 hours ago










            • $begingroup$
              +1 Removing predictors potentially related to outcome (even if "insignificant") is tricky in logistic regression, given its inherent omitted-variable bias. Removing a predictor related to outcome can lead to bias in the estimates of the coefficients of the retained predictors, even if the retained predictors aren't correlated with the removed predictor.
              $endgroup$
              – EdM
              8 hours ago














            6












            6








            6





            $begingroup$

            Let me first ask this: What is the goal of the model? If you are only interested in predicting if a customer will buy, then statistcal hypothesis tests really aren't your main concern. Instead, you should be externally validating your model via a validation/test prodecedure on unseen data.



            If, instead, you are interested in examining which factors contribute to the probability of a customer buying, then there is no need to remove variables which fail to reject the null (especially in a stepwise sort of manner). Presumably, you included a variable in your model because you thought (from past experience or expert opinion) that it played an important part in a customer deciding if they will buy. That the variable failed to reject the null doesn't make your model a bad one, it just means that your sample didin't detect an effect of that variable. That's perfectly ok.






            share|cite|improve this answer









            $endgroup$



            Let me first ask this: What is the goal of the model? If you are only interested in predicting if a customer will buy, then statistcal hypothesis tests really aren't your main concern. Instead, you should be externally validating your model via a validation/test prodecedure on unseen data.



            If, instead, you are interested in examining which factors contribute to the probability of a customer buying, then there is no need to remove variables which fail to reject the null (especially in a stepwise sort of manner). Presumably, you included a variable in your model because you thought (from past experience or expert opinion) that it played an important part in a customer deciding if they will buy. That the variable failed to reject the null doesn't make your model a bad one, it just means that your sample didin't detect an effect of that variable. That's perfectly ok.







            share|cite|improve this answer












            share|cite|improve this answer



            share|cite|improve this answer










            answered 8 hours ago









            Demetri PananosDemetri Pananos

            2,121619




            2,121619








            • 1




              $begingroup$
              Upvoted for excellence of the answer.
              $endgroup$
              – James Phillips
              8 hours ago










            • $begingroup$
              @JamesPhillips Thanks
              $endgroup$
              – Demetri Pananos
              8 hours ago










            • $begingroup$
              +1 Removing predictors potentially related to outcome (even if "insignificant") is tricky in logistic regression, given its inherent omitted-variable bias. Removing a predictor related to outcome can lead to bias in the estimates of the coefficients of the retained predictors, even if the retained predictors aren't correlated with the removed predictor.
              $endgroup$
              – EdM
              8 hours ago














            • 1




              $begingroup$
              Upvoted for excellence of the answer.
              $endgroup$
              – James Phillips
              8 hours ago










            • $begingroup$
              @JamesPhillips Thanks
              $endgroup$
              – Demetri Pananos
              8 hours ago










            • $begingroup$
              +1 Removing predictors potentially related to outcome (even if "insignificant") is tricky in logistic regression, given its inherent omitted-variable bias. Removing a predictor related to outcome can lead to bias in the estimates of the coefficients of the retained predictors, even if the retained predictors aren't correlated with the removed predictor.
              $endgroup$
              – EdM
              8 hours ago








            1




            1




            $begingroup$
            Upvoted for excellence of the answer.
            $endgroup$
            – James Phillips
            8 hours ago




            $begingroup$
            Upvoted for excellence of the answer.
            $endgroup$
            – James Phillips
            8 hours ago












            $begingroup$
            @JamesPhillips Thanks
            $endgroup$
            – Demetri Pananos
            8 hours ago




            $begingroup$
            @JamesPhillips Thanks
            $endgroup$
            – Demetri Pananos
            8 hours ago












            $begingroup$
            +1 Removing predictors potentially related to outcome (even if "insignificant") is tricky in logistic regression, given its inherent omitted-variable bias. Removing a predictor related to outcome can lead to bias in the estimates of the coefficients of the retained predictors, even if the retained predictors aren't correlated with the removed predictor.
            $endgroup$
            – EdM
            8 hours ago




            $begingroup$
            +1 Removing predictors potentially related to outcome (even if "insignificant") is tricky in logistic regression, given its inherent omitted-variable bias. Removing a predictor related to outcome can lead to bias in the estimates of the coefficients of the retained predictors, even if the retained predictors aren't correlated with the removed predictor.
            $endgroup$
            – EdM
            8 hours ago













            1












            $begingroup$

            Have a look at the help pages for step(), drop1() and add1(). These will help you to add/remove variables based on AIC. However, all such methods are somewhat flawed in their path dependence. A better way would be to use the functions in the penalised or glmnet package to perform a lasso regression.






            share|cite|improve this answer









            $endgroup$


















              1












              $begingroup$

              Have a look at the help pages for step(), drop1() and add1(). These will help you to add/remove variables based on AIC. However, all such methods are somewhat flawed in their path dependence. A better way would be to use the functions in the penalised or glmnet package to perform a lasso regression.






              share|cite|improve this answer









              $endgroup$
















                1












                1








                1





                $begingroup$

                Have a look at the help pages for step(), drop1() and add1(). These will help you to add/remove variables based on AIC. However, all such methods are somewhat flawed in their path dependence. A better way would be to use the functions in the penalised or glmnet package to perform a lasso regression.






                share|cite|improve this answer









                $endgroup$



                Have a look at the help pages for step(), drop1() and add1(). These will help you to add/remove variables based on AIC. However, all such methods are somewhat flawed in their path dependence. A better way would be to use the functions in the penalised or glmnet package to perform a lasso regression.







                share|cite|improve this answer












                share|cite|improve this answer



                share|cite|improve this answer










                answered 9 hours ago







                Feakster





































                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Cross Validated!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f413606%2fwhen-to-remove-insignificant-variables%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Hudson River Historic District Contents Geography History The district today Aesthetics Cultural...

                    The number designs the writing. Feandra Aversely Definition: The act of ingrafting a sprig or shoot of one...

                    Ayherre Geografie Demografie Externe links Navigatiemenu43° 23′ NB, 1° 15′ WL43° 23′ NB, 1°...